Column
❈Zheng Xiaowai , Python engineer, mainly responsible for web development and log data processing . The blog post "True Tornado Asynchronous Non-blocking", "Use JWT to Make Your RESTful API Safer" and many more have been selected daily selections by well-known technology communities. "Using Shipyard to Build a Docker Cluster" was selected into the Dockerone Weekly Report.
Personal blog: https://www.hexiangyu.me
GitHub: https://github.com/zhengxiaowai ❈
Among them, Tornado is defined as a web framework and an asynchronous network library. Among them, it has asynchronous non-blocking capabilities, which can solve the problem of blocking requests from the two frameworks . Tornado should be used when concurrency is required .
However, it is easy to use Tornado as an asynchronous blocking framework in actual use , so compared with the other two frameworks, there is no advantage. This article will show how to achieve true asynchronous non-blocking recording.
The Python version used below is 2.7.13 and the platform is Macbook Pro 2016
Two decorators in Tornado:
asynchronous request decorator is to become a long way connection, you must manually call self.finish()
will respond
class MainHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get(self): # bad self.write("Hello, world")
The asynchronous decorator will not be called automatically self.finish()
. If the end is not specified, the long connection will remain until the pending state.
So the correct way to use it is to use asynchronous and need to manually finish
class MainHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get(self): self.write("Hello, world") self.finish()
coroutine decoration is designated mode change request is a coroutine, understand the point that can be used yield
with a program to write asynchronous Tornado.
Tronado implements its own set of protocols for coroutines and cannot use ordinary Python generators.
Before programming in the coroutine mode, you must know how to write asynchronous functions in Tornado. Tornado provides a variety of asynchronous writing forms: callbacks, Futures, coroutines, etc. Among them, the coroutine mode is the simplest and most used.
Writing a coroutine-based asynchronous function also requires coroutine decorator
@gen.coroutine def sleep(self): yield gen.sleep(10) raise gen.Return([1, 2, 3, 4, 5])
This is an asynchronous function. Tornado's coroutine asynchronous function has two characteristics:
raise gen.Return()
as an exception is thrownThe return value is thrown as an exception because the generator is not allowed to have a return value before Python 3.2.
Used Python generators should know, if you want to start the generator must manually perform the next()
method for the job so coroutine decorator's role here is one of the asynchronous call this function is performed automatically when the generator.
The obvious disadvantage of using coroutine is that it relies heavily on the implementation of third-party libraries. If the library itself does not support Tornado's asynchronous operation, no matter how you use coroutines, it will still be blocking. Let me give you an example.
import time import logging import tornado.ioloop import tornado.web import tornado.options from tornado import gen tornado.options.parse_command_line() class MainHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get(self): self.write("Hello, world") self.finish() class NoBlockingHnadler(tornado.web.RequestHandler): @gen.coroutine def get(self): yield gen.sleep(10) self.write('Blocking Request') class BlockingHnadler(tornado.web.RequestHandler): def get(self): time.sleep(10) self.write('Blocking Request') def make_app(): return tornado.web.Application([ (r"/", MainHandler), (r"/block", BlockingHnadler), (r"/noblock", NoBlockingHnadler), ], autoreload=True) if __name__ == "__main__": app = make_app() app.listen(8000) tornado.ioloop.IOLoop.current().start()
In order to show more clearly set 10 seconds
When we use yield gen.sleep(10)
the asynchronous sleep when the other requests are not blocked.
When time.sleep(10)
the time block other requests.
The asynchronous non-blocking here is for another request, this request should be blocked or blocked.
gen.coroutine
After Tornado 3.1 will automatically call self.finish()
termination request, you can not use the asynchronous
decorator.
Therefore, this way of achieving asynchronous non-blocking needs to rely on a large number of asynchronous libraries based on the Tornado protocol, which is more limited in use. Fortunately, there are still some asynchronous libraries that can be used.
Thread-based asynchronous programming
Use gen.coroutine
decorator writing asynchronous function, if the library does not support asynchronous, then the response of any course is blocked.
There is a decorator can be used in the Tornado ThreadPoolExecutor
to make blocking non-blocking procedural programming, the principle is to start another program blocking a thread to execute the thread itself outside of Tornado, allowing Tornado become blocked.
Futures is a standard library in Python3, but you need to manually install pip install futures in Python2
import time import logging import tornado.ioloop import tornado.web import tornado.options from tornado import gen from tornado.concurrent import run_on_executor from concurrent.futures import ThreadPoolExecutor tornado.options.parse_command_line() class MainHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get(self): self.write("Hello, world") self.finish() class NoBlockingHnadler(tornado.web.RequestHandler): executor = ThreadPoolExecutor(4) @run_on_executor def sleep(self, second): time.sleep(second) return second @gen.coroutine def get(self): second = yield self.sleep(5) self.write('noBlocking Request: {}'.format(second)) def make_app(): return tornado.web.Application([ (r"/", MainHandler), (r"/noblock", NoBlockingHnadler), ], autoreload=True) if __name__ == "__main__": app = make_app() app.listen(8000) tornado.ioloop.IOLoop.current().start()
ThreadPoolExecutor
It is a highly encapsulation of threading in the standard library. It uses threads to make blocking functions asynchronous, which solves the problem that many libraries do not support asynchrony.
But the problem that comes with it is that if a large number of threaded asynchronous functions are used to do some high-load activities, it will cause the Tornado process to have low performance and slow response. This is just a problem from one problem to another.
So when dealing with some small workloads, it can have a good effect, allowing Tornado to run asynchronously and non-blocking.
But you know that this function is doing high-load work, then you should adopt another way, using Tornado combined with Celery to achieve asynchronous non-blocking.
Celery-based asynchronous programming
Celery is a simple, flexible and reliable distributed system that processes a large number of messages. It focuses on a task queue for real-time processing and also supports task scheduling.
Celery is not the only choice. You can choose other task queues to implement it, but Celery is written in Python and can be used quickly. At the same time, Celery provides an elegant interface and is easy to integrate with the Python Web framework.
It can be used in conjunction with Tornado tornado-celery
. This package has already encapsulated Celery in Tornado and can be used directly.
In the actual test, because tornado-celery has not been updated for a long time, the request will always be blocked and will not return. The solution is:
pip install celery==3.1
pip install pika==0.9.14
import time import logging import tornado.ioloop import tornado.web import tornado.options from tornado import gen import tcelery, tasks tornado.options.parse_command_line() tcelery.setup_nonblocking_producer() class MainHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get(self): self.write("Hello, world") self.finish() class CeleryHandler(tornado.web.RequestHandler): @gen.coroutine def get(self): response = yield gen.Task(tasks.sleep.apply_async, args=[5]) self.write('CeleryBlocking Request: {}'.format(response.result)) def make_app(): return tornado.web.Application([ (r"/", MainHandler), (r"/celery-block", CeleryHandler), ], autoreload=True) if __name__ == "__main__": app = make_app() app.listen(8000) tornado.ioloop.IOLoop.current().start() import os import time from celery import Celery from tornado import gen celery = Celery("tasks", broker="amqp://") celery.conf.CELERY_RESULT_BACKEND = os.environ.get('CELERY_RESULT_BACKEND','amqp') @celery.task def sleep(seconds): time.sleep(float(seconds)) return seconds if __name__ == "__main__": celery.start()
Celery's Worker runs in another process, independent of the Tornado process, does not affect Tornado's operating efficiency, and is more efficient than the process mode when dealing with complex tasks.
summary
method | advantage | Disadvantage | Availability |
---|---|---|---|
gen.coroutine | Simple and elegant | Need asynchronous library support | ★★☆☆☆ |
Thread | simple | May affect performance | ★★★☆☆ |
Celery | Good performance | Complex operation, low version | ★★★☆☆ |
At present, there is no optimal asynchronous and non-blocking programming mode, and the available asynchronous libraries are relatively limited, and only the ones that are used frequently, it is difficult for individuals to write asynchronous libraries.
It is recommended to use the thread and Celery mode for asynchronous programming. The lightweight ones are executed in threads, and the complex ones are executed in Celery. Of course, it would be best if there is an asynchronous library to use.
In Python 3, Tornado can be set to asyncio mode, so that a library compatible with asyncio mode can be used. This should be the future direction.
Reference