Real Tornado asynchronous non-blocking

Real Tornado asynchronous non-blocking

Column

❈Zheng Xiaowai , Python engineer, mainly responsible for web development and log data processing . The blog post "True Tornado Asynchronous Non-blocking", "Use JWT to Make Your RESTful API Safer" and many more have been selected daily selections by well-known technology communities. "Using Shipyard to Build a Docker Cluster" was selected into the Dockerone Weekly Report.

Personal blog: https://www.hexiangyu.me

GitHub: https://github.com/zhengxiaowai

Among them, Tornado is defined as a web framework and an asynchronous network library. Among them, it has asynchronous non-blocking capabilities, which can solve the problem of blocking requests from the two frameworks . Tornado should be used when concurrency is required .

However, it is easy to use Tornado as an asynchronous blocking framework in actual use , so compared with the other two frameworks, there is no advantage. This article will show how to achieve true asynchronous non-blocking recording.

The Python version used below is 2.7.13 and the platform is Macbook Pro 2016

Asynchronous programming using gen.coroutine

Two decorators in Tornado:

  • tornado.web.asynchronous
  • tornado.gen.coroutine

asynchronous request decorator is to become a long way connection, you must manually call self.finish()will respond

class MainHandler(tornado.web.RequestHandler):

    @tornado.web.asynchronous

    def get(self):

        # bad 

        self.write("Hello, world")

The asynchronous decorator will not be called automatically self.finish(). If the end is not specified, the long connection will remain until the pending state.

So the correct way to use it is to use asynchronous and need to manually finish

class MainHandler(tornado.web.RequestHandler):

    @tornado.web.asynchronous

    def get(self):

        self.write("Hello, world")

        self.finish()

coroutine decoration is designated mode change request is a coroutine, understand the point that can be used yieldwith a program to write asynchronous Tornado.

Tronado implements its own set of protocols for coroutines and cannot use ordinary Python generators.

Before programming in the coroutine mode, you must know how to write asynchronous functions in Tornado. Tornado provides a variety of asynchronous writing forms: callbacks, Futures, coroutines, etc. Among them, the coroutine mode is the simplest and most used.

Writing a coroutine-based asynchronous function also requires coroutine decorator

@gen.coroutine

def sleep(self):

    yield gen.sleep(10)

    raise gen.Return([1, 2, 3, 4, 5])

This is an asynchronous function. Tornado's coroutine asynchronous function has two characteristics:

  • Need to use coroutine decorator
  • The return value is required raise gen.Return()as an exception is thrown

The return value is thrown as an exception because the generator is not allowed to have a return value before Python 3.2.

Used Python generators should know, if you want to start the generator must manually perform the next()method for the job so coroutine decorator's role here is one of the asynchronous call this function is performed automatically when the generator.

The obvious disadvantage of using coroutine is that it relies heavily on the implementation of third-party libraries. If the library itself does not support Tornado's asynchronous operation, no matter how you use coroutines, it will still be blocking. Let me give you an example.

import time

import logging

import tornado.ioloop

import tornado.web

import tornado.options

from tornado import gen



tornado.options.parse_command_line()



class MainHandler(tornado.web.RequestHandler):

    @tornado.web.asynchronous

    def get(self):

        self.write("Hello, world")

        self.finish()

class NoBlockingHnadler(tornado.web.RequestHandler):

    @gen.coroutine

    def get(self):

        yield gen.sleep(10)

        self.write('Blocking Request')


class BlockingHnadler(tornado.web.RequestHandler):

    def get(self):

        time.sleep(10)

        self.write('Blocking Request')


def make_app():

    return tornado.web.Application([

        (r"/", MainHandler),

        (r"/block", BlockingHnadler),

        (r"/noblock", NoBlockingHnadler),

    ], autoreload=True)

if __name__ == "__main__":

    app = make_app()

    app.listen(8000)

    tornado.ioloop.IOLoop.current().start()

In order to show more clearly set 10 seconds

When we use yield gen.sleep(10)the asynchronous sleep when the other requests are not blocked.

When time.sleep(10)the time block other requests.

The asynchronous non-blocking here is for another request, this request should be blocked or blocked.

gen.coroutineAfter Tornado 3.1 will automatically call self.finish()termination request, you can not use the asynchronousdecorator.

Therefore, this way of achieving asynchronous non-blocking needs to rely on a large number of asynchronous libraries based on the Tornado protocol, which is more limited in use. Fortunately, there are still some asynchronous libraries that can be used.

Thread-based asynchronous programming

Use gen.coroutinedecorator writing asynchronous function, if the library does not support asynchronous, then the response of any course is blocked.

There is a decorator can be used in the Tornado ThreadPoolExecutorto make blocking non-blocking procedural programming, the principle is to start another program blocking a thread to execute the thread itself outside of Tornado, allowing Tornado become blocked.

Futures is a standard library in Python3, but you need to manually install pip install futures in Python2

import time

import logging

import tornado.ioloop

import tornado.web

import tornado.options

from tornado import gen

from tornado.concurrent import run_on_executor

from concurrent.futures import ThreadPoolExecutor



tornado.options.parse_command_line()



class MainHandler(tornado.web.RequestHandler):

    @tornado.web.asynchronous

    def get(self):

        self.write("Hello, world")

        self.finish()

class NoBlockingHnadler(tornado.web.RequestHandler):

    executor = ThreadPoolExecutor(4)

    @run_on_executor

    def sleep(self, second):

        time.sleep(second)

        return second

    @gen.coroutine

    def get(self):

        second = yield self.sleep(5)

        self.write('noBlocking Request: {}'.format(second))



def make_app():

    return tornado.web.Application([

        (r"/", MainHandler),

        (r"/noblock", NoBlockingHnadler),

    ], autoreload=True)



if __name__ == "__main__":

    app = make_app()

    app.listen(8000)

    tornado.ioloop.IOLoop.current().start()

ThreadPoolExecutor It is a highly encapsulation of threading in the standard library. It uses threads to make blocking functions asynchronous, which solves the problem that many libraries do not support asynchrony.

But the problem that comes with it is that if a large number of threaded asynchronous functions are used to do some high-load activities, it will cause the Tornado process to have low performance and slow response. This is just a problem from one problem to another.

So when dealing with some small workloads, it can have a good effect, allowing Tornado to run asynchronously and non-blocking.

But you know that this function is doing high-load work, then you should adopt another way, using Tornado combined with Celery to achieve asynchronous non-blocking.

Celery-based asynchronous programming

Celery is a simple, flexible and reliable distributed system that processes a large number of messages. It focuses on a task queue for real-time processing and also supports task scheduling.

Celery is not the only choice. You can choose other task queues to implement it, but Celery is written in Python and can be used quickly. At the same time, Celery provides an elegant interface and is easy to integrate with the Python Web framework.

It can be used in conjunction with Tornado tornado-celery. This package has already encapsulated Celery in Tornado and can be used directly.

In the actual test, because tornado-celery has not been updated for a long time, the request will always be blocked and will not return. The solution is:

  1. Downgrade celery to 3.1 pip install celery==3.1
  2. Downgrade pika to 0.9.14 pip install pika==0.9.14
import time

import logging

import tornado.ioloop

import tornado.web

import tornado.options

from tornado import gen



import tcelery, tasks



tornado.options.parse_command_line()

tcelery.setup_nonblocking_producer()





class MainHandler(tornado.web.RequestHandler):

    @tornado.web.asynchronous

    def get(self):

        self.write("Hello, world")

        self.finish()





class CeleryHandler(tornado.web.RequestHandler):

    @gen.coroutine

    def get(self):

        response = yield gen.Task(tasks.sleep.apply_async, args=[5])

        self.write('CeleryBlocking Request: {}'.format(response.result))





def make_app(): 

    return tornado.web.Application([

        (r"/", MainHandler),

        (r"/celery-block", CeleryHandler),

    ], autoreload=True)



if __name__ == "__main__":

    app = make_app()

    app.listen(8000)

    tornado.ioloop.IOLoop.current().start()

import os

import time

from celery import Celery

from tornado import gen



celery = Celery("tasks", broker="amqp://")

celery.conf.CELERY_RESULT_BACKEND = os.environ.get('CELERY_RESULT_BACKEND','amqp')



@celery.task

def sleep(seconds):

    time.sleep(float(seconds))

    return seconds



if __name__ == "__main__":

    celery.start()

Celery's Worker runs in another process, independent of the Tornado process, does not affect Tornado's operating efficiency, and is more efficient than the process mode when dealing with complex tasks.

summary

method

advantage

Disadvantage

Availability

gen.coroutine

Simple and elegant

Need asynchronous library support

★★☆☆☆

Thread

simple

May affect performance

★★★☆☆

Celery

Good performance

Complex operation, low version

★★★☆☆

At present, there is no optimal asynchronous and non-blocking programming mode, and the available asynchronous libraries are relatively limited, and only the ones that are used frequently, it is difficult for individuals to write asynchronous libraries.

It is recommended to use the thread and Celery mode for asynchronous programming. The lightweight ones are executed in threads, and the complex ones are executed in Celery. Of course, it would be best if there is an asynchronous library to use.

In Python 3, Tornado can be set to asyncio mode, so that a library compatible with asyncio mode can be used. This should be the future direction.

Reference

Reference: https://cloud.tencent.com/developer/article/1033391 Real Tornado Asynchronous Non-blocking-Cloud + Community-Tencent Cloud