Running python tasks in the background

In the past few months I’ve built a couple of web apps using python and flask, and the main task of the programs takes significant time to run. The web app for my client requires a customer to enter their email address and click a button, and the app generates some data files and sends them to a customer. It will take 30 seconds or so to do it, but in my flask app that means the customer clicks the button and has to wait that whole time for the app to process before loading the next page – which is far too long. In other cases, it’s not a user entering the details but a webhook passing some data to the web application – which requires a response to be given within 5 seconds. So we need to either load the next page immediately, telling the customer they will get an email shortly, or immediately respond to the webhook, and then run the actual task in the background.

In python you can do this with threading, where each ‘thread’ is a piece of code that can run at the same time, i.e. asynchronously. (See The Raspberry Pi Education Manual page 108 for a demo). However, there is more powerful way to do this using a task queue. This brings the added bonus of being able to run the task queue and customer facing web app on separate servers (so the task queue can chug away doing its thing, without affecting the performance serving the website to the end customer. There are a few options to do this but it seems like the most popular python package is Celery. Like all good python packages, the name tells you nothing about what it does, but basically Celery lets you call a standard python function, but it will be added to a task queue to run in the background, thus enabling us to return the web page request quickly. You’ll also need to run a task queue such as Redis. it’s a little confusing at first but at the end, you’ll actually have 3 programs running, all of which could be on the same, or different servers:

1. The Flask web application, which runs the Celery client allowing you to add a background task to the task queue.
2. The task queue itself, such as Redis. This does not do anything except hold the details of the task to be executed.
3. The Celery Worker, which is continuously grabbing tasks from the task queue, and actually executing them.

Basic Example

This is a one file basic example of using celery with flask.

Installation:

You’ll need install redis and the flask and celery python packages (pip install flask celery).

Python Code: (app.py)


# Import required modules
from flask import Flask, request
from celery import Celery

# Create the Flask instances
app = Flask(__name__)

# Create the Celery instance, referring to the task queue (or broker) as redis
celery = Celery(app.name, broker='redis://localhost:6379/0')

# This is the celery task that will be run by the worker in the background
# We need to give it the celery decorator to denote this
@celery.task
def my_background_task(arg1, arg2):
    # some long running task here (this simple example has no output)
    result = arg1*arg2
    
# Create a flask route - this is a simpl get request
@app.route('/', methods=['GET'])
def index():
    if request.method == 'GET':
        # add the background task to the task queue,
        # arguments for the task: arg1=10, arg2=20
        # optionally countdown specifies a 60 second delay
        task = my_background_task.apply_async(args=[10, 20], countdown=60)

    # Flask returns this message to the browser
    return {'task started'}

Run the three programs:

$ redis-server
$ celery worker -A app.celery --loglevel=info
$ python app.py

That’s it. You can now point your browser to wherever you’ve set the flask app up, and it will return the ‘task started’ message to you, and 60 seconds later, the celery worker will complete the task.

Miguel Grinberg has a more detailed tutorial on Celery showing how you can get updates on the progress of your background task, which is well worth a read.