Django sync or async, that's the question
January 21, 2025
In the current world of cloud-hosted services, the need to integrate such a service into your project by using an API is more the rule than the exception. When you integrate such a service in your "sync" Django project, you can run into concurrency issues pretty fast. We have an async, coroutine-based stack now in Django, would it be beneficial? Let's try to find out by diving into the Python world of concurrency handling and by testing (and benchmarking) a test application with both "sync" and "async" solutions.
Know thy options
We first need a bit of background information and to understand the terminology and implications before we can make a considered assessment. "Async" is the newest kid on the block, but Django's codebase still has a lot of "sync" parts (like all generic views and forms). Async also adds complexity, and the benefits might not outweigh the extra burden. But what is this async all about? It's all about handling concurrency.
Handling concurrency
For a web application to handle concurrency, there are roughly two flavours:
A web server that runs one or more processes that spawn threads for each request:

A "no threads" single web server, running callbacks or coroutines via an event loop:

Combinations do exist too: an event loop with native threads (or vice versa), multiprocessing with multiple event loops with coroutines or callbacks / events, greenlets (or "green" threads), task queues, etc., but for this blog post we'll stick with these two, as these are the most distinct forms of (web-based) concurrency handling. Alternative definitions used for these two forms are preemptive multitasking and cooperative multitasking:
- With preemptive multitasking the OS kernel decides the amount of time each task (thread) will get. It will mostly divide the available time equally and can give prejudice to important events. When using (POSIX) processes and threads, you are utilising preemptive multitasking.
- With cooperative multitasking, the application process / task itself decides when to give control to another task. When a task suspends itself (when invoking
yieldorawait), the next task will be allowed to execute. It's called "cooperative" because a task needs to cooperate with others by using an explicit suspend, saying "now another task can use the CPU". Coroutine-based programming is a form of cooperative multitasking.
Let's get back to the question
Sync
Writing code for a Django application was mostly done "sync". You expect the web application server (like uwsgi or gunicorn) to take care of the concurrency issues with multiple processes and/or threads. The complete request / response loop is blocking: a thread or process is literally blocked until all the code is executed and a response is given.
This includes waiting for results like a database query or network call. You can still do all you want in parallel in your application, though, like spawning multiple threads, but the dedicated thread handling your request is blocked until the response. The slower the processing of your requests, the fewer requests can be handled simultaneously.
In the Python world, the interface between your application and the web application server (including the request / response loop) is standardised as such in the WSGI standard.
GIL
Note that in Python spawned threads cannot really run concurrently, which has to do with the GIL (Global Interpreter Lock), unless you are using C extensions where you can release the GIL or use the new free threading mode in Python 3.13. This mainly means multithreaded applications can't really use multiple cores on a single CPU (in a single process) either, as only one thread can be active at a time. However, when the threads are waiting mainly for IO, it's still beneficial to use threads.
Async
With async programming, you are aware of concurrency in the application layer. In other words, you will write code that is prepared and aware of being concurrent, and most importantly, when. This can be event-driven (such as good old Twisted), with "implicit" coroutine libraries (such as Gevent), or with asyncio and the async / await keywords (such as Starlette and FastAPI). The async / await keywords and the asyncio (formally known as Tulip) library got standardised in Python in version 3.4 and 3.5. Django received async support (with asyncio) in version 3.1 and this was expanded further in 4.0, 4.1, 4.2 and 5.1.
WSGI was not suitable for async due to its blocking nature; therefore, a new standard was developed for Python applications: ASGI.
Django channels
Django channels is an add-on package for Django for handling WebSockets and other protocols which are not HTTP (although it can). It is depending on Daphne, the first ASGI server, based on Twisted. The first version of Django channels in 2016 did not make use ofasyncio, nor did the first version of Daphne. With the releae Daphne 2.0.0 and Channels 2.0.0 (2018), both were rewritten to useasyncioandasync/await.
Implicit and explicit coroutines
When you are writing "sync" code and use a library like Gevent (with or without monkey patching), you are using coroutines in an implicit manner. In the case of monkey patching, you are not even aware of the fact that coroutines are scheduled on an event loop. So while Gevent can be a terrific solution, it can also be very difficult to tell when and why things go wrong.
Especially when you are running threads that have no IO at all (so there is no suspend, which can make your whole application block). When you are using the async / await keywords in conjunction with asyncio (or any other event loop engine), you are using coroutines explicitly, as you can exactly see in which places in your codebase the coroutine will suspend. This makes it easier to spot problems and, of course, more obvious to see what is going on.
asyncio
Asyncio was not only received with applause when it was released. It still receives a lot of criticism. The package was added to the standard library before theasync/awaitkeywords were added, so it had backward compatibility issues since the start. The package itself was heavily improved (and changed) in newer Python versions, as of 3.8, 3.9, 3.11 and 3.13. A lot of improvements has been added because of development in alternative async libraries like Trio (and Curio). AnyIO is now a popular library as it exposes a single API to work both with theasyncioandtriolibraries.
Benchmarks
So now we know a bit about all the technology and how they work; let's do some benchmarks on how the different stacks respond. Note that these are not 100% true scientific benchmarks. The benchmarks mostly show the characteristics of the different stacks in the context of a particular setup.
The test application
We are going to benchmark a simple Django view that calls an "external" API which returns a random country. This "external" API is slow on purpose (300ms default latency), and we are going to call this external API multiple times to test if the view is doing these API calls concurrently. This means the request should be as slow as the slowest API call, as we call the API in parallel using a ThreadPool.
The test application (and the tests) are available on GitHub in case you're interested in the full source code or if you want to run the benchmarks yourself: https://github.com/maerteijn/django-sync-or-async
The configurations we are going to test this with are:

The API view (with asyncio, so IO won't be a bottleneck):
async def api(request, ms=300):
await asyncio.sleep(delay=ms / 1000)
return JsonResponse(random.choice(exampledata))
The "sync" view, which consumes the API, is deployed with WSGI (uWSGI and Gunicorn):
def sync_view(request, ms=300):
api_urls = (
f"{API_ENDPOINT}/{ms}/",
f"{API_ENDPOINT}/{ms*2}/",
f"{API_ENDPOINT}/{int(ms/2)}/",
)
client = httpx.Client()
with ThreadPoolExecutor() as executor:
futures = executor.map(lambda url: client.get(url), api_urls)
country = next(futures).json()
return render(
request,
"django_sync_or_async/index.html",
dict(country=country),
)
ThreadPoolExecutor
The standardThreadPoolExecutoruses system threads for making the API requests parallel. When using Gevent, threads are monkey patched to be cooperative, so new greenlets will be spawned when using theThreadPoolExecutor, without any code change.
The "async" view, which consumes the API, deployed with ASGI (Uvicorn):
async def async_view(request, ms=300):
api_urls = (
f"{API_ENDPOINT}/{ms}/",
f"{API_ENDPOINT}/{ms*2}/",
f"{API_ENDPOINT}/{int(ms/2)}/",
)
client = httpx.AsyncClient()
results = await asyncio.gather(*[client.get(url) for url in api_urls])
await client.aclose()
country = results[0].json()
return render(
request,
"django_sync_or_async/index.html",
dict(country=country),
)
Response time
Both views will call the API three times, with the following "delays":
- 300 ms
- 600 ms
- 150 ms
So the minimum response time will (should) be 600 ms, as we call the external API with these calls in parallel + extra overhead for rendering the view.
Locust
The benchmarks are done with Locust, a simple and Python-friendly (open source) benchmark tool. We are starting with 10 concurrent users, which will be slowly increased until 100. We measure the response times of the view and also how many concurrent requests (RPS) the selected configuration can handle.
The benchmarks are performed on an Apple MacBook Air with an M2 processor.
uWSGI, 1 process, 2 threads

uWSGI, 1 process, 100 threads

uWSGI, 1 process, gevent

Gunicorn, 1 process, 100 threads

Gunicorn, 1 process, gevent

Uvicorn, 1 process, async

Results
So let's put all benchmark results in a single overview:
| Configuration | Min(ms) | 95%-ile(ms) | Average(ms) | Max RPS |
|---|---|---|---|---|
| uWSGI, 1 process, 2 threads | 783 | 22000 | 12107.87 | 3 |
| uWSGI, 1 process, 100 threads | 613 | 950 | 782.45 | 131 |
| uWSGI, 1 process, gevent | 614 | 1100 | 857.1 | 110 |
| Gunicorn, 1 process, 100 threads | 612 | 1000 | 716.69 | 140 |
| Gunicorn, 1 process, gevent | 619 | 1400 | 1038.78 | 91 |
| Uvicorn, 1 process, async | 616 | 1000 | 853.56 | 115 |
Some things that stand out (also by looking at the graphs):
- As expected, the performance with uWSGI configured with 1 process and 2 threads is terrible, with response times over 22 seconds and only 3 RPS, due to the slow API.
- Gunicorn with 1 process and 100 threads has the best overall performing configuration with the test application. Something I wouldn't have expected up front. uWSGI with 1 process and 100 threads is not far behind, though.
- uWSGI with Gevent performs faster compared to Gunicorn with uWSGI.
- I'm pretty impressed by Gevent, without any code change, all IO is performed using cooperative multitasking, and it performs pretty well.
- Uvicorn performs not as fast as the threaded counterparts, but it is slightly faster than the implicit gevent coroutine counterparts. It has a very reasonable performance. ASGI / async is mostly about handling many connections efficiently, not about pure throughput.
Running benchmarks
Performance characteristics do really differ on each environment. Are you deploying in the cloud (probably yes), then the CPU time can be much less as compared to a "bare metal" machine. Containerized applications also have other characteristics on different platforms, so if you are optimising the configuration for your setup, do not just copy / paste the "recommended" setups, run benchmarks for your own situation too.
Final notes
So how can we answer the question "Django sync or async" when we want to integrate an external service?
I think the answer is it depends. There are many ways of making your application asynchronous. Using the explicit coroutine code style (or "coloured functions") in combination with asyncio is not required per se, as we have already seen with Gevent, or by using a ThreadPoolExecutor. Being very explicit about when task switching will happen in your code using async / await can be beneficial for some applications, but certainly not all.
Note that concurrency always will be difficult, no matter how you program it. For CRUD applications or any other Django project where you're not trying to build the next social media platform with millions of users, using "sync Django" (and the availability of the many libraries out there) is just perfectly fine. uWSGI or Gunicorn with many workers and threads are excellent options to make it perform at a smaller scale, just as Gevent is.
Use asyncio with native async views in Django when you really expect thousands of concurrent users, combined with a lot of slow IO. Or when you use WebSockets or streaming or the like, or when you are building networking applications. Be prepared to take the "burden" as with async, you'll not be able to use the enormous library of existing "sync coloured" functions (well, technically you can but not without performance penalties). All the generic views in Django do not have async variants either, so you'll need to build your own.
The world of async is not black and white but full of colours 🌈.
Links
- 🔗 Full source code repository, in case you want to check out the code used in this article and/or run the performance tests yourself.
- [💬 Discussion page on GitHub; any questions or remarks are welcome!
If you would like to read more
Many great Pythonistas have been writing about this subject, some already a while ago, but most of it is still relevant today. Most of it has been of important value while writing this blog post.
- What Color is Your Function?
- The Function Colour Myth
- asyncio: We Did It Wrong
- Asyncio, Twisted, Tornado, and gevent walk into a bar…
- Think twice before using asyncio in Python
- Unyielding
- Notes on structured concurrency, or: Go statement considered harmful
- Python 3.10 native coroutine asyncio in practice
- Python Asyncio: The Complete Guide
- Some thoughts on asynchronous API design in a post-async/await world
- How to fit triangles into squares—run blocking functions in the event loop
- Why Taskgroup and Timeout Are so Crucial in Python 3.11
- Save the day with gevent