Async 101 and why it matters

Today, we will explore the concept of asynchronous programming in Python, and how it can improve the performance and scalability of web applications. We will also compare two different ways of implementing a web server in Python: one using synchronous functions, and another using asynchronous functions. Finally, we will use a tool called Apache Bench to measure and compare the performance of the two servers to see how they perform differently.

What is asynchronous programming?

Asynchronous programming is a programming paradigm that allows us to write concurrent code that runs asynchronously, i.e., without blocking the main thread of execution. This means that we can perform multiple tasks at the same time, or in parallel, without waiting for each task to finish before starting the next one. This can improve the responsiveness and efficiency of our applications, especially when dealing with I/O operations, such as network requests, database queries, or file operations.

Why it matters

Asynchronous programming is particularly useful for web applications, which often need to handle multiple requests from different clients simultaneously. If we use synchronous programming, we would have to process each request one by one, blocking the main thread until the request is completed. This would result in poor performance and low throughput, as the server would not be able to serve other requests while waiting for I/O operations. Moreover, we would need to use multiple threads or processes to handle concurrent requests, which would increase the complexity and overhead of our application.

On the other hand, if we use asynchronous programming, we can process each request in a non-blocking way, using coroutines or callbacks to handle the I/O operations. This way, the main thread can switch between different tasks as soon as one of them is waiting for I/O, and resume it when the I/O is ready. This would result in better performance and higher throughput, as the server would be able to serve more requests with less resources. Moreover, we would not need to use multiple threads or processes, as we can handle concurrency within a single thread or process, using an event loop to manage the tasks.

Simple web server example

To illustrate the difference between synchronous and asynchronous programming, let us implement a simple web server in Python, using the built-in modules socket and asyncio.

Synchronous version

The synchronous version of the web server will use the socket module to create a TCP socket, bind it to the port 7080, and listen for incoming connections. Then, it will use a while loop to accept and handle each connection in a blocking way, using the recv and sendall methods of the socket object. The handle_get function will read the request from the client, parse the HTTP method and path, and send back a short message as a response.

Here is the code for the synchronous version server_sync.py:

import socket
import traceback 

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

server_address = ('0.0.0.0', 7080)
print('Starting server on %s port %s' % server_address)
sock.bind(server_address)

sock.listen()

# synchronous function -- blocking
def handle_get(client, client_address):
    request_data = client.recv(1024)
    print(f'Request from {client_address}: {request_data.decode()}')

    request_line = request_data.split(b'\r\n')[0]
    tokens = request_line.split()
    if len(tokens) == 3 and tokens[0] == b'GET':
        response_header = b'HTTP/1.1 200 OK\r\nContent-Type: text/html\r\n\r\n'
        response_body = b'<html><body><h1>Hello %s</h1></body></html>' % client_address[0].encode()
        client.send(response_header)
        client.send(response_body)
    else:
        response_header = b'HTTP/1.1 400 Bad Request\n\n'
        client.send(response_header)

    client.close()


while True:
    client, client_address = sock.accept()

    try:
        handle_get(client, client_address)
    except Exception as e:
        traceback.print_exc() 
        print('Error:', e)
        client.close()

Asynchronous version

The asynchronous version of the web server is very similar to the synchronous version but will use the asyncio module to create an event loop, a TCP server on port 8080, and an async function to handle each connection in a non-blocking way. The handle_get function will be decorated with the async keyword, and use the await keyword to read the request from the client, parse the HTTP method and path, and write back a response.

Here is the code for the asynchronous version server_async.py:

import socket
import asyncio
import traceback 

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

server_address = ('0.0.0.0', 8080)
print('Starting server on %s port %s' % server_address)
sock.bind(server_address)

sock.listen()
# Set the socket to non-blocking mode
sock.setblocking(False)

# Get the event loop
loop = asyncio.get_event_loop()

# asynchronous function -- non-blocking
async def handle_get(client, client_address):
    request_data = await loop.sock_recv(client, 1024)
    print(f'Request from {client_address}: {request_data.decode()}')

    request_line = request_data.split(b'\r\n')[0]
    tokens = request_line.split()
    if len(tokens) == 3 and tokens[0] == b'GET':
        response_header = b'HTTP/1.1 200 OK\r\nContent-Type: text/html\r\n\r\n'
        response_body = b'<html><body><h1>Hello %s</h1></body></html>' % client_address[0].encode()
        await loop.sock_sendall(client, response_header)
        await loop.sock_sendall(client, response_body)
    else:
        response_header = b'HTTP/1.1 400 Bad Request\n\n'
        await loop.sock_sendall(client, response_header)

    client.close()


async def run_server():
    while True:
        client, client_address = await loop.sock_accept(sock)

        try:
            loop.create_task(handle_get(client, client_address))
        except Exception as e:
            traceback.print_exc() 
            print('Error:', e)
            client.close()


loop.run_until_complete(run_server())

Performance benchmark

To compare the performance of the two servers, we can use a tool called Apache Bench, which is a command-line program that can generate and measure the load on a web server. Apache Bench can send a specified number of requests to a web server, with a specified level of concurrency, and report the statistics of the response time, throughput, and errors. This utility is pre-installed on macOS. For Ubuntu, you can install via

apt install -y apache2-utils

To use Apache Bench, we need to specify the following parameters:

-n: the number of requests to perform
-c: the number of concurrent requests to perform
-r: don’t exit on socket receive errors
-l: accept variable document length
url: the URL of the web server to test

Let’s fire up the synchronous version of the server and run the load test.

# run the synchronous version of the server on port 7080 from the server side
python server_sync.py

# run apache bench from a different computer on the network
ab -r -n1000 -c150 -l http://SERVER_IP:7080/
...
Concurrency Level:      150
Time taken for tests:   1.890 seconds
Complete requests:      1000
Failed requests:        130
   (Connect: 0, Receive: 65, Length: 0, Exceptions: 65)
Total transferred:      91630 bytes
HTML transferred:       50490 bytes
Requests per second:    529.13 [#/sec] (mean)
Time per request:       283.484 [ms] (mean)
Time per request:       1.890 [ms] (mean, across all concurrent requests)
Transfer rate:          47.35 [Kbytes/sec] received
...

Among other things, note 130 failed requests and the throughput of 529 requests per second. Now, let’s do the same with an asynchronous server.

# run the asynchronous version of the server on port 8080 from the server side
python server_async.py

# run apache bench from a different computer on the network
ab -r -n1000 -c150 -l http://SERVER_IP:8080/
...
Concurrency Level:      150
Time taken for tests:   1.418 seconds
Complete requests:      1000
Failed requests:        44
   (Connect: 0, Receive: 22, Length: 0, Exceptions: 22)
Total transferred:      95844 bytes
HTML transferred:       52812 bytes
Requests per second:    705.41 [#/sec] (mean)
Time per request:       212.641 [ms] (mean)
Time per request:       1.418 [ms] (mean, across all concurrent requests)
Transfer rate:          66.03 [Kbytes/sec] received
...

This time, we have 44 failed requests and the throughput of 705 requests per second. Lower failure rate and higher throughput clearly indicate that the async version of the server outperforms the sync version of the server. In a real world scenario where the server sends larger data to the client, the synchronous version will suffer even more due to higher I/O bottleneck.

Conclusion

In this story, we have learned about the concept of asynchronous programming in Python, and how it can improve the performance and scalability of web applications. We have also implemented and compared two different ways of writing a web server in Python: one using synchronous functions, and another using asynchronous functions. We have used Apache Bench to measure and compare the performance of the two servers. The results have shown that the asynchronous version of the web server has a higher throughput and a lower failure rate than the synchronous version, as it can handle more requests with less resources and without blocking the main thread. Asynchronous programming is a powerful and elegant way of writing concurrent code that can handle I/O operations efficiently and effectively.