reredis

Redesign Redis Using Async/Await: What Is Different?

Jiahua Huang

Published AUG 19, 2020

Redis is a popular in-memory database project used worldwide and it is a good project for programmer to learn data structures and network programming. Several years ago, I'm very interested in redis project especially the network I/O part, so I reimplemented redis in Rust programming language. The project is called "reredis" and is located here. At that time, the name means "reimplement redis" because what I did is understanding the C code and translate into fancier Rust code like wrapping the reference counting and iterators. My network I/O code is similar to redis using event-based I/O model. This is the most exciting part of redis and is what I redesign later. So let's talk about that.

The word "event-based" is not clear, some may call it "asynchronous model" or some other names, but none of name are hard to understand without background. So let's go through the popular I/O models quickly. I like to use the waiter/customer model in the restaurant to simulate the server/client model. Their similarities are:

Client establish connection to the server is just like a customer from a table asked a waiter to order food.
Client communicate with the server is like customer talk with the waiter.
The slow client (the server cannot get a whole message from client either on purpose or due to network issues) is like customers who do not know what exactly they want to order before asking waiters, so they either order something first or think for a time even when waiters are here (very annoying). At the same time, other customers/clients may requesting service. The method to handle this situation is the key difference of the server models, which is called concurrency control.
The server might choose to use multiple threads to handle the connection and the restaurant can hire multiple waiters to serve the customers.

If a poor restaurant only hires 1 waiter and the waiter is dumbest but very friendly to customers so that the waiter will wait any customer to finish ordering before leaving the table even when the customer is still thinking what to order. On the server side, this is called sequential model. Once a server accept a connection, the server will serve the client until the client close the connection. The code looks something like:

loop:
	client_fd = accept(listen_fd)
	while msg = read(client_fd) and len(msg) != 0:
		// ... handle client
	close(client_fd)

The problem here is that before closing the current connection the server cannot accept and handle other requests even if the current client does not send anything (the server blocks on read).

single-waiter

This is not an error because clients in application like redis often send multiple commands on a connection and the later command might be relevant with the former one. Other requests are either queued or discarded so that many clients will feel latency or have failed connection. In the restaurant model, most customers will complain if the waiter spend a long time waiting a customer who does nothing but looking at the menu.

How to solve this if the restaurant does not want to hire extra waiters? Yeah, the waiter needs to be smarter. If the current customer is not ordering anything, write down what the customer has already ordered and go to the other tables to see if they can order something.

async-waiter

Therefore, any customer with some ideas can be served soon. In the server model, this is called event-based model or asynchronous model. The server accept connections and put it in a event loop, and the server poll on the event loop to see if there is any event is active (won't block). Actually, the accepting is also an event and the accepting event is active when the OS already get connection requests. Other events include "getting some message from the client" or "the client is open to get some message". You might want to ask how does the server know a connection has arrived or a message is arrived. This is because the Operating System can monitor the listening socket and connection socket and notify the application if some socket is active. And the OS provides API like select, poll, epoll and kqueue. The server code looks like:

event_loop = create_event_loop()
add_event(event_loop, listen_fd, LISTEN_EVENT)

loop:
	event = poll_event(event_loop)
	
	switch event_type(event):
		LISTEN_EVENT: 
			client_fd = accept(listen_fd) // won't block
			add_event(event_loop, client_fd, READ_EVENT)
		READ_EVENT:
			client_fd = get_client(event)
			msg = read(client_fd)
			// ... handle message
			// might save the msg before handling because the client
			// does not send the whole message at once
			// might add_event(event_loop, client_fd, WRITE_EVENT)
		WRITE_EVENT:
			client_fd = get_client(event)
			// ... send some message to client

The event-based model isn't famous for its readability but it is very time-efficient because it blocks on all event instead of only one client. It's like the waiter monitoring all customers instead of waiting for one customer. Redis and many servers choose to use this model. Here is a piece of redis's early code on processing the event loop, the main part is from line 275 (the function aeProcessEvent). At the beginning, reredis has the similar code, but as I know more about computer systems, I want to make reredis multi-threaded. Because modern computers have multiple cores and we can utilize these core to achieve parallelism. How multi-threaded server works? Let's see.

Imagine you are a very rich boss and you can hire unlimited staff you want to serve your customers so that you can send a new waiter to serve the customer if requested. The waiter can be dump now because you have other waiters. On the server side, this is just like creating a new thread every time a request comes and let it execute the sequential handling process. This sounds ideal but the truth is that these threads are not truly parallel because your CPU has limited cores. And the OS will switches the threads' execution on the cores. This is actually not bad because those blocking thread won't waste a lot of CPU time but switching threads in the OS is slow and tons of threads can consume a lot of system resources. So many people will choose to use thread pools to limit the thread number a program is using. But the slow-client issue appears again because each thread still execute sequential handling and they can all block at the same time. This is like you have several waiters to serve customers but all of them are facing a customer with no idea what to order.

To fix this, all waiters need to be smarter. Each of them needs to poll all tables to serve the customer with idea and record something if the current customer blocks but has ordered something. However, all waiters polling together can result into chaos so we can assign one waiter to monitor all customers and tell other waiters to take order if there is a customer requesting. And if the waiter taking the order stucks, he/she can record the order and tell the monitoring waiter this customer blocks. Therefore, on the server side we can make one thread poll on the event loop and other threads handle the event. The code can be:

Polling Thread:
	event_loop = create_event_loop()
	add_event(event_loop, listen_fd, LISTEN_EVENT)
	loop:
		event = poll_event(event_loop)
		switch event_type(event):
			LISTEN_EVENT:
				client_fd = accept(listen_fd)
				// send the client handle to a serving thread
			READ_EVENT:
				client_fd = get_client(event)
				// send the client handle to a serving thread
			WRITE_EVENT:
				client_fd = get_client(event)
				// send the client handle to a serving thread

Serving Threads:
	loop:
		client_fd = get_client_from_polling_thread()
		if will_block(client_fd):
			// add according event to event_loop
			// this requires mutual exclusion on event_loop
			yield()
		else:
			result = handle_client(client_fd)
			if !is_closed(client_fd): // the client does not finish
				// add according event to event_loop
			else:
				close(client_fd)

This design makes thread cooperate with each other and can utilize multiple CPU cores. However, the real code is definitely difficult to read and it will be error prone because of the data race. But since every server using this design will have the same boilerplate, we can abstract this design. Note that every serving threads here follow the same way of handling: they keep handling until they blocks and then they yield the control. This looks like the OS processes or threads except that they don't yield control on time slice. So we can build a runtime to run our own thread/process on top of the OS threads and switches them only when they block. Now we direct our own way to the coroutines (cooperative routines). And we can simplify the code to:

handling_worker_func(client_fd):
	while msg = read(client_fd) && len(msg) != 0:
		response = handle_msg(msg)
		write(client_fd, response)

accepting_worker_func(listen_fd):
	loop:
		client_fd = accept(listen_fd)
		spawn(handling_worker_func, client_fd)

main():
	listen_fd = get_listen_fd()
	accepting_worker_func(listen_fd)

spawn function spawns a new worker. Each worker will be running on a OS thread managed by the runtime. The runtime will send a non-blocking worker to a idle thread. Here all worker functions look like sequential code, but the runtime behind can switch them if they block and their context will be saved to the runtime and when they resume they are in the same context like before. For example, let's look the handling_worker_func:

handling_worker_func(client_fd):
	while msg = read(client_fd) && len(msg) != 0: // msg = "hello"
		response = handle_msg(msg) // response = "hello back"
		write(client_fd, response) <- blocks here

The worker block at write and they it save the msg , response and the current position to the runtime. When the runtime knows the write won't block, it will resume the worker at the write line and msg and response will be the same. In the real practice, the runtime might find it hard to know whether a function will block or not, so we need a strategy to mark them. Here comes the async/await. await is used to execute function that will block (awaiting this function no matter what it takes) and async is used to mark function with await. Then the code will be:

async handling_worker_func(client_fd):
	while msg = await read(client_fd) && len(msg) != 0:
		response = handle_msg(msg)
		await write(client_fd, response)

async accepting_worker_func(listen_fd):
	loop:
		client_fd = await accept(listen_fd)
		spawn(handling_worker_func, client_fd)

main():
	listen_fd = get_listen_fd()
	await accepting_worker_func(listen_fd)

Because the spawn function dose not execute the handling_worker_func, it should not use await here.

In the next blog, I will talk about how I use this design to modify the redis. I use the Rust language's async/.await semantic and the Tokio runtime.

reredis