quiz7B - multi-threaded web server

# quiz7B - multi-threaded web server contributed by < `secminhr` > [github repo](https://github.com/secminhr/quiz7B) ## 進度表 - [x] http - [x] gain full understand of current codes (including request parsing and response) - [ ] 並行與多執行緒程式設計講座 - [ ] io_uring - [ ] io multiplexer, epoll series - [ ] 高效 Web 伺服器開發講座 - [ ] tool - ab - [ ] tool - wrk2 ## HTTP notes :::warning 除了自規格摘錄，應該要點評解說。 :notes: jserv ::: ### stateless > HTTP is defined as a stateless protocol, meaning that **each request message can be understood in isolation**. > [RFC 7230 2.3](https://datatracker.ietf.org/doc/html/rfc7230#section-2.3) ### syntax *The original representation of ` ` and `\r\n` in RFC 7230 is SP and CRLF, here we convert it into the representation used in C.* #### request ``` method request-target HTTP-version \r\n *( header-field \r\n ) \r\n [ message-body ] ``` `*( header-field \r\n )` means 0 or more `( header-field \r\n )` `[ message-body ]` means at most 1 `message-body` > RFC 5234 [3.5](https://datatracker.ietf.org/doc/html/rfc5234#section-3.5), [3.6](https://datatracker.ietf.org/doc/html/rfc5234#section-3.6), [3.8](https://datatracker.ietf.org/doc/html/rfc5234#section-3.8) #### response ``` HTTP-version status-code reason-phrase \r\n *( header-field \r\n ) \r\n [ message-body ] ``` #### header-field ``` field-name: OWS field-value OWS ``` OWS indicates optional whitespace > [RFC 7230 3.2.3](https://datatracker.ietf.org/doc/html/rfc7230#section-3.2.3) ### message-body in request > The presence of a message body in a request is signaled by a Content-Length or Transfer-Encoding header field. > [RFC 7230 3.3](https://datatracker.ietf.org/doc/html/rfc7230#section-3.3) ### message-body in response > The presence of a message body in a response depends on both the request method to which it is responding and the response status code. > [RFC 7230 3.3](https://datatracker.ietf.org/doc/html/rfc7230#section-3.3) ## length of messsage body To determine the length of messsage body, see [RFC 7230 3.3.3](https://datatracker.ietf.org/doc/html/rfc7230#section-3.3.3). ## organization of the program The main part of this program can be seen as interactions between [connection queue](#connection-queue), [greeters](#greeters) and [workers](#workers). ```graphviz digraph { rankdir=LR listen[label="listen fd"] subgraph cluster_greeters { label="greeters" greeter greeter1[label="greeter"] more[label="..." shape="none"] } listen -> greeter[ltail=cluster_greeters] queue[label="connection queue"] greeter -> queue[label="put accepted connection in"] subgraph cluster_workers { label="workers" worker worker1[label="worker"] more_w[label="...", shape="none"] } worker -> queue[label="get a connection", dir=back] worker -> queue[label="put the connection back to keep alive"] subgraph { connfd[label="connection fd"] worker -> connfd[label="process request and send response"] } } ``` ### connection queue A queue with accepted connections inside, both greeters and workers depend on this queue to put/get connections. This queue uses two locks, `head_lock` for enqueue, `tail_lock` for enqueue to avoid greeters and workers from affecting each other. This queue also guarantees that enqueue is always successful by keeping a dummy node at head. ```graphviz digraph queue { label="normal case" head[shape=none] tail[shape=none label="tail, enqueue position"] dummy head -> dummy tail -> node_n dequeue[shape=none label="dequeue position"] dequeue -> node1[dir=back] subgraph { rank=same dummy -> node1[shape=rect] more[label="..." shape=none] node1 -> more more -> node_n NULL[shape=none] node_n -> NULL } } ``` ```graphviz digraph queue { label="empty, size=0" head[shape=none] tail[shape=none label="tail, enqueue position"] dummy head -> dummy tail -> dummy subgraph { rank=same NULL[shape=none] dummy -> NULL } } ``` #### structure ```c typedef struct { node_t *head, *tail; pthread_mutex_t *head_lock, *tail_lock; /* guards head and tail */ pthread_cond_t *non_empty; int size; /* only used for connection timeout heuristic */ } queue_t; ``` A thread will wait for `non_empty` if `size` is 0 when dequeuing. #### enqueue Create a new node with `fd`, put it into queue `q`, and wake any threads that are waiting for `non_empty`. `size` check is not needed since the queue ensures that an enqueue action must succeed. ```c static void enqueue(queue_t *q, int fd) { /* Construct new node */ node_t *node = malloc(sizeof(node_t)); node->fd = fd, node->next = NULL; pthread_mutex_lock(q->tail_lock); /* Add node to end of queue */ q->tail->next = node; q->tail = node; q->size++; /* Wake any sleeping worker threads */ pthread_cond_signal(q->non_empty); pthread_mutex_unlock(q->tail_lock); } ``` #### dequeue If `size` is 0, which indicates that the queue is empty (has only dummy node), then wait for `non_empty`. Otherwise, dequeue **the node next to head(dummy)**, get `fd` from it and free it. ```c static void dequeue(queue_t *q, int *fd) { node_t *old_head; pthread_mutex_lock(q->head_lock); /* Wait until signaled that queue is non_empty. * Need while loop in case a new thread manages to steal the queue * element after the waiting thread is signaled, but before it can * re-acquire head_lock. */ while (q->size == 0) { pthread_cond_wait(q->non_empty, q->head_lock); } node_t *node = q->head->next; *fd = node->fd; old_head = q->head; q->head = node; q->size--; pthread_mutex_unlock(q->head_lock); free(old_head); } ``` ### greeters A greeter is responsible for accepting an incoming connection from client, and put it into connection queue for workers to handle it. #### structure This will be passed into greeter function as its argument. ```c struct greeter_args { int listfd; queue_t *q; }; ``` #### greeter The important part of `greeter_routine` is inside its `while` loop, which continuously accepting incoming connections. The loop can be separated in 3 pieces. - accept a connection from `listfd` ```c socklen_t clientlen = sizeof(clientaddr); int connfd = accept(listfd, (struct sockaddr *) &clientaddr, &clientlen); if (connfd < 0) { perror("accept"); continue; } ``` - setup timeout Since persistent connection is a default behavior after HTTP/1.1, we setup a timeout to avoid a connection which has no further requests being kept in our queue. [Details about how the program handle persistent connection]() ```c /* Basic heuristic for timeout based on queue length. * Minimum timeout 10s + another second for every 50 connections on the * queue. */ int n = q->size; timeout.tv_sec = 10; if (n > 0) timeout.tv_sec += n / 50; setsockopt(connfd, SOL_SOCKET, SO_RCVTIMEO, (void *) &timeout, sizeof(timeout)); ``` - enqueue the connection for workers to handle ```c enqueue(q, connfd); ``` ### workers A worker will dequeue a connection, read request from the connection, parse the request and send the response. There are some variables and constants being used in the worker function: - `q`: connection queue - `connfd`: connection fd, where a worker receives requests and sends responses - `msg`: - reading phase: request - sending phase: initial line content, file content - `MAXMSG`: length of `msg` - `recv_bytes`: length of received request (in bytes) - `len`: - reading phase: how many bytes read by `recv` - sending phase: length of initial line, length of file - `status`: http status to send - `file`: the fd of the file to send as response message - `buf`: content of `Date` header - `st`: status of the file to send, we use `st` to get the size of the file #### dequeue and read request - dequeue and clear `msg` ```c loopstart: dequeue(q, &connfd); memset(msg, 0, MAXMSG); recv_bytes = 0; ``` - read request From the [syntax of request](#request) we can see that `\r\n\r\n` indecates the start of message body (or end of a request if message body does not present). Since the program does not deal with message body, having `\r\n\r\n` is enough. The following code will keep reading from connection into `msg` unless - the request is finished - request exceeds the length of `msg` - `recv` somehow cannot read Last situation is handled inside `if ((len = recv(connfd, msg + recv_bytes, MAXMSG - recv_bytes, 0)) <= 0)` `len == 0`, by manual, indicates that the socket has been shut down, in this case, we close the connection and jump to `loopstart` to deal with next connection. `len == -1` indicates that an error has occurred, we then give different status depends on the error. ```c /* Loop until full HTTP msg is received */ while (strstr(strndup(msg, recv_bytes), "\r\n\r\n") == NULL && strstr(strndup(msg, recv_bytes), "\n\n") == NULL && recv_bytes < MAXMSG) { if ((len = recv(connfd, msg + recv_bytes, MAXMSG - recv_bytes, 0)) <= 0) { /* If client has closed, then close and move on */ if (len == 0) { close(connfd); goto loopstart; } /* If timeout or error, skip parsing and send appropriate * error message */ if (errno == EWOULDBLOCK) { status = STATUS_REQUEST_TIMEOUT; } else { status = STATUS_SERVER_ERROR; perror("recv"); } goto send; } recv_bytes += len; } ``` :::warning Note: The program cannot handle properly when the request exceeds `MAXMSG`. ::: #### request parsing This line will parse the request and give out the status to send. ```c status = parse_request(msg, request); ``` `parse_request` is supported by some functions and macros, let's break them down one by one. ##### structure This structure represent a http request, pretty straight forward. ```c typedef struct { http_method_t method; char path[MAXPATH]; content_type_t type; int protocol_version; } http_request_t; ``` ##### macros These macros are used extensively in parsing funcions. - `TRY_CATCH` Add check to a statement which return a `status_t`. If status is not `STATUS_OK`, then the function using this will return with the status (as if the function throws the error status). Note that it use a `do {...} while(0)` so we can use it like a normal function. For example ```c TRY_CATCH(parse_header()); ``` ```c #define TRY_CATCH(STATEMENT) \ do { \ status_t s = (STATEMENT); \ if (s != STATUS_OK) \ return s; \ } while (0) ``` - `TRY_CATCH_S` `TRY_CATCH` for `sep_newline` and `sep_whitespace`. `TRY_CATCH_S` will throw a `STATUS_BAD_REQUEST` if `STATEMENT` returns `NULL`. ```c #define TRY_CATCH_S(STATEMENT) \ do { \ if (!(STATEMENT)) \ return STATUS_BAD_REQUEST; \ } while (0) ``` ##### functions Since we now know that `TRY_CATCH` and `TRY_CATCH_S` will throw if an error happens, we can just ignore them to see the normal flow of a function. - `strsep_whitespace` This function will return the token before whitespace, and make `*s` point to next character of whitespaces. ```c static char *strsep_whitespace(char **s) { char *ret = strsep(s, " \t"); while (*s && (**s == ' ' || **s == '\t')) (*s)++; /* extra whitespace */ return ret; } ``` - `strsep_newline` Return first line (terminated by `\r\n` or `\n`), and make `*s` point to the next character of newline character(s). ```c static char *strsep_newline(char **s) { char *ret; char *r = strchr(*s, '\r'); char *n = strchr(*s, '\n'); if (!r || n < r) ret = strsep(s, "\n"); else { ret = strsep(s, "\r"); (*s)++; /* advance past the trailing \n */ } return ret; } ``` - `parse_method` Check the method of a request, and fill in `request->method` if it is `GET` or `HEAD`. Other tokens are not accepted and the function will return an error status `STATUS_BAD_REQUEST` in that case. ```c static status_t parse_method(char *token, http_request_t *request) { if (strcmp(token, "GET") == 0) request->method = GET; else if (strcmp(token, "HEAD") == 0) request->method = HEAD; else return STATUS_BAD_REQUEST; return STATUS_OK; } ``` - `parse_path` Map request to `/` and `/index.html` to full path of `index.html`, and assign it to `request->path`, also, assign `request->type` to `TEXT`. ```c static status_t parse_path(char *token, http_request_t *request) { if (strcmp(token, "/") == 0 || strcmp(token, "/index.html") == 0) { snprintf(request->path, MAXPATH, "%s/index.html", DOCUMENT_ROOT); request->type = TEXT; } else /* FIXME: handle images files and other resources */ return STATUS_NOT_FOUND; return STATUS_OK; } ``` :::info Note: `parse_path` now can only handle requset to `/` and `/index.html`, request to other paths will result in a `STATUS_NOT_FOUND` error. ::: - `parse_protocol_version` This function only accepts HTTP/1.0 and HTTP/1.1 If neither is recognized, it throws a `STATUS_BAD_REQUEST` error. ```c static status_t parse_protocol_version(char *token, http_request_t *request) { if (!strcmp(token, "HTTP/1.0")) request->protocol_version = 0; else if (!strcmp(token, "HTTP/1.1")) request->protocol_version = 1; else return STATUS_BAD_REQUEST; return STATUS_OK; } ``` - `parse_initial_line` This function will parse the first line, and fill the result into `*request`. From the [syntax note of request](#request), we know that the initial line should be ``` method request-target HTTP-version \r\n ``` So the way this function parse the initial line is getting these 3 tokens sequentially and put them in corresponding parsing functions. ```c static status_t parse_initial_line(char *line, http_request_t *request) { char *token; TRY_CATCH_S(token = strsep_whitespace(&line)); TRY_CATCH(parse_method(token, request)); TRY_CATCH_S(token = strsep_whitespace(&line)); TRY_CATCH(parse_path(token, request)); TRY_CATCH_S(token = strsep_whitespace(&line)); TRY_CATCH(parse_protocol_version(token, request)); return STATUS_OK; } ``` - `parse_header` This function should be used to parse a header, however it is not yet implemented. ```c /* FIXME: Currently ignores any request headers */ static status_t parse_header(char *line, http_request_t *request) { (void) line, (void) request; return STATUS_OK; } ``` - `parse_request` By [the syntax of a request](#request), a request should look like ``` method request-target HTTP-version \r\n *( header-field \r\n ) \r\n [ message-body ] ``` First line is its initial line, and the following lines are its header fields. ```c static status_t parse_request(char *msg, http_request_t *request) { char *line; TRY_CATCH_S(line = strsep_newline(&msg)); TRY_CATCH(parse_initial_line(line, request)); while ((line = strsep_newline(&msg)) != NULL && *line != '\0') TRY_CATCH(parse_header(line, request)); return STATUS_OK; } ``` #### response The response must follow [the syntax of a response](#response) ``` HTTP-version status-code reason-phrase \r\n *( header-field \r\n ) \r\n [ message-body ] ``` So we can separate sending phase into 3 parts: initial line, headers, and message body. ##### initial line Here we reuse `msg` to store the initial line. `status_to_str` turns the status enum into its corresponding string reason. ```c send: /* Send initial line */ len = sprintf(msg, "HTTP/1.%d %d %s\r\n", request->protocol_version, status, status_to_str(status)); send(connfd, msg, len, 0); ``` ##### headers Send `Date`, `Content-Length`, and `Content-Type` headers. - `Date` A preferred format of `Date` is defined in [RFC 7231 7.1.1.1](https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.1.1). ``` day-name, date1 time-of-day GMT ``` Since it requires `GMT`, we use `gmtime(&now)` to translate current time to correnponding GMT, then follow the rule above to translate it into a string using `strftime`. ```c /* Send header lines */ time_t now; time(&now); len = strftime(buf, 1024, "Date: %a, %d %b %Y %H:%M:%S GMT\r\n", gmtime(&now)); send(connfd, buf, len, 0); ``` - `Content-Length` and `Content-Type` In [the note of message body for response](#message-body-in-response) we see that the presence of a message body depends on the request method and the status code, and [RFC 7230 3.3](https://datatracker.ietf.org/doc/html/rfc7230#section-3.3) has stated the rule. Here I'll just show the important part for us, since the program only handle the request with `HEAD` or `GET` method. > Responses to the HEAD request method (Section 4.3.2 of [RFC7231]) never include a message body because the associated response header fields (e.g., Transfer-Encoding, Content-Length, etc.), if present, indicate only what their values would have been if the request method had been GET (Section 4.3.1 of [RFC7231]). That is to say, we only have to send message body when the request method is `GET`, and we should indicate that message body is present by `Transfer-Encoding` or `Content-Length`. In this case where we use `Content-Length`, its value should be the length of message body. And `Content-Type` indicates the media type of message body, which was given in `parse_path`. ```c if (status == STATUS_OK && request->method == GET) { stat(request->path, &st); len = sprintf(msg, "Content-Length: %d\r\n", (int) st.st_size); send(connfd, msg, len, 0); len = sprintf(msg, "Content-Type: %s\r\n", type_to_str(request->type)); send(connfd, msg, len, 0); } ``` - end of headers ```c send(connfd, "\r\n", 2, 0); ``` ##### message body As we mentioned in previous section, the program has to send message body only when request method is `GET`, and the message body we want to send is specified by `requset->path` given by `parse_path`. ```c /* If request was well-formed GET, then send file */ if (status == STATUS_OK && request->method == GET) { if ((file = open(request->path, O_RDONLY)) < 0) perror("open"); while ((len = read(file, msg, MAXMSG)) > 0) if (send(connfd, msg, len, 0) < 0) perror("sending file"); close(file); } ``` #### after sending After the response is sent, we have to decide whether to close the connection or keep it (due to default persistent connection behavior in HTTP/1.1). `enqueue` is used to keep connection. We enqueue the connection back to connection queue so it can be handled later by a worker. ```c /* If HTTP/1.0 or recv error, close connection. */ if (request->protocol_version == 0 || status != STATUS_OK) close(connfd); else /* Otherwise, keep connection alive and re-enqueue */ enqueue(q, connfd); ``` :::danger 先闡述程式設計的考量，再來分析程式碼 :notes: jserv ::: :::info Descriptions are added. ::: ## make `recv` non-blocking ### Problem to fix Look at the reading part in a worker: ```c /* Loop until full HTTP msg is received */ while (strstr(strndup(msg, recv_bytes), "\r\n\r\n") == NULL && strstr(strndup(msg, recv_bytes), "\n\n") == NULL && recv_bytes < MAXMSG) { if ((len = recv(connfd, msg + recv_bytes, MAXMSG - recv_bytes, 0)) <= 0) { /* If client has closed, then close and move on */ if (len == 0) { close(connfd); goto loopstart; } /* If timeout or error, skip parsing and send appropriate * error message */ if (errno == EWOULDBLOCK) { status = STATUS_REQUEST_TIMEOUT; } else { status = STATUS_SERVER_ERROR; perror("recv"); } goto send; } recv_bytes += len; } ``` It uses `recv` to get the content of a request. Since `recv` will block if there are no messages available, we can easily imagine that all workers are blocked due to persistent connection is possible. Let's see this effect in action by reducing the number of workers. First, we modify the length of `workers` array to 1 and change its filling loop. ```c pthread_t workers[1], greeters[N_THREADS / 2]; ... for (int i = 0; i < 1; i++) pthread_create(&workers[i], NULL, worker_routine, (void *) connectio ns); ``` Now we can see the effect with `wget`. ```shell wget -w 9 localhost:9000 localhost:9000 ``` This line will send 2 requests to `localhost:9000` within the same connection, with a 9 second delay between 2 requests. (we choose 9 only because the timeout set by a greeter is 10 seconds) During the delay, open another terminal and use ```shell wget localhost:9000 ``` to send a request immediately. We can clearly see that the later command, which should receive a response immediately, waits until the previous command is finished, and the reason is the blocking behavior of `recv`. ### Solution See full code on [nonblocking_only branch](https://github.com/secminhr/quiz7B/tree/nonblock_only). To solve this problem, it's intuitive to make the connection fd non-blocking. We can do that by change the file status flag of connection fd in a greeter. ```c int flag = fcntl(connfd, F_GETFL, 0); flag |= O_NONBLOCK; fcntl(connfd, F_SETFL, flag); ``` And add a check in reading loop when `recv` returns `EAGAIN`. `EAGAIN` means there are currently no messages available, so we `enqueue` the connection back and start next loop. ```c if ((len = recv(connfd, msg + recv_bytes, MAXMSG - recv_bytes, 0)) <= 0) { if (errno == EAGAIN) { enqueue(q, connfd); goto loopstart; } ... ``` This solution manages to solve the problem. When we apply the above test method to it, the later command gives the response immediately. However, this solution makes the CPU usage of the program rises dramatically up to 99% and therefore gets killed by the kernel before long. ### Adding epoll The problem in previous section is that the workers spend most of their time dequeuing, reading, and enqueuing a connection that has no messages since we made the connection fd non-blocking. Epoll gives us the ability to know when a connection gets messages so we can avoid unnecessary `recv` if we use it after messages' arrival. :::spoiler notes note: after reboot, performance of epoll The throughput of epoll solution seems to vary across different machines. The latency, however, is always lower test on - multipass (vm): much higher throughput than queue solution - Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz - number of cores: 1 - digital ocean droplet (vm): nearly the same queue solution - Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz - number of cores: 1 - rpi 3 model B: slightly lower - Quad Core 1.2GHz Broadcom BCM2837 64bit CPU - number of cores: 4 Consider adding timer first, leave this problem later. :::