Topics Covered: HTTP, Client-Server Model, Web APIs
HTTP (Hypertext Transfer Protocol) is an application layer protocol designed for transferring information over the web between networked devices, and is primarily built on TCP (some recent versions use UDP, which will be discussed further later in the course).
Like any protocol, it's a standard agreed upon by networked devices on the Internet. It has a specific message format, which any receiver and sender adheres to.
From Professor Zhang's slide deck, this diagram is especially useful in visualizing the format. You can think of it as the headers providing the information about the client message (request) and how it should be handled, and the body contains some data the server can use.
Generally speaking, the web works as an interaction between a client device and a server device. Both are just computers connected to a network, there's nothing special about a server. The client provides the server with a request, the server processes the request and (usually) does some action, then the server sends the client a response. The request and response are both done over HTTP. TCP ensures there is a streamed connection, while UDP is more along the lines of send and forget, so TCP makes more sense for now.
HTTP Methods specify the type of action the client wants from the server. Here are the main ones:
GET
SELECT
operation against a database.POST
INSERT
operation against a database.PATCH
UPDATE
on a database.PUT
UPDATE
but it could also be INSERT
on the database, depending on how the database is structured.DELETE
DELETE FROM
on the database.There are others such as HEAD
and OPTIONS
which you normally won't encounter.
Obviously, HTTP did not come from nowhere. There are some capabilities associated with different versions which you should be familiar with at this stage in the course.
Some Definitions:
Non-persistent means you need a new TCP connection for each request/response pair. So if you wanted to retrieve an image from a server, assuming that image fits in one packet, it would take 2RTT neglecting transmission time. However, note the structure of an HTML document (standard webpage structure format).
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>My Webpage</title>
</head>
<body>
<img src="/path/to/image.jpg" alt="alt text">
<script src="/path/to/javascript.js">
In this case, we have our original HTML document, and this HTML document requires two extra requests: an image and a Javascript source file, which it fetches from the server. So our sequence is as follows in HTTP/1.0:
Therefore, it will take 1 RTT for the request/response for each object, plus another RTT for each request/response to set up the TCP connection.
For an HTML document referring to n objects within it. Note, you must first retrieve the HTML document, since you need the entire thing before you can start retrieving the other objects.
Persistent HTTP introduced in this version is essentially that you no longer need a new TCP connection for each request/response pair. So the sequence is the same, except the calculation is slightly different.
This isn't necessarily HTTP, but HTTP itself being a stateless protocol allows for easy parallelism of connections between a client and server (limited data dependencies). However, doing so introduces additional overhead on the operating system (POSIX threads are expensive), and contention for the limited bandwidth available. If your OS can provide you with 8 parallel threads (8 physical cores) then you can perform 8 request/response in parallel. However, you cannot parallelize the fetching of index.html
since it is a prerequisite for the others.
For m
threads and n
objects,
This one is difficult to provide any exact calculation for. Essentially, you send many requests one at a time without waiting for a response for the previous, but it is not done completely in parallel. This is possible only over HTTP/1.1 (why?). Theoretically, it is approximate to simultaneous requests, so it can be treated like 1RTT for all objects (not each), but in reality there is some overhead here, and the server can be overloaded, so the propagation time may be longer, etc.
Web APIs leverage the structure of the HTTP protocol to offer services to clients. For example, programmers can interact with Discord through a script if they create and register an application and access Discord's API. This is how Discord bots and webhooks work.
Take a look under the channel documentation. If you have the time, try creating a bot application (everything I'm describing is free). Write a Python script to do the following using the authentication information described in their documentation (requests
library is pretty good).
POST
request to publish a new message.PUT
to add a reaction.GET
the message.DELETE
the reaction.Isn't HTTP neat?