**[« Back to the main CSCI1680 website](https://cs.brown.edu/courses/csci1680/f22/)**
# Final project :email:
## Introduction
We have reached the end of the course, congratulations! Take a deep
breath. You've made it. :grin:
In this course, we have discussed some of the core protocols and
concepts that power the Internet. Yet, there are many topics we have not
had time to cover. Though some of the core protocols will be around
forever, networking is a fast-moving field of CS, with new protocols,
ways to build applications, and new performance and security concerns
evolving every day. Thus, our goal is to give you the tools you need to
tackle new networking challenges you encounter.
In this project, you will have an opportunity to use these skills by
further exploring some component of networking that interests
you---whether it's exploring something we've already covered more
deeply, or by investigating something new. At the end of this document,
we have provided a list of potential project ideas with resources to get
started. You can either pick one of these ideas, or use your own.
This is an open-ended project, and is intended to be *much* lighter than
IP or TCP. We have about two weeks left of the semester: in that time,
you will propose a brief project idea, and spend about week working on
it. After that, you will submit a brief writeup about what you did and
what you learned (in addition to any code you have written). It's okay
if you don't fully complete your idea in time, so long as you show
progress and demonstrate what you've learned.
Our hope is to create something that's fun and interesting, but doesn't create a lot of stress. You all have put up with a lot this term---so I hope this project is a bit of a break.
## Overview
In this project, you are asked to create something that demonstrates a
networking concept of your choice, and to explain what is happening in a
brief writeup. In general, your project might take one of two forms:
- **System-building**: Implement something that demonstrates a
networking concept we have learned but not otherwise covered in a
project. Examples could include building a networked application in
an interesting way, or programmatically controlling network behavior
with Software Defined Networking (SDN) applications. In this form,
your deliverable would include your code for the application, and a
writeup of what you have built and how it works
- **Network measurement**: Measure network traffic in a certain
scenario to learn something about the network. This could involve
writing scripts to send packets and do network measurements (ie,
doing DNS queries from multiple vantage points to characterize a
CDN), or parsing packet captures to gather information about certain
traffic (eg. "How much of my traffic is encrypted?" or "How much
bandwidth is my Zoom call using?"). In this case, your deliverable
might include some results measurement results (including, eg,
figures/tables, where applicable)
See the [Sample topics](#Sample-project-topics) section for a list of possible topics. You are welcome to use any of
these, modify them, or suggest your own! Any ideas you suggest do not
need to fit into these two categories---you can work on whatever you
want, so long as we approve your idea.
## Logistics
### Groups
Similar to how you formed groups for IP and TCP, you can form groups on your own, or ask us to match you to a group. You may choose to keep the same group you worked with for IP/TCP, or you may wish to form a different group.
Groups SHOULD consist of two students. Exceptions MAY be made in the following cases:
1. If you worked as a group of 3 for IP/TCP, you can continue to do so in your current group
2. If you have particular concern, such as end-of-semester logistical constraints may make group-based work very difficult, you may *request* to work on this project on your own. The group assignment form contains a place to list your reasoning
3. If you had approval to work on IP/TCP on your own, you can continue to do so for this project
4. Groups >2 may be permitted with reasonable justification that the project would require it and a plan for how the work will be distributed
:::danger
**To form your group, fill out the [group assignment form](https://forms.gle/nAEB4FpMtSWNJ2ZZ9) by Tuesday November 29 by 11:59pm EST.** All team members MUST fill out this form--only mutual requests will be honored.
:::
### Repository
Once your team has been formed, you will receive a github classroom link to create a repository.
This repository is completely blank. Since this is an open-ended
project, there is no starter code or reference implementation---this
repository is just a place to keep your work and collaborate.
### Language requirements (or lack thereof)
You can work on the project using any language(s) you want---whatever
you think will help you accomplish the project most easily. For example,
if you're doing network measurements, you could write a shell script or
python program to make a bunch of DNS queries and store the results to a
file, and/or a Python script to plot the results or parse a capture
file. The sample projects have links to some software libraries for
common languages that you may find useful---you are welcome to use
these, or find other libraries for other languages you might prefer.
### Implementation requirements
Your implementation does not need to be particularly robust or complex,
so long as you implement something to demonstrate progress toward your
goal and show you understand the networking concepts involved. In
general, this project is similar to what research is like: you can use
any libraries, tools, resources, tutorials, or other materials that
already exist to help you *so long as you understand how they work, and
document where you found them*.
For system-building style projects, your implementation might start by
following a tutorial to build X using some software framework, which you
can then modify/extend to meet your goal. For measurement-style
projects, your implementation might be some scripts that gather and/or
plot results that you include in your writeup. Either way, you should
expect to write some code on top of resources you find online, even if
it's just code that "glues" components together.
## Deadlines and deliverables
### Group assignment form (Tuesday, November 29 by 11:59pm EST)
To request your group, fill out the [group assignment form](https://forms.gle/nAEB4FpMtSWNJ2ZZ9) by **Tuesday, November 29 by 11:59pm EST**. For details, see [Groups](#groups). We will confirm your group within 24 hours and send you a link to create your repository.
### Project proposal (Monday, December 5 by 11:59pm EST)
On or before ***Monday, December 5 by 11:59pm EST***, you will write up a *brief* description of
the project you want to implement and submit it on Gradescope under the assignment "final project proposal." Your
proposal writeup need not be more than a couple of paragraphs: just
describe what you're thinking of doing, what you hope to achieve as a
final deliverable, how you intend to get started, and any questions you
have. **If you have enough ideas to write up your proposal before the
deadline, we encourage you to submit early**---we will be monitoring
Gradescope **daily** and are happy to provide feedback sooner!
### Final writeup and submission (Monday, December 12 by 11:59pm EST)
When you are done, you will submit your work by pushing all code to your
repository and submitting a short writeup. Your writeup should explain
what you have built and what you learned. There is no length
requirement, but a reasonable estimate is on the order of 3--4 pages of
text/figures. In general, your writeup should contain at least the
following components:
- **Introduction**: What were your overall project goals? What
(briefly) did you achieve?
- **Design/Implementation**: What did you build, and how does it work?
Explain the major design decisions behind what you implemented. For
a system-building project, this would be an overview of the major
components of your system design, similar what you might write in a
readme. For a measurement-style project, this would be an
explanation of what you intend to measure, how you intend to measure
it, and how you would draw conclusions from the results.
- **Discussion/Results**: How far did you get toward your goal? In
this section, describe any results you have, what you have learned,
and any challenges you faced along the way. For a system-building
project, you might include some logs or screenshots of your program
operating. For a measurement-style project, this is the place for a
concise summary of your results, perhaps in some figures or tables,
and your interpretation of the data. If you don't meet your goal, or
don't have a lot of results, that's okay---just describe what you
have and what you learned along the way.
- **Conclusions/Future work**: Overall, what have you learned? How did
you feel about this project overall? If you could keep working on
this project, what would you do next? Are there any other directions
of this work you find interesting? If you have any thoughts or
feedback on this project model, please let us know!
### About the deadline
Your final submission is due by **Monday, December 12 by 11:59pm EST**. This
deadline has been selected to give you maximum flexibility in this busy
end of semester. Since this date approaches the official grading deadline, **no late days may be used**. If the deadline would be problematic for you, please contact the instructor ASAP so that we can make a plan together.
## What "open-ended" means
Once again, this project is meant to be a chance for you to get started
exploring some more network concepts in the very short time we have left
in the semester. You are **not** being asked or expected to build a
fully-fledged system or research project: just pick an idea, spend about
a week on it---with a number of hours that you'd consider reasonable for
a single class---and submit your work. If you envision your project as
having steps 1--5, and step 1 takes you a week, **that's okay**: just
submit what you have and tell me what you did and what you learned. So
take this time to explore something that interests you, and have fun!
# Sample project topics
The following pages contain some sample project ideas. Note that this
project is new to the course, and that these are just ideas: you don't
have to use them, or you can use a subset of one to help you reach an
idea. For each project, we have listed a few resources that we think may
be helpful, but these are just suggestions, and we have not necessarily
tested them in this context.
Each project idea also lists a guess for a suitable development
environment: due to the way in which certain network measurements may be
performed, not all can fully run within the course container
environment.
If you have questions about using any tools, development environments,
etc, please feel free to ask us for help on Ed or during office hours.
However, note that we have not tested these projects before, so we won't
have all the answers---but we're happy to help you figure out how to
approach the problem and point you in the right direction!
## System-building: A better Snowcast
When we built Snowcast, you wrote code to manually compose messages in
the Snowcast protocol format and send them along TCP sockets. This is a
great exercise in implementing a protocol. However, modern applications
often leverage frameworks to help build network APIs more quickly. One
such framework is gRPC: users can define their API and message formats,
and the gRPC framework automatically generates code for establishing
connections, authentication, serializing messages and more, in your
language of choice.
To explore these tools, you could implement part of Snowcast (or some
other protocol of your choice) in gRPC, or some other framework that
provides similar functionality. A good starting point might be to build
a client and server that connects and exchanges Snowcast's
`Hello`/`Welcome` messages, and then continue with selecting stations
and streaming data. As you do this, reflect on the differences between
using a framework like gRPC and building Snowcast on your own from a TCP
or UDP socket. What components are automatically provided for you? What
parts of the spec do you still need to implement? Is there anything that
you would need to change about the protocol to use a framework like
gRPC?
### Environment
You should be able to implement this from our course's container
environment.
**Notes/Resources**
- <https://grpc.io/> contains getting started guides for various
languages
- gRPC is based on Protocol Buffers (protobuf), a data serialization
framework. You can read more about Protocol Buffers
[here](https://developers.google.com/protocol-buffers/docs/overview)
## System-building: Implement a webserver
In class so far, we have had a taste of implementing server
applications. Knowing what you know now, implement a webserver (starting
from sockets, like Snowcast) that speaks enough of HTTP 1.0 or 1.1 to
serve files from a configurable directory. You could add to your
webserver by considering any of the following in your implementation:
- How could you make your server scale to large numbers of requests?
Use a benchmarking tool like `wrk` to measure your server's
performance (eg. number of requests that can be handled per second)
and see how you could improve it
- Modern webservers don't just serve files, they often run server-side
application code to make pages dynamic. How might you modify your
server to do this in an extensible way? What are some security and
performance implications?
### Environment
You should be able to implement this using our course's container
environment.
**Notes/Resources**
- [Wikipedia's page on
HTTP](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol) has
a good overview of the basic elements of the protocol, and a list of
RFCs. For a quick experiment to measure performance, you can
probably get away with just implementing the `GET` command (similar
to what was shown in lecture)
- [wrk](https://github.com/wg/wrk), an HTTP benchmarking tool
## System-building: SDN Application
We have discussed how routers and switches implement various protocols
to control how packets are forwarded, such as OSPF, Spanning Tree, etc.
We can think of these protocols as the network's *control plane*.
Traditional switches and routers are shipped with pre-built firmware
that implement these control plane protocols, which can only be tweaked
to a limited degree by network administrators. Software Defined
Networking (SDN) is a departure from this paradigm in which switches do
not contain pre-written control-plane software, and instead only forward
packets based on a generic set of forwarding rules. SDN switches export
an API to configure rules, as well as provide information about packet
events, allowing the switch's entire behavior to be controlled by an
external application, often by a centralized network controller with a
global view of the network. This separation of the control and data plan
permits more dynamic, flexible, and extensible network configurations.
SDN is an area of active research, with new hardware, methods,
applications being developed to work with more "programmable" network
devices.
One way to explore SDNs for your project is to experiment with writing
an SDN application that runs on a network controller. Example
applications can implement switch features like mac learning, Spanning
Tree Protocol, DHCP, network telemetry, and so on. To implement an SDN
application, you would write a program for an SDN controller---when a
switch receives a packet, it will ask the controller how to handle the
packet, which will consult your program to tell the switch how to set up
its forwarding table or respond to the packet.
This course once had a [whole assignment for building SDN
applications](https://cs.brown.edu/courses/csci1680/f18/content/sdn.pdf),
using the well-known OpenFlow protocol and Ryu SDN controller. To get
started working with SDN, we recommend looking here first. This
assignment was about implementing shortest-path forwarding, but the
tutorial and setup information is useful even if you want to build a
different kind of application.
### Environment
The easiest way to start with SDN is to use Mininet, which should be run
in a VM[^1]. In the past, our course has used a VM environment using
Vagrant, which should be straightforward to set up if your computer
supports it. You can find instructions on the course VM environment
here:\
<https://cs.brown.edu/courses/csci1680/f22/content/vagrant.pdf>\
**M1 mac users**: This VM will not run on your system. If want to do
this project, it would be extra work to figure out how to use Mininet in
a container or VM on an M1 Mac---in fact, if this interests you, this
alone could be your project (and it would really help me learn what's
possible)!
**Notes/Resources**
- The starter code for the old SDN assignment can be found at:
<https://github.com/brown-csci1680/sdn-starter>, but beware this
code has not been tested in four years.
- [Mininet](http://mininet.org/) is a network emulator that can create
arbitrary network topologies with switches that you can use to test
SDN applications.
- [Ryu](https://ryu.readthedocs.io/en/latest/getting_started.html) is
an easy-to-use OpenFlow-based SDN controller written in Python. It
has some good tutorials for getting started and a lot of example
applications
- [P4](https://p4.org/) is another interface for programming network
hardware, separate from the OpenFlow model described in the
assignment here. P4 has gained much more traction for its
flexibility in recent years compared to OpenFlow. You are welcome to
experiment with P4 as well, though the setup cost may be higher
## Measurement: Investigating Zoom traffic
As an old person, I don't understand how Zoom works---or, at least, in
terms of how it uses the network. I want you to explain it to me.
Capture some traffic while you use Zoom and report on what happens while
using various features (hosting calls, joining calls, screen sharing,
etc.). Can you tell what protocols are used? How much bandwidth does the
call use? Does everything happen over one connection, or multiple? What
IPs are involved, and where are they located? Does the video data get
sent to Zoom's servers, or directly to other users on the connection?
How does this differ from other videoconferencing applications
(Hangouts, Messenger, FaceTime, etc.)?
You can start this by using Wireshark and looking at the traffic. For a
more detailed analysis, one option is to write a script that parses a
capture file and outputs some useful statistics. Note: I don't expect
you to understand everything about how Zoom works (and indeed, much of
the traffic may be encrypted), but I'm quite curious what can be learned
from a surface-level analysis!
### Environment
To capture traffic from Zoom calls, you would need to install Wireshark
on your own machine, rather than inside the container. For more detailed
analysis, you can save the data captured by Wireshark to a file (ie,
called a "capture file" or "pcap file") and process the data in your
container environment (or anywhere else). Capture files are a
standardized format that can be processed by various tools and
libraries.
**Notes/Resources**
- Zoom has [a whitepaper](https://explore.zoom.us/docs/doc/Zoom Connection Process Whitepaper.pdf) about its connection process. You might consider
comparing what you observe against this (or potentially other resources you find online) to see if you can replicate their results
- [`tshark`](https://www.wireshark.org/docs/man-pages/tshark.html),
Wireshark's terminal-based counterpart, has some good options for
generating summaries of capture files that might be useful
- [scapy](https://scapy.net/) and
[PyPCAPKit](https://pypi.org/project/pypcapkit/) are Python
libraries for parsing packet capture (PCAP) files
## Measurement: Investigating CDNs
In class, we discussed how CDNs use DNS to direct users to nearby
servers. We this by querying a domain name from multiple DNS servers at
different geographic locations. One way to explore this further is to
query a domain from many vantage points and examine the IPs that are
returned. For example, let's say you ask 100 DNS servers around the
world to resolve `randomsite.com`. How many different IPs do you learn?
Where are they located? Do they all belong to the same content provider?
To investigate this, you could write a script that takes in a domain
name, queries it against a list of DNS servers around the world, and
examines the results. From here, you could potentially ask other
questions of different domains: Do all the DNS responses use similar TTL
values? Are the records signed with DNSSEC?
### Environment
You should be able to implement this from our course's container
environment.
**Notes/Resources**
- <https://public-dns.info/> curates a list of public DNS servers
around the world you can query.
- When sending DNS queries, don't send a huge number of queries to the
same server in rapid succession---otherwise you might get blocked!
Instead, wait >=100ms in between queries.
- You can map IP addresses to coarse physical locations using a GeoIP
database. For example, you can do this in Python using
[`python-geoip`](https://pythonhosted.org/python-geoip/), which
reads an IP-to-location database stored on your system. You may need
to install the database separately---this link should lead you to
instructions.
- You can make DNS queries from a script by using a DNS library (a
good Python one is [`dnspython`](https://pythonhosted.org/python-geoip/), or you can simply run shell commands from a script and parse the output (fast, but can get ugly)
## Measurement: Investigating network conditions with Mahimahi
[Mahimahi](http://mahimahi.mit.edu/) is a set of tools to emulate
various network conditions. For example, you can create links with
certain levels of packet loss and delay in order to explore how
protocols may behave under various conditions.
One way to explore this in your project could be to explore how modern
TCP implementations behave differently under various network load
conditions. How much is web page or video performance impacted? How
might different congestion control algorithms behave differently (eg.
BBR, vs. CUBIC, vs. Reno)? To report on your results, you could measure
throughput observed for transferring files of a known size, monitor time
to load web pages, play videos, etc.
### Environment
You may able to implement this from our course's container environment.
Testing different congestion control mechanism may require support from
your OS kernel, which may not be possible on all platforms---your
mileage may vary, let us know if you have questions and we may be able
to help.
**Notes/Resources**
- Mahimahi's documentation has some
[e](http://mahimahi.mit.edu/#usage)xample usages
- [`iperf`](https://iperf.fr/) is a common tool for measuring network
throughput. When supported by the OS, `iperf` supports testing TCP
using different congestion control mechanisms
[^1]: For reasons I can describe if you're interested, Mininet won't run
in our container. We do not recommend running mininet natively on
your machine, either, as it requires running ancient Python 2 code
as root.