or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Syncing
xxxxxxxxxx
Fundamentals of Operating System (Day 1 - Afternoon Session)
We must choose a suitable Operating System and hardware to succesfully deploy our server
html.js
orserver.js
. Surely we can deploy it on our personal computer, but that's not an ideal machine to use for long-term servers (unreliable, lack of proper security measures, not to mention expensive to maintain).As such we typically rely on existing services, like the AWS EC2, Google Compute Engine, DigitalOcean Droplets, Azure VMs among many others. All of these services are web services that provides secure, resizable compute capacity in the cloud. They're ideal to be used to run our server code reliably.
Amazon Web Server setup (50 mins)
For the purpose of this course, we will be using AWS. We have created an account for each of you. Log in to your AWS account and go to EC2 homepage
Then, choose a region of your choice. Remember this region because that's where your instance is hosted. In this example, we use "Ohio".
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →Right now we do not have any instances yet, so let's create one. Click on Launch Instances, and select
Ubuntu Server 20.04 LTS
for the Operating System (also known as machine image) option:- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →For the instance type, select
t2.micro
. It is free tier eligible. The instance type simply defines the hardware capacity of our computer.- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →Click "Review and Launch" to immediately launch the Instance. We can do other settings later. You will see this page:
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →When you click "Launch", you will be prompted with key-pair generation. Select create new pair and give it a name, then download the
.pem
file.- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →This is important if you want to remotely access your AWS EC2 instance via
ssh
. We will do this later. For now, you will see that you have one instance in the dashboard:- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →Click on the instance and then click the connect button on the next page:
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →Connect to EC2 using InstanceConnect
AWS EC2 supports direct access to your instance via the web browser.

Clicking on "connect" opens a new tab that shows the command prompt of your newly made instance. From here, you can type various commands and use the computer as per normal.

There's no GUI here, we are simply accessing the Operating System services via the command line interface (CLI). We have been doing quite a bit of that earlier to run
node
and variousgit
commands, but we will learn more about the CLI and OS real soon.Connect to EC2 using SSH client
If you wish to use your own SSH client, then you can follow these steps:
chmod 400 <filename>.pem
ssh
command to your instance. The details can be found in the SSH Client tab:The output should looks similar to that of EC2InstanceConnect:

Connect to EC2 using Cloud9
Another convenient and recommended way to connect to your EC2 instance is to use AWS Cloud9. Cloud9 is a free web IDE that allows you to connect and access your EC2 instances. Select Cloud9 services from the "Services" tab of your AWS console:

Afterwards, click on create environment:

Give your environment a nice name:

Configure the environment using your EC2 username (which will be

ubuntu
by default) and hostname:The EC2 hostname can be found in the EC2 dashboard we saw earlier

Click "Copy key to clipboard" at Cloud9 environment setting, and head to your EC2 InstanceConnect terminal or your own terminal (that has SSH access to your EC earlier) and paste in the following commands in succession:
The first two commands updates the Ubuntu OS in your EC2. The third command adds the Cloud9 public key to your EC2
~/.ssh/authorized_keys
file. The complete command will look something like this:You can read the file by typing the command:
Finally, the fourth command is to install
node
into the EC2. Cloud9 requiresnode
to run. You might be faced with warnings as such to restart some outdated daemon. Simply select (navigate with arrow and press space to select) all and tab to theOK
to restart.Once you're done, head back to the Cloud9 dashboard and click on "next step":

This will bring you to the next page where you can simply click "Create Environment". You will then be prompted with this:

You can click "next" to allow Cloud9 to install manually, but that's rather hassling and you might be met with unexpected errors. It's best to install this manually.
Right click on the C9 install link and get the link address: https://d1q2hgnv37wylw.cloudfront.net/static/c9-install.sh
Head back to your EC2 terminal in InstanceConnect or your own SSH client. Type the following commands in succession:
The first command downloads the shell script
c9-install.sh
from the cloudfront website. We then change the file permission of this script into executable. The next two commands install the necessary libraries (Python and some basic utilities – generally includes the GCC/g++ compilers and libraries). The last command executes the script to install C9 into your EC2.Wait for a few minutes and once it is done, return to your Cloud9 webpage and click refresh to relaunch the IDE. You can also find your environments in the Cloud9 dashboard:

The IDE looks like this:

If you're faced with outdated package warning e.g:

tmux
, just clickUpdate
:Right now, you only have one file in your Root directory, which is the
install.sh
script you downloaded earlier usingwget
. With this, you're all connected with a convenient editor GUI to traverse and manipulate the EC2 filesystem.One last thing to do is to run the code you pushed at the remote github repository in this EC2.
Exercise: Run the code in your EC2 instance (5 mins)
You can

git clone <repository-url>
your EC2 instance. You should have this at the C9 IDE and run the serverhtml.js
:But wait! How do we access them? We cant just do http://0.0.0.0:8000 in our browser because this is now hosted at another machine (our EC2).
This means we must find out the PUBLIC IP address of our EC2 instance. It is somewhere in your instance's' EC2 dashboard (IPv4). Find it.
Afterwards, construct the url:

http://<your-EC2-public-IPv4>:8000
. We need to also tell our EC2 to accept incoming traffic. Go to the security tab in your instance's EC2 dashboard, and click on the security group entry highlighted in blue:Click on "edit inbound rules" in the new page:

Then add the new rule to allow custom TCP connection from any IPv4 address (so the public can access this server) and give it a nice description:

After saving the inbound rules, try the url to your server in your computer's browser. You should see the nice webpage as before, but this time round it hosted from your EC2:

Exercise: make changes to your file and pull from the EC2 instance (5 mins)
Open index.html in your local repository and make some small changes to index.html:
Save it and

push
to the remote repository (of course it's implied that you need toadd
thencommit
). Afterwards, attempt topull
it on the EC2 instance, and re-serve the webpage. Refresh the site's URL in your web browser should see this new text instead:Similarly, if you want to
push
directly from the EC2 instance, you need to generate a new personal access token and save it in the EC2's github login credentials.Find out how to do this (10 mins). You can review the previous notes for hints.
Possible error: EADDRINUSE
If you're met with certain errors like this when running node for the second time:

..it means that there's a process that is using that exact port 8000. We need to manually kill the process by typing the command:

ps aux | grep node
. This lists out all processes with the name "node":The first entry indicates that there's a
node
process that still runshtml.js
. We need to kill that process by thekill
command:kill -9 <pid>
wherepid
is the process id number (51934) in this case. Afterwards, you can runnode html.js
again.Introduction to Operating System (60 mins)
Now that we have successfully "deployed" our web server code onto the EC2 instance, it is time to understand a little bit about what's going on under the hood. In particular, to give some light to these questions:
Let's begin with the Basics of Operating System.
The Operating System
An operating system is a special program that acts as an intermediary between users of the computer and the computer hardware.
The goal of an operating system is such that we have a dedicated program to fulfil the following essential roles:
Once we have an operating system, it makes things easier for users to use a program / code another program for other purposes within a computer system.
There are a lot of things that make up an operating system, but they are generally divided into three categories:
The OS provides an environment such that user programs such as the text editor, web browser, compiler, database system, music player, video editor, etc can do useful work.
This special program is part of the operating system called the kernel. It provides essential services, such as interprocess communication and file system management.
The Kernel
The Kernel is a software, which forms the heart of an operating system. Its size varies greatly depending on the architecture, for example,
One of the most famous kernels that are used by many OS is the Linux Kernel. A few examples of Linux distributions (an operating system made from a software collection that is based upon the Linux kernel) are Ubuntu, Debian, Fedora, Android, and Chrome OS among many others.
See for yourself the code for the Linux Kernel originally developed by Linus Torvalds here.
Operating System Services
Operating System User Interface
The OS interface gives users convenient access to various OS services. They are programs that can execute specialised commands and help users perform appropriate system calls in order to navigate and utilise the computer system.
There are two general ways for users to conveniently access OS services:
The OS GUI
The OS GUI is what we usually call our “home screen” or “desktop”. It characterises the feel and look of an operating system. We use our mouse and keyboard everyday to interact with the OS GUI and make various system calls:
The Command Line Interface
The OS CLI is what we usually know as the “terminal” or “command line”.
A command-line interface (CLI) is a means of interacting with a computer program where the user issues successive commands to the program in the form of text. The program which handles this interface feature is called a command-line interpreter.
Command Line Interpreter
The particular program that acts as the interpreters of these commands are known as shells. Typical OS might come with multiple command line interpreters. A user may choose among several different shells, including the Bourne shell, C-shell, Bourne-Again shell, Korn shell, and others. Users typically interact with a shell via a terminal emulator, or by directly writing a shell script that contains a bunch of successive commands to be executed.
Common shells that we may have encountered:
Bourne-Again shell (bash): written as part of the GNU Project to provide a superset of Bourne Shell functionality. This shell can be found installed and is the default interactive shell for most Linux distros and macOS systems

Z shell (zsh) is a relatively modern shell that is backward compatible with bash. It's the default shell in macOS since 10.15 Catalina.

PowerShell – An object-oriented shell developed originally for Windows OS and now available to macOS and Linux.

UNIX-like Operating System
The commands that are valid for macOS might not be valid for Windows user. For instance, the command
ls
can be used in macOS to list out the items in the current directory, but the same command wont work on Windows. The commanddir
must be used instead.For practical purposes, there has to be a coherence beteween various OSes such that at least they confirm to the same sets of commands. Therefore a family of UNIX-like operating system was born.
POSIX
Unix was selected as the basis for a standard system interface partly because it was "manufacturer-neutral" (they are able to use the technology best suited to your needs at any time). However, several major versions of Unix existed, so there was a need to develop a common-denominator system.
The standard of which these UNIX OS comform to is called the POSIX standard.
A brief history:
The POSIX specifications for Unix-like operating systems originally consisted of a single document for the core programming interface, but eventually grew to many separated documents. The standardized user command line and scripting interface were based on the UNIX System V shell.
Many user-level programs, services, and utilities (including
awk
,echo
,ed
) were also standardized, along with required program-level services (including basic I/O: file, terminal, and network). POSIX also defines a standard threading library API which is supported by most modern operating systems. In 2008, most parts of POSIX were combined into a single standard.Ubuntu, macOS, and SolarisOS are all UNIX-like and POSIX compliant. Therefore similar sets of UNIX commands can be used to access the services of all these OSes. Windows however is not UNIX-like, resulting in it having an entirely different sets of commands.
Here's a list of common UNIX commands. We have seen some of them before:

And here's a list of common Powershell commands:

Commands
Through the CLI, we can conveniently type out commands such as
git
,echo
,node
, etc. Some commands are "default" (comes with the OS without the need to install anything else), and some requires installations (likegit
).So how exactly do these commands work? By now we know that the commands that can be interpreted highly depends on the OS type (UNIX-like, or not). The shell primarily interprets a command from the user and executes it. There are two ways to implement commands:
In short, when we install something new, like
python
, its binary (executable) is installed at any of these paths, e.gusr/bin/python
.python file.py
, our shell attempts to findpython
program (name must be matching) in any of the paths shown in the output ofecho $PATH
python
in any of these paths, the shell will execute that program with the argumentfile.py
command not found
will be shownHere's a very handy and useful website to learn what a particular command do, just simply paste a command such as
git log --graph --all --oneline --decorate
to it and observe some magic in action:Environment Variable
What if we want the shell to search in other locations? We can modify the
PATH
environment variable in a file in your home directory called.zshenv
(if you use zsh), or.bash_profile
(if you use bash).For instance, suppose we write and compile a C program that prints "Hello World" as follows:

We tell zsh to the execute the program from
this directory
, signified as a "dot"./nat-cprog
, thus printingHello, World!
. Attempting to execute the program name directly will result incommand not found
error because zsh doesn't know where this program is located.We can then add this current

~/Desktop/SUTD Acad
path to thePATH
environment variable, and this allows zsh to execute the program as a command:Processes
There are hundreds of processes running in our computer at any given time. You can type the command
ps -ef
to list all running processes in the system.A process is formally defined as a program in execution. A program is not a process.
A process couples two abstractions: concurrency and protection. Each process runs in a different address space and sees itself as running in a virtual machine – unaware of the presence of other processes in the machine.
Important: Each process is given a
pid
(process ID). We can give commands to our Kernel via the CLI tokill
any existing process using thepid
. This is extremely handy to do if we need to kill a process that hangs or takes too much resources during runtime.OS Service: Interprocess Communication with Message Passing
Every message passed back and forth between writer and reader (server and client) must be done using kernel’s help. One of the ways is via message passing (the other is using shared memory).
*Socket is one of message passing interfaces. *
A socket is one endpoint of a two-way communication link between two programs running on the network:
In short, socket can be used for two processes in the same machine to communicate locally, or two processes in different machine to communicate (over the internet):
Port
A port is a virtual point where network connections start and end. Ports are software-based and managed by a computer's operating system. Each port is associated with a specific process that listens to it.
Ports allow computers to easily differentiate between different kinds of traffic: emails go to a different port than webpages, for instance, even though both reach a computer over the same Internet connection.
Ports are standardized across all network-connected devices in our computer, with each port assigned a number between 0 to 65536. Most ports are reserved for certain protocols — for example, all Hypertext Transfer Protocol (HTTP) messages go to port 80. Samples of standard ports include:
8080
, we need to specify it when keying the url in the address bar like we did earlier.172.217.194.102
. We can access google by typinghttps://172.217.194.102
. We can also access Google by typinghttps://google.com
.google.com
to172.217.194.102
so that our queries to Google can be routed properly. The system that is responsible for this hostname-address translation is DNS. We will learn about it more later today.As a summary, IP addresses enable messages to go to and from specific devices, port numbers allow targeting of specific processes within those devices.
OS Service: The File System
Another important service provided by our OS and its Kernel is file system management. A file system controls how data is stored and retrieved in a system. It is a set of rules (and features) applicable to determine the way the data is stored. A physical disc can be separated into multiple physical partitions, where each partition can have different file system.
The purpose of a file system is to maintain and organize secondary storage.
Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stops and the next begins.
The File System operates using specific data structure and has a specific format. Its interface is part of the OS, so they vary between operating systems.
Examples of common file system includes:
A File
The File System manages collection of Files in our storage device.
A file is a logical storage unit in the computer, defined by the operating system. In layman terms, a file is a group of data bytes which are stored neatly in a known location with a unique name (path).
Consider the file with name:
classrecording.mp4
below. The file consists of file attributes and file content. File attributes contain important information such as name, size, datetime of creation, user ID, etc. Its content is essentially a group of data bytes (~536MB). When we use this file, we don’t really care about its physical address (where it is actually stored on disk).We only care about its path. That’s what we mean by “logical” storage unit.
Directories
Directories are: metadata that organizes files in a structured name space. In layman terms, directories are lists of names assigned to each file. You can think of it like a mall directory, where we have names and mall location, e.g: #03-68. The mall location in the case of a File System contains the ID of that file so that our computer can know where our file content is physically located.
We can change the names of each file by changing the content of the directory, while keeping the ID the same. This is analogous to changing the name of a Shop (we need to change the mall directory), but the "address" within that mall is still the same, e.g: #03-68.
Note that directory is very similar to the definition of a folder. However folder is a GUI concept, associated with the common folder icons to represent collection of files. If you are referring to a container of documents, then the term folder can be used – related to the GUI. The term directory refers more broadly to the way a structured list of document files and folders is stored on the computer.
Windows vs Linux File System
When we compare file system in Windows and Linux, in Microsoft Windows, files are stored in folders on different data drives like C: D: E:. But, in Linux, files are ordered in a tree structure starting with the root (denoted with forward /) directory.
This root directory can be considered as the start of the file system, and it further branches out various other subdirectories. A general tree file system on your UNIX-like OS may look like this:
The full path of the folder
Documents
is/Users/Ubuntu/Documents
. A file located inside Documents, for example:homework.py
has a full path of/Users/Ubuntu/Documents/homework.py
Working directory
The working directory of a process is dynamically associated with each process. Our currently running shell is also a process. It gives the "starting point" for a process to navigate the file system.
When the process refers to a file using a simple file name or relative path (as opposed to a file designated by a full path from a root directory), the reference is interpreted relative to the working
/Users/Ubuntu/Documents
that asks to create the filefoo.txt
will end up creating the file/Users/Ubuntu/Documents/foo.txt
.We can change the current working directory of our POSIX-compliant shell with the command
cd
, and see the current working directory with the commandpwd
. The commandls
lists out all files in the current working directory (one level).When
ls
was first executed in the above example, the current working directory is at/Users/natalie_agus/Desktop
, thus listing all first-level files at that path. The second execution ofls
lists different outputs because it was executed at a different working directory/Users/natalie_agus/Desktop/SUTD Acad
.Summary
We have learned a lot in just a few hours, and the information might be overwhelming. It is useful to search for terms that are unclear, such as virtualisation and memory to piece up the knowledge. The purpose of all the knowledge above is to help you utilise OS services better:
git
andnode
, and know how to troubleshoot (we will have more fun with shell scripting in Day2)Next, we will learn basics of Computer Networks to have a basic idea on what was going on when a data is transferred between two machines via the Internet. There exists many protocols to ensure smooth delivery of data between the client and server, much like the protocols for package delivery.
Follow this link to proceed.