---
tags: summer_training
---
# Process in Linux
## After course you should learn
### Concept
* Have a solid understanding how Linux treat a process?
* How does compiler convert a text code into binary executable?
* How do multiple processes interact with each other?
## Steps of Compilation
### From source to binary
Let's consider our first and simplest program **helloworld**. Have you ever been curious about how helloworld actually runs in Linux?
```c=
// helloworld.c
#include <stdio.h>
int main() { printf("helloworld\n"); }
```
```
$ gcc helloworld.c -o helloworld
```
Everybody knows how to write `helloworld.c` and compile it with `gcc` which was known **GNU C Compiler** but is renamed as **[GNU Compiler Collection](https://zh.wikipedia.org/wiki/GCC)** currently. From the time Linux was developed, compiler has played a core role in software ecosystem and it is considered as even more important componant in the future.
Following is the standard procedure of compilation. We will go through each step in detail in each section.
```mermaid
graph TD
c_code(helloworld.c)-- preprocessor -->preprocessed(helloworld.i)-- assembler -->asm(helloworld.s)-- compiler -->object(helloworld.o)-- linker -->elf(helloworld.exe)
```
### Preprocessor (PP)
In a word, preprocssing is simply string pasting. With following command, one can inspect the output file after preprocessing.
```bash=
$ gcc -E bigger.c -o bigger.i
```
Here's a simple example with a macro function. As you can see, PP basically replace argument a and b with 2 and 3.
```c=
// bigger.c
#define BIGGER(a, b) a > b? a: b;
int main() {
return BIGGER(2, 3);
}
// bigger.i
# 1 "bigger.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 341 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "bigger.c" 2
int main(int argc, char *argv[]) {
return 2 > 3? 2: 3;;
}
```
Although PP seems really stupid, it actually worth more attention if you try to write a better code. Once you master PP, it's a piece of cake for you to generate huge amount of c code from a simple macro.
```c=
// Macro technique
// 1.Include guard
// Sometimes, inclusion logic may seem too difficult to
// assure all compilation unit won't double-include
// certain herder. With include guard, it would be extremely
// easy
// lib.h
#ifndef __MY_LIB__
#define __MY_LIB__
void my_func() {}
#endif
// main.c
#include "lib.h"
int main() {}
```
```c=
// 2.Stringify
// Sometimes it would be really useful to convert variable
// to plain string.
#include <time.h>
#define TIME_FUNC(fun) \
do { \
time_t start = clock(); \
fun(); \
time_t end = clock(); \
double click; \
click = ((double) (end - start)) / CLOCKS_PER_SEC; \
printf("%f click after %s runs\n", click, #fun); \
} while(0)
void heavy_function() {}
int main() { TIME_FUNC(heavy_function); }
```
* [do-while statement in macro](https://stackoverflow.com/a/1067238/13003697)
```c=
// 3.template-like function
// _Generic function has the same behaviour just like
// template function in C++.
#define times_two(x) \
_Generic((x) \
double: two_double(x), \
int32_t: two_int32(x), \
default: no_support(x) \
)(x)
int main() { times_two(...) }
```
#### homework
Please implement a singly linked-list which has following features.
1. Support carrying float, int32_t and uint32_t in node.
2. Support following operation
+ Delete first or last node
+ Append to first or last node
+ Print list. The funciton will out each value and its type depending on node order.
```
[ INT v= 3 00]
[ FLOAT v=0.87 01]
[ FLOAT v=0.45 02]
[ INT v= 87 03]
```
> The exercise here requires participants to master several basic concept in C including:
> 1. [struct](https://www.programiz.com/c-programming/c-structures)
> 2. [enum](https://www.programiz.com/c-programming/c-enumeration)
> 3. [void pointer](https://overiq.com/c-programming-101/void-pointers-in-c/)
> [color=#5bd375]
```c=
// list.c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
enum DataType {
INT32,
FLOAT
};
typedef struct Node {
void *data;
enum DataType dtype;
struct Node *next; // this is the key!!!
} node_t;
#define append(x) ...
#define delete(x) ...
int append_int(int data, node_t *first, bool is_head);
int append_float(float data, node_t *first, bool is_head);
int delete_int(int data, node_t *first, bool is_head);
int delete_float(float data, node_t *first, bool is_head);
float get_random_fp() {
return (float)rand()/(float)(RAND_MAX/1.);
}
int get_random_int() {
return rand();
}
#define N_STEP 100
int main() {
node_t *start = NULL;
srand((unsigned int)time(NULL));
char snum[100];
for(int i = 0; i < N_STEP; i++) {
float fp = get_random_fp();
int x = get_random_int();
uint r = rand() % 8;
printf(
"[%s] %s to %s\n",
r & 1? "APPEND": "DELETE",
r & 2? ftoa(fp, snum, 2): itoa(x, snum, 10),
r & 4? "HEAD": "TAIL"
);
if (r & 1)
append(r & 2? fp: x, start, r & 4);
else
delete(rand() % 2? fp: x, start, is_head);
}
print_list(start);
}
```
### Compiler
As I mention above, though compiler technology has been developed decades, it plays even more important role in modern software stack. It often appears in somewhere you don't expect.
* [Compiler in SQLite Architecture](https://www.sqlite.org/arch.html)
* [BPF Compiler Collection](https://github.com/iovisor/bcc)
* [Deep Learning Compiler TVM](https://tvm.apache.org/docs/)
* [Google V8 Javascript Engine](https://en.wikipedia.org/wiki/V8_(JavaScript_engine))
Usually compiler is tied with Programming Language strongly. As time goes by, various modern [programming model](https://en.wikipedia.org/wiki/Programming_model) is introduced. Compiler design is refined constantly. Compiler is viewed as one of three holly grails from a programmer's perspective. It does require everyone to spend time mastering it.
#### Compiler vs Interpreter
The biggest difference between these two is about execution timing. The only job compiler will do and it is responsible to do it perfectly is to convert from human readable source code to executable binary.
On the other hand, interpreter values different part of process lifecycle. It focus on providing best user interaction. It makes user possible to perform operation on every state. For example, inspecting variable, modifying value in variable and even injecting extra function call in to context.
#### LLVM
In modern days, it is not that **difficult** to implement your own language. There is a existing tool called [LLVM](https://zh.wikipedia.org/wiki/LLVM). The core contribution of the project is that LLVM develops a set of **[Intermidate Representation](https://en.wikipedia.org/wiki/Intermediate_representation)** and make compiler easier to achieve many-to-many compilation.

```c=
// llvm-ir demo
// demo.c
#include <stdio.h>
int main() {printf("helloworld\n");}
```

#### Optimization
* [Out of order execution](https://en.wikipedia.org/wiki/Out-of-order_execution)
As seeking optimal performance switches from higher clock frequency to orchestration among multiple cores, how to make your program suitable to run with multicore becomes an big issue.
CPU instructrion pipeline ensures the behaviour of program with out-of-order excution is just like sequential execution on single core. However things become messy in multicore. It is programmer's duty to make process run on mulitcore perfectly.
* Brench Prediction
Considering following two code snippets.
```c=
// random.c
float get_random_fp() {
return (float)rand()/(float)(RAND_MAX/1.);
}
void testing_op(int *order) {
int a = 0;
for (int i = 0; i < 10000; i++) {
if (order[i])
a++;
else
a--;
}
}
int main() {
srand(time(NULL));
int random[10000]; // [0, 0, 0, 1, 0, 1, 1, 0]
int order[10000]; // [0, 0, 0, 0, 1, 1, 1, 1]
for (int i = 0; i < 10000; i++)
random[i] = rand() % 2;
for (int i = 0; i < 5000; i++)
order[i] = 0;
for (int i = 0; i < 5000; i++)
order[5000 + i] = 1;
// timeit
testing_op(random);
testing_op(order);
}
```
* Localization (Linkedlist and Array)
Considering sorting elements in both linkedlist and array, linkedlist is definitely going to be slower since it is not **[cache-friendly](https://www.stardog.com/blog/writing-cache-friendly-code/)**. To make program runs even faster, when CPU asks for data in certain memory address, it moves both required object but also the neighboring content. Therefore contiguous array would definitely perform outstanding than linkedlist does.

### Linker
As the time goes, your code repository may grow. A intuitive solution is to decouple your application into multiple unit and organize them with directory. Here comes the issue: **how to correctly reassemble them back to the one it should be?**
#### Static and Dynamic Library
There are two ways of linking components together, staticly and dynamicly. It's about timing again. In static linking, compiler aggregate all components into one huge code collection after output.
As for dynamic linking, all components rely on "**Symbols**" to link one to each other. There is a default linking directory priority in compiler and it is configurable through flag `clang -L...`.

Now let's build our own static and dynamic lib.
```c=
// dep.c
#include "dep.h"
void foo() {
printf("helloworld\n");
}
// dep.h
#include <stdio.h>
void foo();
// main.c
#include "dep.h"
int main() {foo();}
```
```bash=
# Static linking
# compile dep.c to object file
$ gcc -c dep.c
$ ar rcs libdep.a dep.o
# Link together
$ gcc main.c libdep.a
# or
$ gcc main.c -L. -ldep
$ ./a.out
```
```bash=
# Dynamic linking
$ gcc -fPIC -c dep.c
$ gcc -shared -o libdep.so dep.o
$ gcc main.c libdep.so
# or
$ gcc main.c -L. -ldep
```
So how do I choose when to use which linking. Following is the comparison between two methods. Yet please remember that nothing is absolute a common scenerio is that both static linking and dynamic linking work perfect together.
| | Static linking | Dynamic linking |
|-|:--------------:|:---------------:|
| code size | bigger | smaller |
| performance | faster | slower |
| flexibility | none | high |
#### homework
Refine linkedlist application above to move `append` and `delete` into different compilation unit.
```
-+- main.c
|
+- operation.c (delete and append)
```
#### Linker interposition
Since all linker cares about is symbol, it is able to fake the symbol and redirect the same symbol to different implementation or in proficient term, relocation. The technique is often used in profiler and debugger such as [Valgrind](https://valgrind.org/).
```c=
// malloc_count.c
#include <stddef.h>
#include <string.h>
#include <stdio.h>
#include <dlfcn.h>
// The customized malloc will be first loaded and
// executed.
void *malloc(size_t size) {
char buf[32];
static void *(*real_malloc)(size_t) = NULL;
if (real_malloc == NULL) {
// Seeking next "malloc" symbol and assign
// it to function pointer.
real_malloc = dlsym(RTLD_NEXT, "malloc");
}
sprintf(buf, "malloc called, size = %zu\n", size);
write(2, buf, strlen(buf));
// here comes the reall allocation
return real_malloc(size);
}
```
#### homework
Please use linker interposition technique to inject printing function in `malloc` and `free` to log out operation information.
### Connection with Python
As a relative senior (older) programmer, I do know the feeling of you guys facing these shitty things. Thus, it's my duty to let you guys know why it matters. Have you guys ever wonder how Python intepreter `import` certain lib? Let me give you a hint.

Yep! Since CPython is written in C, Python has perfect compatibility with C. There are several popular tool such as [Cython](https://cython.org/) and [Pybind11](https://github.com/pybind/pybind11) to write tool and is importable in Python.
Trust me it is not a guarantee that you can always live in a Python environment.
## Process, abstraction of execution unit
So far, what we learn is about how to generate an **executable ([ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format))**. In this chapter we will focus on how Linux treat a process and interaction between Operating System and process and even among processes, or InterProcess Communication.

Note: yes, there is also a format called [DWARF](https://en.wikipedia.org/wiki/DWARF). :smile:
### Process and Thread
#### Terminology
First let's clearify several confusing concept of program, process and thread.
##### Program
Program is a static collection of code without any **living** meaning. It's often referred to unfunctional corpse of application.
##### Process
Process is defined as a dynamic living entity also a minimum resource manageable and sustainable unit. Here an resource manageable unit represents a set of necessary resource for execution including memory, disk, CPU computation power and the most important code itself.

##### Thread
Thread is defined as minimum **executable** and **schedulable** unit. Compared to process, the definition of thread focuses on executing stage. Thread is the direct subject of Linux scheduler. However, thread can't live by its own.
##### Sum up
Followins are some illustration about the relation between thread and process.


### PID and TID
Although by definition thread and process are so different in nature, they are actually quite similar in terms of Linux kernel source code. In fact there is only process at the beginning of Linux and the concept of thread is affected by Windows.
Currently, the thread implementation in Linux is called Native Posix Thread Library (NPTL). Here's a simplification of `task_structt` in Linux kernel [`include/linux/sched.h`](https://elixir.bootlin.com/linux/latest/source/include/linux/sched.h#L629).
```c=
// include/linux/sched.h
struct task_struct {
// ...
pid_t pid; // L804
pid_t tgid; // L805
// ...
struct task_struct *group_leader; // L828
struct list_head thread_group; // L842
}
```
In Posix, PID should satisfy following two conditions.
1. When qurying (`getpid`) PID for different threads under a process, it should return the same value (Process ID).
2. Each thread under a process should have its own Thread ID.
However the `pid` member in L804 is actually Thread ID and `tgid` is socalled Process ID.
```bash
$ ps -eLf
```

### Process lifecycle

#### Fork

```c=
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
void forkexample()
{
// child process because return value zero
if (fork() == 0)
printf("Hello from Child!\n");
// parent process because return value non-zero.
else
printf("Hello from Parent!\n");
}
int main()
{
forkexample();
return 0;
}
```
##### [Copy-On-Write](https://en.wikipedia.org/wiki/Copy-on-write)

##### PID Tree

* pid_0
`swapper` is responsible to perform paging.

* pid_1
`init` is responsible to turn on/off components in Linux kernel.
#### exec
`exec` replace the original process and runs other text from file. The key is that the replaced process still owns the original PID. In other world, we successfully breakthrough the obstacle of sharing same code between parent and child process.

### Process interaction
In Linux everything is a **file** including process ifself. Here what I mean `file` is that it is manageable through FileSystem ([procfs](https://man7.org/linux/man-pages/man5/procfs.5.html)) such as `cat`, `ls` ...

#### [Pipe](https://man7.org/linux/man-pages/man2/pipe.2.html)

* [Send info from one process to another](https://jan.newmarch.name/ProgrammingUnix/ipc/lecture.html)
```c=
#include <stdio.h>
#define SIZE 1024
int main(int argc, char **argv) {
int pipefd[2];
int nread;
int pid;
char buf[SIZE];
if (pipe(pipefd) == -1) {
perror("pipe failed");
exit(1);
}
if (pid == 0) {
/* child */
close(pipefd[1]);
while ((nread = read(pipefd[0], buf, SIZE)) != 0)
printf("child read %s\n", buf);
close(pipefd[0]);
} else {
/* parent */
close(pipefd[0]);
strcpy(buf, "hello...");
/* include null terminator in write */
write(pipefd[1], buf, strlen(buf)+1);
close(pipefd[1]);
}
exit(0);
}
```
## Reference
* [Linking and Loading](http://www.cs.fsu.edu/~baker/opsys/notes/linking.html)
* [How a compiler works](https://www.slideshare.net/jserv/how-a-compiler-works-gnu-toolchain)
* [你所不知道的 C 語言:前置處理器應用篇](https://hackmd.io/@sysprog/c-preprocessor)
* [你所不知道的 C 語言:動態連結器篇](https://hackmd.io/@sysprog/c-dynamic-linkage?type=view)
* [你所不知道的 C 語言:編譯器和最佳化原理篇](https://hackmd.io/@sysprog/c-compiler-optimization?type=view)
* [你所不知道的 C 語言: 執行階段程式庫 (CRT)](https://hackmd.io/@sysprog/c-runtime?type=view)
* [Linux 環境編程 從應用到內核](https://leonawang.com/books/Linux%E7%8E%AF%E5%A2%83%E7%BC%96%E7%A8%8B%EF%BC%9A%E4%BB%8E%E5%BA%94%E7%94%A8%E5%88%B0%E5%86%85%E6%A0%B8%20%28Linux-Unix%E6%8A%80%E6%9C%AF%E4%B8%9B%E4%B9%A6%29.pdf)
* [讀書會](https://hackmd.io/@VIRqdo35SvekIiH4p76B7g/rJVila1o?type=view#%E6%B4%BB%E5%8B%95)
* [Operating systems - Spring 2018](http://www.it.uu.se/education/course/homepage/os/vt18/)