Lecture 05 - Processes and Threads

Lecture 05 - Processes and Threads #### Process sometimes called task is the execution or instantiation of a program on a processor, it is a dynamic sequence of actions combined with corresponding changes in state and represents the entire status information of the resources of a program. **Virtual processors** - In multiprogramming environments, they play a crucial role by providing an illusion of concurrency for each process or thread. These virtual processors are not limited to specific scenarios but are commonly associated with various forms of virtualization. Assigned by the operating system, each process receives a virtual processor, enabling parallel processing when multiple threads are assigned their own computed cores. While appearing concurrent, each real processor is sequentially assigned to one virtual processor at a time, necessitating process switchovers managed by the operating system. This dynamic allocation facilitates improved resource utilization, isolation between processes, and flexibility in managing computing resources, though challenges such as overhead and potential contention for physical resources also exist. ![[Pasted image 20240217103918.png]] **Process context** - contains complete status information about process it contains, Heap Data, Program Code, Stack, File Info/Access Rights/, Kernel stack (which is stack for sys calls of the process) and integrate hardware context which have CPU registers and Memory Management Unit (MMU) ![[Pasted image 20240217104419.png]] **Process Tables** - are managed by OS and it keeps information that process management requires for process, they are stored in tables or in list of tables. Entry in process table is also called **Process Control Block** (PCB) which keeps track of many important process information like: - Program counter - Proccess state - Priority - Processor time consumed since the process started - Process number (PIB) and parent process PID - Assigned resources like files and it's meta data - Current register contents **Process context switching** - is the mechanism by which a computer's operating system can rapidly switch the central processing unit (CPU) from one process to another. This occurs when the operating system decides to interrupt the currently running process and start or resume the execution of another. Context switching involves saving the current state, or context, of the running process, including its register values, program counter, and other relevant information, into a data structure known as the process control block (PCB). The operating system then loads the context of the next process to be executed from its PCB into the CPU. This seamless transition allows multiple processes to share the CPU, giving the illusion of concurrent execution and providing effective multitasking in a time-sharing system. In following example process A hardware context is saved into its PCB, then it loads hardware context of process B from its PCB into the hardware and then returns back ![[Pasted image 20240217105033.png]] **Process life cycle** - starts by: 1. creating a procces via System call like `fork()` in Unix and `CreateProcess()` in windows, which then get's assigning process identification PID and allocate real processor, main memory and other resources. 2. Loading program code and data into memory 3. Creating PCB in the process table 4. Load process context and start process Process can terminate in couple of maniers, - Normal exit - Error exit - requested by programmer or fatal error - Terminated by another process (killed) The states that process go trough their life time are: 1. Activate - when OS selects the process 2. Disable, preemption - when OS selects another process 3. Blocked - when process is waiting for input or resource to be freed 4. Blocking reason lifted - resource available 5. Process termination or fatal error ![[Pasted image 20240217180316.png]] ##### Process management in Unix Unix organises process structure into a tree-like data type with clear process hierarchy. Each process receives unique process ID **PID** from OS. Special processes under Unix like: - **scheduler** have PID 0 is responsible for managing the execution of processes, determining which process runs next, and allocating CPU time to them. It plays a crucial role in the overall performance of the system. - **swapper/Page Daemon (swapper, swapd)** manages swapping pages of memory between RAM and disk to optimize memory usage. It is involved in the virtual memory subsystem of the operating system. - **init** process is the first process that gets executed by the kernel during the system boot process. It has the process ID (PID) 1 and is responsible for initializing the system and starting other processes. In modern Unix systems, init has been replaced by newer init systems like systemd. In unix process is replaced by a `fork()` method, which is a called by a parent process, then child process inherits its environment as a copy as well as: - all open files and network connections - environment variables - current working directory - data areas - code areas - .... Through the system call `execve()` a new program can be loaded in the child process. The pseudocode for process management looks like following ``` public static void main() { int ret; int status; pid_t pid; // Return code from fork, // Status of the child process // pid_t is a special data type for a PID ret =fork(); // child process is created // Fork() returns 0 in the child process, and PID for the caller if (ret == 0) { // Statements to be executed in the child process . . . exit(0); // Terminate the child process with status 0 (ok) } else { // Instructions that should only be executed in the parent process // Only the parent process comes in here at expiration time // (Return code = PID of the child process) . . . pid =wait(&status); // Wait for a child process to end exit(0); // Terminate the parent process with status 0 (ok) } } ``` In practice each terminal has a getty process waiting for an input or login, which after successful login a shell process will be opened and each command usually executes in its own process ![[Pasted image 20240217182847.png]] Every process except the init process has a parent process, there for exist an anomally where finished process can still exist until its parent recieves a signal of its termination those are called zombie processes which are terminated processes that still have an entry in the process table. They exist briefly after a process has terminated but before the parent process retrieves its exit status. The init process typically reaps these zombie processes. When a parent process dies before its child processes, and the child processes continue to execute independently, these orphaned child processes are typically adopted by the init process. The init process, with process ID 1, is the ancestor of all processes in Unix-like operating systems. When a parent process terminates, the orphaned child processes are reparented to the init process. ![[Pasted image 20240217183159.png]] **Process Management under Windows** - process creation in windows is more complex than in Unix, function `CreateProcess()` is used and it assigns to each new process a management PID. **POSIX fork()** is mechanism which also works under windows in POSIX process `fork` in POSIX systems and `CreateProcess` in Windows, the end goal is the same: creating a new process. Developers working across different platforms often need to account for these differences when writing cross-platform code. #### Threads are lightweight processes, uses common resources as process with shared address space open files and network connections, it literally exists inside a process but can coexist with other threads too in which case is called Multithreading. Thread has it's own **state machine** similar to processes, they can also access different levels like user/kernel level and threads are not protected against each other meaning they are allowed to communicate between one another, but with required synchronisation. As they have their own state machine threads also have their own program counter as well as logs and register set and its own stack ![[Pasted image 20240217184443.png]] Threads can be implemented on: - **kernel level** - which are threads managed totally by kernel where each thread is separate entity even from process and is scheduled and managed independently. They can take advantage of multiprocessor architecture since kernel manages them individually. They also cause big overhead in context switching causes higher overhead then compared to User Level Threads, hanse making them less portable. - **user level** - are threads managed by user level thread libraries or runtimes without any kernel involvement they live in processes, they are lightweight since thread creation, context management and synchronisation is managed in user space, therefore making the ultra portable. Although they have limited concurrency ![[Pasted image 20240217190122.png]] Threads are not mapping one per process there can exist multiple headers in one process. The mapping of user-level threads (ULTs) to kernel-level threads (KLTs) is crucial for the effective and efficient execution of multithreaded applications. This mapping, often referred to as thread management or threading model, determines how user-level threads are scheduled and executed by the operating system kernel. Threads exists because of their context switching is much faster then process-context change which allows for parallelisation of the process which needs to be programmed accordingly. One example would be a: *a thread that listens for network connection requests while other thread performs calculations while other takes care of the user interface and I/O like keyboard, output screen etc* this way is used in multi CPU systems as well as for web servers, where dispatcher thread waits for incoming HTTP requests and multiple worker threads process the requests. ![[Pasted image 20240217190709.png]] ``` Dispatcher() { while (true) { r = receive_request(); // Wait for incoming requests start_thread(workerThread, r); // Request arrived } } workerThread(r) { // Thread for request processing a = process_request(r); reply_request(a); // Reply back to requestor } ``` **Process/Thread Management on Windows** - is quite unique since Windows groups the processes into jobs, managed as a unit which have quotas such as maximum memory usage per process and maximum number of processes. Process in windows have containers for storing resources like threads and memory, while thread is scheduling unit, and fiber is lightweight thread managed by the user ![[Pasted image 20240217191356.png|inlR|400]] #### Threads in the runtime systems A runtime, or runtime system, is a set of software tools and libraries responsible for supporting the execution of programs written in a specific programming language. Runtime systems manage the creation, scheduling, and synchronisation of threads, enabling parallelism and concurrency within a program. The use of threads in runtime systems enhances the performance and responsiveness of applications by allowing tasks to be executed concurrently, often making efficient use of multicore processors. **Threads in Java - JVM and Threads** For each separate program one JVM has started which rans on OS process and which support Threads within package `java.lang` with much simplified state machine ![[Pasted image 20240217192549.png|inlR|500]] 1. Constructor call of the Thread class 2. Calling the `run()`method 3. Calling of the `stop()`method 4. Calling of the `sleep()` method 5. Calling of the `resume()` method 6. Calling of the `yield()` method The translation process and flow of Java program is envisioned as in following sketch ![[Pasted image 20240217192757.png]] **Thread class** and **Runnable Interface** - are classes in Java that enable concurrency they allow to define our own class that is derived from Thread with method `run()` or it implements interface Runnable and implement method `run()` ![[Pasted image 20240217192923.png|implR|500]] ```java import java.lang.Thread; class myThread extends Thread{ String messageText; public myThread(String messageText){ this.messageText = messageText; // My Thread class } public void run(){ //Method that carries out the actual action for (;;) { System.out.println("Thread " +getName()+ ": " +messageText); try { sleep(2000); } catch (Exception e) { /* Handle exception */ } } } } public class myThreadTest{ static void main(String args[]){ myThread t1; t1 = new myThread("...up and down again and again..."); t1.start(); // automatically starts run() if (t1.isAlive()) { for (int i=0; i < 10000000; i++) { try { t1.join(10000); // waits 10000 milis and continues // without param it will wait until thread dies } catch (InterruptedException e) { /* Handle exception */ } } } } } ``` Other methods are: - `getPriority()` - determine thread priority check - `isAlive()` - determines whether tread is alive - `getThreadGroup()` - get thread group of of thread - `interrupt()` - interrupt thread - `getName()` - get's name of thread In Java treads are organised in hierarchy as groups, where **system** group is for threads of the JVM system and **main** for user specific threads as a subset of system. Group **system** also contains **finalizers** which calls the finalizer method for objects to be released. Garbage collection is very low priority thread that waits for signal from idle thread Idle thread sets mark when it starts running that garbage collection is watching which is indicators that JVM has nothing else to compile and garbage collector can clear the memory. ![[Pasted image 20240217200323.png]]