Lecture 06 - CPU scheduling

### Lecture 06 - CPU scheduling **Important terms** **Service / operating time** - is time in which process holds the CPU and can work **Waiting period or queueing time** - is time that a process has to wait for execution which is the sum of all periods in which a process has to wait **Turnaround Time** - is total time a process is in system which is service + waiting time **Response Time** - is time in which user has to wait for an request to be processes **Throughput** - is number of requests that a system can process in certain time **CPU usage** - is CPU utilisation while processing processes as percentage of the total capacity **Scheduler** - is component that handles the planning of resource allocation **Dispatcher** - is component that is responsible for the process change and work distribution ![[Pasted image 20240217201011.png|inlR|500]] CPU allocation is highly depending on the scheduling unit which can be: - Process-based - when CPI is only allocated to entire processes, it can also depend on process type where processes can be computing intensive or I/O intensive - Thread-based - when CPU is allocated to threads **Time sharing** or **time slicing method** - is CPU withdrawal by operating system if running process waiting for an event or an running process used the processor for certain amount of time, meaning that in specific **time slices** which are 10ms or 100ms depending on CPU and OS CPU is checked if the process is too much computing. **Scheduling Criteria and Goals** of allocation of time slices, like: - **fairness** - each process receives guaranteed minimum allocation - **efficiency** - makes sure that CPU is used to the fullest - **response time** - should be minimised - **throughput/turnaround time** - waiting times should be short as possible - **throughput** - should be maximised so that OS executes - **reliability/predictability** - realtime requirements ![[Pasted image 20240217201541.png|inlR|500]] #### Scheduling Procedures **Non-preemeptive** scheduling is also known as Run-To-Completion procedure which makes sure that old methods in process are not interrupted until they are done, and can only be canceled by CPU, but it is not efficient for real-time processing, but are excellent for batch processing like in MS-DOS **Preemptive** scheduling or displacing is also known as priority interruption which compute ready processes are suspended, making it suitable for competitive users where time slice technology is required Some of the scheduling algorithms are: - **FCFS** or **FIFO** - First Come First Served - First In First Out - **Shortest Job First** SJF - **Shortest Remaining Time Next** SRTN - which is preemptive since it processes with the shortest remaining computing time in the system are prioritised Some strategies allow to starve processes, meaning that some processes never come to their time in CPU in case of SRTN many long lasting processes never take a turn Interactive system/processes tent to integrate different algorithms for scheduling like: - **Round Robin Scheduling** (RR) - which is FCFS in conjunction with time slice, the performance of it highly depends on length of the time slice, we also have to take into account that context switching also takes time where switchover to time ration must not be too small hanse we will have poor throughput, good value should be aprox. between 10 to 200ms - **Priority Scheduling** (PS) - is highest priority process next which handles dynamic and static priorities as well as combination of it - **Shortest Remaining Time Next** (SRT) - **Lottery/Random Scheduling** - **Fair share scheduling /guaranteed scheduling** - where each process get uniform CPU usage - A combination of several algorithm is possible and usually it is used RR with Priority Scheduling, in this combination time slice should be set in comparason to a context switch costs so for 1ms context switch it makes sense to have time slice of 10ms having context switch make an impact of 10% ![[Pasted image 20240217204128.png]] Additionally priority need to be set static when process starts or adaptive which is also referred to as relative prioritisation. **In ideal case** would be dynamic method with initial definition of time slice length but adjustable where for computationally intensive processes we will reduce slice and in I/O processes longer quantum, where process priorities are statically preset and are also subject to dynamic change in combination with multilevel feedback scheduling. Management of process data time slice length and priorities and cyclical adjustments are required. **Multi-level scheduling** - is handling multiple queues with multiple types of jobs of different priorities **Multi-level feedback scheduling** - means that a process can also move to higher/slower order in queue ![[Pasted image 20240217204659.png]] **Scheduling in real-time systems** - needs to be fast and predictable reactions required for upcoming events, where needs to be differentiated between: - **hard real time systems** - where reaction need to be as quick as possible with fixed deadline - **soft real time systems** - allows delays but to be within targets In those system we can differentiate between: - **static** scheduling algorithm - where scheduling decision is known in advance - **dynamic** scheduling algorithm - where scheduling decision are made at runtime | Real-time OS: Embedded Linux, etc... | Universal OS: Windows, MacOS, Linux | | ---- | ---- | | Supports hard real-time requirements | Supports only soft real-time requirements | | Is on the worst case optimized | Optimisation for support various use cases and reaction time is not in foreground | | predicatability has high priority when selecting the next task | efficent scheduling with fair service from all processes especially from dialogue processes is usually in the foreground | | Usually few dedicated functions are supported | Many applications ca run on one system | | Time optimisation important | Throughput and Response time behaviour are important | #### Example of scheduling algorithms 5 processes from A - E with estimated execution times in ms. Need to determine average execution time in system for following algorithms (context switch is neglected): 1. Priority scheduling | job | A | B | C | D | E | | -- | -- | --- | --- | --- | --- | | Time | 10 | 6 | 4 | 2 | 8 | | Priority | 3 | 5 | 2 | 1 | 4 | Execution: | job | B | E | A | C | D | | ---- | ---- | ---- | ---- | ---- | ---- | | Execution time | 6 | 14 | 24 | 28 | 30 | Total: 102 Average execution time = 102 / 5 = 20.4ms ![[Pasted image 20240217205854.png|inlR|500]] 2. FCFS | job | A | B | C | D | E | | ---- | ---- | ---- | ---- | ---- | ---- | | Time | 10 | 6 | 4 | 2 | 8 | | Priority | 3 | 5 | 2 | 1 | 4 | Execution: | job | A | B | C | D | E | | ---- | ---- | ---- | ---- | ---- | ---- | | Time | 10 | 16 | 18 | 22 | 30 | Total: 96 Average execution time = 96 / 5 = 19.4ms ![[Pasted image 20240217210112.png|inlR|500]] 3. Shortest Job First | job | A | B | C | D | E | | ---- | ---- | ---- | ---- | ---- | ---- | | Time | 10 | 6 | 4 | 2 | 8 | | Priority | 3 | 5 | 2 | 1 | 4 | Execution: | job | D | C | B | E | A | | ---- | ---- | ---- | ---- | ---- | ---- | | Time | 2 | 6 | 12 | 20 | 30 | Total: 70 Average execution time = 70 / 5 = 14.ms ![[Pasted image 20240217210255.png|inlR|500]] 4. RR with priorities | job | A | B | C | D | E | | ---- | ---- | ---- | ---- | ---- | ---- | | Time | 10 | 6 | 4 | 2 | 8 | | Priority | 3 | 5 | 2 | 1 | 4 | Execution: | job | A | E | D | B | C | | ---- | ---- | ---- | ---- | ---- | ---- | | Time | 10 | 18 | 20 | 26 | 30 | Total: 104 Average execution time = 104 / 5 = 20.8ms ![[Pasted image 20240217211034.png|inlR|500]] In summary SJF is the best if you only consider time spend processing **Exercise 2** A CPU scheduler supports priority-controlled thread-based scheduling with static priorities and manages the threads with “ready” status in a multi-level queue structure (run queue). The time slices (quanta) of all threads in a queue with higher priority are always processed completely before the next queue with lower priority is processed. The following table shows the currently ready threads A to G with their static priorities and the remaining running times in milliseconds - Priority 1 is the highest, priority 3 is the lowest priority | thread | A | B | C | D | E | F | G | | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | | priority | 2 | 1 | 3 | 1 | 2 | 3 | 3 | | remaining running time in ms | 300 | 200 | 200 | 200 | 300 | 200 | 200 | Following figure shows the current occupancy of the run queue ![[Pasted image 20240217211703.png]] Now, based on the current situation, determine the scheduling order for the seven threads A, B, C, D, E, F and G for priority scheduling with round robin per priority queue and a static one, i.e. at runtime unchanged quantum of 100 milliseconds on hardware with a CPU (single core processor). The pure thread switching time (context switching) is neglected for the calculation.The preemption of a thread before its quantum has expired only occurs if the thread is terminated beforehand. Enter the scheduling order in the table by checking the boxes One box represents a time slot of 100 milliseconds | Thread | | | | | | | | ---- | ---- | ---- | ---- | ---- | ---- | ---- | | A | | | | | | X | | B | X | | X | | | | | C | | | | | | | | D | | X | | X | | | | E | | | | | X | | | F | | | | | | | | G | | | | | | | Now determine the scheduling order for the seven threads on hardware with two computer cores (dual-core processor), in which two threads can truly be processed in parallel - Everything else remains as before (static quantum of 100 milliseconds, priority scheduling with round robin per priority queue) - The pure thread change time (context change) is again neglected for the calculation - The preemption of a thread before its quantum has expired only occurs here if the thread is terminated beforehand #### Scheduling Strategies under Unix Every unix is derivative has subtleties, but there are fundamental similarities in scheduling strategies they support: - **Preemptive procedure** - **Process as scheduling unit** - **Priorities with RR-Multi-Level feedback scheduling** - like one queue per priority and RR within the same priority where the front most process from the current highest priority queue is selected next, after time slice has expired the process is added to the end of its current queue or to the end of another queue. Multi level queues are organized in **Run queues** ![[Pasted image 20240217212409.png]] Priority in Unix is assigned as following, process receives on initial priority at start which is subject to change over time, levels reach from -127 highest to +127 lowest but nowadays it is from 0 to 255. Priority is usually recalculated cyclically for every second. CPU usage in past work as a parameter into calculation of priority meaning if process has been high recently the priority should be lower or in case of I/O intensive process like dialog processes, they are more preferred. `nice` - command is unix command that affects static priority when starting program, it is kind to other processes as it lowers the priority of command in question, it can accept param values from 20 to -19 `nice -<int-value> <command>` - following `<command>` is started with a priority value of `<int-value>` lower then the default value meaning that `nice -10 bash` will start `bash` program with value less then 10. Other commands/system calls that can adjust priorities are: - **renice** - used to change priority of the running process - **setpriority** - new command instead of nice #### Scheduling strategies under Linux **Linux** 0(1) scheduler uses, priorities with RR and multi-level feedback mechanism where scheduling unit is **thread** on a kernel level implementation. Supported scheduling strategies can be checked with command `ps -c` which shows strategies of the process/thread and they usually are: 1. **Real-time FIFO** (POSIX) (SCHED_FIFO) - highest priorities and non-preemptive 2. **Real-time Round-Robin** (POSIX) (SCHED_RR)- like FIFO but though preemptive make use of time slices 3. **Timeshare** - (SCHED_BATCH, SCHED_OTHER, SCHED_IDLE) - preemptive for standard user threads Linux also manages Multi-Level feedback queue via run queue ![[Pasted image 20240217214026.png]] CPU withdrawn thread when the time slice length has expired or the thread blocks (thread waiting for resources) and giving a higher priority for ready threads which are checked from time to time. This scheduling policy treats processes with highest priority with preference meaning they are processes as much as possible until they are either completed or blocked, when highest priority queue is processes the next lowest priority queue is then considered, above all that real-time processes may hinder or have an advantage over all other due to time cruciality. Priorities in Linux are handled as following: Each thread will have one static and dynamic priority assignment, the priority scale ranges from - to 139, where timeshare priorities range from 100 to 139 where 100 is highest on other hand `nice/setpriority` command values range from -20 to 19. Default value for priority is always 120, and priorities from 0 to 99 are real time priorities. These priorities result in system where higher priority processes have more CPU time, and changing priority of a processes can mean attaching to process to another queue. Time slice or time of process spend in CPU have a strict function for calculation where: time-slice = f(static-proirity) = (140 - static-priority) $\times$ 20 if static-priority < 120 time-slice = f(static-proirity) = (140 - static-priority) $\times$ 5 if static-priority >= 120 Scheduler also assigns bonus priority which depends on average sleep time it can deal negative bonus resulting in improvement in priority or a priority change and placing it in another queue. **Effective priority** $\approx$ (**Static Priority + Bonus**) ![[Pasted image 20240218090854.png]] **Summary:** O(1) is constant time complexity of scheduler's operations, he deals **static priorities** that determine CPU and time-slice length allocation, effective or dynamic priorities determine the placement in the run queue. With computationally intensive threads which spend their time-slice length which are received according to their static priority and positive bad bonus which results in effective priority meaning that static priority affects the time-slice length and the computational intensity after the effective priority. **Average sleep time** influences the decision whether a process is interactive (I/O intensive) or computational intensive. I/O Intensive threads recieve a negative bonus to the effective priority and are assigned earlier. **Linux Completely Fair Scheduler** (CFS) - has replaced O(1) scheduler after Linux version 2.6.23, it is scheduler for the timesharing processes, it has `fair.c` module with approximately 4400 source lines of code implementing CFS, other module such as `rt.c` with only 1700 source lines of code implements real-time scheduling with only 100 priority levels. Particularities are that there a little to no statistics or run queue as well as no switching from active to expired queue and no timeslice adjustments. CFS is controversial because it may have disadvantage for workstation but it is perfect for servers. CFS have pretty simple strategy, meaning that for each process/thread per group/CPU there is a wait time value which is sorted/managed in red-black tree. Resource allocation for following scenario is set as: if we have 5 processes each with 20% CPU utilisation, the **vruntime** (virtual runtime) for each process is calculated using formula vruntime = 1 - (wait-time/alive-time) $\times$ nice-weight Processes are selected with the lowest vruntime and it runs until a fair state is reached again, this ensures that processes are scheduled in way that maintains fairness among them. Fairness goal is th ensure that the value of vruntime for all processes eventually reaches an equilibrium. This strategy implicitly handlers differences between I/O-bound and CPU-bound processes it doesn't explicitly differentiate between these types but rather relies on the fair allocation base on vruntime. `chrt -p <pid>` - is comand which provides information about the scheduling policy and priority of a process, additionally it can set the scheduling policy of the process #### Scheduling strategies under Windows Windows uses one priority controlled, preemptive scheduling with multi-level feedback, where threads serve as scheduling unit. Threads are organized by priority into multi level feedback queues, and provides internal kernel with first 31 priorities, meaning from 0 - 31: - zero page and idle thread have priority 0 - priorities 16 - 31 are real time priorities and they don't change - priorities 1 - 15 are dynamic processes ![[Pasted image 20240218093615.png|imlR|300]] Multi-level feedback queues in Windows. Deferred Ready - are threads that are already assigned to a processor but are not yet active ![[Pasted image 20240218093638.png]] **Windows API priorities** - provides 6 priority classes of the **processes** which are: idle, bellow normal, normal, above normal, high and real-time. There are also **thread** priority classes which are:time critical, highest, above normal, normal, bellow normal, lowest and idle Function `SetPriorirtyClass()` - is used to set priority class for threads of a process Function `SetThreadPriority()` - allows thread priority to be actively changed but only for thread within the process Priority class | Windows thread priority | real time | high | above normal | normal | | --- | --- | --- | --- | --- | | time critical | 31 | 15 | 15 | 15 | | highest | 26 | 15 | 12 | 10 | | above normal | 25 | 14 | 11 | 9 | | normal | 24 | 13 | 10 | 8 | | below normal | 23 | 12 | 9 | 7 | | lowest | 22 | 11 | 8 | 6 | | idle | 16 | 1 | 1 | 1 | These priority classes can be manipulated using the task manager which can also show basic priority, when changing the priority of the process the priorities of the threads are also changed. System processes use special system calls NTSetInformationProcess ![[Pasted image 20240218095510.png]] Priority Increase, in I/O operation there can be priority increase according to device driver that is adding input, for example mouse/keyboard can raise priority by +6 or end of disk I/O up to +1 the maxium that I/O device can reach is 15 ![[Pasted image 20240218095936.png]] Process increase in case to save the starving threads, lower priority could be disadvantage for certain processes, especially for computationally intensive threads with higher priority always come first, therefore every second it is checked whether a thread has not been active for awhile, even though it is ready, if thread has been stale for long time the priority is rised by 15 and it's time-slice length quadrupled, after it's processing it is reset to it's old state and avoiding starvation. ![[Pasted image 20240218100802.png]] Clock interval for Windows is approx 10ms for x86 single processor and 15ms for x86 multiprocessor. Quantum or timeslice counter per thread set to 6 for workstation (since XP) and 36 on windows server, which can be reduced by 3 per clock interrupt. | criteria | Windows | Linux O(1) scheduler | | ---- | ---- | ---- | | Supported Process types | timesharing and real-time processes; Distinction over priority | Time sharing, real time with FIFO, real time with RR; Distinction over priority | | Priority levels | Internal kernel priority levels from 0 – 31: 0 – 15: Timesharing threads 16 – 31: Realtime and system threads Own WinAPI priorities with base priorities for processes and relative priorities for threads | Static and effective priorities with levels from 0 – 139 (139 lowest): 0 – 99: Realtime processes 100 – 139: Timesharing processes | | Scheduling strategy for timesharing processes | Thread-based, favoring interactive over computationally intensive threads Quantum, priority-driven, round robin, multilevel feedback queue, 32 queues | Process-based, favoring interactive over computationally intensive processes (thread = process) quantum, priority-controlled, round robin, multi-level feedback queue, 140 queues | | Priority calculation for timesharing processes | Current priority is calculated at runtime, priority boost for waiting threads after waiting time or for GUI threads | Effective priority is calculated at runtime, depending on static priority and a bonus (+/- 5) for long-sleeping processes; Effective Priority = static Priority + Bonus (simplified) | | Quantum calculation for Timesharing -processes | Quantum counter = 6 for workstations and 36 for Windows servers, with each clock interval the counter is decremented by 3; Clock interval is approx. 10 ms for x86 single processors (100 Hz) and approx. 15 ms (67 Hz) for multiprocessors Quantum increase in GUI threads or to prevent thread starvation | Depends on static priority Quantum = (140 - static Prio) * 20 [ms] (or * 5) -Maximum 800 ms Clock of the internal system clock can be set to 100 Hz from Linux up to Linux 2.5, from Linux 2.6 to 100, 250, 300, 1000 Hz; With each tick, the quantum of active processes is reduced by the clock interval | #### Scheduling in Java Scheduling is based on priority and time slice, where each thread has a possible priority which are: MIN, NORM, MAX - which come from `Thread.Max.Priority` - class which also has `setPriority()` - `getPriority()` - are methods to set and get current priority. High priority thread has usually advantage but that depends on prioritization implementation in JVM. Scheduling is preemptive meaning that thread is interrupted after a certain time/time-slice, for scheduling can also be sad that is "weak fair" meaning that every thread gets its turn at some point, but not evenly. One queue per priority the tread that is at front of the queue with the highest priority comes next, and priorities are increased for long waiting threads.