Top Banner
Scheduling Beuth Hochschule Summer Term 2014 Pictures (C) W. Stallings, if not stated otherwise
40
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Operating Systems 1 (10/12) - Scheduling

Scheduling

Beuth HochschuleSummer Term 2014!Pictures (C) W. Stallings, if not stated otherwise

!!!

Page 2: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Process Concept

• Classically, processes are executed programs that have ...

• Resource Ownership

• Process includes a virtual address space to hold the process image

• Operating system prevents unwanted interference between processes

• Scheduling/Execution

• Process follows an execution path that may be interleaved with other processes

• Process has an execution state (Running, Ready, etc.) and a dispatching priority and is scheduled and dispatched by the operating system

• Today, the unit of dispatching is referred to as a thread or lightweight process

• The unit of resource ownership remains the process or task

2

Page 3: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Single and Multithreaded Processes

3

code% data% files%

registers% stack%

Thread'

code% data% files%

registers%

stack%

Thread'

stack%

registers%

stack%

registers%

Thread' Thread'

Page 4: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Scheduling

• Assign activities (processes / threads) to processor(s)

• System objectives to be considered; Response time, throughput, efficiency, ...

• Long-term scheduling: Decision to add a process to the pool of executed processes

• Example: Transition of a new process into „ready“ state; batch processing queue

• Medium-term scheduling: Decision to load process into memory for execution

• Example: Resume suspended processes from backing store

• Short-term scheduling: Decision which particular ready process will be executed

• Example: Move a process from „ready“ state into „running“ state

• I/O scheduling: Decision which process is allowed to perform device activities

• Overall goal is to minimize queuing time for all processes

4

Page 5: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Short-Term Scheduler

• In cooperation with the dispatcher as part of the core operating system function

• Frequent fine-grained decision about what runs next, happens on:

• Clock interrupt (regular scheduling interval)

• I/O interrupts

• Operating system calls

• Signals

• Any event that blocks the currently running process / thread

• Needs decision criteria to choose the next

• User perspective vs. system perspective

5

Page 6: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

CPU and I/O Bursts

• Processes / threads can be described as either:

• I/O-bound – spends more time doing I/O than computations, many short CPU bursts

• Compute-bound – spends more time doing computations, few very long CPU bursts

• Behavior can change during run time

• Many short CPU bursts are typical

6

!!!!!!!!…!load!val!inc!val!read!file!

wait!for!I/O!

inc!count!add!data,!val!write!file!

wait!for!I/O!

load!val!inc!val!read!from!file!

wait!for!I/O!

…!

CPU!burst!

CPU!burst!

CPU!burst!

I/O!burst!

I/O!burst!

I/O!burst!

Burst&dura)on&(msec)&0& 10& 20& 30&

distrib

u)on

&

Page 7: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Short-Term Scheduler

• Scheduling criteria

• CPU utilization - Keep the CPU as busy as possible

• Throughput - Number of processes that complete their execution per time unit

• Turnaround time - Amount of time to fully execute a particular process

• Waiting time - Amount of time a process has been waiting in the ready queue

• Response time - Amount of time it takes from when a request was submitted until the first response is produced

• Response is not necessarily valuable output, can also be just a wait indicator

7

Page 8: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Short-Term Scheduling Criteria

8

User-oriented System-oriented

Performance

Turnaround time (submission to completion)

Response time

(interactive)

Deadlines

Throughput (#process completions)

Resource utilization

OtherPredictability

(regardless of system load)

Fairness (no starvation)

Priority enforcement

Resource balancing

Page 9: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Short-Term Scheduling: Multiprocessors

• Load Sharing - Processes are not assigned to a particular processor, global queue

• Central data structure with mutual exclusion may become a bottleneck

• Caching may become ineffective

• Optimized version became default in all standard operating systems

• Gang Scheduling - Set of related threads is scheduled to run on a set of processors at the same time on a one-to-one base

• Mainly beneficial for parallel applications

• Dedicated Processor Assignment - Implicit scheduling by the fixed assignment of threads to processors until completion

• Sacrifices processor utilization for an exact metric of performance

• Dynamic Scheduling - Number of threads in a process can be altered by the scheduler (research approach)

9

Page 10: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Scheduling Function and Decision Mode

• Selection function for scheduling determines which process, among ready processes, is selected next for execution

• May be based on priority, resource requirements, or the execution characteristics

• If based on execution characteristics, then important quantities are:

• w = time spent in system so far, waiting

• e = time spent in execution so far

• s = total time required by the process, including e (user estimation)

• Decision mode specifies the kind of scheduler

• Preemptive: Currently running process is interrupted and moved to ready queue

• Non-preemptive: Process runs until termination or intentional blocking (e.g. I/O)

10

Page 11: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Round Robin

• Uses preemption based on a clock interrupt, manage „ready“ processes in a queue

• Also known as time slicing - each process get‘s a time quantum

• Particularly effective in time-sharing system or transaction processing system

• Compute-bound processes are favored over I/O bound processes in mixed load

• I/O wait delays the move-back to the „ready“ list

• Better for short jobs in comparison to FCFS

• Very short quantum brings overhead penalty, typical lower limit of 10ms

11

Page 12: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Round Robin

12

• Quantum should be slightly longer than the time required to complete a typical request or function

• Quantum higher than the longest request processing time leads to pure FCFS

Thread'execu+on'+me:'15'

0' 15'

15'

15'

0'

0'

10'

10'

quantum'context'switches'

20'

10'

1'

0'

1'

14'

Page 13: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Round Robin with I/O Bursts

13

Thread   Burst  Time     T1                                      23     T2                                            7     T3                                      38     T4                                      14

T1! T2! T3! T4! T1! T3! T4! T1! T3! T3!

0! 10! 17! 27! 37! 47! 57! 61! 64! 74! 82!

Page 14: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Multilevel Queue Scheduling

• Ready queue is partitioned into separate queues

• Real-time (system, multimedia) and Interactive

• Queues may have different scheduling algorithms

• Real-Time – Round Robin

• Interactive – Round Robin + priority-elevation + quantum stretching

• Scheduling must be done between the queues

• Fixed priority scheduling (i.e., serve all real-time threads then from interactive)

• Possibility of starvation

• Time slice – each queue gets a certain amount of CPU time which it can schedule

• Established approach in Solaris operating system family

14

Page 15: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Example: Windows

• Windows dispatcher

• Gives control to the thread selected by the short-term scheduler

• Switching context, switching to user mode

• Jumping to the proper location in the user program to restart that program

• Windows has no mid-term or long-term scheduler

• Dispatch latency – time it takes for the dispatcher to stop one and start another

• Windows scheduling is event-driven - no central dispatcher module in the kernel

• Starvation problem

• Unix: Decreasing priority + aging

• VMS / Windows: Priority elevation

15

Page 16: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Windows Scheduler

• Priority-driven preemptive scheduling system

• Highest-priority runnable thread always runs

• Thread runs for time amount of quantum

• No single scheduler - event-based scheduling code spread across the kernel

• Dispatcher routine triggered by the following event

• Thread becomes ready for execution

• Thread leaves running state (quantum expires, wait state)

• Thread‘s priority changes (system call / NT activity)

• Processor affinity of a running thread changes

16

Page 17: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Windows Scheduling Principles

• 32 priority levels

• Threads within same priority are scheduled following round robin policy

• Realtime priorities (i.e.; > 15) are assigned statically to threads

• Non-realtime priorities are adjusted dynamically

• Priority elevation as response to certain I/O and dispatch events

• Quantum stretching to optimize responsiveness

• In multiprocessor systems, affinity mask is considered

• No attempt to share processors fairly among processes, only among threads

17

6

N-.#-0K((L6.-/$(G."%.",4(d-<-0'(

$%&'()*+,-.)/&+)0)+1&

$2&0*(3*4+)&+)0)+1&

51)6&47&8)("&9*:)&#;()*6&

51)6&47&36+)&#;()*6<1=&

>$&

$%&

&?&

&!"

$2&

&$&

Page 18: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Multiprocessor Systems

• Threads can run on any CPU, unless specified otherwise

• Scheduling tries to keep threads on same CPU (soft affinity)

• Threads can be bound to particular CPUs (hard affinity)

• SetThreadAffinityMask, SetProcessAffinityMask, SetInformationJobObject

• Bit mask where each bit corresponds to a CPU number

• Thread affinity mask must be a subset of process affinity mask, which must be a subset of the active processor mask and may be derived from the image affinity mask, if given

• The scheduling code runs fully distributed, no ,master‘ processor

• Any processor can interrupt another processor to schedule a thread

• Scheduling database as per-CPU data structure of ready queues

18

Page 19: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Multiprocessor Systems

• Every thread has an ideal processor

• System selects ideal processor for the first thread of a fresh process (round robin across CPUs)

• Next thread gets next CPU relative to the process seed

• SetThreadIdealProcessor (HANDLE hThread, DWORD dwIdealProcessor)

• Hard affinity changes update ideal processor settings

• Used in selecting where a thread runs next

• Hyperthreading: GetLogicalProcessorInformation()

• NUMA systems: GetProcessAffinityMask(), GetNumaProcessorNode(), GetNumaHighestNodeNumber(), GetNumaNodeProcessorMask()

19

Page 20: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Windows Scheduling Principles

• No central scheduler, i.e. there is no routine or thread called “the scheduler”

• Routines are called whenever events change the ready state of a thread

• Things that cause scheduling events include:

• Interval timer interrupts (for quantum end)

• Interval timer interrupts (for timed wait completion)

• Other hardware interrupts (for I/O wait completion)

• Thread changes the state of a waitable object upon which thread(s) are waiting

• A thread waits on one or more dispatcher objects

• A thread priority is changed

• Based on doubly-linked lists (queues) of ready threads

20

Page 21: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Windows Scheduling Principles

• Windows API point of view

• Processes are given a priority class upon creation ( Idle, Normal, High, Realtime )

• Windows 2000 added “Above normal” and “Below normal”

• Threads have a relative priority within the class ( Idle, Lowest, Below_Normal, Normal, Above_Normal, Highest, and Time_Critical )

• Different API functions to influence scheduling ( Get/SetPriorityClass, Get/SetThreadPriority, Get/SetProcessAffinityMask, SetThreadAffinityMask, SetThreadIdealProcessor, Suspend/ResumeThread )

• Kernel point of view

• Threads have priorities 0 through 31, scheduled accordingly

• Process priority class is not used to make scheduling decisions

21

Page 22: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Windows vs. Kernel Priorities

22

Page 23: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Other Examples

23

Page 24: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Special Thread Priorities

• One idle thread per CPU

• When no threads want to run, idle thread is executed

• Appears to have priority zero, but actually runs “below” priority 0

• Provides CPU idle time accounting - unused clock ticks are charged to idle thread

• Loop:

• Calls HAL to allow for power management, processes DPC list

• Dispatches to a thread if selected

• One zero page thread per system

• Zeroes pages of memory in anticipation of “demand zero” page faults

• Runs at priority zero (lower than reachable with Windows API) in the „system“ process

24

Page 25: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Thread Scheduling States (2000, XP)

25

Ready&(1)& Running&(2)&

Wai0ng&(5)&

Ready&=&thread&eligible&to&be&scheduled&to&run&Standby&=&thread&is&selected&to&run&on&CPU&>=&Vista:&Addi0onal&‘Deferred&ready’&state&

voluntary&switch&

preemp0on,&&quantum&end&

Init&(0)&

Terminate&(4)&

Transi0on&(6)&

wait&resolved&aRer&kernel&stack&made&&&&&pageable&

Standby&(3)&preempt&

Page 26: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Thread Scheduling States (2000, XP)

26

• Transition:

• Thread was in a wait entered from user mode for 12 seconds or more

• System was short on physical memory, so the balance set manager marked the thread’s kernel stack as pageable

• Later, the thread’s wait was satisfied, but it can’t become ready until the system allocates a non-pageable kernel stack frame

• Initiate:

• Thread is “under construction” and can’t run yet

• Standby: One processor has selected a thread for execution on another processor

• Terminate: Thread has executed its last code, but can’t be deleted until all handles and references to it are closed (object manager)

Page 27: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Scheduling Scenarios

• Preemption

• A thread becomes ready at a higher priority than the currently running thread

• The lower-priority running thread is preempted

• The preempted thread goes back to the head of its ready queue

• Scheduler needs to pick the lowest priority thread to preempt

• Preemption is strictly event-driven, does not wait for the next clock tick

• Threads in kernel mode may be preempted (unless they raise IRQL to >= 2)

27

Page 28: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Priority Adjustments• Dynamic priority adjustments are applied to threads in dynamic classes

• Disable if desired with SetThreadPriorityBoost or SetProcessPriorityBoost

• Types of priority adjustment

• I/O completion

• Wait completion on executive events or semaphores

• When threads in the foreground process complete a wait operation

• Boost value of 2, lost after one full quantum

• Quantum decremented by 1 so that threads that get boosted after I/O completion won't keep running and never experiencing quantum end

• GUI threads that wake up to windowing input (e.g. messages) get a boost of 2

• Added the current priority, not the base priority

28

Page 29: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Priority Adjustments

• No automatic adjustments in real-time class (16 or above)

• Real time here really means “system won’t change the relative priorities of your real-time threads”

• Hence, scheduling is predictable with respect to other “real-time” threads,but not for absolute latency

• Example: Boost on I/O completion

• Specified by the device driver through IoCompleteRequest(Irp, PriorityBoost)

• Common boost values (see NTDDK.H): 1 - disk, CD-ROM, parallel, video ;2 - serial, network, named pipe, mailslot ; 6 - keyboard or mouse ;8 - sound

29

Page 30: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Foreground Applications

• Quantum Stretching

• The threads of a normal-priority process that owns the foreground window may get longer quantum (Win32PrioritySeparation registry key)

• „Maximum“ - 6 ticks, „Middle“ - 4 ticks, „None“ - 2 ticks

• Does not happen on Server editions by default, depends on Windows „performance options“; NT4 Server had 12 ticks

30

8""

Running"""Ready"

Page 31: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Choosing a CPU for a Ready Thread• For Windows 2000 / XP

• Check if any processors are idle that are in the thread’s hard affinity mask:

• If its ideal processor is idle, it runs there

• Else, if the previous processor it ran on is idle, it runs there

• Else if the current processor is idle, it runs there

• Else it picks the highest numbered idle processor in the thread’s affinity mask

• If no processors are idle:

• If the ideal processor is in the thread’s affinity mask, it selects that

• Else if the previous processor is in the thread’s affinity mask, it selects that

• Else it picks the highest numbered processor in the thread’s affinity mask

• Check the priority of the thread running on the processor for preemption31

Page 32: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Choosing a Thread for a CPU

• For Windows 2000 / XP

• System needs to choose a thread to run on a specific CPU at quantum end, wait state entering, affinity mask changes, or thread exit

• Starting with the first thread in the highest priority non-empty ready queue, it scans the queue for the first thread that:

• Has the current processor in its hard affinity mask, and

• Ran last on the current processor, or has its ideal processor equal to the current processor, or has been in its ready queue for 3 or more clock ticks, or has a priority >=24

• If it cannot find such a candidate, it selects the highest priority thread that can run on the current CPU (whose hard affinity includes the current CPU)

32

Page 33: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Scheduling Data Structures (since Server 2003)

• Threads always go into the ready queue of their ideal processor

• Instead of locking the dispatcher database to look for a candidate to run, per-CPU ready queue is checked first (PRCB lock)

• If a thread has been selected on the CPU, just perform the dispatching

• Otherwise scan of other CPU’s ready queues looking for a thread to run

• This scan is done OUTSIDE the dispatcher lock, just acquires PRCB lock

• Dispatcher lock still need to wait or un-wait a thread

• In sum, global dispatcher database lock is now held for a MUCH shorter time

• Idle processor selection considers NUMA and hyperthreading characteristics

• Next ideal processor is the first logical processor on the next physical processor

33

Page 34: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

New since Windows 7

• Core Parking

• Historically, CPU workload was distributed fairly evenly across logical processors,even on low utilization

• Core Parking tries to keep the load on fewest logical processors possible,all others can sleep; only overridden by hard affinity and thread ideal processor

• Power management code notifies scheduling code about parked cores

• Considers socket topology - newer processors put sockets into deep sleep if all the cores are idle

• At least one CPU in each NUMA node is left unparked for fast memory access

• Core Parking is active on server and hyperthreading systems

• Best returns on medium utilization workloads, but typical Desktop client systems tend to run at extremes

34

Page 35: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

New since Windows 7

• Before, no quality of service for Remote Desktop (formerly called Terminal Server)

• One user could hog server’s CPU

• Remote Desktop role now automatically enables dynamic fair share scheduling

• Sessions are given weight 1-9 (default is 5), internal API can set weight

• Each session given CPU budget, charge happens at every scheduler event

• When session exceeds quota, its threads go to idle-only queue

• Scheduled only when no other session wants to run

• At end of interval, all threads made ready to run

35

Page 36: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Unix SVR4 Scheduling

• Differentiation between different three priority classes for 160 priority levels

• Real-time processes (159-100)

• kernel-mode processes (99-60)

• time-shared processes (59-0, user mode)

• Kernel was not preemptible, so specific preemption points were defined

• Region of code where all kernel data structures are either updated and consistent, or locked via a semaphore

• One dispatch queue per priority level, each handled in round-robin

• Each time a time-shared process used a quantum, its priority is decreased

• Each time it blocks on an event or resource, its priority is increased

• Time-shared process quantum depends on priority, fixed for real-time processes36

Page 37: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Linux Scheduling• schedule function as central

organization point for scheduling

• Runtime of the scheduler became thread-count-independent with Linux 2.6 - O(1) scheduler

• Also established for a while in BSD and Windows NT kernels

• Internal priorities: real-time processes (0-99), regular processes (100-139)

• nice system call allows to modify the static priority between -20 and +19(less means higher priority)

37

Page 38: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Linux Scheduling• Each process is represented by a task_struct, which contains all scheduling-related

information

• Dynamic and static priority

• Scheduling policy - SCHED_NORMAL, SCHED_RR, SCHED_FIFO

• Real-time scheduling classes demanded for POSIX compatibility

• Round-robin real-time processes have a quantum, FIFO processes not

• Processor affinity mask

• Average sleep time of the task (high sleep time gives better priority, to support interactive tasks in the best-possible way)

• Remaining quantum as time slice

• Tasks are scheduled independently, so threads from the same process can run on different processors

38

Page 39: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Linux Scheduling

• Each CPU has three queues

• active queue (still have quantum)

• expired queue (quantum over)

• migration queue (for processor migration)

• Queues are summarized in a runqueue structure

• When active queue is empty, it is swapped with the expired queue

• Periodic scheduling function (scheduler_tick) decreases the current quantum and calls the main scheduling function if needed

• Main function takes the highest priority task from the active queue and runs it

• Calculation of the dynamic priority in the effective_prio() function

39

Page 40: Operating Systems 1 (10/12) - Scheduling

Operating Systems I PT / FF 14

Linux Scheduling• Base time quantum

• Static priority determines the base time quantum, which is assigned when the former quantum is exhausted

• With static priority < 120: (140 - static priority) * 20

• With static priority >= 120: (140 - static priority) * 5

• Base time quantum gets longer with higher priority (lower value)

• Dynamic priority

• max(100, min(static priority - bonus + 5, 139))

• Bonus is a value between 0 and 10, depends on average sleep time

• less than 5 is a penalty, more than 5 is a premium

• Average sleep time is decreasing when the process is running

40