Chapter 4: Threads Chapter 3: Processes Start of Lecture on January 22, 2014 1
Chapter 4: ThreadsChapter 3: Processes
Start of Lecture on January 22, 2014
���1
Chapter 4: ThreadsChapter 3: Processes ���2
Reminders
• Likely you’ve gotten some useful info for your assignment and are hopefully working on it; start early, as c can be frustrating and difficult to debug
• Added Lab Lectures section online
• I’ve added references from CMPUT 201 to eClass page
• Working from home FAQ
• Linux tutorial
• Introduction to Emacs
• What are you questions and comments? Am I talking too fast? Or going too fast and you want more review?
Chapter 4: Threads
Review of what we’ve learned
• A feel for the purpose of operating systems — history of their role and main services they provide, such as an interface, resource allocation, abstraction and protection
• System calls as interface to main services made available by OS, that processes cannot/should not be trusted to execute
• e.g. create process (fork), write to file (write), create shared mem (shm_open)
• Operating system design decisions, e.g microkernel, modules
• Process as entity, process states and context switching
• Interprocess communication: shared mem & message passing
���3
Chapter 4: ThreadsChapter 3: Processes
End of Chapter 3 on Processes
• Any questions about processes? Or anything else?
• If you’re too embarrassed to ask, remember that likely someone has the same question. There are no “dumb” questions; you are not dumb for not knowing an answer. Each question you ask makes you get more knowledgeable
• If you are too embarrassed, ask on the anonymous forum
���4
Chapter 4: Threads
CMPUT 379, Section A1, Winter 2014 January 22 and 27
Chapter 4: Threads
Objectives
• Understand the notion of a “thread”: a fundamental unit of CPU utilization; the basis of multithreaded systems
• Understand why threads became popular and how they affected OS design
• Understand benefits and issues with multithreaded programming
• Understand how to use POSIX pthreads
���6
Chapter 4: Threads
Multicore (multiprocessor) versus Singlecore (single processor)
• In response to the need for more processing power, multiprocessor systems arose
• Why multiple processors rather than more powerful single processors? For example, 8 GHz instead of 4 Ghz?
• Heat and transmission delays (settling signals) limit the CPU speed
• Multi-tasking makes multiple processors useful
• Increase in speed from multiple processors not linear; overhead in managing multiple CPUs and multithreading
���7
Chapter 4: Threads
Video: Multicore vs Singlecore
���8
Chapter 4: Threads
Parallelism vs. Concurrency
• Concurrency supports more than one task by allowing multiple tasks to make progress
• single-processor system could have concurrency by constantly switching between processes; multi-processor system likely has concurrency
!
!
• Parallelism is running more than one task simultaneously • single-processor system cannot run processes in parallel
���9
T1 T2 T3 T4 T1 T2 T3 T4 T1single core
time
…
T1 T3 T1 T3 T1core 1
T2 T4 T2 T4 T2core 2
time
…
…
Chapter 4: Threads
Open problem: Scalability Analysis
• Performance likely scales with increasing number of cores
• Particularly if choose to improve precision or increase problem size since you know more computing power is available
• Useful to have a measure/formula of scalability
• Unfortunately, parallel performance is dependent on many factors, such as uni-processor power, network speed, I/O system speed, problem size, problem input and serial versus parallel instruction percentages
• Neither Amdahl’s Law nor Gustafson’s Law really tell the whole story, because its difficult to gauge how much of the problem can be parallelized
���10
Chapter 4: Threads
What is a thread?
• A thread is a basic unit of CPU utilization, with program counter, CPU register set and stack
• Often referred to as a light-weight process
• It shares code, data, system resources and other OS related information with its peer group (other threads of the same process)
• A thread belongs to one instantiating process, whereas a process can have many threads
���11
Chapter 4: Threads
Threads and Processes
���12
Copyright © 1996–2002 Eskicioglu and Marsland (and Prentice-Hall and Paul Lu)
Processes 25July
99
Threads versus processes cont.
STACK
DATA
a traditional process a multi-threaded process (task)
STACK STACK STACK
TEXTa thread
DATA
TEXT
Chapter 4: Threads
Why did multithreading arise?
• Separate processes can take advantage of multiprocessor systems, so its not obvious that threads are useful
• Considering the importance of multi-tasking (i.e. multiprogramming), particularly on interactive systems, there are already clear speed-ups from separate processes taking advantages of multiple cores
• So why introduce threads, and complicate things?
���13
Chapter 4: Threads
Benefits of threads
• Benefits that parallel processes and threads share:
• Responsiveness — may allow continued execution if part of the process/process group is blocked, especially important for user interfaces
• Scalability — can take advantage of multiprocessor architectures
• Threads, however, can often be simpler and more efficient for parallelism, particularly due to:
• Resource sharing — threads share resources of process, easier than shared memory or message passing for communication/coordination
• Economy — cheaper than process creation, thread switching has less overhead than context switching
���14
Chapter 4: Threads
fork() versus pthread_create()
���15
1/21/14 POSIX Threads Programming
https://computing.llnl.gov/tutorials/pthreads/#WhyPthreads 1/1
Note: don't expect the sytem and user times to add up to real time, because these are SMP systems with
multiple CPUs working on the problem at the same time. At best, these are approximations run on local
machines, past and present.
Platformfork() pthread_create()
real user sys real user sys
Intel 2.6 GHz Xeon E5-2670 (16 cores/node) 8.1 0.1 2.9 0.9 0.2 0.3
Intel 2.8 GHz Xeon 5660 (12 cores/node) 4.4 0.4 4.3 0.7 0.2 0.5
AMD 2.3 GHz Opteron (16 cores/node) 12.5 1.0 12.5 1.2 0.2 1.3
AMD 2.4 GHz Opteron (8 cores/node) 17.6 2.2 15.7 1.4 0.3 1.3
IBM 4.0 GHz POWER6 (8 cpus/node) 9.5 0.6 8.8 1.6 0.1 0.4
IBM 1.9 GHz POWER5 p5-575 (8 cpus/node) 64.2 30.7 27.6 1.7 0.6 1.1
IBM 1.5 GHz POWER4 (8 cpus/node) 104.5 48.6 47.2 2.1 1.0 1.5
INTEL 2.4 GHz Xeon (2 cpus/node) 54.9 1.5 20.8 1.6 0.7 0.9
INTEL 1.4 GHz Itanium2 (4 cpus/node) 54.5 1.1 22.2 2.0 1.2 0.6
fork_vs_thread.txt
Chapter 4: Threads
Parallelism important, so threads important
• Threads arose due to importance of software speed-ups
• Hardware speed-ups are important, but its equally (more?) important to focus on efficiently using that hardware with clever software design
• Since parallelism is such a key approach for speeding up task completion, the (seemingly small) gains from threads are important
���16
Chapter 4: Threads
Video Break: brought to you by another fantabulous classmate
���17
Chapter 4: Threads
Threading examples
• Summing elements from 1 to N (joint task with focus on data parallelism, i.e. accessing data in parallel)
• Thread1: sum elements 1 to floor(N/2)
• Thread2: sum elements floor(N/2)+1 to N
• Combine (sum) results from Thread1 and Thread 2 to obtain total
• Compute statistics in parallel (separate tasks with focus on task parallelism, rather than accessing data in parallel)
• Thread1: compute mean of N elements — sum arr[i]
• Thread2: compute 2nd moment of N elements (for Var) — sum arr[i]^2
• Use separately computed statistics for an end goal (e.g. profile summary)
���18
Chapter 4: Threads
Example: Multithreaded Server Architecture
• Server process runs, receiving requests and creating new threads to service those requests
• Since threads share much code and data with server process, there is no need to create separate processes (which require significant overhead)
���19
client
(1) request(2) create new
thread to servicethe request
(3) resume listeningfor additional
client requests
server thread
Chapter 4: Threads
And many more situations where threads make more sense than multiple processes
• Intensive data sharing settings:
• e.g. computing statistics on the same (large) dataset for machine learning
• large scale face recognition (finding match in large database)
• Situations where there are many more subtasks than available cores, since thread switches are faster than context-switches for processes
• threads useful for logical separation of program with multiple tabs (without need for protection between tabs), even if can’t all run in parallel
• “embarrassing" parallel tasks: easily separates into many subtasks; e.g. calculate potential energy of several thousand independent conformations of a molecule and, when done, find minimum energy conformation
���20
Chapter 4: Threads
Example: POSIX Pthreads
• Run pthreads_hello.c
• Run pthreads_hello_arg2.c
• Run pthreads_join1.c
• Notice that
• pthread_exit() called from child thread is like exit() for children in fork()
• pthread_join() called from parent thread is like waitpid() for forked children, except there is not the same hierarchy for threads
• Way that arguments are passed to threads (as pointers to variables on data segment or heap) and way threads return values to join
���21