Threads By Dr. Yingwu Zhu
Jan 04, 2016
Threads What is a thread ?
Lightweight Process (LWP)? Basic unit of CPU utilization Contains
Thread ID Program counter Register set Stack
Why multithreading ? Creating processes are expensive Other advantages
Benefits
Responsiveness
Resource Sharing share memory and resources of the process they belong to Sharing code and data allow different threads of activity
within the same address space Economy
Processes are expensive to create, and do context-switch In Solaris
Process creating is about 30 times slower Context-switch is about 5 times slower
Utilization of MP Architectures A single-threaded process can only run on one CPU
User Threads Thread management (creation, scheduling)
done by user-level threads library No kernel resources allocated to the threads
Drawback Blocking system call suspends other threads in the same
process
Three primary thread libraries: POSIX Pthreads Win32 threads Java threads
Kernel Threads Supported by the Kernel Advantages
Non-blocking thread execution (Similar to processes, when a kernel thread makes a blocking call, only that thread blocks )
Multi-processors (threads on different processors)
Drawback Slower to create and manage than user-level
Examples Windows XP/2000, Solaris Linux Tru64 UNIX, Mac OS X
Many-to-One Many user-level threads mapped to single
kernel thread Thread management is done by thread lib. in
user space; so, it is efficient. But, a thread making a blocking system call block the entire process Multiple threads cannot run in parallel on MP computers (only one
thread can access the kernel at a time)
Used on systems that do not support kernel threads.
Examples: Solaris Green Threads GNU Portable Threads
Many-to-one Model
Kernels do not support multiple threads of control
Multithreading can be implemented entirely as a user-level library
Schedule multiple threads onto the process’s single kernel thread; multiplexing multiple user threads on a single kernel thread
Many-to-one (cont.): Benefits
Cheap synchronization When a user thread wishes to perform
synchronization, the user-level thread lib. checks to see if the thread needs to block.
If a user thread does, the user-level thread lib. enqueues it, and dequeues another user thread from the lib.’s run queue, and swithes the active thread.
No system calls are required Cheap thread creation
The thread lib. need only create a context (i.e., a stack and registers) and enqueues it in the user-level run queue
Many-to-one (cont.): Benefits
Resource efficiency Kernel memory is not wasted on a stack for each
user thread Allows as many thread as VM permits
Portability User-level threads packages are implemented
entirely with standard UNIX and POSIX lib. calls
Many-to-one (cont.): Drawbacks
Single-threaded OS interface If a user thread blocks (e.g, blocking system calls),
the entire process blocks and so no other user thread can execute until the kernel thread (which is blocked in the system call) becomes available
Solution: using nonblocking system calls Can not utilize MP achitectures Examples: Java, Netscape
One-to-One Each user-level thread has (maps to) a kernel
thread More concurrency than many-to-one: allowing
another thread to run when a thread makes a blocking system call; allowing multiple threads running on MP computers as well
Overhead: creating a kernel thread upon a user thread
Examples Windows NT/XP/2000 Linux Solaris 9 and later
One-to-one (cont.): Benefits
Scalable parallelism Each kernel thread is a different kernel-schedulable
entity; multiple threads can run concurrently on multiprocessors
Multithreaded OS interface When one user thread and its kernel thread block,
the other user threads can continue to execute since their kernel threads are unaffected
One-to-one (cont.): Drawbacks
Expensive synchronization Kernel threads require kernel involvement to be
scheduled; kernel thread synchronization will require a system call if the lock is not immediately acquired
If a trap is required, synchronization will be from 3-10 times more costly than many-to-one model
Expensive creation Every thread creation requires explicit kernel
involvement and consumes kernel resources 3-10 times more expensive than creating a user
thread
One-to-one (cont.): Drawbacks
Resource inefficiency Every thread created by the user requires kernel
memory for a stack, as well as some sort of kernel data structure to keep track of it
Many parts of many kernels cannot be paged out The presence of kernel threads is likely to displace
physical memory for applications
Many-to-Many Model Allows many (K) user level threads to be
mapped to many (M) kernel threads: M<=K
Allows the operating system to create a sufficient number of kernel threads without overburdening the system
Solaris prior to version 9 Windows NT/2000 with the ThreadFiber
package
Many-to-Many Model
Combing the previous two models User threads are multiplexed on top of kernel
threads which in turn are scheduled on top of processors
Taking advantage of the previous two models while minimizing both’s disadvantages
Creating a user thread does not necessarily require the creation of a kernel threads; synchronization can be purely user-level
Threading Issues
Due to multithreading: Semantics of fork() and exec()
system calls Thread cancellation Signal handling Thread pools Thread specific data Scheduler activations
Semantics of fork() and exec() Does fork() duplicate only the calling
thread (single-threaded process) or all threads?
It depends on applications Example: if call exec() after fork?
Thread Cancellation
Terminating a thread before it has finished
Examples Multiple threads are concurrently doing the
same task Cancel web browser’s on-going tasks
Two general approaches: Asynchronous cancellation terminates
the target thread immediately Deferred cancellation allows the target
thread to periodically check if it should be cancelled
Signal Handling Signals are used in UNIX systems to notify a process
that a particular event has occurred A signal handler (user-defined handler
overrides default handler) is used to process signals
1. Signal is generated by particular event2. Signal is delivered to a process3. Signal is handled
Depends on signal type Synchronous signals (e.g., division by 0, illegal memory
access) delivered to the thread causing the signal Asynchronous signals have options
Options: Deliver the signal to the thread to which the signal applies Deliver the signal to every thread in the process, e.g, ctrl-c Deliver the signal to certain threads in the process: kill(aid,
signal) Assign a specific thread to receive all signals for the process
Thread Pools
Create a number of threads in a pool where they await work
Advantages: Usually slightly faster to service a
request with an existing thread than create a new thread
Allows the number of threads in the application(s) to be bound to the size of the pool
Thread Specific Data Threads belonging to a process
share the data of the process Allows each thread to have its
own copy of data Useful when you do not have
control over the thread creation process (i.e., when using a thread pool)
Scheduler Activations M:M models require communication to maintain the
appropriate number of kernel threads allocated to the application by an immediate data structure called LWP (light-weight process), a virtual processor
LWP runs a user thread; LWP maps to a kernel thread which the OS schedules to run on the physical processor
Scheduler activations provide upcalls - a communication mechanism from the kernel to the thread library; upcall handler perform the task, mapping a user thread to a new LWP, or removing a user thread being blocked from a LWP
The kernel provides a LWP for a user thread This communication allows an application to maintain
the correct number kernel threads
Pthreads A POSIX standard (IEEE 1003.1c)
API for thread creation and synchronization
API specifies behavior of the thread library, implementation is up to development of the library
Common in UNIX operating systems (Solaris, Linux, Mac OS X)
How to compile?
$ gcc –o proj2 proj2.c –pthread The option specifies that pthreads library should be
linked causes the complier to properly handle multiple
threads in the code that it generates
Creating and Destroying Threads
Creating threads Step 1: create a thread Step 2: send the thread one or more parameters
Destroy threads Step 1: destroy a thread Step 2: retrieve one or more values that are returned
from the thread
Creating Threads
#include <pthread.h> int pthread_create (pthread_t *thread_id, pthread_attr_t *attr, void *(*thread_fun)(void
*), void *args);- The #1 para returns thread ID- The #2 para pointing to thread attr. NULL represents
using the default attr. settings- The #3 para as pointer to a function the thread is to
execute- The #4 para is the arguments to the function
Thread Terminates
Pthreads terminate when the function returns, or the thread calls pthread_exit()
int pthread_exit(void *status); status is the return value of the thread A thread_fun returns a void*, so calling “return (void
*) is the equivalent of this function
Thread termination
One thread can wait (or block) on the termination of another by using pthread_join()
You can collect the exit status of all threads you created by pthread_join()
int pthread_join(pthread_t thread_id, void **status)
The exit status is returned in status pthread_t pthread_self();
Get its own thread id int pthread_equal(pthread_t t1, pthread_t t2);
Compare two thread ids
Example
#include <pthread.h>void *thread_fun(void *arg) { int *inarg = (int *)arg; … return NULL;}
Int main() { pthread_t tid; void *exit_state; int val = 42; pthread_create(&tid, NULL, thread_fun, &value); pthread_join(tid, &exit_state); return 0;}
Kill Threads
Kill a thread before it returns normally using pthread_cancel()
But Make sure the thread has released any local
resources; unlike processes, the OS will not clean up the resources
Why? Threads in a process share resources
Exercise
Write a multithreaded program that calculates the summation of a non-negative integer in a separate thread
The non-negative integer is from command-line parameter
The summation result is kept in a global variable:int sum; // shared by threads
Step 1: write a thread function
void *thread_sum(void *arg) { int i; int m = (int)(*arg); sum = 0; //initialization for (i = 0; i <= sum; i++) sum += I; pthread_exit(0);}
Step 2: write the main()
int sum;int main(int argc, char *argv[]) { pthread_t tid; if (argc != 2) { printf(“Usage: %s <integer-para>\n”, argv[0]); return -1; } int i = atoi(argv[1]); if (i < 0) { printf(“integer para must be non-negative\n”); return -2; } pthread_create(&tid, NULL, thread_sum, &i); pthread_join(tid, NULL); printf(“sum = %d\n”, sum);}
Exercise
Write a program that creates 10 threads. Have each thread execute the same function and pass each thread a unique number. Each thread should print “Hello, World (thread n)” five times where ‘n’ is replaced by the thread’s number. Use an array of pthread t objects to hold the various thread IDs. Be sure the program doesn’t terminate until all the threads are complete. Try running your program on more than one machine. Are there any differences in how it behaves?
Returning Results from Threads
Thread function return a pointer to void: void * Pitfalls in return value
Pitfall #1
void *thread_function ( void *){ int code = DEFAULT_VALUE; return ( void *) code ;}
Only work in machines where integers can convert to a point and then back to an integer without loss of information
Pitfall #2
void *thread_function ( void *){ char buffer[64]; // fill up the buffer with sth good return ( void *) buffer;}
This buffer will disappear as the thread function returns
Pitfall #3
void *thread_function ( void *){ static char buffer[64]; // fill up the buffer with sth good return ( void *) buffer;}
It does not work in the common case of multiple threads running the same thread funciton
Right Way
void *thread_function ( void *){ char* buffer = (char *)malloc(64); // fill up the buffer with sth good return ( void *) buffer;}
Right Way
int main() { void *exit_state; char *buffer; …. pthread_join(tid, &exit_state); buffer = (char *) exit_state; printf(“from thread %d: %s\n”, tid, buffer); free(exit_state);
}
Exercise
Write a program that computes the square roots of the integers from 0 to 99 in a separate thread and returns an array of doubles containing the results. In the meantime the main thread should display a short message to the user and then display the results of the computation when they are ready.