Multithreading in C++ CS204 Advanced Programming Sabancı University not in the books, try to understand in class/labs there are good explanations at http://www.cplusplus.com/reference/multithreading/ 1
Jan 11, 2016
1
Multithreading in C++
CS204 Advanced ProgrammingSabancı University
not in the books, try to understand in class/labs
there are good explanations at http://www.cplusplus.com/reference/multithreading/
2
Overview
• Terminology– Processes– Threads– Multithreading
• Thread synchronization and synchronization conflicts– How to resolve them using mutex
• C++ multithreading support (standart classes/functions)
3
Processes• What is a "process"?
– For the sake of simplicity, let us define that a process is a running program/application in a computer system.• But sometimes a program may have more than one process.• But always, each application has at least one process.
• A process has self-contained execution environment.– it has own code, data, address space in the memory, and some other
resources (some of them are to be shared with other processes).– The Operating System (OS), such as MS Windows, LINUX, Android, iOS,
etc. facilitates sharing of computer resources by many processes concurrently .
– To give the illusion that many processes are executing concurrently, the CPU switches between processes fastly: this is called context switching
• The processes can work independently or by cooperating and communication with other processes– Cooperation/communication among processes is not in CS204 scope
4
Threads• Actually the processor (CPU) does not execute processes, but
executes threads• A process is composed of one or more threads• A thread is a "path of execution" in a process
– it is a single sequential flow of control within a process– The resources of the process are shared among its threads (including
memory and files)• Although a particular thread may have its own variables/objects and other
resources
• As you can see, threads are similar to processes– But threads are more flexible to use cooperatively since they share memory
and resources within a process/program– In all programming languages, there is support to program with threads and
today we will see how to do this using standard C++ techniques.• Several threads of a process/program work concurrently
– They seems to work in parallel, but actually there is context switching– Every process has at least one thread (the main thread)
• That was the case in CS201/CS204 so far.
5
Processes & Threads
6
The Story So Far
Your process starts
The main thread is located
The instructions are executed sequentially
Instructions finish, the main thread returns and your process ends
You compile build your code exe
You run your executable
7
Now! Multithreading• You still have the main thread, but there are going to be
some other threads working concurrently• Why multithreading?
– The answer is already given. We need concurrency and multitasking in some cases.
– A typical application has many tasks• The real computation and data processing
– Sometimes several objects working on the same shared resource in parallel.
• Interacting with the user via a Graphical User Interface (GUI)• Input/Output (I/O) operations• etc.
– All these tasks are better to be handled concurrently• Wouldn't it be great if the computation continues while waiting for
input from user?
8
compute
compute
I/O
I/O resultNeeded
(a) Sequential process
compute thread
I/O resultNeeded
(b) Multithreaded process
I/O request
I/O complete
I/O thread
Multithreading: a simple example •Starting from the main thread you can spawn "child" threads• Child thread : The spawned thread• Parent thread: the thread that spawned
9
Multithreaded Programming• No set standard in C++ for multithreaded programming until C+
+11 (11th version of the C++ standards)• Every library or framework implements threads in a different way
– Pthreads for *NIX– Windows Threads– Higher Level Libraries:
• Boost:thread• Intel’s Thread Building blocks• OpenMP• Ting : ThreadING (Multiplatfrom library win/linux/OSX/Android)
• Every OS has a different thread implementation.• In CS204, we will use standartd C++ threads
10
Multithreading in C++• thread object is used by the C++ standard library to make the
thread-management tasks easier– launching threads– checking if they are finished– keeping an eye on them– The library has many functionalities: we will only cover a few of them.
• Every C++ program has at least one thread, – a thread running main() started by the C++ runtime.
• We will call it the main thread
– main thread can then spawn additional threads that have another function as the entry point
– all these threads run concurrently with each other• Via a schedule organized by the OS
11
A simple multithreaded example
#include <iostream>#include <thread>using namespace std;
void hello() { cout << "Hello thread\n";}
int main() { thread aThread(&hello); aThread.join(); cout << "Bye main\n"; return 0; }
standard C++ header to use threads
The main thread starts here
Every thread has to have an initial function where the new thread of execution begins. The new thread is started by constructing aThread object that specifies the task hello() to run on that thread.
Later, this function will be the entry point (starting point) of aThread
Hello threadBye main
aThread starts (is spawned) here.Parent thread: main
Child thread: aThread
See threads1.cpp
12
Another multithreading example#include <iostream>#include <thread>using namespace std;
void hello() { int i; for (i=0; i<100; i++) cout << "Hello thread ";}
int main() { thread aThread(&hello); int i; for (i=0; i<100; i++) cout << "Hi main "; aThread.join(); cout << "Bye main\n"; return 0; }
• The previous example may look like a regular function calling. However, this is not the fact. Both main and aThread threads work concurrently.
• To visualize concurrency, consider the example on the left, where several outputs are displayed on each thread.
• Please run this several times. You will see that "Hello thread" and "Hi main" strings would really interleave.
See threads2.cpp
Some syntax• The thread objects can be created with or without initialization
– Execution of thread starts once the thread object is initialized• Below is the syntax for creating with initialization (using parametric
constructor)
13
thread thread_object_name (&starting_function, argument list);
• Below is the syntax for creating the thread object without initialization (using default constructor). The thread does not start.
thread thread_object_name;
• Below is the syntax for initialization of an already created thread object. The thread now starts.
thread_object_name = thread(&starting_function, argument list);
See threads3.cpp for another example with arguments
14
join• Once a thread, say aThread, is started by the main thread, normally
the main thread should wait for aThread to finish before it terminates. – Otherwise, the resources of the main (parent) thread becomes destroyed
but the child thread (aThread) still wants to use them. This causes your program to crash.
– In general, the parent threads cannot be terminated before the child threads are finished.
– You make sure that parent thread does not terminate before the child threads finish using the thread method called join() on child threads (in our case aThread) within the parent thread function (in our case main).Syntax: Example:thread_object_name.join(); athread.join();
15
More on join
– You can use this characteristic of join() for the cases the parent thread needs the value generated by the child thread to continue (for example, when the child thread reads data and parent thread needs it).
– Of course, if the execution aThread had already completed before, the current thread continues without waiting
• Actually, join() causes the current thread in which it was called (in our case the main thread) to be blocked until the thread on which it was called (in our case aThread) completes and terminates.– That is why in the previous examples, "Bye main" was displayed always
at the end.
void hello() { cout << "Hello thread\n";}
int main() { thread aThread(&hello); aThread.join(); cout << "Bye main\n"; return 0; }
16
More on join If join is not called then, the main thread does not wait for aThread to terminate. It is not safe since aThread is still using some resources. Thus your program crashes. Sometimes it crashes after all output is displayed, but this case ia a problem.
void hello() { cout << "Hello thread\n";}
int main() { thread aThread(&hello); // aThread.join(); cout << "Bye main\n"; return 0; }
17
More on join • In order to call join() on a thread object, that thread must be
joinable.• A thread object is said to be joinable:
– if it is not the current thread (i.e. you cannot call join on a thread inside itself) – well, you have to work hard to be able to do so .
– if it has not joined/detached before– if it is not a default constructed thread object which has not been
initialized later (e.g. thread aThread; )• If you call join() on a thread which is not joinable, then your
program crashes.• We can check the thread if it is joinable before joining to avoid
these problems if (aThread.joinable())
aThread.join();
• Let us see some of the cases that cause crash and use of joinable() in threads4.cpp
18
freedom for a thread - detach • Sometimes (but not often) we want the child threads to become
independent of parent one.– For example, when the child thread is performing a long task and main does
not need anything from child. • The detach() function of thread class detaches the thread on
which it is called from the calling (parent) thread, allowing them to execute independently from each other.
• After calling detach(), both threads continue to run in parallel. Note that when either one ends execution, its resources are released.
• After a call to detach(), the thread object becomes non-joinable.• After the main thread finishes, any incomplete detached threads
continue their execution in the background. – So you may not see some of the outputs of the detached threads on screen
(the ones that would have been displayed after main ends).– Thus you may say that detach()is not so useful, and I agree.
Let us see threads5.cpp for an example of detach() use.
19
this_thread
• This namespace contains a set of functions that access the current thread.
• Functions– get_id – Returns the thread id. No parameters– yield - Yields to other threads (i.e. It does not work for that
scheduled duration). No parameters.• Kind of advanced issue, will skip.
– sleep_until - Sleep until a time point given as parameter– sleep_for - Sleep for a given duration given as parameter
• Usagethis_thread::function_name (arguments_if_any);
20
this_thread::get_idvoid hello(int order){ cout << order << " " << this_thread::get_id() << endl;}
int main(){ int i; thread threads[5]; for (i=0; i < 5; i++) { threads[i] = (thread(&hello, i)); }
for (i=0; i < 5; i++) { threads[i].join(); }
return 0;}
0 42401 89404 2 78523 267616224
0 53161 3 293636162 92124 2952
• Same program, but different executions. One reason is that thread_ids are different at each program run. Other reason is multithreading and difference in scheduling
• The reason that the outputs are mixed up is interleaved execution of threads even during one cout statement.
Returns the thread id of the calling thread. This value uniquely identifies the thread.
Outputs of the executions
See threads6.cpp
21
#include <iostream> #include <thread> #include <chrono> // for timingusing namespace std;
void pause_thread(int n) { this_thread::sleep_for (chrono::seconds(2*n)); cout << "thread " << n << " is here\n";}
int main() { cout << "Spawning 2 threads...\n"; thread t1 (pause_thread,1); thread t2 (pause_thread,2); cout << "Done spawning threads.\n"; cout << "Now waiting for them to join:\n"; t1.join(); t2.join(); cout << "Bye from main!\n"; return 0;}
Spawning 2 threads...Done spawning threads.Now waiting for them to join:thread 1 is herethread 2 is hereBye from main!
this_thread::sleep_formeans 2*n seconds. We will see more about chrono namespace later.You can use minutes for longer, milliseconds and microseconds for shorter durations.
• Here we always see the same output. This is due to different waiting times in threads. In this way, one of them returns while the other is still waiting.
• Try the same waiting time and see that the output again becomes garbled.See threads7.cpp
22
this_thread::sleep_until (abs_time)
• Blocks the calling thread until abs_time.– other threads may continue to work. – abs_time is of a data type from chrono namespace of C++.
• No need to understand the details of time structures but you should be able to apply the basics by using analogy from the examples below.
• Example: race– All threads will start at the same time (beginning of the next
minute).– In the thread function (count1m), an empty loop will iterate 1
million times. Actually this is the race (counting to 1 million). – With sleep_until, the order of threads finishing the loop looks
random (try several runs)– Without sleep_until, the threads starting first (0, 1, 2, ..) are more
likely to finish first (try several runs by commenting out sleep_until.
• See the code in the next slide and in threads8.cpp
23
this_thread::sleep_until (abs_time)#include <iostream> #include <iomanip> // for put_time#include <thread> #include <chrono> // for chrono::system_clock#include <ctime> // for time_t, tm, localtime, mktimeusing namespace std;
void count1m(int id, struct tm *ptm) {this_thread::sleep_until(chrono::system_clock::from_time_t(mktime(ptm)));
for (int i=0; i < 1000000; ++i) {} cout << id;}
int main () { time_t tt = chrono::system_clock::to_time_t (chrono::system_clock::now()); struct tm *ptm = new struct tm; localtime_s(ptm, &tt); cout << "Time is now " << put_time(ptm,"%X")<<endl; ptm->tm_min++; //new time is the next min ptm->tm_sec=0; cout << "Race will start at " << put_time(ptm,"%X") << endl; thread threads[10]; for (int i=0; i<10; i++) threads[i] = thread(count1m, i, ptm); for (int i=0; i<10; i++) threads[i].join(); cout << endl; return 0; }
Time is now 11:29:17Race will start at 11:30:006574392180
ptm is the name of the parameter. The others are fixed code
Comment out sleep_until line to see the pattern of thread completion order is similar to the creating order.
gets the current time
creating the time struct to be used in thread
converting the time structures
24
Re-initialization of a thread Object• An existing thread object can be re-initiliazed to run as another
thread after join() or detach()– To do so, you just assign a thread(&func_name, arguments) to the thread object.
– If you reassign/reinitialize a thread object witout join() or detach(), then your program crashes.
• In this way, you can have multiple threads in your program using a single thread object.– However, if you use join() between reassignment, threads work
consecutively (i.e. one ends and the other starts)– If you use detach() between reassignments, several threads may work
concurrently– Moreover, if you use detach() you may not see all the outputs of the
child threads (since the main thread may end before the child threads). As a remedy, you may sleep the main thread for a while at the end.See threads9.cpp for an example code
25
Sharing among threads• In all examples, threads were working concurrently• What if two or more threads modifies the same variable?
– Let us see in an example
Thread 1 Thread 2
Thread Function increments global value 100000 times
What is the final value of value after they join in main?
Declare a global value = 0
• You can never know the answer. • It rarely becomes the expected
value of 200000. – Mostly it is below 200000.
• The reason is at scheduling the threads in the middle of the increment.
• We will see more on this problem and solutions. But before that we will see some basics.
26
Sharing among threads: Scheduling
• Scheduling is a key concept in computer multitasking, multiprocessing operating system and real-time operating system designs
• Scheduling refers to the way processes are assigned to run on the available CPUs, since there are typically many more processes running than there are available CPUs.
• This assignment is carried out by software known as a scheduler and dispatcher.
27
Sharing among threads: Scheduling• Why is scheduling important?
– Multiple threads share the same process resources– Scheduler schedules them according to "some" algorithm that only OS
knows (i.e. you cannot control scheduling)– The sequence of scheduling can create conflicts and problems
• Read-write conflicts (not I/O type of read-write; we mean reading and writing from/to memory)
• Deadlocks
Thread 3
Thread 2
Thread 1
• Example: Three threads sharing a single CPU
• Actually this is a simplified and ideal scheduling. Normally scheduling may not be round-robin and allocated durations may vary
28
int value = 0;#define THREADS_NUM 2
void increment(){ for (int i=0; i <100000; i++) value++;}
int main(){ int i; cout << "At the beginning of main value is: " << value << endl; thread threads[THREADS_NUM]; for (i=0; i < THREADS_NUM; i++) threads[i] = thread(&increment); for (i=0; i < THREADS_NUM; i++) threads[i].join(); cout << "At the end of main value is: " << value << endl; return 0;}
Sharing among threads: Synchronization ConflictsLet us go back to increment problem
Global value to be shared by threads
Each thread increments value 100000 times
The final value of value does not reach 200000 for most of the time (it may rarely be 200000). Let us see this by running threads10.cpp.
The reason of this unexpected behavior is explained in the next slide.
Synchronization Conflict: Why Happens? • value++ has three internal steps (we do not see this in program, but internally
these happen). – First the value of value is read from the memory by the thread– The thread increments it.– The thread writes the new value of value back to memory.
• Scheduling may happen in the middle of these three steps. • Example scenarios (switching occurs in the arrow points):
Thread 1 reads valueThread 2 reads valueThread 2 incrementsThread 2 writes to valueThread 1 incrementsThread 1 writes to value
Thread 1 reads valueThread 1 incrementsThread 1 writes to valueThread 2 reads valueThread 2 incrementsThread 2 writes to value
29
This is correct since value is incremented twice
This yields wrong value since both threads increment the same value and value is incremented only once, although we think that it is incremented twice
Synchronization Conflict: Why Happens? • The problem explained in the previous slide may happen
several times in the example code (threads9.cpp).• This problem is a read-write type of synchronization conflict
– The object is read by one thread from memory and before having a chance to write the updated value, another thread reads it.
– In order not to allow this, read-process-write operations must be done without switching to another thread.
• As said before, the switching may occur any time during scheduling and we cannot control scheduling. – Thus, we have to have other mechanisms to avoid read-write
conflicts. Fortunately, there are ways to handle these problems as will be seen later.
30
31
Synchronization Conflicts: Another Example• Typical example: Producer-Consumer Problem
– Shared queue of items– Producer thread(s) add items to the queue– Consumer thread(s) remove items from the queue
IntQueue dataQ();
void producer() { Data d = ProduceData(); dataQ.enqueue(d);}
void consumer() { if(!dataQ.isEmpty()) { Data d = dataQ.dequeue(); ConsumeData(d); }}
• Suppose there are 3 threads:• thread1: Produces data and puts it into a queue• thread2: Checks the queue and if there is a data removes and consumes it• thread3: Checks the queue and if there is a data removes and consumes it
• How can things go wrong in this producer-consumer example?
32
Synchronization Conflicts• Problematic scheduling :
– Suppose that only one data left in queue• thread2(consumer) checks for empty and receives false and enters the if-
statement• thread3(consumer) checks for empty and receives false and enters the if-
statement• thread3 dequeues and consumes• thread2 tries to dequeue an empty queue and crashes/error/exception
• What if the queue is size-limited and we have two producers?– In producer function, we need if(!dataQ.isFull()) check
before enqueueing.– A similar scheduling problem may occur if there is only one empty
spot and each producer thread checks isFull() before the other enqueues.
33
Synchronization Conflicts
• There are several solutions to remedy the synchronization conflicts – Semaphores– Atomic references– Monitors– Condition variables– Compare and swap– etc.
• In this course, we will see a specific semaphore called "mutex"– This is the most general solution
34
Mutex
• A mutex (short for mutual exclusion) is a way of communicating among threads or processes that are executing concurrently.
• This communication is usually used to coordinate the activities of multiple threads or processes, typically by controlling access to a shared resource by "locking" and "unlocking" the resource.
35
Mutex• Actually a mutex can be thought as a binary counter
– To lock a mutex you "up" its value– To unlock/release a mutex you "down" its value
• Once a mutex is locked, only unlock operation is allowed on it.– If another thread wants to lock the same mutex, it should wait until it is
unlocked by the locking thread.– However, if the locking thread tries to lock the same mutex again before
unlocking it, then the program crashes/behaves unexpectedly.– If an unlocked mutex is (somehow) unlocked again, this generally cause a crash
or unexpected results. Thus you have to manage the lock/unlock sequences carefully.
• Generally, mutex objects are created as global to be shared by many thread functions (when created, initially in unlocked state).– However, only the locking thread can unlock it later.– If a mutex is in unlocked state, then any thread can lock it (same or different
threads).– A mutex can be locked and unlocked several times by several threads.
• Lets see how do we use mutex in C++ in the next slide and in threads11.cpp
36
. . .#include <mutex>mutex myMutex;int value = 0;#define THREADS_NUM 2void increment(){ for (int i=0; i <100000; i++) { myMutex.lock(); value++;
myMutex.unlock(); }}int main(){ int i; cout << "At the beginning of main value is: " << value << endl; thread threads[THREADS_NUM]; for (i=0; i < THREADS_NUM; i++) threads[i] = thread(&increment); for (i=0; i < THREADS_NUM; i++) threads[i].join(); cout << "At the end of main value is: " << value << endl; return 0;}
Solving Synchronization Conflicts using mutexRevisiting increment problem A new header file for mutex use
Global mutex object created
The final value of value now reaches 200000 since at a given time the ++ operation on value can be executed by only one thread. Let us see this by running threads11.cpp.
mutex is locked before value++; so that no other thread can attempt to increment value
mutex is unlocked aftervalue++; so that other threads can increment value
Also see threads11.cpp for some special cases related with lock/unlock sequences
37
Solving Synchronization Conflicts using mutexRevisiting producer-consumer problem: just a sketch here; more in labsQuestion: where to lock/unlock myMutex?
locking/unlocking to encapsulate only enqueue and dequeue is not a correct solution since two threads can concurrently check emptiness/fullness, which yields wrong results.
IntQueue dataQ();mutex myMutex; void producer() { if (!dataQ.isFull()) { Data d = ProduceData(); dataQ.enqueue(d); }}
void consumer() { if (!dataQ.isEmpty()) { Data d = dataQ.dequeue(); ConsumeData(d); }}
void producer() { myMutex.lock(); if (!dataQ.isFull()) { Data d = ProduceData(); dataQ.enqueue(d); } myMutex.unlock();}
void consumer() { myMutex.lock(); if (!dataQ.isEmpty()) { Data d = dataQ.dequeue(); myMutex.unlock(); ConsumeData(d); } else myMutex.unlock();}
38
void hello(int order){ coutMutex.lock(); cout << order << " " << this_thread::get_id() << endl; coutMutex.unlock();}
Solving Output Tidiness Problem using mutex
try with and without lock/unlock to see the difference in the output format
• In some of the previous examples the outputs of the threads were not tidy. – The reason was thread scheduling in the middle of cout.– We somehow solved it using ostringstream but there is not guarantee
that a string will be displayed ot once.– A guaranteed way to have a tidy output is to use a mutex before and
after cout– See the code below and threads13.cpp
39
Non-blocking lock trial: try_lock()
• Normally, a thread, which is trying to lock a mutex, waits idle if it has been locked by another thread.
• Sometimes, you want a thread not to remain idle while waiting for a mutex. – but do something else when waiting
for it.• There is a member function of
mutex for this purpose: try_lock()– If the thread successfully locks
mutex, then try_lock()returns true; otherwise, returns false.
– We can check this returned value to get into the critical section or doing something else
void increment(){ int unsuccessfulLock = 0; for (int i=0; i <10000; i++) { bool isLocked; isLocked = myMutex.try_lock(); if (isLocked) { value++; myMutex.unlock(); } else { unsuccessfulLock++; } } cout << "Unsuccessful Lock: " << unsuccessfulLock << endl;
}See threads12.cpp for the full sample code
40
The Remaining Problem: Deadlock• A catastrophic case that occurs when a thread, say thread1, waits
for a mutex to be unlocked, which will never be unlocked– One reason is that mutex has been locked by another thread and the
corresponding unlock has been forgotten• Solution: do not forget to unlock!
– Another reason is that the thread that has locked mutex, say thread2,has been blocked since it is waiting for another mutex to be unlocked and this other mutex has been locked by thread1.• Solution: if you are using multiple mutexes lock them in the same order in all
threads functions.• Let us see this problem below and solution in threads14.cpp
void thread_func2(){ myMutex2.lock(); myMutex1.lock();
myMutex1.unlock(); myMutex2.unlock();}
void thread_func1(){ myMutex1.lock(); myMutex2.lock();
myMutex2.unlock(); myMutex1.unlock();}
41
Passing reference parameters to threads• Since the functions associated with threads are not directly called when
the thread is initialized, actually a copy is created even if the corresponding parameter is reference parameter.– Thus the updates in function is not reflected back to parent thread. This is a
problem.– Solution is to use ref(argument) instead of just argument while initiliazing the
thread.
void increment(int & value){ for (int i=0; i <100; i++) { myMutex.lock(); value++; myMutex.unlock(); }}
int main(){ int counter = 0; thread myThread(&increment,counter); myThread.join(); cout << counter << endl; return 0;}
Output is unexpectedly 0. Solution is to change the thread line asthread myThread(&increment,ref(counter));After that, output becomes 100.See threads15.cpp for the full sample code