1Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Class ME703, Fridtjof Siebert
Challenges for the RealTimeSoftware Development
for Multicore Systems
Dr. Fridtjof Siebert CTOaicas
29 April 2010
2Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Typical Problems on Multicore
typical code sequence (C/C++ or Java)
int counter;
void increment(){ counter++; }
3Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Typical Problems on Multicore
typical code sequence (C/C++ or Java)
int counter;
void increment(){ counter++; } r1 = counter;
r2 = r1 + 1; counter = r2;
4Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Typical Problems on Multicore
typical code sequence (C/C++ or Java)
int counter;
void increment(){ counter++; } r1 = counter;
r2 = r1 + 1; counter = r2;
r1 = counter; r2 = r1 + 1; counter = r2;
Thread 1 Thread 2
5Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Typical Problems on Multicore
typical code sequence (C/C++ or Java)
int counter;
void increment(){ counter++; } r1 = counter;
r2 = r1 + 1; counter = r2;
r1 = counter; r2 = r1 + 1; counter = r2;
Thread 1 Thread 2
One increment() can get lost!
6Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Typical Problems on Multicore
typical code sequence (C/C++ or Java)
int counter;
void increment(){ counter++; }
7Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Typical Problems on Multicore
typical code sequence (C/C++ or Java)
int counter;
void increment(){ counter++; }
this code misses synchronization
but on a single core, it practically always works!
on a multicore, chances for failure explode!
8Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Synchronization
solution: synchronize
int counter;
synchronized void increment(){ counter++; }
easy, problem solved.
Or? See later.
9Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Atomic Operations
What is the result of
int a, b; /* 32 bit, initially 0 */
Thread 1 Thread 2 b = a; a = 1;
?
10Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Atomic Operations
What is the result of
int a, b; /* 32 bit, initially 0 */
Thread 1 Thread 2 b = a; a = 1;
?
b == 0b == 1
11Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Atomic Operations
What is the result of
long a, b; /* 64 bit, initially 0 */
Thread 1 Thread 2 b = a; a = 1;
?
12Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Atomic Operations
What is the result of
long a, b; /* 64 bit, initially 0 */
Thread 1 Thread 2 b = a; a = 1;
?
b == 0b == 1b == 4294967296b == 4294967295
13Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Typical Problems on Multicore
polling update
long counter;[..]do { doSomething(); }while (counter < MAX);
counter is incremented by parallel thread
on multicore, changes to counter may not become visible!
14Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Typical Problems on Multicore
polling update
long counter;[..]do { doSomething(); }while (counter < MAX);
counter is incremented by parallel thread
on multicore, changes to counter may not become visible!
15Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Solution: volatile?
polling update
volatile long counter;[..]do { doSomething(); }while (counter < MAX);
works for Java
does not work for C!
16Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Solution: volatile?
polling update
volatile long counter;[..]do { doSomething(); }while (counter < MAX);
works for Java
does not work for C!
17Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
We must understand the memory model!
Memory model specifies what optimisations are permitted by the compiler or underlying hardware
C/C++ programs have undefined semantics in case of race conditions
Java defines a strict memory model
18Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Java's memory model:
ordering operations aresynchronized block
accessing a volatile variable
The presence of an ordering operation determines the visible state in shared memory
19Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Java's memory model: Enforcing Order
all reads are completed before entering synchronized block, or
reading a volatile variable
read fence
all writes are completed before exiting a synchronized block, or
writing a volatile var
write fence
20Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Java's memory model: Date Races
data races are not forbidden in Java you can use shared memory variables
your code has to tolerate optimizations
examples collecting debugging / profiling information
useful if occasional errors due to data reaces are tolerable
21Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Example use of Java's memory model:
Shared memory communicationPtr p;boolean p_valid;
Thread 1 p = new Ptr(); p_valid = true;
22Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Example use of Java's memory model:
Shared memory communicationPtr p;boolean p_valid;
Thread 1 Thread 2 p = new Ptr(); p_valid = true;
23Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Example use of Java's memory model:
Shared memory communicationPtr p;boolean p_valid;
Thread 1 Thread 2 p = new Ptr(); if (p_valid)p_valid = true; { p.call(); }
24Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Example use of Java's memory model:
Shared memory communicationPtr p;boolean p_valid;
Thread 1 Thread 2 t1 = new Ptr(); if (p_valid)t2 = true; {p_valid = t2; p.call();p = t1; }
Writes reordered!
25Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationPtr p;boolean p_valid;
Thread 1 Thread 2 t1 = new Ptr(); t3 = p;t2 = true; if (p_valid) p_valid = t2; { p = t1; t3.call(); }
Writes reordered! Reads reordered!
Example use of Java's memory model:
26Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationPtr p;boolean p_valid;
Thread 1 Thread 2 t1 = new Ptr(); t3 = p;t2 = true; if (p_valid) p_valid = t2; { p = t1; t3.call(); }
Writes reordered! Reads reordered!
Example use of Java's memory model:
27Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Ptr p;volatile boolean p_valid;
Thread 1 Thread 2 p = new Ptr(); if (p_valid)p_valid = true; { p.call(); }
in Java
Example use of Java's memory model:
28Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Ptr p;volatile boolean p_valid;
Thread 1 Thread 2 p = new Ptr(); if (p_valid)p_valid = true; { p.call(); }
in Java
Example use of Java's memory model:
29Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Ptr p;volatile boolean p_valid;
Thread 1 Thread 2 p = new Ptr(); if (p_valid)p_valid = true; { p.call(); }
in Java
Example use of Java's memory model:
30Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Obj *p;volatile boolean p_valid;
Thread 1 Thread 2 p = malloc(..); if (p_valid)p_valid = TRUE; { p>f = ..; }
in C?
Example use of C's memory model:
31Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Obj *p;volatile boolean p_valid;
Thread 1 Thread 2 p = malloc(..); if (p_valid)p_valid = TRUE; { p>f = ..; }
in C? CPU may still reorder memory accesses!
Example use of C's memory model:
32Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Obj *p;volatile boolean p_valid;
Thread 1 Thread 2 p = malloc(..); if (p_valid)p_valid = TRUE; { p>f = ..; }
in C? CPU may still reorder memory accesses!
Example use of C's memory model:
33Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Obj *p;volatile boolean p_valid;
Thread 1 Thread 2 p = malloc(..); if (p_valid)p_valid = TRUE; { p>f = ..; }
in C? CPU may still reorder memory accesses!
Example use of C's memory model:
34Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Obj *p;volatile boolean p_valid;
Thread 1 Thread 2 p = malloc(..); if (p_valid)p_valid = TRUE; { p>f = ..; }
How to fix it? Add memory fences!
Example use of C's memory model:
35Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Obj *p;volatile boolean p_valid;
Thread 1 Thread 2 p = malloc(..); if (p_valid)asm volatile( { "sfence":::"memory"); p>f = ..;p_valid = TRUE; }
How to fix it? Add memory fences!
Example use of C's memory model:
36Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Obj *p;volatile boolean p_valid;
Thread 1 Thread 2 p = malloc(..); if (p_valid)asm volatile( { "sfence":::"memory"); asm volatile(p_valid = TRUE; "lfence":::"memory"); p>f = ..; }
How to fix it? Add memory fences!
Example use of C's memory model:
37Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Shared memory communicationvolatile Obj *p;volatile boolean p_valid;
Thread 1 Thread 2 p = malloc(..); if (p_valid)asm volatile( { "sfence":::"memory"); asm volatile(p_valid = TRUE; "lfence":::"memory"); p>f = ..; }
How to fix it? Add memory fences!
Example use of C's memory model:
38Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
OutofthinAir
imagine this codeint x = 0, n = 0;
Thread 1 Thread 2 for(i=0;i<n;i++) x = 42; x += f(n); print(x);
39Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
OutofthinAir
imagine this codeint x = 0, n = 0;
Thread 1 Thread 2 for(i=0;i<n;i++) x = 42; x += f(n); print(x);
can only print 42 in Java
40Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
OutofthinAir: Introduction of Writes
loop optimization in C/C++int x = 0, n = 0;
Thread 1 Thread 2 tmp = x; for(i=0;i<n;i++) x = 42; tmp += f(n);x = tmp; print(x);
41Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
OutofthinAir: Introduction of Writes
loop optimization in C/C++int x = 0, n = 0;
Thread 1 Thread 2 tmp = x; for(i=0;i<n;i++) x = 42; tmp += f(n);x = tmp; print(x);
can print 0 in C/C++
42Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
OutofthinAir
imagine this codeint x = 0, y = 0;
Thread 1 Thread 2 r1 = x; r2 = y;y = r1; x = r2;
43Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
OutofthinAir
imagine this codeint x = 0, y = 0;
Thread 1 Thread 2 r1 = x; r2 = y;y = r1; x = r2;
Expected result x == 0; y == 0;
Only possible result in Java
44Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
OutofthinAir: Optimization in C/C++
imagine this codeint x = 0, y = 0;
Thread 1 Thread 2 y = 42; r2 = y;r1 = x; x = r2;if (r1 != 42) y = r1;
Possible in upcoming C++ MM. Results in x == 42; y == 42;
45Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Performance on a Multicore
example: singlecore app, 3 threads
all threads synchronize frequently on the same lock
46Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
example: singlecore app, 3 threads
Performance on a Multicore
47Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
example: singlecore app, 3 threads
on a multicore
Performance on a Multicore
48Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Lockfree Algorithms
typical code sequence
do { x = counter; result = CAS(counter,x,x+1); }while (result != x);
49Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
CompareAndSwap Issues
typical code sequence
do { x = counter; result = CAS(counter,x,x+1); }while (result != x);
what is the WCET? ∞?
50Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
On dual core:
# iterations
fre
qu
en
cy
typical code sequence
do { x = counter; result = CAS(counter,x,x+1); }while (result != x);
what is the WCET? ∞?
CompareAndSwap Issues
51Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
CompareAndSwap Solution
introduce long enough code sections in between 2 compareandswap loops
then, if a retry is required, one other CPU was successful
after n1 conflicts, we can be sure that all other CPUs are outside the CAS loop
52Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Example: On an 8 CPU system: # of tries limited
CompareAndSwap Solution
53Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Lockfree library code
use of libraries helpsAtomicInteger counter = new AtomicInteger();void increment(){ (void) counter.incrementAndGet();}
Code is easier and safer
Handmade lockfree algorithms are not for everyday development
54Dr. Fridtjof Siebert, aicas
5th Annual
ME703: Challenges for RealTime SW Development for Multicore Systems
Conclusion
Code that runs well on single CPU suddenly fails on a multicore
Clear semantics of concurrent code is required for safe applications
Concurrency at a high level is most beneficial
Thanks: This work was partially funded by the European Commission's 7th framework program's JEOPARD project, #216682.