Top Banner
1 Dr. Fridtjof Siebert, aicas 5 th Annual ME-703: Challenges for Real-Time SW Development for Multicore Systems Class ME-703, Fridtjof Siebert Challenges for the Real-Time Software Development for Multicore Systems Dr. Fridtjof Siebert  CTO aicas 29 April 2010
54

Class ME703, Fridtjof Siebert Challenges for the RealTime

Feb 03, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Class ME703, Fridtjof Siebert Challenges for the RealTime

1Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Class ME­703, Fridtjof Siebert

Challenges for the Real­TimeSoftware Development

for Multicore Systems

Dr. Fridtjof Siebert CTOaicas

29 April 2010

Page 2: Class ME703, Fridtjof Siebert Challenges for the RealTime

2Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Typical Problems on Multicore

typical code sequence (C/C++ or Java)

int counter; 

void increment(){  counter++; }

Page 3: Class ME703, Fridtjof Siebert Challenges for the RealTime

3Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Typical Problems on Multicore

typical code sequence (C/C++ or Java)

int counter; 

void increment(){  counter++; } r1 = counter; 

r2 = r1 + 1; counter = r2; 

Page 4: Class ME703, Fridtjof Siebert Challenges for the RealTime

4Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Typical Problems on Multicore

typical code sequence (C/C++ or Java)

int counter; 

void increment(){  counter++; } r1 = counter; 

r2 = r1 + 1; counter = r2; 

r1 = counter; r2 = r1 + 1; counter = r2; 

Thread 1             Thread 2           

Page 5: Class ME703, Fridtjof Siebert Challenges for the RealTime

5Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Typical Problems on Multicore

typical code sequence (C/C++ or Java)

int counter; 

void increment(){  counter++; } r1 = counter; 

r2 = r1 + 1; counter = r2; 

r1 = counter; r2 = r1 + 1; counter = r2; 

Thread 1             Thread 2           

One increment() can get lost!

Page 6: Class ME703, Fridtjof Siebert Challenges for the RealTime

6Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Typical Problems on Multicore

typical code sequence (C/C++ or Java)

int counter; 

void increment(){  counter++; }

Page 7: Class ME703, Fridtjof Siebert Challenges for the RealTime

7Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Typical Problems on Multicore

typical code sequence (C/C++ or Java)

int counter; 

void increment(){  counter++; }

this code misses synchronization

but on a single core, it practically always works!

on a multicore, chances for failure explode!

Page 8: Class ME703, Fridtjof Siebert Challenges for the RealTime

8Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Synchronization

solution: synchronize

int counter; 

synchronized void increment(){  counter++; }

easy, problem solved.

Or? See later.

Page 9: Class ME703, Fridtjof Siebert Challenges for the RealTime

9Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Atomic Operations

What is the result of

int a, b;  /* 32 bit, initially 0 */

Thread 1            Thread 2  b = a;         a = ­1; 

?

Page 10: Class ME703, Fridtjof Siebert Challenges for the RealTime

10Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Atomic Operations

What is the result of

int a, b;  /* 32 bit, initially 0 */

Thread 1            Thread 2  b = a;         a = ­1; 

?

b == 0b == ­1

Page 11: Class ME703, Fridtjof Siebert Challenges for the RealTime

11Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Atomic Operations

What is the result of

long a, b;  /* 64 bit, initially 0 */

Thread 1            Thread 2  b = a;         a = ­1; 

?

Page 12: Class ME703, Fridtjof Siebert Challenges for the RealTime

12Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Atomic Operations

What is the result of

long a, b;  /* 64 bit, initially 0 */

Thread 1            Thread 2  b = a;         a = ­1; 

?

b == 0b == ­1b == ­4294967296b == 4294967295

Page 13: Class ME703, Fridtjof Siebert Challenges for the RealTime

13Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Typical Problems on Multicore

polling update

long counter;[..]do   {    doSomething();   }while (counter < MAX);

counter is incremented by parallel thread

on multicore, changes to counter may not become visible!

Page 14: Class ME703, Fridtjof Siebert Challenges for the RealTime

14Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Typical Problems on Multicore

polling update

long counter;[..]do   {    doSomething();   }while (counter < MAX);

counter is incremented by parallel thread

on multicore, changes to counter may not become visible!

Page 15: Class ME703, Fridtjof Siebert Challenges for the RealTime

15Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Solution: volatile?

polling update

volatile long counter;[..]do   {    doSomething();   }while (counter < MAX);

works for Java

does not work for C!

Page 16: Class ME703, Fridtjof Siebert Challenges for the RealTime

16Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Solution: volatile?

polling update

volatile long counter;[..]do   {    doSomething();   }while (counter < MAX);

works for Java

does not work for C!

Page 17: Class ME703, Fridtjof Siebert Challenges for the RealTime

17Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

We must understand the memory model!

Memory model specifies what optimisations are permitted by the compiler or underlying hardware

C/C++ programs have undefined semantics in case of race conditions

Java defines a strict memory model

Page 18: Class ME703, Fridtjof Siebert Challenges for the RealTime

18Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Java's memory model:

ordering operations aresynchronized block

accessing a volatile variable

The presence of an ordering operation determines the visible state in shared memory

Page 19: Class ME703, Fridtjof Siebert Challenges for the RealTime

19Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Java's memory model: Enforcing Order

all reads are completed before entering synchronized block, or

reading a volatile variable

         read fence

all writes are completed before exiting a synchronized block, or

writing a volatile var

         write fence

Page 20: Class ME703, Fridtjof Siebert Challenges for the RealTime

20Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Java's memory model: Date Races

data races are not forbidden in Java you can use shared memory variables

your code has to tolerate optimizations

examples collecting debugging / profiling information 

useful if occasional errors due to data reaces are tolerable

Page 21: Class ME703, Fridtjof Siebert Challenges for the RealTime

21Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Example use of Java's memory model:

Shared memory communicationPtr     p;boolean p_valid;

Thread 1                             p = new Ptr();          p_valid = true;                                                                  

Page 22: Class ME703, Fridtjof Siebert Challenges for the RealTime

22Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Example use of Java's memory model:

Shared memory communicationPtr     p;boolean p_valid;

Thread 1                             Thread 2                      p = new Ptr();          p_valid = true;                                                                  

Page 23: Class ME703, Fridtjof Siebert Challenges for the RealTime

23Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Example use of Java's memory model:

Shared memory communicationPtr     p;boolean p_valid;

Thread 1                             Thread 2                      p = new Ptr();          if (p_valid)p_valid = true;           {                            p.call();                          } 

Page 24: Class ME703, Fridtjof Siebert Challenges for the RealTime

24Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Example use of Java's memory model:

Shared memory communicationPtr     p;boolean p_valid;

Thread 1                             Thread 2                      t1 = new Ptr();         if (p_valid)t2 = true;                {p_valid = t2;               p.call();p = t1;                   } 

Writes reordered!

Page 25: Class ME703, Fridtjof Siebert Challenges for the RealTime

25Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationPtr     p;boolean p_valid;

Thread 1                             Thread 2                      t1 = new Ptr();         t3 = p;t2 = true;              if (p_valid)  p_valid = t2;             {  p = t1;                     t3.call();                          } 

Writes reordered! Reads reordered!

Example use of Java's memory model:

Page 26: Class ME703, Fridtjof Siebert Challenges for the RealTime

26Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationPtr     p;boolean p_valid;

Thread 1                             Thread 2                      t1 = new Ptr();         t3 = p;t2 = true;              if (p_valid)  p_valid = t2;             {  p = t1;                     t3.call();                          } 

Writes reordered! Reads reordered!

Example use of Java's memory model:

Page 27: Class ME703, Fridtjof Siebert Challenges for the RealTime

27Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Ptr     p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = new Ptr();          if (p_valid)p_valid = true;           {                            p.call();                          } 

in Java

Example use of Java's memory model:

Page 28: Class ME703, Fridtjof Siebert Challenges for the RealTime

28Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Ptr     p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = new Ptr();          if (p_valid)p_valid = true;           {                            p.call();                          } 

in Java 

Example use of Java's memory model:

Page 29: Class ME703, Fridtjof Siebert Challenges for the RealTime

29Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Ptr     p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = new Ptr();          if (p_valid)p_valid = true;           {                            p.call();                          } 

in Java 

Example use of Java's memory model:

Page 30: Class ME703, Fridtjof Siebert Challenges for the RealTime

30Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Obj    *p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = malloc(..);         if (p_valid)p_valid = TRUE;           {                            p­>f = ..;                          }  

in C? 

Example use of C's memory model:

Page 31: Class ME703, Fridtjof Siebert Challenges for the RealTime

31Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Obj    *p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = malloc(..);         if (p_valid)p_valid = TRUE;           {                            p­>f = ..;                          } 

in C? CPU may still reorder memory accesses! 

Example use of C's memory model:

Page 32: Class ME703, Fridtjof Siebert Challenges for the RealTime

32Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Obj    *p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = malloc(..);         if (p_valid)p_valid = TRUE;           {                            p­>f = ..;                          } 

in C? CPU may still reorder memory accesses! 

Example use of C's memory model:

Page 33: Class ME703, Fridtjof Siebert Challenges for the RealTime

33Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Obj    *p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = malloc(..);         if (p_valid)p_valid = TRUE;           {                            p­>f = ..;                          } 

in C? CPU may still reorder memory accesses! 

Example use of C's memory model:

Page 34: Class ME703, Fridtjof Siebert Challenges for the RealTime

34Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Obj    *p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = malloc(..);         if (p_valid)p_valid = TRUE;           {                            p­>f = ..;                          } 

How to fix it? Add memory fences!

Example use of C's memory model:

Page 35: Class ME703, Fridtjof Siebert Challenges for the RealTime

35Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Obj    *p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = malloc(..);         if (p_valid)asm volatile(             {  "sfence":::"memory");     p­>f = ..;p_valid = TRUE;           } 

How to fix it? Add memory fences!

Example use of C's memory model:

Page 36: Class ME703, Fridtjof Siebert Challenges for the RealTime

36Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Obj    *p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = malloc(..);         if (p_valid)asm volatile(             {  "sfence":::"memory");     asm volatile(p_valid = TRUE;             "lfence":::"memory");                            p­>f = ..;                          }

How to fix it? Add memory fences!

Example use of C's memory model:

Page 37: Class ME703, Fridtjof Siebert Challenges for the RealTime

37Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Shared memory communicationvolatile Obj    *p;volatile boolean p_valid;

Thread 1                             Thread 2                      p = malloc(..);         if (p_valid)asm volatile(             {  "sfence":::"memory");     asm volatile(p_valid = TRUE;             "lfence":::"memory");                            p­>f = ..;                          }

How to fix it? Add memory fences!

Example use of C's memory model:

Page 38: Class ME703, Fridtjof Siebert Challenges for the RealTime

38Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Out­of­thin­Air

imagine this codeint x = 0, n = 0;

Thread 1                             Thread 2                      for(i=0;i<n;i++)        x = 42;  x += f(n);            print(x);

Page 39: Class ME703, Fridtjof Siebert Challenges for the RealTime

39Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Out­of­thin­Air

imagine this codeint x = 0, n = 0;

Thread 1                             Thread 2                      for(i=0;i<n;i++)        x = 42;  x += f(n);            print(x);

can only print 42 in Java

Page 40: Class ME703, Fridtjof Siebert Challenges for the RealTime

40Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Out­of­thin­Air: Introduction of Writes

loop optimization in C/C++int x = 0, n = 0;

Thread 1                             Thread 2                      tmp = x;                for(i=0;i<n;i++)        x = 42;  tmp += f(n);x = tmp;                                        print(x);

Page 41: Class ME703, Fridtjof Siebert Challenges for the RealTime

41Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Out­of­thin­Air: Introduction of Writes

loop optimization in C/C++int x = 0, n = 0;

Thread 1                             Thread 2                      tmp = x;                for(i=0;i<n;i++)        x = 42;  tmp += f(n);x = tmp;                                        print(x);

can print 0 in C/C++

Page 42: Class ME703, Fridtjof Siebert Challenges for the RealTime

42Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Out­of­thin­Air

imagine this codeint x = 0, y = 0;

Thread 1                             Thread 2                      r1 = x;                 r2 = y;y = r1;                 x = r2;

Page 43: Class ME703, Fridtjof Siebert Challenges for the RealTime

43Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Out­of­thin­Air

imagine this codeint x = 0, y = 0;

Thread 1                             Thread 2                      r1 = x;                 r2 = y;y = r1;                 x = r2;

Expected result  x == 0; y == 0;

Only possible result in Java

Page 44: Class ME703, Fridtjof Siebert Challenges for the RealTime

44Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Out­of­thin­Air: Optimization in C/C++

imagine this codeint x = 0, y = 0;

Thread 1                             Thread 2                      y = 42;                 r2 = y;r1 = x;                 x = r2;if (r1 != 42)  y = r1;

Possible in upcoming C++ MM. Results in   x == 42; y == 42;

Page 45: Class ME703, Fridtjof Siebert Challenges for the RealTime

45Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Performance on a Multicore

example: single­core app, 3 threads

all threads synchronize frequently on the same lock 

Page 46: Class ME703, Fridtjof Siebert Challenges for the RealTime

46Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

example: single­core app, 3 threads

Performance on a Multicore

Page 47: Class ME703, Fridtjof Siebert Challenges for the RealTime

47Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

example: single­core app, 3 threads

on a multicore

Performance on a Multicore

Page 48: Class ME703, Fridtjof Siebert Challenges for the RealTime

48Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Lock­free Algorithms

typical code sequence

do  {    x = counter;    result = CAS(counter,x,x+1);  }while (result != x);

Page 49: Class ME703, Fridtjof Siebert Challenges for the RealTime

49Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Compare­And­Swap Issues

typical code sequence

do  {    x = counter;    result = CAS(counter,x,x+1);  }while (result != x);

what is the WCET? ∞?

Page 50: Class ME703, Fridtjof Siebert Challenges for the RealTime

50Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

On dual core:

# iterations

fre

qu

en

cy

typical code sequence

do  {    x = counter;    result = CAS(counter,x,x+1);  }while (result != x);

what is the WCET? ∞?

Compare­And­Swap Issues

Page 51: Class ME703, Fridtjof Siebert Challenges for the RealTime

51Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Compare­And­Swap Solution

introduce long enough code sections in between 2 compare­and­swap loops

then, if a retry is required, one other CPU was successful

after n­1 conflicts, we can be sure that all other CPUs are outside the CAS loop

Page 52: Class ME703, Fridtjof Siebert Challenges for the RealTime

52Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Example: On an 8 CPU system: # of tries limited

Compare­And­Swap Solution

Page 53: Class ME703, Fridtjof Siebert Challenges for the RealTime

53Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Lock­free library code

use of libraries helpsAtomicInteger counter = new AtomicInteger();void increment(){  (void) counter.incrementAndGet();}

Code is easier and safer

Hand­made lock­free algorithms are not for every­day development

Page 54: Class ME703, Fridtjof Siebert Challenges for the RealTime

54Dr. Fridtjof Siebert, aicas

5th Annual

ME­703: Challenges for Real­Time SW Development for Multicore Systems

Conclusion

Code that runs well on single CPU suddenly fails on a multicore

Clear semantics of concurrent code is required for safe applications

Concurrency at a high level is most beneficial

  Thanks: This work was partially funded by the European   Commission's 7th framework program's JEOPARD project, #216682.