Introduction Lock-Free Explained Lock-Free Performance Summary Lock-Free Data Exchange for Real-Time Applications Peter Soetens Flander’s Mechatronics Technology Centre Leuven 25 Feb 2006 Free and Open Source Developers Meeting Peter Soetens Lock-Free Data Exchange
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Lock-Free Data Exchange for Real-TimeApplications
Peter Soetens
Flander’s Mechatronics Technology CentreLeuven
25 Feb 2006Free and Open Source Developers Meeting
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Outline
1 IntroductionAbout You and MeApplication Domain
2 Lock-Free ExplainedThe Good ’Ol DaysA New Kind of Computer Science
it allows individual threads to starve (loop forever) butdenies livelock.
Redefinition (2003): “Obstruction Free”
Real-Time
A term to denote execution time determinism of an action orsequence of actions in response to an event. This means thatthe action always completes (and/or starts) within a boundedtime interval.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
About You and MeApplication Domain
When to Use Lock-Free Algorithms
Lock-Free is especially useful for
multi-threaded or -process applications
blob data- or pointer-exchangeOS kernels and applications
A term to denote execution time determinism of an action orsequence of actions in response to an event. This means thatthe action always completes (and/or starts) within a boundedtime interval.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Going Real-Time
Given one of these Real-Time schedulers:
Rate Monotonic Scheduler (RMS)
Deadline Monotonic Scheduler (DMS)
Earliest Deadline First Scheduler (EDFS)
The following properties are always true for any lock-freealgorithm:
The highest priority writer thread has best case accesstime.
The other writer threads have bounded access time.
Any reader has always best case access time.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Going Real-Time
Given one of these Real-Time schedulers:
Rate Monotonic Scheduler (RMS)
Deadline Monotonic Scheduler (DMS)
Earliest Deadline First Scheduler (EDFS)
The following properties are always true for any lock-freealgorithm:
The highest priority writer thread has best case accesstime.
The other writer threads have bounded access time.
Any reader has always best case access time.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Going Real-Time
Given one of these Real-Time schedulers:
Rate Monotonic Scheduler (RMS)
Deadline Monotonic Scheduler (DMS)
Earliest Deadline First Scheduler (EDFS)
The following properties are always true for any lock-freealgorithm:
The highest priority writer thread has best case accesstime.
The other writer threads have bounded access time.
Any reader has always best case access time.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Going Real-Time
Given one of these Real-Time schedulers:
Rate Monotonic Scheduler (RMS)
Deadline Monotonic Scheduler (DMS)
Earliest Deadline First Scheduler (EDFS)
The following properties are always true for any lock-freealgorithm:
The highest priority writer thread has best case accesstime.
The other writer threads have bounded access time.
Any reader has always best case access time.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation Experiment
Real-time Machine ControllerPentium III 750MHz, 128MB RAM(this is vastly oversized for our purpose, but allowedon-target data capturing)
Software
Linux 2.4.18 with RTAI/LXRT 3.0 Patch
Orocos configured for LXRT
Many readers / many writers test applications
Both FIFO buffers and shared data exchange
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation: Data Flow
Concurr ent
NRT
Sma ll
NRT
RT500Hz
RT1Hz
RT2KHz
RT500Hz
RT1KHz
NRT
RT500Hz
RT1Hz
RT2KHz
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation: No Communication
0.1
1
10
100
1000
10000
100000
1e+06
1e-05 1e-04 0.001 0.01
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
1ms/0.5ms5ms/1ms
0.1
1
10
100
1000
10000
100000
1e+06
1e-05 1e-04 0.001 0.01
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
0.5ms/0.1ms1ms/0.2ms2ms/0.3ms
Executionlatencies
For a small (2 RTthreads) andconcurrent (3 RTthreads)application.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation: Data Exchange
0.1
1
10
100
1000
10000
100000
1e+06
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
1ms/0.5ms
0.1
1
10
100
1000
10000
100000
1e+06
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
1ms/0.5ms
Small application, high priority thread.
Communicationlatencies
Lock based (top)and lock free(bottom).
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation: Data Exchange
0.1
1
10
100
1000
10000
100000
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
5ms/0.5ms
0.1
1
10
100
1000
10000
100000
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
5ms/0.5ms
Small application, low priority thread.
Communicationlatencies
Lock based (top)and lock-free(bottom).
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation: Data Exchange
0.1
1
10
100
1000
10000
100000
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
0.5ms/0.1ms
0.1
1
10
100
1000
10000
100000
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
0.5ms/0.1ms
Concurrent application, high priority thread.
Communicationlatencies
Lock based (top)and lock free(bottom).
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation: Data Exchange
0.1
1
10
100
1000
10000
100000
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
1ms/0.2ms
0.1
1
10
100
1000
10000
100000
1e+06
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
1ms/0.2ms
Concurrent application, medium priority thread.
Communicationlatencies
Lock based (top)and lock free(bottom).
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation: Data Exchange
0.1
1
10
100
1000
10000
100000
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
2ms/0.3ms
0.1
1
10
100
1000
10000
100000
1e-06 1e-05 1e-04 0.001 0.01 0.1
Occ
uren
ces
Latency time ( s ). Bucket size: 5 us
2ms/0.3ms
Concurrent application, medium priority thread.
Communicationlatencies
Lock based (top)and lock free(bottom).
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Real-Time Validation: Conclusions
Small Applications
Lock-free performs on average better
Lock-free performs worst case better
Concurrent Applications
Lock-free performs on average better
Lock-free performs worst case better
Lock-free prevents dead-line failures
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Outline
1 IntroductionAbout You and MeApplication Domain
2 Lock-Free ExplainedThe Good ’Ol DaysA New Kind of Computer Science
OK for most real-time andembedded applications, number ofthreads is well known.
Worse, if not catastrophic, for OSkernels, number of threads isunknown.
Reference counted memory
Both readers and writers need toreference count data blocks.
Requires ’atomic’ processorinstructions.
DD D
Reference counted
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Overhead for Readers
Increase reference count
Since a data block may not be freedbefore all readers are done readingit, a reference countingimplementation is required.Analogous to RCU
Detect moved ’Most Recent’ pointer.If a reader ’locks’ the data block butdetects that the refcount is one, itmust retry, since the block may be inre-use already.
DD D
Reference counted
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Overhead for Writers
Find an empty data block
Possibly race against other writers
Increase reference count
Copy and update the data
Large data blocks will reduceperformance
Retry if necessary (W > 1)
In case source data block changed,startover with the copy-update fromthe new data block.
DD D
Reference counted
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Exception: Lock-free Pointer Queues
Best case access time ?= TFindpointer + TPointercopy
Always the case for the highest priority thread.
Worst case access time ?= WHigherPriority ∗ TBestcase
Depends on the number of higher priority writers.
Memory requirements ?= sizeof (D) ∗ Nqueue
⇒ Best of both worlds ! Independent of number of threads.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Exception: Lock-free Pointer Queues
+ Best case access time ?= TFindpointer + TPointercopy
Always the case for the highest priority thread.
Worst case access time ?= WHigherPriority ∗ TBestcase
Depends on the number of higher priority writers.
Memory requirements ?= sizeof (D) ∗ Nqueue
⇒ Best of both worlds ! Independent of number of threads.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Exception: Lock-free Pointer Queues
+ Best case access time ?= TFindpointer + TPointercopy
Always the case for the highest priority thread.
+ Worst case access time ?= WHigherPriority ∗ TBestcase
Depends on the number of higher priority writers.
Memory requirements ?= sizeof (D) ∗ Nqueue
⇒ Best of both worlds ! Independent of number of threads.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Time determinismAlgorithm Overhead
Exception: Lock-free Pointer Queues
+ Best case access time ?= TFindpointer + TPointercopy
Always the case for the highest priority thread.
+ Worst case access time ?= WHigherPriority ∗ TBestcase
Depends on the number of higher priority writers.
+ Memory requirements ?= sizeof (D) ∗ Nqueue
⇒ Best of both worlds ! Independent of number of threads.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Summary
Lock-Free algorithms are a drastic improvement forreal-time applications
Lock-Free algorithms don’t require any schedulerintervention.
But be aware of memory requirements.
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
Thank you for your attention !
Peter Soetens Lock-Free Data Exchange
IntroductionLock-Free Explained
Lock-Free PerformanceSummary
References
http://www.orocos.org
Anderson, J., S. Ramamurthy, and K. Jeffay (1995).Real-time com- puting with lock-free shared objects.Proceedings of the 16th IEEE Real-Time SystemsSymposium.
Herlihy, M., V. Luchangco, and M. Moir (2003).Obstruction-free synchronization: Double-ended queuesas an example. In 03: Proceedings of the 23rdInternational Conference on Dis- tributed ComputingSystems, Washington, DC, USA, pp. 522. IEEE ComputerSociety.