-
A New Reachability Algorithm for Symmetric Multi-processor
ArchitectureD. Sahoo, StanfordJ. Jain, FujitsuS. Iyer, UT-AustinD.
Dill, StanfordFormal Equivalence and Assertion-based Verification
Workshop 2005
-
OutlineStandard Reachability AnalysisMultithreaded
ReachabilityMultithreaded Reachability in SMP machinesEngineering
IssuesResultsConclusion and Future Work
-
Related WorkParallel Reachability Analysis:Stern and Dill [CAV,
97]Stornetta and Brewer [DAC, 96]Yang, Hallaron [97]Heyman, Geist,
Grumberg, Schuster [CAV, 00]Garavel, Mateescu, Smarandache [SPIN,
01]Pixley, Havlicek [03]
-
Reachability using BDDIR1R2RiTr1TriTrn[Burch et al. :
91]Partitioned Transition RelationInitial StateImage
computationLeast Fixed Point
-
Partitioned Reachability using POBDDIPOBDD - [Jain :
92]Reachability - [Narayan et al. : 97]Initial States : I
-
Partitioned Reachability using POBDDLocal Fixed Point 1Local
Fixed Point 2IInitial States : IPOBDD - [Jain : 92]Reachability -
[Narayan et al. : 97]
-
Partitioned Reachability using POBDDIInitial States : ISimilarly
repeat for other partitionsPOBDD - [Jain : 92]Reachability -
[Narayan et al. : 97]
-
Partitioned Reachability using POBDDIImprovements:[Iyer et al. :
03][Sahoo et al. : 04]POBDD - [Jain : 92]Reachability - [Narayan et
al. : 97]
-
Motivation for Multi-threaded ApproachScheduling
ProblemIncreasing availability of powerful SMP
machinesMulti-threading is a way of achieving real parallelism in
SMP machines
-
Multi-threaded Reachability [DAC 05]TimeAdvantage:Parallel
speedupCatch a bug faster than the sequential versionNave
parallelizationProblems:Not much parallelism
-
Multi-threaded Reachability [DAC 05]Advantage:Parallel
speedupFinishes the reachability analysis fasterCatches bug faster
than the naive versionEarly CommunicationTimeProblems:Parallelism
could be better
-
Multi-threaded Reachability [DAC 05]Advantage:Parallel
speedupFinishes the reachability analysis fasterCatches bug faster
than the previous versionsEarly Communication and Partial
CommunicationTime
-
Reachability in SMP Architecture We find the bugs faster !
Improved parallelism Better parallel speedupTime
-
Engineering IssuesThread-safe BDD libraryDeterministic
behaviorSmart thread scheduling
-
Sources of Non-determinismExtensive memory based
optimizationsPointer comparisonsHashing based on memory address
Solutions:Deterministic HashingDeterministic comparisonsp =
malloc ()p = malloc ()Thread 1Thread 2if (p > p1) key =
hash(p)
-
Sources of Non-determinismThread synchronization
SolutionsSynchronization based on deterministic countNumber of
ITE operationsNumber of Sift operationsImage #nThread 1Thread
2Image #n+1
-
Smart Thread SchedulingEach processor has its own cacheThread is
assigned to a processorThe cache fills up with the threads memory
usage.The same thread assigned to a different processor after
sometimeA large number of unnecessary cache miss when the thread
use its previously used memory locations
Solutions:Bind thread to a processorLeads to suboptimal
throughputIf the number of threads exceeds the number of
processorsCPU1CPU2Cache1
0x07ffd0Cache2
Lookup 0x07ffd0ThreadCachemiss
-
BDD Performance : CUDD Vs New
CktsBDD Statistics after Reachability Analysis (Static
Order)P/F#img#nodesCUDDNewMem(MB)Cache hitsCache
collisionTimeMemCache hitsCache
collisionTimebpbF101.8M50M41.0%90.4%18.661M41.0%88.2%26.3eightP4779K6.1M42.9%26.2%0.87.5M42.9%26.2%1.5fru32F28K9.2M34.0%28.4%7.910.9M34.0%28.9%8.9idu32F136K6.6M28.8%5.0%4.27.8M28.7%7.7%4.5usbphyP190K6.4M37.7%16.6%0.77.8M37.7%17.1%0.7
-
BDD Performance : CUDD Vs New
Chart1
11.2210.97566371681.4139784946
11.2295081967111.875
11.184782608711.01760563381.1265822785
11.18181818180.99652777781.541.0714285714
11.2187511.03012048191
Cudd
New Memory
New Cache Hits
New Cache Collision
New Time
Ckts
Ratio
BDD performance
Sheet1
bpb504190.4614188.21.2210.9756637168118.626.31.4139784946
eight6.142.926.27.542.926.21.22950819671110.81.51.875
fru329.23428.410.93428.91.184782608711.017605633817.98.91.1265822785
idu326.628.857.828.77.71.18181818180.99652777781.5414.24.51.0714285714
usbphy6.437.716.67.837.717.11.2187511.030120481910.70.71
Sheet1
Cudd
Memory
Cache Hits
Cache Collision
Time
Ckts
Ratio
BDD performance
Sheet2
Sheet3
-
Performance : Non-deterministic Vs Deterministic
-
Performance: Cache or Parallelism
CktsVerification Time in SecUniprocessorSequential In 8-way
SMPParallelIn 8-way
SMPc11570286227d11251313d21803930d3295130100d41766038
-
Results on Industrial Circuits
-
Results on public benchmarks
-
Results : Gantt chartsReal execution traces from our
multi-threaded reachability program
-
Conclusion and Future WorkParallelize the
ReachabilityMulti-threaded Reachability Better resultsDeterministic
behavior
Future WorkImprove the parallelism furtherStudy cache
behavior