Page 1
DICEHorizon2020Research&InnovationActionGrantAgreementno.644869http://www.dice-h2020.eu
FundedbytheHorizon2020FrameworkProgrammeoftheEuropeanUnion
AToolforVerificationofBig-DataApplications
Jul21th,2016
QUDOS2016Saarbrücken,Germany
M.M.Bersani,F.Marconi,M.G.RossiPolitecnicodiMilano
Milan,Italy
MadalinaErascuInstitutee-AustriaTimisoara&WesternUniversityofTimisoara
Timisoara,Romania
Page 2
Ourworkataglance
o Approachandtoolfortheautomatedverificationoftopology-baseddata-intensiveapplications.§ Based(sofar)ontemporallogicmodel§ Performsautomatedtransformationfromhighlevelapplicationdescriptiontoformalmodel
§ Enablesverificationofsafetyproperties
2
Page 3
Roadmap
§ Context• QualityassuranceinDIA
§ ResearchDesign• Researchquestion• Ourapproach
§ Conclusions• Contributions• Futureworks
3
Page 4
CONTEXT
QualityAnalysisandVerificationfordata-intensiveapplications
4
Page 5
FormalVerification
o GivenaModelMandaPropertyspecificationP,verificationcheckswhetherPholdsinM.
o MandPcanbeexpressedinmanydifferentways§ variouskindsofautomata(operationalmodels)§ variouskindsoflogics(descriptivemodels)
5
Page 6
Data-IntensiveApplications(DIA)
6
Page 7
DICEProject
o Horizon2020Research&InnovationAction(RIA)§ Quality-AwareDevelopmentforBigDataapplications§ Feb2015- Jan2018,4MEurosbudget§ 9partners(Academia&SMEs),7EUcountries
7
Page 8
QualityDimensionsinDICE
o Reliability
o Efficiency
o Safety&Privacy
8
§ Availability§ Fault-tolerance
§ Performance§ Costs
§ Verification§ Dataprotection
Page 9
BigDataTechnologies
Cloud(Priv/Pub)`
9
DICEIDE
Profile
Plugins
Sim Ver Opt
DPIM
DTSM
DDSMTOSCAMethodology
Deploy Config Test
Mon
AnomalyTrace
Iter.Enh.
DataIntensiveApplication(DIA)
Cont.Int. FaultInj.
WP4
WP3
WP2
WP5
WP1 WP6- Demonstrators
OurpositioninginDICEframework(1)
Page 10
OurpositioninginDICEframework(2)
10FeaturingtheDICEH2020EUProject
DICE DPIM Meta-Model
Scenario: Tech. Comparison
DICE DTSM Meta-Model
Big-Data Technological Development
DICE DDSM Meta-Model
Big-Data PhysicalAssetsDICE
DMON
Operations
Spark
DICE Process Views
HadoopMR
DICE TOSCA Meta-Model
Big-Data Technological Deployment (TOSCA)
DICE DTSM Meta-Model
Big-Data Technological Logic
Safety Verification with ZOT
Reliability and Resource Management
with GreatSPNHadoop MR Monitoring
StormMonitoring
LEGENDASoftware
Architecture ViewsModel-to-Model Transformation
Storm Oryx 2
Scenario:Deployment
Safety Reliability
Resource Management
Config. Optimization
SparkMonitoring
Oryx 2Monitoring
Configuration Optimisation with BO4CO
JSON or YAML File Exchange
Model-to-Text Transformation
TOSCA *.yaml Blueprint
Page 11
RESEARCHDESIGN
QualityAnalysisandVerificationfordata-intensiveapplications
11
Page 12
Researchquestion
“Howcanweverifysafetypropertiesofadata-intensiveapplication?”
12
Page 13
Stateoftheart
o Formalverificationofdistributedsystemsisamajorresearchareainsoftwareengineering
o FewworkstryingtoaddressformalverificationinthecontextofDIA§ Mainfocusonverifyingapplication-independentpropertiesrelatedtospecificframeworks• ReliabilityandloadbalancingofMapReduce• ValidityofmessagingflowinMapReduce
§ nomodelingandverificationofapplication-dependentproperties
o VerificationtoolshavebeenusedasverificationenginestobuildformalverificationtechniquesforUMLmodels§ Fewofthemdealwithreal-timeconstraints.§ Mainlyfocusedonfunctionalrequirements.
13
Page 14
OurApproach
o Focusonaspecificsetoftechnologies§ Topology-basedstreamingapplicationsà ApacheStorm
o Identifysafetyissueso Deviseaformalmodel
§ Havinganappropriatelevelofabstraction§ Allowingtocapturemeaningfulsystembehaviorandproperties
§ Usingaformalismthatenablesautomaticverificationo Defineatool-supportedmechanismforformalverification§ Startingfromhighlevelapplicationdescription(annotatedUML)
14
Page 15
ApacheStorm
o OpenSourceDistributedStreamProcessingSystem
o Analytics,LogEventprocessing,etc..o Reliability,at-least-onesemanticsoWideadoptioninproductionoMainconcepts
§ Streams§ Topologies
- 15 -
Page 16
StormApplications
o ApplicationsdefinedbymeansofTopologies,graphsofcomputationscomposedof:§ Spouts
• Sourcesofdatastreams(tuples)
§ Bolts• Calculate,Filter,Aggregate,Join,Talktodatabases
- 16 -
Page 17
SafetyIssues
o Importantrequirementsforstreamingapplications§ Latency§ Throughput
o Criticalpoints§ incorrectdesignoftimingconstraints§ nodefailures
o mightcause§ latencyinprocessingtuples§ monotonicgrowthofthesizeofusedmemory(queues).
17
Page 18
DICEVerificationTool
oWewantto§ Verifywhetheratopologyreachesanunwantedconfiguration• e.g.,whereboltsarenotabletoprocessincomingtuplesontime
§ Lettheuserspecifythetopologybymeansofhighlevelmodels(UML)
18
Page 19
D-VerT - DICEVerificationTool
19
Page 20
DICEDTSM::StormUMLprofile
20
Page 21
D-VerT - DICE-profiledUMLClassDiagram
21
Page 22
DTSM2Jsonmodule
o ReliesonEclipseUML2 Javalibraryo “Navigates”DTSMclassdiagramandextracttopologystructure
andinformationo GathersverificationoptionfromEclipselaunchconfigurationo MapstopologycomponentstoJavaobjectso DirectlyconvertsJavaobjectstoJSONobjectviagson library
22
Page 23
Json2MC- Module
o PythoncomponentbasedonJinja2templatingengine
o GeneratesFormalModelbasedonthecontentofJSONfileandontheselectedtemplate(TLorFOL).
23
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
TL Model Zot
JSON2MC
<—> <—> <———> ———— ————————— </———> </—> <————————> <———————> </——————> <—> </—> </————————></—>
JSON
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
TL ModelTemplate
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
FOL ModelTemplate
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
FOL Model Cubicle
X
WIP
Page 24
VerificationApproacheso BoundedSatisfiabilityChecking(BSC)
§ Input:• Temporallogicformula(Model)• NegatedPropertyovertime
§ Outcome:• SATà counterexampletrace• UNSATà Propertyholdsfortheconsideredtimebound
§ WeuseZotverificationtool(https://github.com/fm-polimi/zot)o ReachabilityChecking(WIP)
§ ModeldefinedbyFOLArraybasedsystem• Setofinitialstates andtransitions• Formuladefiningundesiredstates(Negatedproperty)
§ Outcome:• UNSAFEà Traceshowingthatundesiredstatearereachablefrominitialstates
• SAFEà Noundesiredstatecanbereache frominitialstates
24
Page 25
D-VerT – Outputtrace
o Whenatleastonequeuegrowswithanunboundedtrend§ aninfiniteultimatelyperiodicmodelisfound§ OutputParserprovidesgraphicalcounterexampletrace
25
Page 27
Contributions
oWeenabledautomaticverificationontopology-based streamingapplicationsby§ Definingaformalmodelbasedontemporallogic§ definingautomaticmechanismsfortranslatingtotheformalmodelfromahighleveldescription.
§ extendingZotVerificationtooltosupporttheformalismandcarryoutBSConit
27
Page 28
Preliminaryresults
o Validationthroughopensourceandindustrialusecases§ Meaningfulqualitativeresultsinidentifyingcriticalpointsintopologydesign
§ Executiontimestronglydependsonthesizeofthetopologyandontheconfigurationsofsinglecomponents
28
http://dice-project.github.io/DICE-Verification/
Page 29
OngoingandFutureworks
o Identificationandverificationoffurtherproperties§ PrivacyandSecurity
o Toolimprovementso Modelingdifferenttechnologies(Spark,CEP,Tez)o DevelopingFOLmodelo Newtheoreticalresultsonthecorrectnessandcompletenessoftheformalanalysis
29
Page 30
Questions?
30
Thankyou!