DICEHorizon2020Research&InnovationActionGrantAgreementno.644869http://www.dice-h2020.eu
FundedbytheHorizon2020FrameworkProgrammeoftheEuropeanUnion
AToolforVerificationofBig-DataApplications
Jul21th,2016
QUDOS2016Saarbrücken,Germany
M.M.Bersani,F.Marconi,M.G.RossiPolitecnicodiMilano
Milan,Italy
MadalinaErascuInstitutee-AustriaTimisoara&WesternUniversityofTimisoara
Timisoara,Romania
Ourworkataglance
o Approachandtoolfortheautomatedverificationoftopology-baseddata-intensiveapplications.§ Based(sofar)ontemporallogicmodel§ Performsautomatedtransformationfromhighlevelapplicationdescriptiontoformalmodel
§ Enablesverificationofsafetyproperties
2
Roadmap
§ Context• QualityassuranceinDIA
§ ResearchDesign• Researchquestion• Ourapproach
§ Conclusions• Contributions• Futureworks
3
FormalVerification
o GivenaModelMandaPropertyspecificationP,verificationcheckswhetherPholdsinM.
o MandPcanbeexpressedinmanydifferentways§ variouskindsofautomata(operationalmodels)§ variouskindsoflogics(descriptivemodels)
5
DICEProject
o Horizon2020Research&InnovationAction(RIA)§ Quality-AwareDevelopmentforBigDataapplications§ Feb2015- Jan2018,4MEurosbudget§ 9partners(Academia&SMEs),7EUcountries
7
QualityDimensionsinDICE
o Reliability
o Efficiency
o Safety&Privacy
8
§ Availability§ Fault-tolerance
§ Performance§ Costs
§ Verification§ Dataprotection
BigDataTechnologies
Cloud(Priv/Pub)`
9
DICEIDE
Profile
Plugins
Sim Ver Opt
DPIM
DTSM
DDSMTOSCAMethodology
Deploy Config Test
Mon
AnomalyTrace
Iter.Enh.
DataIntensiveApplication(DIA)
Cont.Int. FaultInj.
WP4
WP3
WP2
WP5
WP1 WP6- Demonstrators
OurpositioninginDICEframework(1)
OurpositioninginDICEframework(2)
10FeaturingtheDICEH2020EUProject
DICE DPIM Meta-Model
Scenario: Tech. Comparison
DICE DTSM Meta-Model
Big-Data Technological Development
DICE DDSM Meta-Model
Big-Data PhysicalAssetsDICE
DMON
Operations
Spark
DICE Process Views
HadoopMR
DICE TOSCA Meta-Model
Big-Data Technological Deployment (TOSCA)
DICE DTSM Meta-Model
Big-Data Technological Logic
Safety Verification with ZOT
Reliability and Resource Management
with GreatSPNHadoop MR Monitoring
StormMonitoring
LEGENDASoftware
Architecture ViewsModel-to-Model Transformation
Storm Oryx 2
Scenario:Deployment
Safety Reliability
Resource Management
Config. Optimization
SparkMonitoring
Oryx 2Monitoring
Configuration Optimisation with BO4CO
JSON or YAML File Exchange
Model-to-Text Transformation
TOSCA *.yaml Blueprint
Stateoftheart
o Formalverificationofdistributedsystemsisamajorresearchareainsoftwareengineering
o FewworkstryingtoaddressformalverificationinthecontextofDIA§ Mainfocusonverifyingapplication-independentpropertiesrelatedtospecificframeworks• ReliabilityandloadbalancingofMapReduce• ValidityofmessagingflowinMapReduce
§ nomodelingandverificationofapplication-dependentproperties
o VerificationtoolshavebeenusedasverificationenginestobuildformalverificationtechniquesforUMLmodels§ Fewofthemdealwithreal-timeconstraints.§ Mainlyfocusedonfunctionalrequirements.
13
OurApproach
o Focusonaspecificsetoftechnologies§ Topology-basedstreamingapplicationsà ApacheStorm
o Identifysafetyissueso Deviseaformalmodel
§ Havinganappropriatelevelofabstraction§ Allowingtocapturemeaningfulsystembehaviorandproperties
§ Usingaformalismthatenablesautomaticverificationo Defineatool-supportedmechanismforformalverification§ Startingfromhighlevelapplicationdescription(annotatedUML)
14
ApacheStorm
o OpenSourceDistributedStreamProcessingSystem
o Analytics,LogEventprocessing,etc..o Reliability,at-least-onesemanticsoWideadoptioninproductionoMainconcepts
§ Streams§ Topologies
- 15 -
StormApplications
o ApplicationsdefinedbymeansofTopologies,graphsofcomputationscomposedof:§ Spouts
• Sourcesofdatastreams(tuples)
§ Bolts• Calculate,Filter,Aggregate,Join,Talktodatabases
- 16 -
SafetyIssues
o Importantrequirementsforstreamingapplications§ Latency§ Throughput
o Criticalpoints§ incorrectdesignoftimingconstraints§ nodefailures
o mightcause§ latencyinprocessingtuples§ monotonicgrowthofthesizeofusedmemory(queues).
17
DICEVerificationTool
oWewantto§ Verifywhetheratopologyreachesanunwantedconfiguration• e.g.,whereboltsarenotabletoprocessincomingtuplesontime
§ Lettheuserspecifythetopologybymeansofhighlevelmodels(UML)
18
DTSM2Jsonmodule
o ReliesonEclipseUML2 Javalibraryo “Navigates”DTSMclassdiagramandextracttopologystructure
andinformationo GathersverificationoptionfromEclipselaunchconfigurationo MapstopologycomponentstoJavaobjectso DirectlyconvertsJavaobjectstoJSONobjectviagson library
22
Json2MC- Module
o PythoncomponentbasedonJinja2templatingengine
o GeneratesFormalModelbasedonthecontentofJSONfileandontheselectedtemplate(TLorFOL).
23
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
TL Model Zot
JSON2MC
<—> <—> <———> ———— ————————— </———> </—> <————————> <———————> </——————> <—> </—> </————————></—>
JSON
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
TL ModelTemplate
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
FOL ModelTemplate
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
FOL Model Cubicle
X
WIP
VerificationApproacheso BoundedSatisfiabilityChecking(BSC)
§ Input:• Temporallogicformula(Model)• NegatedPropertyovertime
§ Outcome:• SATà counterexampletrace• UNSATà Propertyholdsfortheconsideredtimebound
§ WeuseZotverificationtool(https://github.com/fm-polimi/zot)o ReachabilityChecking(WIP)
§ ModeldefinedbyFOLArraybasedsystem• Setofinitialstates andtransitions• Formuladefiningundesiredstates(Negatedproperty)
§ Outcome:• UNSAFEà Traceshowingthatundesiredstatearereachablefrominitialstates
• SAFEà Noundesiredstatecanbereache frominitialstates
24
D-VerT – Outputtrace
o Whenatleastonequeuegrowswithanunboundedtrend§ aninfiniteultimatelyperiodicmodelisfound§ OutputParserprovidesgraphicalcounterexampletrace
25
Contributions
oWeenabledautomaticverificationontopology-based streamingapplicationsby§ Definingaformalmodelbasedontemporallogic§ definingautomaticmechanismsfortranslatingtotheformalmodelfromahighleveldescription.
§ extendingZotVerificationtooltosupporttheformalismandcarryoutBSConit
27
Preliminaryresults
o Validationthroughopensourceandindustrialusecases§ Meaningfulqualitativeresultsinidentifyingcriticalpointsintopologydesign
§ Executiontimestronglydependsonthesizeofthetopologyandontheconfigurationsofsinglecomponents
28
http://dice-project.github.io/DICE-Verification/
OngoingandFutureworks
o Identificationandverificationoffurtherproperties§ PrivacyandSecurity
o Toolimprovementso Modelingdifferenttechnologies(Spark,CEP,Tez)o DevelopingFOLmodelo Newtheoreticalresultsonthecorrectnessandcompletenessoftheformalanalysis
29