Top Banner
From Model to Event Data Analysis Marco Montali Free University of Bozen-Bolzano ATAED 2016 Marrying Data and Processes
87

ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Jan 09, 2017

Download

Marco Montali
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

From Model to Event Data Analysis

Marco Montali Free University of Bozen-Bolzano

ATAED 2016

Marrying Data and Processes

Page 2: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Our Starting Point

Marrying processes and data is a must if we want to really understand

how complex dynamic systems operate

Dynamic systems of interest: • business processes• multiagent systems • distributed systems

2

Page 3: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Our ThesisKnowledge representation and

computational logics

is a swiss-army knife to

understand data-aware dynamic systems, and

provide automated reasoning and verification capabilities along their entire lifecycle

3

Page 4: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Business Process Lifecycle

4

picture by Wil van der Aalst

Page 5: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Formal Verification

Automated analysis of a formal model of the system against a property of interest,

considering all possible system behaviors5

picture by Wil van der Aalst

Page 6: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Process Mining

Extraction of valuable, process-related information

from event logs, i.e., the footprint of reality

6

picture by Wil van der Aalst

Page 7: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

7

Page 8: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Loneliness

Page 9: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Data/Process Fragmentation• A business process consists of a set of activities that

are performed in coordination in an organizational and technical environment [Weske, 2007]

• Activities change the real world• The corresponding updates are reflected into the

organizational information system(s) • Data trigger decision-making, which in turn determines

the next steps to be taken in the process

• Survey by Forrester [Karel et al, 2009]: lack of interaction between data and process experts

9

Page 10: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Experts Dichotomy• BPM professionals: data are subsidiary to

processes

• Master data managers: data are the main driver for the company’s existence

• Forrester: in 83/100 companies, no interaction at all between these two groups • This isolation propagates to languages and tools,

which never properly account for the process-data connection

10

Page 11: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Conventional Data ModelingFocus: revelant entities, relations, static constraints

Supplier ManufacturingProcurement/Supplier

Sales

Customer PO Line Item

Work OrderMaterial PO

*

*

spawns0..1

Material

But… how do data evolve? Where can we find the “state” of a purchase order?

11

Page 12: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Conventional Process ModelingFocus: control-flow of activities in response to events

But… how do activities update data? What is the impact of canceling an order?

12

Page 13: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

A Deployed Process

13

Page 14: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Do you like Spaghetti?Manage

CancelationShipAssembleManage

Material POsDecompose

Customer PO

Activities

Process

Data

Activities

Process

Data

Activities

Process

Data

Activities

Process

Data

Activities

Process

Data

Customers Suppliers&CataloguesCustomer POs Work Orders Material POs

IT integration: difficult to manage, understand, evolve14

Page 15: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Too Late…• Where are the data?

• Where shall we model relevant business rules?

15

Too late to reconstruct the missing pieces

Where is our data?part is in the DBs,part is hidden in the process execution engine.

Where are the relevant business rules, and how are they modeled?At the DB level? Which DB? How to import the process data?(Also) in the business model? How to import data from the DBs?

DataProcess

Supplier ManufacturingProcurement/Supplier

Sales

Customer PO Line Item

Work OrderMaterial PO

*

*

spawns0..1

Determine cancelation

penaltyNotify penalty

Material

Process Engine

Process State

Business rulesFor each work order W For each material PO M in W if M has been shipped add returnCost(M) to penalty

Diego Calvanese (FUB) Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016 (10/1)

Page 16: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

““

How is Research Reacting?

A recent review…

Verification typically takes place at the design stage of a business process type. However, at this stage, required knowledge about data (database schema, integrity constraints) is typically not yet available.

16

Page 17: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

…But There is Hope!• [Meyer et al, 2011]: data-process integration

crucial to assess the value of processes and evaluate KPIs

• [Dumas, 2011]: data-process integration crucial to aggregate all relevant information, and to suitably inject business rules into the system

• [Reichert, 2012]: “Process and data are just two sides of the same coin”

17

Page 18: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Formal Verification The Conventional, Propositional Case

Process control-flow

(Un)desired property18

Page 19: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

(Un)desired property

Finite-statetransition system

Propositionaltemporal formula|= �

Formal Verification The Conventional, Propositional Case

Process control-flow

19

Page 20: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

(Un)desired property

Finite-statetransition system

Propositionaltemporal formula|= �

Verification via model checking2007 Turing award:

Clarke, Emerson, Sifakis

Formal Verification The Conventional, Propositional Case

Process control-flow

20

Page 21: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Marriage

Page 22: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

(Un)desired property

Formal Verification The Data-Aware Case

22

Process+Data

Page 23: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

(Un)desired property

First-ordertemporal formula|= �

Process+Data

Formal Verification The Data-Aware Case

Infinite-state, relational transition system [Vardi 2005] 23

Page 24: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

(Un)desired property

First-ordertemporal formula|= �

?Formal Verification

The Data-Aware Case

24

Process+Data

Infinite-state, relational transition system [Vardi 2005]

Page 25: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Why FO Temporal Logics• To inspect data: FO queries• To capture system dynamics: temporal

modalities• To track the evolution of objects: FO

quantification across states • Example:

It is always the case that every order is eventually either cancelled or paid

25

Page 26: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Why FO Temporal Logics• To inspect data: FO queries• To capture system dynamics: temporal

modalities• To track the evolution of objects: FO

quantification across states • Example:

It is always the case that every order is eventually either cancelled or paid

26

G

✓8x.Order(x)

! F�State(x, cancelled) _ State(x, paid)

�◆

Page 27: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Problem DimensionsData

component Relational DB Description logic KB OBDA system Inconsistency

tolerant KB …

Process component

condition-action rules BPMN Golog program Petri nets …

Task modeling

Conditional effects

Add/delete assertions Programs User forms …

External inputs None External

services Input DB Fixed input …

Network topology

Single orchestrator Full mesh Connected, fixed

graph Ring …

Interaction mechanism None Synchronous Asynchronous

and orderedAsynchronous

lossy …27

Page 28: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Colored Petri Nets

28

Page 29: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Colored Petri Nets

• Where is the data model?• Formal analysis only with a-priori propositionalization

29

Page 30: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Tools (e.g. BizAgi)

30

ReviewRequest

Fill Reim-bursement

Review Reim-bursement

Rejected

Accepted

• Which formal semantics? • Analysable?

Page 31: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Atrue

false

BPMS and Data

Correct?

• BizAgi…

• YAWL…

31

Page 32: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Atrue

false

BPMS and Data

Correct?

• BizAgi… not sure…

• YAWL… YES!

32

Page 33: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

RAW-SYS• Integrated data+process modeling

• Standard relational model for capturing data • Standard workflow nets (or other types of Petri nets) for capturing

processes

• Net transitions interplay with data • Conditionally enabled by FO queries over the data • Described in terms of full-fledged CRUD operations over the data

• Bridge between theory and practice • Mimics how BPMS actually work • Has unambiguous execution semantics

33

Page 34: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

RAW-SYS

34

Task

local

Task

local

TaskCase

new case close case archive

shared

local

archive

Page 35: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Example: User Cart

35

CustomerId …

Product (read-only)name …

InCartBarCode Product

OwnerCustId

Shared DBLocal DB

create case

close case

Page 36: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Example: User Cart

36

CustomerId …

Product (read-only)name …

InCartBarCode Product

OwnerCustId

Shared DBLocal DB

Customer(x,…)

ADD Owner(x)

create case

close case

Page 37: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Example: User Cart

37

CustomerId …

Product (read-only)name …

InCartBarCode Product

OwnerCustId

Shared DBLocal DB

Customer(x,…)

ADD Owner(x)

open cart…

create case

close case

Page 38: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Example: User Cart

38

CustomerId …

Product (read-only)name …

InCartBarCode Product

OwnerCustId

Shared DBLocal DB

Customer(x,…)

ADD Owner(x)

open cart

insert item(p)

create case

Product(p,…)ADD InCart(getBC(),p)

close case

Page 39: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Example: User Cart

39

CustomerId …

Product (read-only)name …

InCartBarCode Product

OwnerCustId

Shared DBLocal DB

Customer(x,…)

ADD Owner(x)

open cart

empty cart

insert item(p)

create case

Exist x,p. InCart(x,p)

Product(p,…)

Forall x,p. InCart(x,p)->DEL InCart(x,p)

ADD InCart(getBC(),p)

close case

Page 40: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Example: User Cart

40

CustomerId …

Product (read-only)name …

InCartBarCode Product

OwnerCustId

Shared DBLocal DB

Customer(x,…)

ADD Owner(x)

open cart…

close cart

empty cart

insert item(p)

create case

Exist x,p. InCart(x,p)

Product(p,…)

Forall x,p. InCart(x,p)->DEL InCart(x,p)

ADD InCart(getBC(),p)

close case

Page 41: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Execution SemanticsRelational transition system. Each state is labeled by: • Instance of the shared DB • Case IDs of running cases, together with corresponding

• Instances of local DBs • Markings of their nets

Successors constructed considering all possible ground executable actions and all possible input configurations (s.t. the resulting state satisfies the schema constraints) —> infinite-state transition system

Page 42: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

The Good…RAW-SYS are:

• Markovian: Next state only depends on the current state + input. Two states with identical DBs are bisimilar.

• Generic: FO/SQL (as all query languages) does not distinguish structures which are identical modulo uniform renaming of data objects.

—> Two isomorphic states are bisimilar42

Page 43: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

… and the BadReachability undecidable even with a single safe net• Counter —> “size” of a unary relation

• Test counter for zero: check whether counter relation is empty • What matters is the # of tuples, not the actual values • Can be reconstructed also without negation in the queries

43

New

Increment Decrement

Page 44: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

State-Boundedness [PODS 2013]

Put a pre-defined bound on the DB size (not the size of the data domain!)

• Resulting transition system: still infinite-state • But: infinitely-many encountered values along a

run cannot be “accumulated” in a single state44

Page 45: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

RAW-SYS, Boundedness, and Reachability

Reachability undecidable as soon as one of the following conditions holds:

• Shared DB with unbounded size

• Local DB with unbounded size

• Unboundedly many simultaneously running cases

What happens if all these three sources are “bounded in size”?

45

Page 46: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Magic!

46

First-ordertemporal formula

(FO-CTL or FO-LTL withpersistent quantification)

|= �Infinite-state

transition system

Page 47: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Magic!

47

First-ordertemporal formula

(FO-CTL or FO-LTL withpersistent quantification)

|= �Infinite-state

transition system

|= � Propositionaltemporal formula

Finite-stateabstraction

Page 48: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Magic!

48

First-ordertemporal formula

(FO-CTL or FO-LTL withpersistent quantification)

|= �Infinite-state

transition system

|= � Propositionaltemporal formula

‘If and only if

Finite-stateabstraction

Page 49: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Towards Implementations• [IJCAI 2015] Planning can be lifted to deal with this

infinite-state setting • Ongoing implementation effort using DLVk and

state-of-the-art ADL planners

• [SEBD 2015, AMW 2015] Ongoing effort for implementing model checking techniques based on our abstraction natively in relational technology

• Goal: combine the best of databases and formal methods

49

Page 50: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Process Mining

50 picture by Wil van der Aalst

Page 51: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Process Mining

51 picture by Wil van der Aalst

Page 52: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Expected Reality

52

log

traceevent

Page 53: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Expected Reality

XES standard for event logs

53

<logxes.version="1.0"xes.features="nested-attributes"><trace><stringkey=“concept:name”value=“1”/><event> <stringkey=“concept:name”value=“registerrequest”/> <datekey=“time:timestamp”value=“2010-12-30T11:02:00.000+01:00”/></event></trace>

Page 54: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Actual Reality

54

Page 55: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Actual Reality

55

PaperInfo

Page 56: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Understanding Reality…

56

1..*

*

Conferencecreationtime:DateTime

confname:String

Usercreationtime:DateTime

username:String

Papercreationtime:DateTime

title:String

ReviewRequestinvitationtime:DateTime

Reviewsubmissiontime:DateTime

Decisiondecisiontime:DateTime

outcome:Bool

UploadSubmitteduploadtime:DateTime

uploadaccepteduploadtime:DateTime

submittedto

1

*

organizerof

AcceptedPaper<<notime>>

*

reviewer

1

0..1

PhasD

1

0..1

RhasR

1

10..1 correspondsto

*

UhasP

1

*

AhasU

1

*1 for

author

1..*

*

by

1

*

USuploadbyU

creator

1

*

1*

UAuploadbyU

1

*

Page 57: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

From here…

Impedance Mismatch

57

1..*

*

Conferencecreationtime:DateTime

confname:String

Usercreationtime:DateTime

username:String

Papercreationtime:DateTime

title:String

ReviewRequestinvitationtime:DateTime

Reviewsubmissiontime:DateTime

Decisiondecisiontime:DateTime

outcome:Bool

UploadSubmitteduploadtime:DateTime

uploadaccepteduploadtime:DateTime

submittedto

1

*

organizerof

AcceptedPaper<<notime>>

*

reviewer

1

0..1

PhasD

1

0..1

RhasR

1

10..1 correspondsto

*

UhasP

1

*

AhasU

1

*1 for

author

1..*

*

by

1

*

USuploadbyU

creator

1

*

1*

UAuploadbyU

1

*

Page 58: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

…to there!

Impedance Mismatch

58

Page 59: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Key Issues

• How to resolve the “impedance mismatch”?

• How to get a “view” of the data tailored to process mining?

59

PaperInfo

1..*

*

Conferencecreationtime:DateTime

confname:String

Usercreationtime:DateTime

username:String

Papercreationtime:DateTime

title:String

ReviewRequestinvitationtime:DateTime

Reviewsubmissiontime:DateTime

Decisiondecisiontime:DateTime

outcome:Bool

UploadSubmitteduploadtime:DateTime

uploadaccepteduploadtime:DateTime

submittedto

1

*

organizerof

AcceptedPaper<<notime>>

*

reviewer

1

0..1

PhasD

1

0..1

RhasR

1

10..1 correspondsto

*

UhasP

1

*

AhasU

1

*1 for

author

1..*

*

by

1

*

USuploadbyU

creator

1

*

1*

UAuploadbyU

1

*

Page 60: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Impedance Mismatch is Really an Issue

Crompton (2008): domain experts loose too much time to big into data and turn them into knowledge

• Engineers in the oil/gas industry: 30-70% of their working time spent for data searching and data quality

60

Page 61: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Optique

Scalable, End-User Access to Big Data

• http://optique-project.eu

• Goal: engineering techniques for enabling end-users accessing data through domain ontologies

• Case studies: Statoil, Siemens

61

Page 62: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Facts on Statoil• 1000 TB of data inside relational DBMSs

• Schemas not aligned

• More than 2000 tables, in a plethora of different DBs

• 900 experts part of “Statoil Exploration” • Up to 4 days to formulate queries and encode

them in SQL

62

Page 63: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Query Example

63

OBDI framework Query answering Ontology languages Mappings Identity Conclusions

How much time/money is spent searching for data?

A user query at Statoil

Show all norwegian wellbores with some aditional attributes(wellbore id, completion date, oldest penetrated age,result). Limitto all wellbores with a core and show attributes like (wellbore id,core number, top core depth, base core depth, intersectingstratigraphy). Limit to all wellbores with core in Brentgruppen andshow key atributes in a table. After connecting to EPDS (slegge)we could for instance limit futher to cores in Brent with measuredpermeability and where it is larger than a given value, for instance 1mD. We could also find out whether there are cores in Brent whichare not stored in EPDS (based on NPD info) and where there couldbe permeability values. Some of the missing data we possibly own,other not.

At Statoil, it takes up to 4 days to formulate a query in SQL.

Statoil loses up to 50.000.000e per year because of this!!

Diego Calvanese (FUB) Ontologies for Data Integration FOfAI 2015, Buenos Aires – 27/7/2015 (5/52)

Page 64: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

64

OBDI framework Query answering Ontology languages Mappings Identity Conclusions

How much time/money is spent searching for data?

A user query at Statoil

Show all norwegian wellbores with some aditional attributes(wellbore id, completion date, oldest penetrated age,result). Limitto all wellbores with a core and show attributes like (wellbore id,core number, top core depth, base core depth, intersectingstratigraphy). Limit to all wellbores with core in Brentgruppen andshow key atributes in a table. After connecting to EPDS (slegge)we could for instance limit futher to cores in Brent with measuredpermeability and where it is larger than a given value, for instance 1mD. We could also find out whether there are cores in Brent whichare not stored in EPDS (based on NPD info) and where there couldbe permeability values. Some of the missing data we possibly own,other not.

SELECT [...]FROMdb_name.table1 table1,db_name.table2 table2a,db_name.table2 table2b,db_name.table3 table3a,db_name.table3 table3b,db_name.table3 table3c,db_name.table3 table3d,db_name.table4 table4a,db_name.table4 table4b,db_name.table4 table4c,db_name.table4 table4d,db_name.table4 table4e,db_name.table4 table4f,db_name.table5 table5a,db_name.table5 table5b,db_name.table6 table6a,db_name.table6 table6b,db_name.table7 table7a,db_name.table7 table7b,db_name.table8 table8,db_name.table9 table9,db_name.table10 table10a,db_name.table10 table10b,db_name.table10 table10c,db_name.table11 table11,db_name.table12 table12,db_name.table13 table13,db_name.table14 table14,db_name.table15 table15,db_name.table16 table16WHERE [...]

table2a.attr1=‘keyword’ ANDtable3a.attr2=table10c.attr1 ANDtable3a.attr6=table6a.attr3 ANDtable3a.attr9=‘keyword’ ANDtable4a.attr10 IN (‘keyword’) ANDtable4a.attr1 IN (‘keyword’) ANDtable5a.kinds=table4a.attr13 ANDtable5b.kinds=table4c.attr74 ANDtable5b.name=‘keyword’ AND(table6a.attr19=table10c.attr17 OR(table6a.attr2 IS NULL ANDtable10c.attr4 IS NULL)) ANDtable6a.attr14=table5b.attr14 ANDtable6a.attr2=‘keyword’ AND(table6b.attr14=table10c.attr8 OR(table6b.attr4 IS NULL ANDtable10c.attr7 IS NULL)) ANDtable6b.attr19=table5a.attr55 ANDtable6b.attr2=‘keyword’ ANDtable7a.attr19=table2b.attr19 ANDtable7a.attr17=table15.attr19 ANDtable4b.attr11=‘keyword’ ANDtable8.attr19=table7a.attr80 ANDtable8.attr19=table13.attr20 ANDtable8.attr4=‘keyword’ ANDtable9.attr10=table16.attr11 ANDtable3b.attr19=table10c.attr18 ANDtable3b.attr22=table12.attr63 ANDtable3b.attr66=‘keyword’ ANDtable10a.attr54=table7a.attr8 ANDtable10a.attr70=table10c.attr10 ANDtable10a.attr16=table4d.attr11 ANDtable4c.attr99=‘keyword’ ANDtable4c.attr1=‘keyword’ AND

table11.attr10=table5a.attr10 ANDtable11.attr40=‘keyword’ ANDtable11.attr50=‘keyword’ ANDtable2b.attr1=table1.attr8 ANDtable2b.attr9 IN (‘keyword’) ANDtable2b.attr2 LIKE ‘keyword’% ANDtable12.attr9 IN (‘keyword’) ANDtable7b.attr1=table2a.attr10 ANDtable3c.attr13=table10c.attr1 ANDtable3c.attr10=table6b.attr20 ANDtable3c.attr13=‘keyword’ ANDtable10b.attr16=table10a.attr7 ANDtable10b.attr11=table7b.attr8 ANDtable10b.attr13=table4b.attr89 ANDtable13.attr1=table2b.attr10 ANDtable13.attr20=’‘keyword’’ ANDtable13.attr15=‘keyword’ ANDtable3d.attr49=table12.attr18 ANDtable3d.attr18=table10c.attr11 ANDtable3d.attr14=‘keyword’ ANDtable4d.attr17 IN (‘keyword’) ANDtable4d.attr19 IN (‘keyword’) ANDtable16.attr28=table11.attr56 ANDtable16.attr16=table10b.attr78 ANDtable16.attr5=table14.attr56 ANDtable4e.attr34 IN (‘keyword’) ANDtable4e.attr48 IN (‘keyword’) ANDtable4f.attr89=table5b.attr7 ANDtable4f.attr45 IN (‘keyword’) ANDtable4f.attr1=‘keyword’ ANDtable10c.attr2=table4e.attr19 AND(table10c.attr78=table12.attr56 OR(table10c.attr55 IS NULL ANDtable12.attr17 IS NULL))

At Statoil, it takes up to 4 days to formulate a query in SQL.

Statoil loses up to 50.000.000e per year because of this!!

Diego Calvanese (FUB) Ontologies for Data Integration FOfAI 2015, Buenos Aires – 27/7/2015 (5/52)

Page 65: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

65

OBDI framework Query answering Ontology languages Mappings Identity Conclusions

How much time/money is spent searching for data?

A user query at Statoil

Show all norwegian wellbores with some aditional attributes(wellbore id, completion date, oldest penetrated age,result). Limitto all wellbores with a core and show attributes like (wellbore id,core number, top core depth, base core depth, intersectingstratigraphy). Limit to all wellbores with core in Brentgruppen andshow key atributes in a table. After connecting to EPDS (slegge)we could for instance limit futher to cores in Brent with measuredpermeability and where it is larger than a given value, for instance 1mD. We could also find out whether there are cores in Brent whichare not stored in EPDS (based on NPD info) and where there couldbe permeability values. Some of the missing data we possibly own,other not.

SELECT [...]FROMdb_name.table1 table1,db_name.table2 table2a,db_name.table2 table2b,db_name.table3 table3a,db_name.table3 table3b,db_name.table3 table3c,db_name.table3 table3d,db_name.table4 table4a,db_name.table4 table4b,db_name.table4 table4c,db_name.table4 table4d,db_name.table4 table4e,db_name.table4 table4f,db_name.table5 table5a,db_name.table5 table5b,db_name.table6 table6a,db_name.table6 table6b,db_name.table7 table7a,db_name.table7 table7b,db_name.table8 table8,db_name.table9 table9,db_name.table10 table10a,db_name.table10 table10b,db_name.table10 table10c,db_name.table11 table11,db_name.table12 table12,db_name.table13 table13,db_name.table14 table14,db_name.table15 table15,db_name.table16 table16WHERE [...]

table2a.attr1=‘keyword’ ANDtable3a.attr2=table10c.attr1 ANDtable3a.attr6=table6a.attr3 ANDtable3a.attr9=‘keyword’ ANDtable4a.attr10 IN (‘keyword’) ANDtable4a.attr1 IN (‘keyword’) ANDtable5a.kinds=table4a.attr13 ANDtable5b.kinds=table4c.attr74 ANDtable5b.name=‘keyword’ AND(table6a.attr19=table10c.attr17 OR(table6a.attr2 IS NULL ANDtable10c.attr4 IS NULL)) ANDtable6a.attr14=table5b.attr14 ANDtable6a.attr2=‘keyword’ AND(table6b.attr14=table10c.attr8 OR(table6b.attr4 IS NULL ANDtable10c.attr7 IS NULL)) ANDtable6b.attr19=table5a.attr55 ANDtable6b.attr2=‘keyword’ ANDtable7a.attr19=table2b.attr19 ANDtable7a.attr17=table15.attr19 ANDtable4b.attr11=‘keyword’ ANDtable8.attr19=table7a.attr80 ANDtable8.attr19=table13.attr20 ANDtable8.attr4=‘keyword’ ANDtable9.attr10=table16.attr11 ANDtable3b.attr19=table10c.attr18 ANDtable3b.attr22=table12.attr63 ANDtable3b.attr66=‘keyword’ ANDtable10a.attr54=table7a.attr8 ANDtable10a.attr70=table10c.attr10 ANDtable10a.attr16=table4d.attr11 ANDtable4c.attr99=‘keyword’ ANDtable4c.attr1=‘keyword’ AND

table11.attr10=table5a.attr10 ANDtable11.attr40=‘keyword’ ANDtable11.attr50=‘keyword’ ANDtable2b.attr1=table1.attr8 ANDtable2b.attr9 IN (‘keyword’) ANDtable2b.attr2 LIKE ‘keyword’% ANDtable12.attr9 IN (‘keyword’) ANDtable7b.attr1=table2a.attr10 ANDtable3c.attr13=table10c.attr1 ANDtable3c.attr10=table6b.attr20 ANDtable3c.attr13=‘keyword’ ANDtable10b.attr16=table10a.attr7 ANDtable10b.attr11=table7b.attr8 ANDtable10b.attr13=table4b.attr89 ANDtable13.attr1=table2b.attr10 ANDtable13.attr20=’‘keyword’’ ANDtable13.attr15=‘keyword’ ANDtable3d.attr49=table12.attr18 ANDtable3d.attr18=table10c.attr11 ANDtable3d.attr14=‘keyword’ ANDtable4d.attr17 IN (‘keyword’) ANDtable4d.attr19 IN (‘keyword’) ANDtable16.attr28=table11.attr56 ANDtable16.attr16=table10b.attr78 ANDtable16.attr5=table14.attr56 ANDtable4e.attr34 IN (‘keyword’) ANDtable4e.attr48 IN (‘keyword’) ANDtable4f.attr89=table5b.attr7 ANDtable4f.attr45 IN (‘keyword’) ANDtable4f.attr1=‘keyword’ ANDtable10c.attr2=table4e.attr19 AND(table10c.attr78=table12.attr56 OR(table10c.attr55 IS NULL ANDtable12.attr17 IS NULL))

At Statoil, it takes up to 4 days to formulate a query in SQL.

Statoil loses up to 50.000.000e per year because of this!!

Diego Calvanese (FUB) Ontologies for Data Integration FOfAI 2015, Buenos Aires – 27/7/2015 (5/52)

50.000.000 €/year

Page 66: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Ontology-Based Data Access

66

OBDI framework Query answering Ontology languages Mappings Identity Conclusions

Ontology-based data integration framework

. . .

. . .

. . .

. . .

Query

Result

Ontologyprovides

global vocabulary

and

conceptual view

Mappingssemantically link

sources and

ontology

Data Sourcesexternal and

heterogeneous

We achieve logical transparency in accessing data:

does not know where and how the data is stored.

can only see a conceptual view of the data.

Diego Calvanese (FUB) Ontologies for Data Integration FOfAI 2015, Buenos Aires – 27/7/2015 (7/52)

data sources

“lightweight” conceptual model

mapping

Page 67: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Ontop• Open-source OBDA technology developed at

UNIBZ (supervisor: Diego Calvanese)

• Fully supports semantic web standards (OWL/SPARQL)

• Integrates with a plethora of relational DBMSs

• Apache open license

• http://ontop.inf.unibz.it

67

Page 68: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Resolving the Impedance Mismatch

68

PaperInfo

FullPapercreationTime: DateTime title: String

mappingId fp-mapping

target paper{ID} a :FullPaper; :title {Title}; :creationTime{CT}

source select I.ID, I.Title, I.CT from PaperInfo I where I.Type = “FP”

Page 69: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

What if my DB is Very Nice?

Ontology bootstrapping automatically creates

• a conceptual model that mirrors 1-1 the relational DB

• identity mappings

Useful for “small” case studies

69

Page 70: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

OBDA for Process Mining• Need to resolve a second impedance mismatch

problem!

• From here…

70

1..*

*

Conferencecreationtime:DateTime

confname:String

Usercreationtime:DateTime

username:String

Papercreationtime:DateTime

title:String

ReviewRequestinvitationtime:DateTime

Reviewsubmissiontime:DateTime

Decisiondecisiontime:DateTime

outcome:Bool

UploadSubmitteduploadtime:DateTime

uploadaccepteduploadtime:DateTime

submittedto

1

*

organizerof

AcceptedPaper<<notime>>

*

reviewer

1

0..1

PhasD

1

0..1

RhasR

1

10..1 correspondsto

*

UhasP

1

*

AhasU

1

*1 for

author

1..*

*

by

1

*

USuploadbyU

creator

1

*

1*

UAuploadbyU

1

*

Page 71: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

OBDA for Process Mining• …To there!

71

-xes.version:xs:decimal-xes.features:xs:token

Log

-name:xs:string-keys:xs:string

Classifier-prefix:xs:string-uri:xs:string-key:xs:string-type:xs:string

Extension

Trace

Event

-key:xs:string-value:xs:string-type:xs:string

Attribute

-scope:{event,trace}

GlobalAttribute

0..*

-declare1

1

-define1

1 1..*

-contain11

0..*

-contain21

0..*

-contain31

0..*

-contain5

1

0..*

-define21..*

1..*

-declare20..1

1

-contain4

1

0..*

-contain60..*

*

Page 72: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

OBDA for Process Mining• From here…

72

PaperInfo

Page 73: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

OBDA for Process Mining• …To there!

73

log

traceevent

Page 74: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Log Annotations

74

1..*

*

Conferencecreationtime:DateTime

confname:String

Usercreationtime:DateTime

username:String

Papercreationtime:DateTime

title:String

ReviewRequestinvitationtime:DateTime

Reviewsubmissiontime:DateTime

Decisiondecisiontime:DateTime

outcome:Bool

UploadSubmitteduploadtime:DateTime

UploadAccepteduploadtime:DateTime

submittedto

1

*

organizerof

AcceptedPaper<<notime>>

*

reviewer

1

0..1

PhasD

1

0..1

RhasR

1

10..1 correspondsto

*

UhasP

1

*

AhasU

1

*1 for

author

1..*

*

by

1

*

USuploadbyU

creator

1

*

1*

UAuploadbyU

1

*

trace

event

event

eventevent

trace:followhasactivityname:“decision”timestamp:decisiontime

resource:followbytype:complete

attributes:outcome

trace:followhas&foractivityname:“review”

timestamp:submissiontimeresource:followRhasR&reviewer

type:complete

trace:followhasactivityname:“uploadsubmitted”

timestamp:uploadtimeresource:followUSuploadbyU

type:complete

trace:followhas&corr.toactivityname:“uploadaccepted”

timestamp:uploadtimeresource:followUAuploadbyU

type:complete

submittedto=BPM2015

Page 75: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

75

1..*

*

Conferencecreationtime:DateTime

confname:String

Usercreationtime:DateTime

username:String

Papercreationtime:DateTime

title:String

ReviewRequestinvitationtime:DateTime

Reviewsubmissiontime:DateTime

Decisiondecisiontime:DateTime

outcome:Bool

UploadSubmitteduploadtime:DateTime

UploadAccepteduploadtime:DateTime

submittedto

1

*

organizerof

AcceptedPaper<<notime>>

*

reviewer

1

0..1

PhasD

1

0..1

RhasR

1

10..1 correspondsto

*

UhasP

1

*

AhasU

1

*1 for

author

1..*

*

by

1

*

USuploadbyU

creator

1

*

1*

UAuploadbyU

1

*

trace

event

event

eventevent

trace:followhasactivityname:“decision”timestamp:decisiontime

resource:followbytype:complete

attributes:outcome

trace:followhas&foractivityname:“review”

timestamp:submissiontimeresource:followRhasR&reviewer

type:complete

trace:followhasactivityname:“uploadsubmitted”

timestamp:uploadtimeresource:followUSuploadbyU

type:complete

trace:followhas&corr.toactivityname:“uploadaccepted”

timestamp:uploadtimeresource:followUAuploadbyU

type:complete

submittedto=BPM2015

Page 76: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Multiple Log Views

76

1..*

*

Conferencecreationtime:DateTime

confname:String

Usercreationtime:DateTime

username:String

Papercreationtime:DateTime

title:String

ReviewRequestinvitationtime:DateTime

Reviewsubmissiontime:DateTime

Decisiondecisiontime:DateTime

outcome:Bool

UploadSubmitteduploadtime:DateTime

UploadAccepteduploadtime:DateTime

submittedto

1

*

organizerof

AcceptedPaper<<notime>>

*

reviewer

1

0..1

PhasD

1

0..1

RhasR

1

10..1 correspondsto

*

UhasP

1

*

AhasU

1

*1 for

author

1..*

*

by

1

*

USuploadbyU

creator

1

*

1*

UAuploadbyU

1

*

trace

event

trace:followhasauthoractivityname:“decisionauthor”

timestamp:decisiontimeresource:followPhasD

type:complete

eventtrace:followby

activityname:“decisionchair”timestamp:decisiontimeresource:followPhasD

type:completeattributes:outcome

eventtrace:followhas&revieweractivityname:“review”

timestamp:submissiontimeresource:followRhasR&for

type:complete

eventtrace:followuploadby

activityname:“uploadsubmitted”timestamp:uploadtimeresource:followUhasP

type:complete

Page 77: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

77

1..*

*

Conferencecreationtime:DateTime

confname:String

Usercreationtime:DateTime

username:String

Papercreationtime:DateTime

title:String

ReviewRequestinvitationtime:DateTime

Reviewsubmissiontime:DateTime

Decisiondecisiontime:DateTime

outcome:Bool

UploadSubmitteduploadtime:DateTime

UploadAccepteduploadtime:DateTime

submittedto

1

*

organizerof

AcceptedPaper<<notime>>

*

reviewer

1

0..1

PhasD

1

0..1

RhasR

1

10..1 correspondsto

*

UhasP

1

*

AhasU

1

*1 for

author

1..*

*

by

1

*

USuploadbyU

creator

1

*

1*

UAuploadbyU

1

*

trace

event

trace:followhasauthoractivityname:“decisionauthor”

timestamp:decisiontimeresource:followPhasD

type:complete

eventtrace:followby

activityname:“decisionchair”timestamp:decisiontimeresource:followPhasD

type:completeattributes:outcome

eventtrace:followhas&revieweractivityname:“review”

timestamp:submissiontimeresource:followRhasR&for

type:complete

eventtrace:followuploadby

activityname:“uploadsubmitted”timestamp:uploadtimeresource:followUhasP

type:complete

Page 78: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

And Now?

78

database

database

database

DomainOntology EventOntology

XESLogExtraction

Mapping

Annota+on

?

?

Page 79: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Mapping Synthesis

79

database

database

database

DomainOntology EventOntology

XESLogExtraction

Mapping

Annota+on

?

?

Automatically synthesized

1. Annotation transformed into an ontology-to-ontology mapping M’

2. M’ is “rewritten” using the data-to-domain ontology mapping

3. The result is a mapping connecting the XES ontology directly to the data

Page 80: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Log Materialization

80

database

database

database

DomainOntology EventOntology

Mapping

Annotation

LogMapping

XESEventData

XESFile

ProcessMiningTools

1

2 3 4

SELECT DISTINCT ?t ?v ?e WHERE {?t :TcontainsA ?ta . ?ta :valueA ?v.

?t :TcontainsE ?e.} SELECT DISTINCT ?e ?t WHERE {?e :EcontainsA ?a . ?a :typeA ?t.} SELECT DISTINCT ?e ?t WHERE {?e :EcontainsA ?a . ?a :keyA ?t.} SELECT DISTINCT ?e ?t WHERE {?e :EcontainsA ?a . ?a :valueA ?t.}

Page 81: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

81

Page 82: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

82

Page 83: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Log Virtualization

83

database

database

database

DomainOntology EventOntology

Mapping

AnnotationProcess

MiningTools

OnDe

mand

XESLoader

LogMapping

1

2

XFactoryOnDemandImpl XLogOnDemandImpl XTraceOnDemandImpl XEventOnDemandImpl XLogOnDemandIterator XTraceOnDemandIterator

xlog.get(7).get(90) toretrieveteeventinindex7thinsidethe90thtraceinalog

Page 84: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

Questions• How to optimize and test the scalability of the

approach? Fine-tuning is a must!

• Real vs simulated data? (Benchmarking OBDA)

• Initial benchmarking using CPN tools

• Is the “virtual” approach useful? How do process mining algorithms access the data?

• Hybrid virtual approach with caching strategies?84

Page 85: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

KAOS ProjectKnowledge-Aware Operational Support

• Goal: Empowering process mining and online operational support with domain knowledge

• Euregio project: Trento + Bolzano + Innsbruck

• Mix of expertise from AI, BPM, database theory, formal methods, formal ontology, conceptual modeling, process mining, machine learning, software engineering

• Just started: we are hiring!!!

85

Page 86: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

86

Conclusion

Page 87: ATAED2016 Montali - Marrying data and processes: from model to event data analysis

AcknowledgmentsAll coauthors of this research,

in particular

Diego Calvanese (UNIBZ)Giuseppe De Giacomo (UNIROMA)Riccardo De Masellis (FBK-Trento)

Alin Deutsch (UCSD)Chiara Difrancescomarino (FBK-Trento)

Chiara Ghidini (FBK-Trento)Fabio Patrizi (UNIBZ)

Sergio Tessaris (UNIBZ)Alifah Syamsiyah (TU/e)Wil van der Aalst (TU/e)

87