Mining Declarative Models using Intervals Jan Martijn van der Werf Ronny Mans Wil van der Aalst
Feb 08, 2016
Mining Declarative Models using Intervals
Jan Martijn van der WerfRonny MansWil van der Aalst
A service landscape
How to combine logs?
Merge using time stamps!
Are timestamps synchronized in landscape?
Semantics of timestamps?• Time when the event occurred?• Time when it started / completed?• Time when the event is recorded?• Time when the event is stored?• ...
Time stamps
• Time scale of data?• Dense (time stamps)• Coarse (hour, minute, day)
• Reliability of the data?• User entered?• System generated?
Events & intervals: “old theory”
• Structure of concurrency:− Observe whether an event preceded another event− Observe whether events occurred simultaneously
• Implies an order• Interval order!
• Position of intervals on the axis!
Interval orders
• Define relation > by a > b iff “a occurs wholly after b”• Interval order if:
• [ a > b and c > d ] imply [ a > d or c > b ]
• Generalization of transitivity• Simultaneousness: ⌐ ( a > b) /\ ⌐ ( b > a)
b a
cd
b
a
b
a
But only works on level of events!
Process mining & intervals
1. Derive interval for each event• Singleton set (single time stamp)• Accurracy interval ( t ± )• Time scale (week, day, hour, minute, ...)
2. Relate events and intervals to activity3. Discover process model
Activities & intervals
• First event until last event
• Following the life cycle of the activities
Activities & intervals
• Activities relate to a set of intervals• Many different mappings possible!• Granularity (Density of intervals)
− Fine: many small intervals− Coarse: few large intervals
• Finest interval function:• Only intervals of single points
• Coarsest interval function• Each activity maps to a single interval
Process mining & intervals
1. Derive interval for each event• Singleton set (single time stamp)• Accurracy interval ( t ± )• Time scale (week, day, hour, minute, ...)
2. Relate events and intervals to activity• Many different approaches!
3. Discover process model
Relations on interval sets (1)
• Simultaneousness• Weak: there is somewhere some overlap
• Dependent: always if A occurs, then B occurs as well
• Strong: if A occurs, then B occurs and vice versa
Relations on interval sets (2)
• Causality• Wholly: all intervals of A before B
• Succeeded: each interval of B followed by one of C
• Preceeded: each interval of B occurs after one of A
Declarative language
• Interval relations are highly declarative:• Granularity influences degree of concurrency
• Activities occur simultaneously, unless prohibited
Succeeds!
Preceeds!
Declarative language
An example
Discover declarative model
1. Derive interval sets2. Calculate relations on interval sets3. Generate declarative model
− Problems: − Simultaneousness relations overlapping− Causality: always finds the transitive closure!
• Transitive reduction: S S* = R* R
• Minimal edge problem:• Only use “existing” edges for transitive reduction• What are existing arcs in process mining?
Causality & transitive closure
Polynomial
NP-hard
Next to and betweenness relation
• Next to• Weak: there is an interval of A directly followed by A• Strong: all intervals of A are directly followed by B
• Betweenness: • interval of B is between two intervals of A• Weak or strong?
bac
aa
c
b
d? ?
Conclusions & future work
• Approach:1. Derive interval for each event2. Relate events and intervals to activity
− Many possibilities!3. Discover process model
• Proof of concept implemented in ProM• Apply approach to case studies