Top Banner
Discovering Branching Conditions from Business Process Execution Logs Massimiliano de Leoni, Marlon Dumas, Luciano García-Bañuelos University of Tartu, Estonia (Joint work with Eindhoven University of Technology)
17

Discovering Branching Conditions from Business Process Execution Logs

Dec 07, 2014

Download

Education

Marlon Dumas

Paper presentation given at the International Conference on Fundamental Approaches to Software Engineering (FASE) in March 2013. The paper can be found here.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Discovering Branching Conditions from Business Process Execution Logs

Discovering Branching Conditions from Business

Process Execution Logs

Massimiliano de Leoni, Marlon Dumas, Luciano García-Bañuelos

University of Tartu, Estonia(Joint work with Eindhoven University of Technology)

Page 2: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Business Process Management

2

Start

Get Ready

Travel by CarTravel by Train

BETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Travel by Train Pay for Parking

Travel by Car

End

Give a Talk

Implementation

EventLog

Execution

Page 3: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Business Process Mining

3

Start

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContact

customer

Archive order

End

Performance Analysis

Process Model

Organizational Model

Social Network

EventLog

Slide by Ana Karla Alves de Medeiros (TU/e)

ProMProcess miningworkbench

Page 4: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Data perspective?

4

salaryage

installment

amount

length

Branching points

Page 5: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

ProM’s Decision Miner

5

salaryage

installment

amount

length

CID Amount Len Salary Age Installm Task

CID Amount Len Salary Age Installm Task13219 8500 1 NULL NULL NULL ELA

Event

Log

CID Task Data Time Stamp …

13219 ELA Amount=8500 Len=1 2007-11-09 T 11:20:10 -

13219 RAP Salary=2000 Age=25 2007-11-09 T 11:22:15 -

13220 ELA Amount=25000Len=1 2007-11-09 T 11:22:40 -

13219 CI Installm=750 2007-11-09 T 11:22:45 -13219 NE 2007-11-09 T 11:23:00 -13219 ASA 2007-11-09 T 11:24:30 -13220 CI Installm=1200 2007-11-09 T 11:24:35 -

… … … … …

CID Amount Len Salary Age Installm Task13219 8500 1 NULL NULL NULL ELA13219 8500 1 2000 25 NULL RAP13219 8500 1 2000 25 750 RAP13219 8500 1 2000 25 750 NE

Page 6: Discovering Branching Conditions from Business Process Execution Logs

(amount < 10000)

(amount < 10000) ∨ (amount ≥ 10000 age < 35)∧

amount

Approve SimpleApplication (ASA)

≥ 10000 < 10000

Approve Complex Application (ACA)

Approve SimpleApplication (ASA)

≥ 35

age< 35

ProM’s decision miner / 2CID Amount Installm Salary Age Len Task

13219 8500 750 2000 25 1 ASA13220 12500 1200 3500 35 4 ACA13221 9000 450 2500 27 2 ASA

… … … … … … …

6

Decision tree learning

amount ≥ 10000 age ≥ 35∧

Page 7: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Decision miner: Not a panacea!

• Decision tree learning cannot discover expressions of the form “v op v”

7

installment > salary

The decision miner would return:

installment ≤ 1760 ∧ salary ≤ 1750 ∨ installment ≤ 1810 ∧ salary ≤ 1800 ∨ installment ≤ 1875 ∧ salary ≤ 1850 ∨ installment ≤ 1960 ∧ salary ≤ 1950 ∨installment ≤ 1975 ∧ salary ≤ 1970 ∨ installment ≤ 2000 ∧ salary ≤ 1990 ∨ …

Page 8: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Problem statement

• Discovery of branching conditions composed of atoms of the form “v op c” and “v op v”, including linear equations or inequalities involving multiple variables

• Our solution combines• Tools for dynamic analysis of software

(i.e., likely invariant discovery)

• Theory of decision tree learning

8

Page 9: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Daikon

• Tool for discovering likely invariants from execution logs• Given a set of program points, Daikon:• Instantiates a set of invariant templates

(over certain combination of variables)• Traverses the execution log• Falsifying some invariants• Gathering the statistical support for the remaining templates

• Discards some invariants based on:• Subsumption• Statistical support

Daikon strongly relies on code instrumentation/analysis9

Page 10: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

CID Amount Installm Salary Age Len Task13210 20000 2000 2000 25 1 NR13220 25000 1200 3500 35 2 NE13221 9000 450 2500 27 2 NE13219 8500 750 2000 25 1 ASA13220 25000 1200 3500 35 2 ACA13221 9000 450 2500 27 2 ASA

… … … … … … …

Daikon: Tool for mining likely invariants

10

Daikon

installment > salaryamount ≥ 5000length < age…

installment ≤ salaryamount ≥ 5000length < age…

installment ≤ salaryamount ≤ 9500length < age…

installment ≤ salaryamount ≥ 10000length < age…

Page 11: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

BranchMiner (Conjunctive)• Information Gain (IG) quantifies the discriminating power of a

predicate (with respect to two different outcomes)• Approach: • Use Daikon for discovering invariants• Combine invariants in a conjunction so as to maximize the overall IG

11

a1: installment > salarya2: amount ≥ 5000a3: length < age…

IG(a1) = 0.8IG(a2) = 0.2IG(a3) = 0…

IG(a1∧a2) = 0.8…

Page 12: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

¬(P Q)∧

Disjunctions?

12

P Q∧

¬P∨¬Q

• Only the negation of conjunctive expression by the de Morgan Laws

Page 13: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

BranchMiner (Disjunctive)

13

Partition 1 Partition 2

ConjunctiveBranchMiner

ConjunctiveBranchMiner

CONJ1 CONJ2

Partition n

ConjunctiveBranchMiner

CONJn

EventLog

Page 14: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

BranchMiner (Disjunctive)

14

Partition 1 Partition 2

ConjunctiveBranchMiner

ConjunctiveBranchMiner

CONJ1 CONJ2

EventLog

Notify Rejection

Notify Eligibility

Notify Rejection

Decision Tree

IG(CONJ1) = 0.4IG(CONJ2) = 0.45IG(CONJ3) = 0.5…

IG(CONJ1∨CONJ2) = 0.78IG(CONJ1∨CONJ3) = 0.6…

Page 15: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Linear and polynomial expressions

• Approach• Select all numerical

variables and generate some derived (a.k.a. latent) variables using an arithmetic operatore.g., salary_div_installment, meaning “salary/installment”• Augment the event log with

the values for latent variables• Run the discovery method

for conjunctive/disjunctive conditions

15

CID Amount Installm Salary Age Len Task13210 20000 2000 2000 25 1 NR13220 25000 1200 3500 35 2 NE13221 9000 450 2500 27 2 NE13219 8500 750 2000 25 1 ASA13220 25000 1200 3500 35 2 ACA13221 9000 450 2500 27 2 ASA

… … … … … … …

CID Amount Installm Salary Sal/Inst Age LenAge+Le

n Task13210 20000 2000 2000 1.00 25 1 26 NR13220 25000 1200 3500 2.92 35 2 37 NE13221 9000 450 2500 5.56 27 2 29 NE13219 8500 750 2000 2.67 25 1 26 ASA13220 25000 1200 3500 2.92 35 2 37 ACA13221 9000 450 2500 5.56 27 2 29 ASA

… … … … … … … … …

CID Amount Installm Salary Sal/Inst Age LenAge+Le

n Task13210 20000 2000 2000 1.00 25 1 26 NR13220 25000 1200 3500 2.92 35 2 37 NE13221 9000 450 2500 5.56 27 2 29 NE13219 8500 750 2000 2.67 25 1 26 ASA13220 25000 1200 3500 2.92 35 2 37 ACA13221 9000 450 2500 5.56 27 2 29 ASA

… … … … … … … … …

Page 16: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Assessment

16

Page 17: Discovering Branching Conditions from Business Process Execution Logs

Dis

cove

ring

Bran

chin

g Co

nditi

ons

Conclusions• We developed a technique for discovering branching

conditions from event logs• Complex expressions (e.g., “v op v”, linear inequalities, etc.)• More compact than those mined with conventional decision trees

• Integration into ProM• Implemented as a command line tool

• Validation with real-life logs• Assessed with synthetically generated event logs

• Areas for extensions• Coping with noise in the event logs• Handling of null values• Extending the coverage to more complex types of expressions

17