Top Banner
Process-oriented System Analysis Process Mining
33

Process-oriented System Analysis Process Mining. BPM Lifecycle.

Jan 05, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Process-oriented System AnalysisProcess Mining

Page 2: Process-oriented System Analysis Process Mining. BPM Lifecycle.

BPM Lifecycle

Page 3: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Motivation

Up until now:Designed or pre-defined models

Assumption that they are appropriate

Process Mining

Consideration of information from the execution of proceses

This is covered in log data

Logs

Sequence of log entries, which capture events in a company that relate to processes

Page 4: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Log entries

Examples of log entries

Check Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:19:57

Function StoreCustomerData(„Müller“, c1987, „Bad Bentheim“) completed on 12.11.2010 at 9:22:24

Send Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:23:18

Function ContactCustomer(c1987, PromoMailing) completed on 12.11.2010 at 9:24:10

Function StoreCustomerData(„Miller“, c1988, „Osnabrück“) completed on 12.11.2010 at 9:26:08

Check Invoice for Invoice No. 4568 completed on 12.11.2010 at 9:26:38

Function ContactCustomer(c1988, PromoMailing) completed on 12.11.2010 at Send 9:27:32

Page 5: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Logs bear valuable information

Logs bear valuable information to answer questions likeWhen and how many process instances have been executed?

Are there recurring patterns in the execution of activities?

Can process models be derived from the data?

Which paths of execution are used how often in the process models?

Are there paths which are never taken?

Page 6: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Process Discovery

Process Discovery is a technique for deriving a process model from log data

Input: execution logs as ordered lists of activities with time stamp and case id

Output: process model which could have generated the execution logs

The case id is often not directly covered in the data, and needs to be generated in pre-processing

Page 7: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Process Conformance

Process Conformance is a technique to analyze the relationship between log data and process models

Input: Logs and process model

Output: information on the relationship, e.g. fitness

Page 8: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Overview

Page 9: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Execution Logs

AssumptionExecution log defines complete order of events, which can all be

related to process activitiesAll events in the execution log relate to process instances of the

considered process

HintOften log entries refer to different process modelsThis warrants filtering activities

AbstractionTechniques often work on abstraction of logsFocus on case id and activities

Page 10: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Execution Log Format

Log format(caseID, activity)

ExampleCheck Invoice for Invoice No. 4567 completed on 12.11.2010 at

9:19:57

Function StoreCustomerData(„Müller“, c1987, „Bad Bentheim“) completed on 12.11.2010 at 9:22:24

Send Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:23:18

Resulting Log(4567, Check Invoice), (c1987, StoreCustomerData), (4567, Send

Invoice), etc.

Page 11: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Execution Log

Further abstraction

A‘s and B‘s

(case id, task id)

Additional information

Event type, time, resource, data

Not considered here

Assumption

Activity execution captured by one event

No intermediate activities

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Page 12: Process-oriented System Analysis Process Mining. BPM Lifecycle.

The Alpha Algorithm

Page 13: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Process Discovery Algorithms

Simplest Algorithm: The α – Algorithm

Relatively simple, some properties can be proofed

Affected by Noise, therefore not first choice in practice

Noise refers to incomplete or erroneous logs

Furthermore, the α+(+) – Algorithms

α+ and α++ are extensions to the α – Algorithm for recognizing more fine-granular structure in the process model

Also affected by Noise

Finally, techniques for dealing with Noise

Page 14: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Definitions

Let T be a set of activities (Tasks) and T * the set of all sequences of arbitrary length over T, then we have:σ T * is called execution sequence, if all activities in σ belong to the

same process instance

W T * is called execution log (workflow log)

AssumptionsIn each process model, each activity appears at most once

Each direct neighbor relation between activities is represented at least once

Page 15: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Execution Logs

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Page 16: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Execution Logs

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Execution sequences:

Case 1: ABCD

Case 2: ACBD

Case 3: ABCD

Case 4: ACBD

Case 5: EF

Resultingworkflow log: W = {ABCD, ACBD, EF}

Page 17: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Order relations

Log based order relations for pairs of activities a, b T in a workflow log W:Direct successor

a >w b i.e. in an execution sequence b directly follows a

Causalitya w b i.e. a >w b and not b >w a

Concurrency a ║w b i.e. a >w b and b >w a

Exclusivenessa w b i.e. not a >w b and not b >w aActivity pairs which never succeed each other

Page 18: Process-oriented System Analysis Process Mining. BPM Lifecycle.

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

W = {ABCD, ACBD, EF}• Direct successor• Causality• Concurrency

Execution log analysis

Page 19: Process-oriented System Analysis Process Mining. BPM Lifecycle.

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

A>BA>CB>CB>DC>BC>DE>F

AB

AC

BD

CD

EF

B||CC||B

1) 2) 3)

• W = {ABCD, ACBD, EF}• Direct successor• Causality• Concurrency

Execution log analysis

Page 20: Process-oriented System Analysis Process Mining. BPM Lifecycle.

α-Algorithm

The idea is to utilize order relations for deriving a workflow net that is compliant with these relations

Precisely, each order relation results in a petri net fragment, which imposes the respective relationship

Page 21: Process-oriented System Analysis Process Mining. BPM Lifecycle.

α-Algorithm

Idea (a)

a b

Page 22: Process-oriented System Analysis Process Mining. BPM Lifecycle.

α-Algorithm

Idea (b)

a b, a c and b # c

Page 23: Process-oriented System Analysis Process Mining. BPM Lifecycle.

α-Algorithm

Idea (c)

b d, c d and b # c

Page 24: Process-oriented System Analysis Process Mining. BPM Lifecycle.

α-Algorithm

Idea (d)

a b, a c and b || c

Page 25: Process-oriented System Analysis Process Mining. BPM Lifecycle.

α-Algorithm

Idea (e)

b d, c d and b || c

Page 26: Process-oriented System Analysis Process Mining. BPM Lifecycle.

The Alpha-Algorithm (simplified)

1. Identify the set of all tasks in the log as TL.

2. Identify the set of all tasks that have been observed as the first task in some case as TI.

3. Identify the set of all tasks that have been observed as the last task in some case as TO.

4. Identify the set of all connections to be potentially represented in the process model as a set XL. Add the following elements to XL:

a. Pattern (a): all pairs for which hold a→b.

b. Pattern (b): all triples for which hold a→(b#c).

c. Pattern (c): all triples for which hold (b#c)→d.

Note that triples for which Pattern (d) a→(b||c) or Pattern (e) (b||c)→d hold are not included in XL.

Page 27: Process-oriented System Analysis Process Mining. BPM Lifecycle.

The Alpha-Algorithm (cont.)

5. Construct the set YL as a subset of XL by:

a. Eliminating a→b and a→c if there exists some a→(b#c).

b. Eliminating b→c and b→d if there exists some (b#c)→d.

6. Connect start and end events in the following way:

a. If there are multiple tasks in the set TI of first tasks, then draw a start event leading to an XOR-split, which connects to every task in TI. Otherwise, directly connect the start event with the only first task.

b. For each task in the set TO of last tasks, add an end event and draw an arc from the task to the end event.

Page 28: Process-oriented System Analysis Process Mining. BPM Lifecycle.

The Alpha-Algorithm (cont.)

7. Construct the flow arcs in the following way:

a. Pattern (a): For each a→b in YL, draw an arc a to b.

b. Pattern (b): For each a→(b#c) in YL, draw an arc from a to an XOR-split, and from there to b and c.

c. Pattern (c): For each (b#c)→d in YL, draw an arc from b and c to an XOR-join, and from there to d.

d. Pattern (d) and (e): If a task in the so constructed process model has multiple incoming or multiple outgoing arcs, bundle these arcs with an AND-split or AND-join, respectively.

8. Return the newly constructed process model.

Page 29: Process-oriented System Analysis Process Mining. BPM Lifecycle.

α-Algorithm Example

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Page 30: Process-oriented System Analysis Process Mining. BPM Lifecycle.

α-Algorithm Example

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

a(W):

α-Algorithm

Page 31: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Log Completeness

Level of completeness required for a log

Assume for the execution sequence EF, there is a log missing

Then, the correct process model cannot be derived

Basic assumption: each execution sequence must be part of the log

Consequence: the complete behaviour is visible

Problem: amount of required instances grows dramatically

Example:

10 activities are executed in parallel

Amount of potential execution sequences:10! = 3.628.800

Page 32: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Log Completeness

Result

For the α-Algorithm it is sufficient to have completeness in terms of the successor relationship (>w)

Reason

All other relations are derived from direct successorship

Interpretation

Each time two activities may succeed each other, this must be visible in at least one execution sequence

Hint

In case of highly concurrent process models, this reduces the amount of required execution sequences dramatically

Page 33: Process-oriented System Analysis Process Mining. BPM Lifecycle.

Summary

• Execution Logs• Process Mining using the Alpha-Algorithm