This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Provided courtesy of BCS – Dr. Jürgen Pitschke Page 2
Content
1 Do we need „Textual Analysis“? ................................................................................. 3 2 Origins of Textual Analysis ........................................................................................ 4
2.1 Textual Analysis – The Abbott Article .................................................................. 4 2.2 Textual Analysis and UML Models ....................................................................... 4 2.3 Textual Analysis and BPMN Models ..................................................................... 6 2.4 Textual Analysis and other Model Elements ......................................................... 6
3 Textual Analysis in the Modeling Process .................................................................... 7 3.1 The Zachman Framework .................................................................................. 7 3.2 Process Models and the BCS Modeling Process ..................................................... 8
3.2.1 Identify Concepts ....................................................................................... 9 3.2.2 Building a Fact Model ................................................................................ 10 3.2.3 Business Rules, RuleSpeak® and Textual Analysis ......................................... 12 3.2.4 Textual Analysis and Process Decomposition ................................................ 12
4 How Textual Analysis Really Works .......................................................................... 14 4.1 Evaluation of Candidate Objects for Model Content ............................................. 14 4.2 Language and Textual Analysis ........................................................................ 15
5 Tool Support for Textual Analysis ............................................................................ 16 6 Summary ............................................................................................................. 17 Literature .................................................................................................................. 18 Annex A: Sample Service Desk .................................................................................... 19
Unsorted Information Collection (Abridged) ................................................................. 19 Collection of Terms (Macro Level, Abridged) ................................................................ 21 Fact Model (Macro Level) .......................................................................................... 21 Process Model (Macro process Incident Management) ................................................... 22
Provided courtesy of BCS – Dr. Jürgen Pitschke Page 4
2 Origins of Textual Analysis
The basic idea of textual analysis is originated in an article of Russel J. Abbott1 in
Communications of the ACM, November 1983 titled „Program Design by Informal English
Descriptions“. There were approaches for a systematic analysis of textual information before.
Abbott had the idea to extract data types, variables, operators and control structures from
natural language text to develop Ada programs. I don’t know how successful his approach
was. But 20 years later, with the rise of visual modeling, his idea found new users and was
enhanced. The article is for example referenced in „Mentoring Object Technology Projects“2 by
Richard Dué published in 2002.
2.1 Textual Analysis – The Abbott Article
Abbott wrote in his article:
„We identify the data types, objects, operators, and control structures by looking at the
English words and phrases in the informal strategy.
1. A common noun in the informal strategy suggests a data type.
2. A proper noun or direct reference suggests an object
3. A verb, attribute, predicate, or descriptive expression suggests an operator.
4. The control structures are implied in a straightforward way by the English.”
He points also to the fact that the process of formalizing the information cannot be automated
currently. Creativity and experience is needed to apply it successfully.
Table 1 gives an overview for the Abbott approach.
Part of Text Model Component Example
Proper Noun Instance, Object J. Smith, Euro
Common Noun Class, Type, Role toy, currency, seller
Doing Verb Operation buy
Being Verb Classification is an
Having Verb Composition has an
Stative Verb Invariance-Condition are owned
Modal Verb Data Semantics, Pre
Condition, Post Condition or
Invariance Condition
must be
Adjective Attribute Value or Class unsuitable
Adjective Phrase Association, Operation The customer with children,
the customer who bought the
kite
Transitive Verb Operation enter
Intransitive Verb Exception or Event depend
Table 1: Textual Analysis according to Abbott
2.2 Textual Analysis and UML Models
The Unified Modeling Language (UML) is the standard notation for object oriented system
design today. Abbott’s idea can be applied for the model development with UML, especially for
1 Russel J. Abbott, Program Design by Informal English Descriptions, Communications of the ACM, Volume 26, Number 11, November 1983 2 Richard T. Dué, Mentoring Object Technology Projects, Just Enough Series / Yourdon Press, Prentice Hall, 2002
A common misunderstanding about the Zachman Framework is that the perspectives (rows)
contain more detail from top to bottom. That’s wrong. Each row defines a new perspective. An
4 John Zachman, The Zachman Framework For Enterprise Architecture: Primer for Enterprise Engineering and Manufacturing, Zachman International, 2006, Electronic Book
Provided courtesy of BCS – Dr. Jürgen Pitschke Page 12
Fact Type Sentence Form Sample
Unary Fact Type Subject Predicate Service Desk Engineer is available.
Binary Fact Type Subject Predicate Object User reports Incident.
Ternary Fact Type Subject Predicate Object
Attribute
User reports Incident for Asset.
Quaternary Fact
Type
Subject Predicate Object
Attribute Adverb
User reports Incident for Asset at
Date.
Table 5: Fact Types and Samples7
In real projects you will find mainly unary and binary fact types. Sometimes we see ternary
fact types. In a real world project I never saw a quaternary fact type so far. Often such fact
types are reduced to binary fact types in the process of simplifying and normalizing the fact
types.
The techniques mentioned work best with English, compared e.g. to “normal” German
language. The reason is that German has more variances in creating sentences. It is good
practice in any language to choose a simple construction of sentences already during
information collection. At the same time the text should be accepted by the reader as a
normal, understandable text. Textual Analysis is a creative technique, not pure mechanics.
3.2.3 Business Rules, RuleSpeak® and Textual Analysis
RuleSpeak® is a method to present business rules in natural language8. Expressing rules in
natural language ensures that the rule statements are understandable to a business person.
Using a regulated vocabulary and the RuleSpeak sentence patterns ensures at the same time
that the quality and consistency of the rules can be checked.
Executing textual analysis to identify and formulate Business Rules includes the following
steps:
Identify statements representing business rules and advices
Transform (or better “normalize”) the statements to make them conform to the RuleSpeak
guidelines (use of rule keywords, decomposition of rules)
The document “Basic RuleSpeak® Guidelines- Do’s and Don’ts in Expressing Natural-Language
Business Rules in English” is a good introduction into this topic.
3.2.4 Textual Analysis and Process Decomposition
The BCS Modeling Process is a process driven approach. This means the business process is
the central element of the approach; it is the “motivating” element. Other models and model
elements are used to describe the business environment (the enterprise) or are derived from
the business activities.
The presentation of business processes is realized in three levels of detail:
1. Structur Level: General Process Structure
2. Management Level: Refinement of the sub processes of the structural level
3. Task Level: Refinement of the activities in level two to the (atomic) task level
This three level presentation of processes can be reached by decomposition (top-down) or
composition (bottom-up). Our indented approach is the decomposition from the macro level
to the task level.
7 see Semantics of Business Vocabulary and Business Rules (SBVR), v1.0, OMG Document Number: formal/2008-01-02, Anhang I 8 Details about RuleSpeak® can be found at www.rulespeak.com.
Provided courtesy of BCS – Dr. Jürgen Pitschke Page 13
Process models are presented using the Business Process Modeling Notation (BPMN). Textual
analysis for BPMN elements was shown in paragraph 2.3.
We will change the approach again to be more goal-oriented. We ask for content not for
model elements. We identify the model elements in the following order in our projects:
General participants in the process (presented by pools)
Roles (presented by lanes)
Process events, Milestones (presented by start, intermediate and end events)
Business Activities (presented by sub processes and tasks)
Order (sequence flow) of Business Activities, including alternative, parallel and optional
flows (presented by sequence flows and gateways)
Information Exchange and Interfaces between Process Participants (presented by message
flows)
Information Objects (presented by data objects).
The order of the content in the list above is not a coincidence. Because of theoretical and
practical reasons this approach has been proven to help to get a well structured and
maintainable process model.
Our questionnaire for information collection for process modeling related to a process under
investigation has the following content:
What is the benefit of the process for the customer of the process? What is the intended
result of the process?
Name the process participants.
Classify the participant for internal or external participants.
Name roles9 within the internal participants.
Which business events cause the process to start?
Which business events represent reaching the intended result of the process?
Are there important milestones within the process?
What are the activities within the process?
Assign the activities to the process phases: Which activities are executed between start
and milestone 1, between milestone 1 and 2 …
Give a short description of the activities. What is the expected result of the activity10?
Which preconditions exist for the execution of the activity? Which exceptions can occur
during execution of the activity?
Which information objects is needed to execute the activity? Which information objects are
changed or produced by the activity?
We will define guidelines how many activities we want to see in our process model depending
on the detail level (macro process level, sub process level, task level). E.g. for the macro level
the guideline is to name not more than 10 sub processes. Tasks are not allowed on the macro
process level.
There are strong logical connections between the questions visible. To name roles it is very
helpful to know and classify activities within the process. I point again to our iterative
approach. The order of the questions is not a dogma.
9 This question assumes a common understanding of the term „Role“. This is not discussed here. 10 We see a strong connection to the fact model. The result must be presented in the fact model.