Top Banner
Institute of Scientific Computing – University of Vienna P.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institut e of Scientific Computing Faculty for Information Science University of Vienna E-mail : [email protected] WWW: http:// www.par.univie.ac.at /~ brezany http://artemis.wszib.edu.pl/~brezany/
26

Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany1

Data Analysis for Decision andManagement Processes

Univ.-Prof. Dr. Peter Brezany

Institute of Scientific Computing

Faculty for Information Science

University of Vienna

E-mail : [email protected]

WWW: http://www.par.univie.ac.at/~brezany

http://artemis.wszib.edu.pl/~brezany/

Page 2: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany2

Institute of Scientific Computing – Research Profile

The primary objectives of the Institute are

- to conduct research in high-performance advanced data analysis, knowledge management, programming languages, compilers, programming environments and software tools for high performance computing systems,

- to actively contribute to a transfer of technology to industry

- to disseminate knowledge in the fields of parallel and distributed computing and software technology

Page 3: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany3

Institute for Software Science – Main Research Projects and

Cooperations

Participation in 14 EU projects (coordination of 1 project)

The European Centre of Excellence for Parallel Computing,a department of the Institute, founded by the EU

Coordination of the CEI-PACT project (Austria, Slovakia, Czech Republic, Poland, Italy, Hungary, Slovenia)

Special Research Program AURORA of the Austrian ScienceFund (1997-2007)

Many international cooperations (NASA, CalTech, CERN, ...)

Page 4: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany4

New Research Field: GRID COMPUTING

The Grid – a new distributed com-puting infrastructure for science and engineering.

The Grid consists of physical resources (computers, disks, net-works, databases, sensors, laboratoryequipments) and “middleware“ software that ensures the access and the coordinated use of such resources.

Page 5: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany5

Media That Radically Influenced Society

Web

1500sPrinting Press

1840sPenny Post

1850sTelegraph

1920sTelephone

1930sRadio

1990s

1950s TV

20xxGrid

Page 6: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany6

Outline• Business Intelligence, knowledge management• Relation: data, information, knowledge• Knowledge discovery process – System

Architectures• Data warehousing and data webhousing• Data preparation:

– selection, preprocessing (cleaning, transformation), integration

• Data mining techniques – association rules, sequences, classification, prediction, neural

networks, clustering, meta-learning

• Advanced topics– Multi-agent and mobile agent systems– Web mining– intelligent search engines – semantic web– information and knowledge management on computing grids – security issues

Page 7: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany7

Basic LiteratureMark and Mary Whitehorn: Business Intelligence: The IBM Solution. Springer-Verlag, 2000.

R. Kimball: The Data Warehouse Toolkit. John Willey, 1996.

J. Han, M. Kamber: Data Mining. Concepts and TechniquesMorgam Kaufmann Publishers, 2000.

M. Ester, J. Sander: Knowledge Discovery in Databases.Springer-Verlag, 2000 (in German).

I.H. Witten, E. Frank: Data Mining. (Practical Machine Learning Tools and Techniques with Java Implementations).Morgam Kaufmann Publishers, 2000.

Page 8: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany8

Time Schedule

• Monday, Feb 27 : 17.15 – 20.30 (4 hours)

• Tuesday, Feb 28: 10.00 -- 13.15 (4 hours)

• Wednesday, Mar 01: 15.30 – 18.45 (3 hours)

• Thursday, Mar 02: 16.00 --18.15 (3 hours)

Location: s.1 AK4

Page 9: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany9

Business Intelligence

Definition:

Business Intelligence is an umbrella term, broadly covering theprocesses involved in extracting valuable business informationand knowledge from the mass of data that exists within a typical enterprise, and knowledge management (knowledge storage in an appropriate form and knowledge distribution).

What is meant by information and knowledge? This is best un- derstood by imagining a chain linking data information knowledge.

Page 10: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany10

Data Information Knowledge

• Data are the facts about events or processes.

• Information is the organization of, associations between, and constraints upon data that allow it to be used by a user or a machine.

• Knowledge is the interpretation of information and its use in a problem solving context. Knowledge can lead to new insights, which in turn lead to new innovations and ultimately to wealth creation and improvements in the quality of life.

• Wisdom arises when one understands the foundational principles responsible for the patterns representing knowledge (She/he can answer questions like Why ... ? and knows how he can find or derive new knowledge.

Page 11: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany11

Data

Example: When a customer visits a gass station and buys

petrol, it is possible to describe this transaction with the

following data: data/time, volume, price.

However, this data do not say, why this customer has chosen

this station and not any other, and it is not possible to find out from this data whether he will come again, or whether this

station is good or bad.

Data alone posses almost no meaning nor purpose. They are

the base material for getting information.

Page 12: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany12

Information

• A piece of information can be described as a message.

• As all messages, information has one sender and one receiver.

• Information shall form the opinion or attitude of the receiver to a problem and influence his behavior.

• We can also think of information as data which something changes/forms/influences.

• The word ``inform´´ originally meant ``give some form one thing or person´´.

Page 13: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany13

Information (2)

• Data become information when the receiver adds some meaning to data. Such a data upgrading can be done in different ways, for example:

– Contextualizing: We know for what purpose the data was collected.

– Calculation: The data could be mathematically analyzed und statistically

enriched.– Correction: Errors are removed from the data material.– Comprising: The data is transformed into a more compact

form; main components of the data material have to be identified.

Page 14: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany14

Information Management

Information management: all management tasks, which dealwith information and communication in one enterprise.

Page 15: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany15

Knowledge

• Knowledge is the production factor of the future, which will replace energy and materials.

• Knowledge is produced by means of head activity and processes, which modell the head activity.

• Transformation process Information Knowledge:

– Comparison: How shall I estimate information about the current situation in comparison to other known situations?

– Consequence: How will information influence decisions and activities.

– Connex: Which relations exist between one concrete information element and another one?

– Conversation: How do think other people about one certain piece of information?

Page 16: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany16

Knowledge (2)

• People gain knowledge through experience – they see, hear, touch, and taste the world around them.

• We can associate something we see with something we hear, thereby gaining new knowledge about the world.

• Suppose we know that the sun is hot, balls are round, and the sky is blue. These facts are knowledge about the world. How do we store this knowledge in our brain? How could we store this knowledge in a computer?

• This problem, called knowledge representation, is one of the first, most fundamental issues that researchers in artificial intelligence had to face.

Page 17: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany17

Knowledge Pyramide

Action

Knowledge

Information

Data

Characters

Syntax

Semantics (Meaning)

Pragmatics (Associated withContext and Experience)

Decision

Knowledge has 3 Dimensions: Syntax, Semantics, and Pragmatics.

Page 18: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany18

Example

• Characters: t i s n i o r o l a n i l w

• Data: The above characters give with the right syntax (here the sequence of letters) a

statement „It will rain soon“.

• Information: The above statement means:„Water drops fall from the sky“.

• Knowledge: Information „Water drops fall from the sky.“ isconnected with experience and expectations

like: „One can become wet; it can rain into the flat“.

• Action: Based on this knowledge, activities are developed: „I will take an umbrella, I will close the window, etc.“

Page 19: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany19

Knowledge Management

Knowledge management: all management tasks of the enterprise, which deal with obtaining, utillization, and further development of knowledge.

Page 20: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany20

Knowledge Representation

• Procedural representation– Perhaps the most common technique for representing

knowledge in computers is to use procedural knowledge.Procedural code not only encodes facts (constants and variables) but also defines a sequence of operations for using and manipulating those facts. Thus, program code is a perfect natural way of encoding procedural knowledge. This „hardcoded“ logic is typically not considered to be part of artificial intelligence per se.

• Declarative representation – A user simply states facts, rules, and relationships.

However, declarative knowledge must be processed by some procedural code. Most of the knowledge representation techniques studied in artificial intelligence are declarative. Some of them are shown on the following slides.

Page 21: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany21

Knowledge Representation - Rules

General form of a predicate logic rule:

if antecedents(s) then consequents(s)

(Instead antecedent, other names, e.g., precondition, are used. Instead consequent, other names, e.g., conclusion, action, hypothesis, are used.)

Rules can have following forms:• if P then Q• if P1 and P2 and ... and Pn then Q1 and Q2 and ... and Qm

• if P1 and P2 or ... or Pn then Q

Rules, which produce new facts, are called production rules.

Page 22: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany22

Rules (2)Architecture of a Production

System

Rules

Knowledge base Fact base Inference mechanisms

Facts

Act

Recognize Select

Page 23: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany23

Semantic Nets

Semantic nets are used to define the meaning of a concept by itsrelationships to other concepts.

A graph data structure is used, with nodes holding concepts andlinks with natural language labels showing the relationships.

A portion of a semantic net representation of the vehicle domain isshown in the next slide.

Remark: The standard relationships such as isa, has-part, and instance should be familiar to readers with object-oriented design experience.

Page 24: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany24

A Semantic Net Example

Vehicle

Automobile

Sports Car

Corvette

2

4

Motor

Wheels

Small

Doors

has-part

has-part

size

instance

num-doors

num-wheels

has-part

is-a

is-a

Page 25: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany25

Business Intelligence Tools

• Data warehouses

• OLAP (On-Line Analytical Processing) tools

• Data mining tools

• Text mining tools

• Data joiners

• Business Intelligence portals, etc.

Page 26: Institute of Scientific Computing – University of ViennaP.Brezany 1 Data Analysis for Decision and Management Processes Univ.-Prof. Dr. Peter Brezany Institute.

Institute of Scientific Computing – University of Vienna

P.Brezany26

Business Intelligence Tools (cont.)• Data warehouse - a repository of multiple heterogeneous data

sources, organized under a unified schema at a single site in order to facilitate management decision making.

• OLAP – analysis techniques with functionalities such as summari- zation, consolidation, and aggregation, as well as the ability to view information from different angles.

• Data mining – extracting or “mining“ knowledge from large data sets.

• Text mining – “mining“ large textual (document) databases. Related term – web mining.

• Data joiner - working with data from disparate, heterogeneous data sources

• Business Intelligence portal – a Web site designed to be the first point of entry for visitors to information about a company. With help of the portal´s personalising functions, the user can choose informa-tion sources that he needs for performing a specific task. The portal allows problemless access to valuable information and data analyses; so, the basis for competent decisions is optimized.