Top Banner
The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02
47

The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

Dec 30, 2015

Download

Documents

Winifred Lamb
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

INFO-340: Database Management &

Information Retrieval

David HendryClass L-02

Page 2: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 2

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Topics

• Information Systems • Database systems: Short History• Three-level ANSI-SPARC Architecture• Functions of a DBMS

Page 3: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Q & ASyllabus

Assignment #1

Page 4: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton Information Systems

Page 5: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 5

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Information Systems

• Examples– Airline reservation system– ATM network – File system on a PC– CD collection at home– Museum or art gallery – Website – File sharing system– A personal stamp collection or family

scrapbook

Page 6: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 6

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

• An Information System The resources that enable the collection, management, control, and dissemination of information throughout an organization

Page 7: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 7

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Components of Information System

• Stakeholders – Management– Division workers– Customers– Partners

• Inputs & Outputs – Traffic – Sales

• Data – Plans– Calendars & events – Part assemblies – Business

transactions

• Procedures – Updating data– Transferring data

Page 8: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 8

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Components of Systems

Supplier CustomerSystem

Environment

Input

Input Output

Output

Process StakeholderStakeholder

Page 9: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 9

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

System

Sub-systemBoundary

Page 10: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 10

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Three Key Ideas

• Systems are hierarchical – Systems consist of sub-systems

• Systems are nearly decomposable– Interaction between subsystems is weak

• System boundaries are arbitrary – Where you set a boundary requires

judgment

Page 11: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 11

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Class Exercise:Museum as Information

System

• What questions should you answer?

Page 12: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 12

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Museum as Information

System • Who are the stakeholders?• What is the environment? • What are the inputs, processes &

outputs? • Where are the system boundaries? • How does the system hierarchical

decompose? • Where does the strict

decomposition fail? ‘• Where are the feedback loops?

Page 13: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 13

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Components of Systems

• Environment: Where the system operates• System: Interacting components that work

together to complete a function• Subsystem: A system is made up of other

systems (HIERARCHICAL)• Boundary: What is inside and outside the

system• Inputs & Outputs: Material flowing into

and out of a system• Process: What gets done?

Page 14: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Development Lifecycle

Page 15: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 15

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Development Lifecycle

Define: Vision/scopeNeeds assessment

Design: Invent thetechnological solution

Develop: Build the technology

Deploy: Delivery stabletechnology

Vision/scope document

Design specificationsdocument

Beta software

Version Release

Page 16: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 16

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Database Development

1. Analysis of functional requirements2. Conceptual design3. Logical design4. Physical design 5. Implement6. Test7. Maintain

Page 17: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton Database Systems

Page 18: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 18

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Evolution of Database Systems

• File-based systems (1950s – now)• Application programs process files

• 1st Generation (mid 1960s – mid 1980s)

• Hierarchical & Network databases

• 2nd Generation (mid 1970s – now)• Relational database systems

• 3rd Generation (early 1990s – now)• Object-oriented database systems

Page 19: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 19

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

File Systems

• Application programs manage own data files and produce reports

• Collection of programs was often based on functional areas (payroll vs. personal)

Page 20: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 20

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

File-Based Data Processing

Payroll System

Personal Data

TaxData

ProjectsData

Project Management System

Personal Data

S1

S2

Page 21: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 21

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Weaknesses

• Program-data dependence• Separation and isolation of data• Duplication of data• Incompatibility of files • Many, many application programs

Page 22: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 22

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Key Lesson Learned

1. Program-data independence is good– Programs should not responsible for the

definition of data formats

2. Centralized control of data access is good

– Programs should not be responsible for security, access control, and certain kinds of data integrity

Page 23: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 23

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

1st Generation: Record-Based DBMS

• To address these problems two types of databases were developed in the 60s and early 70s

– Network data models– Hierarchical data models

Page 24: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 24

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Hierarchical/Network Data Model

Courses

Students

• Collections of ‘records’ • Pointers used to create ‘sets’

Page 25: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 25

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Lessons Learned

• Better on – Data independence– Sharing data

• However, complex application programming– Chasing ‘pointers’ to navigate data

Page 26: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 26

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

2nd Generation: Relational Model

• Data modeled as table, rows, columns • No pointer chasing • Grounded in theory (relational algebra)

Page 27: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 27

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

3rd Generation: Object-Oriented Database

Management Systems

• Domain objects (entities, relationships, etc.) modeled directly rather than with tables, rows, columns

• Very important in Engineering Domains

Page 28: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Three-level ANSI-SPARC architecture

Page 29: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 29

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Page 30: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 30

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

External Level

• Different users require different data views– Specific information for goals, job

roles, etc.

• Some information is derived/calculated– Dynamic calculations (age)– Complex combinations of data

Page 31: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 31

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Conceptual Level

• What data is stored and the relationships between the data

• Key concerns:– Entities, attributes, relationships– Data types– Constraints– Security and integrity info

Page 32: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 32

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Internal Level

• How the data is stored – Optimal run-time performance– Optimal space utilization

• Key concerns:– Storage space for data and indices– Record size and placement– Data compression and encryption

Page 33: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 33

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Schemas: Contain information for mapping from one level to the next

Page 34: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 34

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Data Independence

• Logical data independenceChanges in the conceptual schema do not cause the external schemas to ‘break’ (If they fail, they fail gracefully)

• Physical data independenceChanges to the internal schema do not cause the conceptual schema to ‘break’

Page 35: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 35

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Class Exercise

1. Working in teams of 3-4, select an example database application and sketch a picture of:

– External schema – Conceptual schema– Internal schema

2. Give an example of data independence and data dependence

Page 36: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton Functions of DBMS

(See Chapter #2)

Page 37: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 37

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Functions of DBMS

1. Data storage, retrieval, and update2. A user-accessible catalog 3. Transaction support4. Concurrency control5. Recovery services6. Authorization services7. Support for data communication 8. Integrity services

Page 38: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 38

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Summary

• Evolution of Database Systems– File-based– 1st – 3rd generation systems

• Three-level ANSI-SPARC Architecture

• Functions of a DBMS

Page 39: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 39

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Data storage, retrieval, and update

Ability to store, retrieve and update data

Key idea: Hide internal representation of how this is achieved

Page 40: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 40

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

A user-accessible catalog

• Provide users with a catalog that complete describes the database– Tables and relationships– Names, types and sizes of data items– Etc.

• Purposes:– “Self revealing” for understanding data – Data integrity and security is enforced– Store auditing information

Page 41: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 41

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Transaction support• A transaction is a series of actions

– Example: Staff member quits1. Delete staff member from database2. Re-assign responsibilities to another staff

member

• Issue: Must avoid putting the database into an inconsistent state

• Thus: All steps of a transaction are completed or none are completed

Page 42: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 42

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Concurrency control

• Ensuring the multiple users do not conflict with each other and put the database into an inconsistent state

• Easy for read-only situations • Hard when multiple users can

read and write• See lost-update problem

Page 43: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 43

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Recovery services

• Databases ‘crash’– Power goes out– Disks and CPUs fail– Intruders cause systems to fail– Etc.

• Provide a method for recovering the database and returning it to a consistent state

Page 44: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 44

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Authorization services

• Depending on job role, have access to different information and operations – Querying data– Changing data– Deleting data – Adding data

• Must be able to give ‘access permissions’ to people

Page 45: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 45

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Support for data communication

• Ability to access central databases from remote client locations – This idea, of course, ‘powers the web’

• Databases must handle requests and responses

Page 46: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 46

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton

Integrity services

• Rules that specify the valid states of the data within the data base

• Examples– Every employee must have a

manager– Managers supervise a max of 10

employees

Page 47: The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02.

INFO-340: Class 2 47

Th

e I

nfo

rmati

on

Sch

ool

of

the

Un

ivers

ity o

f W

ash

ing

ton4GLs

• High-level applications that are ‘closer’ to users goals

• Example types (e.g., Access):– Form generators – Report generators– Graphics generators– Application generators