Top Banner
IS 257 – Spring 2004 2004.01.20 - SLIDE 1 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management and Systems SIMS 257: Database Management
51

2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 1

Database Management:Introduction

Ray R. Larson

University of California, Berkeley

School of Information Management and Systems

SIMS 257: Database Management

Page 2: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 2

Overview

• Announcements– TA – Mayjane Co

• Course Description

• Database Concepts and Terminology

• Database Models

Page 3: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 3

Course Overview

• Description of the Course

• Assignments

• Readings

• Grading

• Schedule

• Web site: http://sims.berkeley.edu/courses/is257/s04

Page 4: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 4

Course Description

• This course is concerned with the design of the database itself -- not with the design of database system software.– We will discuss DBMS internals only as they

relate to the database and its design and structure

• We will spend a fair amount time on database application design, especially on options for Web application database support -- but this will not be primary focus.

Page 5: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 5

Assignments

• Two kinds of assignments– Using a pre-built database for search and

retrieval and database modification queries– Designing, populating, and running queries

against your own personal database• Types of database project

– Individual» Work related» Course only» Projects from around campus that need doing…

– Group» Course related» SIMS Final project

Page 6: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 6

Readings

• Textbook is:– Jeffrey A. Hoffer, Mary B. Prescott and Fred

R. McFadden. Modern Database Management (Sixth Edition). Prentice Hall (Pearson Educational) : Upper Saddle River, NJ, 2002.

– ISBN 0-13-061183-2

Page 7: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 7

Grading

• Grades will be based on:– Assignments (30%)– Personal/Group Database project (60%)– Class participation (10%)– (No midterm or final)

Page 8: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 8

Schedule

• on website:

• http://sims.berkeley.edu/courses/is257/s04/Schedule.html

Page 9: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 9

What is a Database?

Page 10: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 10

Files and Databases

• File: A collection of records or documents dealing with one organization, person, area or subject. (Rowley)– Manual (paper) files– Computer files

• Database: A collection of similar records with relationships between the records. (Rowley)– bibliographic, statistical, business data, images,

etc.

Page 11: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 11

Database

• A Database is a collection of stored operational data used by the application systems of some particular enterprise. (C.J. Date)– Paper “Databases”

• Still contain a large portion of the world’s knowledge

– File-Based Data Processing Systems• Early batch processing of (primarily) business data

– Database Management Systems (DBMS)

Page 12: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 12

Terms and Concepts

• Database Management System -- DBMS– Software system used to define, create,

maintain and provide controlled access to the database and repository

Page 13: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 13

Terms and Concepts

• Repository– AKA Data Dictionary– The place where all metadata for a particular

database is stored– may also include information on relationships

between files or tables in a particular database

Page 14: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 14

Terms and Concepts

• Metadata– Data about data

• In DBMS means all of the characteristics describing the attributes of an entity, E.G.:

– name of attribute– data type of attribute– size of the attribute– format or special characteristics

– Characteristics of files or relations• name, content, notes, etc.

Page 15: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 15

Why DBMS?

• History– 50’s and 60’s all applications were custom

built for particular needs– File based– Many similar/duplicative applications dealing

with collections of business data– Early DBMS were extensions of programming

languages– 1970 - E.F. Codd and the Relational Model– 1979 - Ashton-Tate & first Microcomputer

DBMS

Page 16: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 16

File Based Systems

Naughty

NiceJust what asked for

CoalEstimation

DeliveryList

Application File

ToysAddresses

Toys

Page 17: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 17

From File Systems to DBMS

• Problems with File Processing systems– Inconsistent Data

– Inflexibility

– Limited Data Sharing

– Poor enforcement of standards

– Excessive program maintenance

Page 18: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 18

DBMS Benefits

• Minimal Data Redundancy• Consistency of Data• Integration of Data• Sharing of Data• Ease of Application Development• Uniform Security, Privacy, and Integrity

Controls• Data Accessibility and Responsiveness• Data Independence• Reduced Program Maintenance

Page 19: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 19

Terms and Concepts

• Data Independence– Physical representation and location of data

and the use of that data are separated• The application doesn’t need to know how or

where the database has stored the data, but just how to ask for it.

• Moving a database from one DBMS to another should not have a material effect on application program

• Recoding, adding fields, etc. in the database should not affect applications

Page 20: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 20

Database Environment

CASE Tools

DBMS

UserInterface

ApplicationPrograms

Repository Database

Page 21: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 21

Database Components

DBMS===============

Design toolsTable CreationForm CreationQuery CreationReport Creation

Procedural language

compiler (4GL)=============

Run timeForm processorQuery processor

Report WriterLanguage Run time

UserInterface

Applications

ApplicationProgramsDatabase

Database contains:User’s DataMetadataIndexesApplication Metadata

Page 22: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 22

Types of Database Systems

• PC Databases

• Centralized Database

• Client/Server Databases

• Distributed Databases

• Database Models

Page 23: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 23

PC Databases

E.G.AccessFoxProDbaseEtc.

Page 24: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 24

Centralized Databases

Cental Computer

Page 25: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 25

Client Server Databases

NetworkClient

Client

Client

DatabaseServer

Page 26: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 26

Distributed Databases

computercomputer

computer

Location A

Location CLocation B

HomogeneousDatabases

Page 27: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 27

Distributed Databases

Local Network

DatabaseServer

Client

Client

CommServer

Remote Comp.

Remote Comp.

HeterogeneousOr FederatedDatabases

Page 28: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 28

Terms and Concepts

• Database Application– An application program (or set of related

programs) that is used to perform a series of database activities:

• Create• Read• Update• Delete• On behalf of database users

Page 29: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 29

Range of Database Applications

• PC databases– Usually for individual

• WorkGroup databases– Small group use where everyone has access

to the database over a LAN

• Departmental databases– Larger than a workgroup – but similar

• Enterprises databases– For the entire organization over an intranet (or

sometimes the internet)

Page 30: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 30

Terms and Concepts

• Database activities:– Create

• Add new data to the database

– Read• Read current data from the database

– Update• Update or modify current database data

– Delete• Remove current data from the database

Page 31: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 31

Terms and Concepts

• Enterprise– Organization

• Entity– Person, Place, Thing, Event, Concept...

• Attributes– Data elements (facts) about some entity– Also sometimes called fields or items or domains

• Data values– instances of a particular attribute for a particular entity

Page 32: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 32

Terms and Concepts

• Records– The set of values for all attributes of a

particular entity– AKA “tuples” or “rows” in relational DBMS

• File– Collection of records – AKA “Relation” or “Table” in relational DBMS

Page 33: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 33

Terms and Concepts

• Key– an attribute or set of attributes used to identify

or locate records in a file

• Primary Key– an attribute or set of attributes that uniquely

identifies each record in a file

Page 34: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 34

Terms and Concepts

• DA– Data adminstrator - person responsible for the

Data Administration function in an organization

– Sometimes may be the CIO -- Chief Information Officer

• DBA– Database Administrator - person responsible

for the Database Administration Function

Page 35: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 35

Terms and Concepts

• Data Administration– Responsibility for the overall management of data

resources within an organization

• Database Administration– Responsibility for physical database design and

technical issues in database management

• Data Steward– Responsibility for some subset of the

organization’s data, and all of the interactions (applications, user access, etc.) for that data

Page 36: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 36

Terms and Concepts

• Models– (1) Levels or views of the Database

• Conceptual, logical, physical

– (2) DBMS types• Relational, Hierarchic, Network, Object-Oriented,

Object-Relational

Page 37: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 37

Models (1)

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 38: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 38

Data Models(2): History

• Hierarchical Model (1960’s and 1970’s)– Similar to data structures in programming

languages.

Books(id, title)

Publisher SubjectsAuthors

(first, last)

Page 39: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 39

Data Models(2): History

• Network Model (1970’s)– Provides for single entries of data and

navigational “links” through chains of data.

Subjects Books

Authors

Publishers

Page 40: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 40

Data Models(2): History

• Relational Model (1980’s)– Provides a conceptually simple model for data

as relations (typically considered “tables”) with all data visible.

Book ID Title pubid Author id1 Introductio 2 12 The history 4 23 New stuff ab 3 34 Another title 2 45 And yet more 1 5

pubid pubname1 Harper2 Addison3 Oxford4 Que

Authorid Author name1 Smith2 Wynar3 Jones4 Duncan5 Applegate

Subid Subject1 cataloging2 history3 stuff

Book ID Subid1 22 13 34 24 3

Page 41: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 41

Data Models(2): History

• Object Oriented Data Model (1990’s)– Encapsulates data and operations as

“Objects”

Books(id, title)

Publisher SubjectsAuthors

(first, last)

Page 42: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 42

Data Models(2): History

• Object-Relational Model (1990’s)– Combines the well-known properties of the

Relational Model with such OO features as:• User-defined datatypes• User-defined functions• Inheritance and sub-classing

Page 43: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 43

Database System Life Cycle

Growth,Change, &

Maintenance6

Operations5

Integration4

Design1

Conversion3

PhysicalCreation

2

Page 44: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 44

Design

• Determination of the needs of the organization

• Development of the Conceptual Model of the database– Typically using Entity-Relationship

diagramming techniques

• Construction of a Data Dictionary• Development of the Logical Model

Page 45: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 45

Physical Creation

• Development of the Physical Model of the Database– data formats and types– determination of indexes, etc.

• Load a prototype database and test• Determine and implement security, privacy

and access controls• Determine and implement integrity constraints

Page 46: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 46

Conversion

• Convert existing data sets and applications to use the new database– May need programs, conversion utilities to

convert old data to new formats.

Page 47: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 47

Integration

• Overlaps with Phase 3

• Integration of converted applications and new applications into the new database

Page 48: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 48

Operations

• All applications run full-scale

• Privacy, security, access control must be in place.

• Recovery and Backup procedures must be established and used

Page 49: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 49

Growth, Change & Maintenance

• Change is a way of life– Applications, data requirements, reports, etc.

will all change as new needs and requirements are found

– The Database and applications and will need to be modified to meet the needs of changes

Page 50: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 50

Another View of the Life Cycle

Operations5

Conversion3

PhysicalCreation

2Growth, Change

6

Integration4

Design1

Page 51: 2004.01.20 - SLIDE 1IS 257 – Spring 2004 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information Management.

IS 257 – Spring 2004 2004.01.20 - SLIDE 51

Next Time