Top Banner
IST 210 Database Design Process IST 210 Todd S. Bacastow
41

IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

Jul 25, 2018

Download

Documents

phamnhu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210 Database Design Process

IST 210

Todd S. Bacastow

Page 2: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

2

Page 3: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

3

Key points

� Database design must reflect the information system of which the database is a part

� Information systems undergo evaluation and revision within

a framework known as the Systems Development Life Cycle (SDLC)

� Databases also undergo evaluation and revision within a framework known as the Database Life Cycle (DBLC)

� There are two general design strategies exist:

� top-down vs. bottom-up design

� centralized vs. decentralized design

Page 4: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

4

References

� STANDISH GROUP (1995): The CHAOS Report into Project Failure, The Standish Group International Inc.

� STANDISH GROUP (1996): Unfinished Voyages, The Standish Group International Inc.

� Croswell, P., 1991. "Obstacles to GIS implementation and guidelines to increase the opportunity for success," Journal of the Urban and Regional Information Systems Association, 3(1):43-56.

Page 5: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

5

Lessons from Business Automation

� Era of finance and operations: ’60s - ’70s� Business accounting systems

� Manufacturing software: ’70s - ’80s� Separate applications for inventory, ordering, forecasting, shop floor operations, logistics, etc.

� Era of the business enterprise: ’90s - ’00s� Separate applications get rolled into “enterprise resource planning” system

� Sales force automation, customer service center, campaign management, automated email response, etc. get rolled into “customer relationship management.”

Page 6: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

6

User

(Someone doing “real work” )

Infrastructure(Computer and

Human)Management(Organization)

Successful automation requires an interlocking of the:

Page 7: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

7

American Airlines

American Airlines settled a lawsuit with Budget Rent-A-Car, Marriott Corp. and Hilton Hotels after the $165 million CONFIRM car rental and hotel reservation system project collapsed into chaos.

Page 8: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

8

Project Outcomes

16%

53%

31%

Success

Significant Problems

Major Problems

� 84% of all automation projects have significant or major problems

STANDISH GROUP (1995): The CHAOS Report into Project Failure, The Standish Group International Inc.

Page 9: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

9

16%4%

9%

10%

30%

31%

<20%

21% - 50%

51% - 100%

101%-200%

201%-400%

>400%

Percent Over Budget

� 53% of all automation projects are more than 50% over budget

� 23% of all automation projects are more than

100% over budgetSTANDISH GROUP (1995): The CHAOS Report into Project Failure, The Standish Group International Inc.

Page 10: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

10

14%

18%

20%

36%

11% 1%

<20%

21%-50%

51-100%

101%-200%

201%-400%

>400%

� 49% of all automation projects take twiceas long to complete as planned

Percent of Time Under Estimated

STANDISH GROUP (1995): The CHAOS Report into Project Failure, The Standish Group International Inc.

Page 11: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

11

27%

22%

39%

7% 5%

<25%

25-49%

50-74%

75-99%

100%

Percent Planned Functionality

� 54% of all automation projects deliver less than half of the promised functionality

STANDISH GROUP (1995): The CHAOS Report into Project Failure, The Standish Group International Inc.

Page 12: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

12

Page 13: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

13

Most problems are non-technical� Poorly selected data

� Badly organized data � Incorrect data models

� Software has limited capability (oversell)

� Systems managers underestimate time requirements

� Systems can be underutilized

� Systems can be (and have been) abandoned � Personnel problems

Page 14: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

14

� Data

� Raw facts stored in databases

� Need additional processing to become

useful

� Information

� Required by decision maker

� Data processed and presented in a meaningful form

� Transformation

Changing Data into Information

Page 15: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

15

� Database

� Carefully designed and constructed repository of facts

� Part of an information system

� Information System

� Provides data collection, storage, and retrieval

� Facilitates data transformation

� Includes people, hardware, and software� Software: Database(s), Application programs, and Procedures

The Information System

Page 16: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

16

� System Analysis� Establishes need and extent of an information system

� Refer to Recommended Requirements Gathering Practices

� We are NOT DOING A SYSTEM REQ’T ANALYSIS!!

� Systems development� Process of creating information system

� Database development� Process of database design and implementation

� Creation of database models

� Implementation� Creating storage structure

� Loading data into database

� Providing for data management

The Information System (Con’t.)

Page 17: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

17

Systems Development Life Cycle

System Analysis

DatabaseOrganization

(IST 210)

Page 18: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

18

Phase 2

Phase 1

Phase 3

Phase 4

Phase 5

Phase 6

Database Lifecycle (DBLC)

DatabaseOrganization

(IST 210)

Page 19: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

19

Phase 1: Database Initial Study

� Purposes

� Analyze company situation

� Operating environment

� Organizational structure

� Define problems and constraints

� Define objectives

� Define scope and boundaries

Page 20: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

20

Initial Study Activities

Page 21: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

21

Phase 2: Database Design

� Most Critical DBLC phase

� Makes sure final product meets requirements

� Focus on data requirements

� Subphases

� I. Create conceptual design

� II. DBMS software selection

� III. Create logical design

� IV. Create physical design

Page 22: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

22

Two Views of Data

Page 23: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

23

I. Conceptual Design

� Data modeling creates abstract data structure to represent real-world items

� High level of abstraction

� Four steps

� Data analysis and requirements

� *Entity relationship modeling and normalization*

� *Data model verification*

Page 24: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

24

Data analysis and Requirements

� Focus on:

� Information needs

� Information users

� Information sources

� Data sources

� Developing and gathering end-user data views

� Direct observation of current system

� Interfacing with systems design group

� Business rules

Page 25: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

25

Entity Relationship Modeling and Normalization

Page 26: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

26

E-R Modeling is Iterative

Page 27: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

27

Concept Design: Tools and Sources

Page 28: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

28

Data Model Verification

� E-R model is verified against proposed system processes

� End user views and required transactions

� Access paths, security, concurrency control

� Business-imposed data requirements and constraints

� Reveals additional entity and attribute details

Page 29: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

29

E-R Model Verification Process

Page 30: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

30

Iterative Process of Verification

Page 31: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

31

II. DBMS Software Selection

� DBMS software selection is critical

� Advantages and disadvantages need study

� Factors affecting purchasing decision

� Cost

� DBMS features and tools

� Underlying model

� Portability

� DBMS hardware requirements

Page 32: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

32

III. Logical Design

� Translates conceptual design into internal model

� Maps objects in model to specific DBMS constructs

� Design components

� Tables

� Indexes

� Views

� Transactions

� Access authorities

� Others

Page 33: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

33

IV. Physical Design

� Selection of data storage and access characteristics

� Very technical

� More important in older hierarchical and network models

� Becomes more complex for distributed systems

� Designers favor software that hides physical details

Page 34: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

34

Phase 3: Implementation and Loading

� Creation of special storage-related constructs

to house end-user tables

� Data loaded into tables

� Other issues

� Performance

� Security

� Backup and recovery

� Integrity

� Company standards

� Concurrency controls

Page 35: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

35

Phase 4: Testing and Evaluation

� Database is tested and fine-tuned for performance, integrity, concurrent access,

and security constraints

� Done in parallel with application

programming

� Actions taken if tests fail

� Fine-tuning based on reference manuals

� Modification of physical design

� Modification of logical design

� Upgrade or change DBMS software or hardware

Page 36: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

36

Phase 5: Operation

� Database considered operational

� Starts process of system evaluation

� Unforeseen problems may surface

� Demand for change is constant

Page 37: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

37

Phase 6: Maintenance and Evaluation

� Preventative maintenance

� Corrective maintenance

� Adaptive maintenance

� Assignment of access permissions

� Generation of database access statistics to

monitor performance

� Periodic security audits based on system-

generated statistics

� Periodic system usage-summaries

Page 38: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

38

DB Design Strategy Notes

� Top-down

� 1) Identify data sets

� 2) Define data elements

� Bottom-up

� 1) Identify data elements

� 2) Group them into data sets

Page 39: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

39

Top-Down vs. Bottom-Up

Page 40: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

40

Centralized vs. Decentralized Design

� Centralized design

� Typical of simple databases

� Conducted by single person or small team

� Decentralized design

� Larger numbers of entities and complex relations

� Spread across multiple sites

� Developed by teams

Page 41: IST 210 Todd S. Bacastow - libvolume2.xyzlibvolume2.xyz/biotechnology/semester7/fundamentalsofosanddbms/... · a framework known as the Systems Development Life Cycle (SDLC) Databases

IST 210

41

Decentralized Design