Preparing to Automate Data Management
1
Chapter 1
“You can use all the quantitative data you can get, but you stillhave to distrust it and use your own intelligence and judgment.” — Alvin Toffler
Chapter Introduction• Discovery phase includes
– Gathering all existing data – Researching missing and incomplete data– Talking with users about data output needs
• Subsequent steps in process include– Putting data into groups called tables– Identifying unique values for each record in those
tables – Designing database to produce desired output
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 2
Database Design Process: The Discovery Phase
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 3
Level 1 Objectives: Examining Existing and Missing Sources of Data• Discover and evaluate sources of existing
business data• Research sources of missing or incomplete data• Assign data to tables and use field types and sizes
to define data
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 4
Discovering and Evaluating Sources of Existing Data• Identify information that organization needs to
manage and organize• Might begin to see patterns that indicate how to
organize data• Database management system (DBMS)
– Includes:• Oracle • ColdFusion• Microsoft Access• MySQL
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 5
Discovering and Evaluating Sources of Existing Data (cont’d)• Data duplication
– Undesirable • Additional space required in database to store extra
records• Leads to inconsistent and inaccurate data
• Data redundancy– Same data repeated for different records
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 6
Researching Sources of Missing Data• Part of discovery phase• Must ask right questions of right people to get
right answers
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 7
Assimilating the Available Information and Planning the Database• First step in database design
– Determine best way to organize data into logical groups of fields
• Field – Single characteristic of entity– Also called column
• Record– Values in each field in table– Also called row
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 8
Assimilating the Available Information and Planning the Database (continued)• Table
– Collection of fields that describe one entity– Also called entity or relation
• Database– Collection of one or more tables
• Relational database– Contains related tables through fields that contain
identical data
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 9
Evaluating Field Values and Assigning Appropriate Data Types• Data type
– Determines how to store data in field• DBMSs use different names for some data types• How do you determine which data type to assign
each field? – Depends on what function you want to derive from
data– Each data type has different properties
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 10
Common Data Types and Their Descriptions
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 11
The Text and Memo Data Types• Text data type
– Letters and numbers– Not used in calculations or formulas– Stores maximum of 255 characters– Default for all fields created in access database
• Memo data type– Store long passages of text– Displays only 64,000 characters
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 12
The Number Data Type• Stores both positive and negative numbers • Contains up to 15 digits• Use for values used in calculations
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 13
The Currency Data Type• Includes two decimal places and displays values
with dollar sign• Use for monetary values
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 14
The Date/Time Data Type• Display values in format mm/dd/yyyy
– Can also include time in different formats• Used in calculations if necessary
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 15
The AutoNumber Data Type• Unique to Access• Number automatically generated by access • Produces unique values for each record• Useful to distinguish two records that share
identical information• Produces values of up to nine digits
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 16
The Yes/No Data Type• Assigned to fields requiring
– Yes/no– True/false– On/off
• Takes up one character of storage space• Make data entry easy
– Check box
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 17
The OLE Object Data Type• Used to identify files created in another program
- Then linked or embedded in database• Abbreviation for object linking and embedding
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 18
The Hyperlink Data Type• Assigned to fields that contain hyperlinks to
– Web pages– E-mail addresses– Files that open in
• Web browser• E-mail client• Another application
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 19
The Lookup Wizard Data Type• Creates fields to look up data in
– Another table– Or list of values created for field
• Makes data entry easy • Ensures that valid data entered into field
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 20
The Attachment Data Type• New to Access 2007• Lets you store one or more files for each record
in the database
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 21
Selecting the Correct Data Type• Helps store correct data in correct format while
using least amount of space• Eases data entry and interactivity with data • Choosing certain data types results in user-
friendly interactive features– Drop-down menus – Check boxes– Hyperlinks
• Correctly manipulate data
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 22
Assigning the Correct Field Size for Text Fields• Important to consider field size when assigning
data types– Minimize space reserved for each record by assigning
smallest data type that will store data• Be conservative when assigning field sizes
– But not too conservative
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 23
Assigning the Correct Field Size for Number Fields
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 24
Dividing the Existing and Missing Data into Tables• Tables
– Single most important component of database– Most databases contain
• Multiple tables • Hundreds or even thousands of records
• Primary key – One field that creates unique value in each record – Used to identify each record in table– May be a combination of fields
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 25
Database Design Process: Planning the Tables
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 26
Naming Conventions• Database tables must
– Have unique names– Follow established naming conventions
• General rules for naming objects– Object names cannot exceed 64 characters– Object names cannot include period, exclamation
point, accent grave, or brackets– Object names should not include spaces– Most developers capitalize first letter of each word
when table name includes two words
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 27
Leszynski/Reddick Naming Conventions for Database Objects
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 28
Level 1 Summary• Discovery phase• Identify existing and missing data• Determine tables
– Determine data types• Follow naming conventions
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 29
Level 2 Objectives:Understanding and Creating Table Relationships• Understand relational database objects and
concepts• Create table relationships• Understand referential integrity
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 30
Understanding Relational Database Objects• Users can view data in tables by:
– Opening table – Creating other objects
• Four main objects in database – Tables– Queries– Forms– Reports
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 31
Tables• Data in relational database stored in one or more
tables• View data in table
– Open it and scroll through records• Most of the time, three other main database
objects used to display data normally
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 32
Queries• Query
– Question asked about data stored in database• Query results
– Look similar to table– Fields displayed in columns – Records displayed in rows
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 33
Queries (continued)• Select query
– Most commonly used query– Data selected from table on which query based
• Action query– Performs action on table– Select specific records in table and update them
• Crosstab query – Performs calculations on values in field and displays
results in datasheet
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 34
Forms• Used to view add delete, update and print records in
database• Based on table or query• Interface more attractive than table datasheet• Customize form’s appearance with instructions and
command buttons• Switchboard
– Form displayed when database opened– Provides controlled method for users to open objects in
database
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 35
Form Based on a Table
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 36
Reports• Formatted presentation of data from table or
query • Created as printout or to be viewed on screen• Data displayed by report usually based on query• Dynamic
– Reflect latest data from object• Cannot be used to modify data
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 37
Accounts Receivable Report
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 38
Other Database Objects• Macro
– Set of instructions – Automate certain database tasks– Usually automates simple tasks
• Module – Contains instructions to automate database task– Written in Visual Basic for Applications (VBA)– Performs more sophisticated actions than macro
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 39
Understanding Relational Database Concepts• Flat file database
– Simple database – Contains single table of information
• Relational database– Contains multiple tables to store related information
• Common field – Field that appears in two or more tables and contains
identical data to relate tables– Primary key in first table– Foreign key in second table
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 40
Creating Table Relationships• Take advantage of interrelated objects• Goal in good database design
– Create separate tables for each entity– Ensure each table has primary key– Use common field to relate tables
• Relate two (or more) tables– Query them as though they are one big table
• Join – Specifies relationship between tables and properties
of relationshipSucceeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 41
One-to-Many Relationships• Abbreviated as 1:M• One record in first table matches zero one or
many records in related table• Primary table
– One side• Related table
– Many side
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 42
One-to-Many Relationship Between Customers and Prescriptions
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 43
One-to-One Relationships• Abbreviated as 1:1• Exists when each record in one table matches
exactly one record in related table
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 44
One-to-One Relationship Between Physical and Billing Addresses
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 45
Many-to-Many Relationships• Abbreviated as M:N• Each record in first table matches many records
in second table• Each record in second table matches many
records in first table• Junction table
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 46
Many-to-Many Relationship Between Employees and Classes
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 47
Understanding Referential Integrity• Null value
– Field does not contain any value
• Entity integrity– Guarantee that there are no duplicate records in table– Each record unique– No primary key field contains null values
• Referential integrity – If foreign key in one table matches primary key in second table – Values in foreign key must match values in primary key
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 48
Understanding Referential Integrity (continued)• When database does not enforce referential
integrity – Problems occur that lead to inaccurate and
inconsistent data• Orphaned
– No longer match between primary key in primary table and foreign keys in related table
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 49
Referential Integrity Errors
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 50
Overriding Referential Integrity• Might want to override referential integrity
– Intentionally change primary key – Delete parent record
• Cascade updates– Change primary key value so that DBMS
automatically updates appropriate foreign key values in related table
• Cascade deletes
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 51
Level 2 Summary• Main database objects:
– Table– Query– Form– Report
• Relationship types:– One-to-many– One-to-one– Many-to-many
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 52
Level 3 Objectives: Identifying and Eliminating Database Anomalies by Normalizing Data• Learn the techniques for normalizing data• Evaluate fields that are used as keys• Test the database design
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 53
Normalizing the Tables in the Database• Normalization
– Design process– Goals
• Reduces space required to store data by eliminating duplicate data in database
• Reduces inconsistent data in database by storing data only once
• Reduces chance of deletion update and insertion anomalies
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 54
Normalizing the Tables in the Database (continued)• Deletion anomaly
– User deletes data from database – Unintentionally deletes only occurrence of data in database
• Update anomaly – Due to redundant data in database– User fails to update some records or updates records
erroneously
• Insertion anomaly – User cannot add data to database unless preceded by entry of
other data
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 55
Normalizing the Tables in the Database (continued)• Functional dependency
– Column in table considered functionally dependent on another column
• If each value in second column associated with exactly one value in first column
• Partial dependency – Field dependent on only part of primary key
• Composite primary key– Primary key uses two or more fields to create unique
records in table
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 56
Normalizing the Tables in the Database (continued)• Determinant
– Field or collection of fields whose value determines value in another field
– Inverse of dependency• Natural key
– Primary key that details obvious and innate trait of record
• Artificial key– Field whose sole purpose is to create primary key– Usually visible to users
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 57
Normalizing the Tables in the Database (continued)• Surrogate key
– Computer-generated primary key – Usually invisible to users
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 58
First Normal Form• Repeating group
– Field contains more than one value• First normal form
– 1NF– Does not contain any repeating groups
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 59
Second Normal Form• 2NF• Table must be in 1NF • Must not contain any partial dependencies on
composite primary key• Tables in 1NF and contain primary key with only
one field – Automatically in 2NF
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 60
Third Normal Form• 3NF• Only determinants must be candidate keys• Candidate key
– Field or collection of fields that could function as primary key but was not chosen to do so
• Transitive dependency – Occurs between two nonkey fields both dependent
on third field• Tables in 3NF should not have transitive
dependenciesSucceeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 61
Level 3 Summary• Normal forms
– First (1NF)– Second (2NF)– Third (3NF)
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 62
Chapter Summary• Discovery:
– Identify existing and missing data– Organize data into tables– Determine data types for each field
• Table relationships– Established through common fields– Types
• 1:M• 1:1• M:N
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 63
Chapter Summary (continued)• Normalization
– Reduces duplication and inconsistency– Forms:
• 1NF• 2NF• 3NF
Succeeding in Business with Microsoft Office Access 2007: A Problem-Solving Approach 64