1 Chapter 9 Database Management Systems Accounting Information Systems, 5 th edition James A. Hall
Dec 17, 2015
Flat-File Versus Database Environments
Computer processing involves two components: data and instructions (programs).
Conceptually, there are two methods for designing interface between programs and data: file-oriented (flat-file) processing: A specific data file
was created for each application [legacy systems] data-oriented processing: Create a single data repository
to support numerous applications [current systems] Disadvantages of file-oriented processing include redundant
data and programs and varying formats for storing the redundant data.
File-Oriented (Flat-File) Environment
Program 1
Program 2
Program 3
A,B,C
X,B,Y
L,B,M
User 2 (A/R)Transactions
User 1(Sales)Transactions
User 3 (Payroll)Transactions
Data
Where B is a customer record. Assume B’s address changes. Each userthat works with the B record must modify the address, increasing chance for error.
Program 1
Program 2
Program 3
User 2 (A/R)Transactions
User 1 (Sales)Transactions
User 3 (Payroll)Transactions
Database
DBMS
A,B,C,X,Y,L,M
Database Approach
With the database approach, any User could change B’s address - it would be changed for all users. changes. [Assume all users have authority to change theaddress for this example.]
Data Redundancy & Flat-File Problems
Data Storage - creates excessive storage costs of paper documents and/or magnetic form
Data Updating - any changes or additions must be performed multiple times (see example of B’s address)
Currency of Information - potential problem of failing to update all affected files
Task-Data Dependency - user’s inability to obtain additional information as his or her needs change
6
Advantages of the Database Approach
Data sharing/centralized database resolves flat-file problems:
No data redundancy - Data stored only once, eliminating data redundancy and reducing storage costs.
Single update - Because data are in only one place, requires only single update procedure, reducing time/cost of keeping database current.
Current values - Change to database made by any (authorized) user yields current data values for all other (authorized) users.
Task-data independence - As users’ information needs expand beyond immediate domain, new needs can be more easily satisfied than under flat-file approach.
7
Disadvantages of the Database Approach
Can be costly to implement additional hardware, software, storage, and
network resources are required – yet these costs continue to drop…
Can only run in certain operating environments may be unsuitable for some existing system
configurations (e.g., legacy systems)
8
Internal Controls and DBMS1
Purpose of DBMS is to provide controlled access to database.
DBMS is special software programmed to know which data elements each user is authorized to access and deny unauthorized requests of data.
1 DBMS = database management system
9
Query Language Query capability permits both end users and
professional programmers to access data (information) in database without writing conventional programs.
10
Three Steps in Designing a Database1. Prepare conceptual model
Identify entities Identify relationships between entities Prepare ER diagram (ERD)
2. Specify logical design Select logical database model (which will always be
relational nowadays) Transform conceptual data model using logical
database model
3. Implement physical design Physical structures Access methods
Example of a Relationship LinkingTwo Entities
CUSTOMERplaces ORDER
Name
Cust Number
Order Number Date
Example of 1:M relationship, using Crow’s Feet notation. We usually don’t cover the “0” relationship.
14
Logical Data Structures Objective is to develop structure efficiently so
data can be accessed quickly and easily. Four types of database structures are:
hierarchical (tree structure) network relational object-oriented
Concentrate on relational DBs. They are the systems being created at the current time.
15
The Relational Model Relational model portrays data in form of two
dimensional tables (looks like Excel worksheet): relation - database table attributes/fields (data elements) - columns tuples (records) - rows data - intersection of rows and columns
16
Properly Designed Relational Tables
No repeating values – Primary Key cannot repeat in a table.
Attribute values in any column must all be of same class. [can’t put text in a date field]
Each column (field name) in a table must be uniquely named. [can’t have 2 address fields both named Address – use Addr1 and Addr2]
Each row (record) in a table must be unique in at least one attribute (primary key)
17
Relational Model Data Linkages (>1 table)
No explicit pointers are present – data are viewed as collection of independent tables.
“Relations” formed by attribute/field that is common to both tables in relation. This field is a “foreign key.”
How to assign foreign keys: if 1:1 association, either of table’s primary keys may be foreign
key. if 1:M association, primary key of the ONE side is embedded
as foreign key in the MANY side. if M:M (M:N) association, create separate linking table.
18
Three Types of Anomalies(anomalies are found in Unnormalized tables)
Insertion Anomaly: New item cannot be added to table until at least one entity uses particular attribute item.
Deletion Anomaly: If attribute item used by only one entity is deleted, all information about that attribute item is lost.
Update Anomaly: Modification on attribute must be made in each of rows in which attribute appears.
Anomalies can be corrected by creating relational tables.
19
Relational Tables Various items of interest (customers,
inventory, sales) stored in SEPARATE tables in database.
Advantages: Removes all three anomalies Efficient use of space. Flexible. Users can form ad hoc relationships for
queries.
20
Normalization Process Process that breaks up large, complex tables
into smaller tables that meet two conditions: all nonkey attributes (fields) in table are
dependent on primary key (PK) all nonkey attributes (fields) are independent of
other nonkey attributes (fields) When unnormalized tables are split and
reduced to third normal form, they are linked together by foreign keys (secondary keys).
21
StudentID Lname Fname Major CRN Course
Instructor ID
80012 Garcia Maria ACCT 17256 ACCT3321 G001
80012 Garcia Maria ACCT 17535 ACCT3320 G003
80012 Garcia Maria ACCT 17004 MKT3311 G005
80014 Henry Hank MKT 17155 MKT3311 G005
80020 Hernandez Barbara CIS 17200 CIS2320 G010
80020 Hernandez Barbara CIS 17155 MKT3311 G005
80033 Johnson Kris ECON 16256 ACCT3321 S001
80033 Johnson Kris ECON 17005 ECON3315 S005
80033 Johnson Kris ECON 16500 CIS3545 M001
all nonkey attributes (fields) in table are dependent on primary key (PK)
This is an unnormalized table. Only Lname & Fname depend on the PK ofStudent ID. Other fields belong in another table(s).
Steps in Normalization
Unnormalized Table with repeating groups
First normalform (1NF)
Second normalform (2NF)
Third normalform (3NF)
1. Removerepeating
groups
2. Remove partial
dependencies
3. Removetransitive
dependencies
Does field depend on PK? If no, remove and put in another table.
Only concerns tables with composite PKs. See Fig. 9-38
When field depends on another non-key field in table. See Fig. 9-40
Accountants and Data Normalization
Update anomalies can generate conflicting and obsolete database values.
Insertion anomalies can result in unrecorded transactions and incomplete audit trails.
Deletion anomalies can cause loss of accounting records and destruction of audit trails.
Accountants should understand data normalization process and be able to determine whether database is properly normalized.
26
President
VPMarketing
VP ComputerServices
VPOperations
VPFinance
SystemsDevelopment
DatabaseAdministration
DataProcessing
New SystemsDevelopment
SystemsMaintenance
DataControl
DataPreparation
ComputerOperations
DataLibrary
President
VPMarketing
VPFinance
VPOperations
IPU IPU IPU IPU IPU IPU
VPAdministration
Treasurer Controller ManagerPlant X
ManagerPlant Y
CENTRALIZED COMPUTER SERVICES FUNCTION
DISTRIBUTED ORGANIZATIONALSTRUCTURE
27
Characteristics of DDP1
Advantages: Cost reductions in hardware and data entry tasks Improved cost control responsibility Improved user satisfaction because control is closer to user level Backup of data can be improved through use of multiple data
storage sites
Disadvantages Loss of control (organization-wide) Mismanagement of organization-wide resources Hardware/software incompatibility Redundant tasks/data Incompatible tasks may be consolidated Lack of standards
1 DDP = Distributed Data Processing
28
Data are retained in central location. Remote IPUs (workstations) send requests
for data. Central site services needs of remote
workstations. Actual data processing is performed at remote
workstation
Centralized Databases in DDP Environment
29
Data Currency
Can be an issue in DDP with centralized database During transaction processing, data will
temporarily be inconsistent as record is being read and updated.
Database lockout procedures are necessary to keep workstations from reading inconsistent data and/or from writing over a transaction being written by another workstation.
Distributed Databases: Partitioning
Splits central database into segments that are distributed to their primary users
Advantages: users’ control is increased by having data stored at
local sites transaction processing response time is improved volume of transmitted data between workstations is
reduced Reduces potential data loss from a disaster
Distributed Databases: Replication
Duplication of entire database for multiple workstations
Effective for situations with a high degree of data sharing, but no primary user supports read-only queries.
Data traffic between sites is reduced considerably.
32
The Deadlock Phenomenon
Especially a problem with partitioned databases Occurs when multiple sites lock each other out
of data that they are currently using Special software needed to analyze and resolve
conflicts.
The Deadlock Phenomenon
A,BE, F
C,D
Locked A, waiting for C
Locked C, waiting for E
Locked E, waiting for A