-
WRDC-TR-90-8007Volume VPart 1
AD-A250 448
INTEGRATED INFORMATION SUPPORT SYSTEM (IISS)Volume V - Common
Data Model SubsystemPart 1 - CDM Administrator's Manual
M. Apicella, R. Palumbo, S. Singh
Control Data Corporation D TICIntegration Technology Services
LECT12970 Presidential DriveFairborn, OH 45324-6209 SMAY0 1992
U
September 1990
Final Report for Period 1 April 1987 - 31 December 1990
Approved for Public Release; Distribution is Unlimited
MANUFACTURING TECHNOLOGY DIRECTORATEWRIGHT RESEARCH AND
DEVELOPMENT CENTERAIR FORCE SYSTEMS COMMANDWRIGHT-PATTERSON AIR
FORCE BASE, OHIO 45433-6533
92-122229 2 1111 11 IfIII IIIlifI IIJI 111lii ;i
-
NOTICE
When Government drawings, specifications, or other data are used
for any purpose otherthan in connection with a definitely related
Government procurement operation, the UnitedStates Government
thereby incurs no responsibility nor any obligation whatsoever,
regardlesswhether or not the government may have formulated,
furnished, or in any way supplied thesaid drawings, specifications,
or other data. It should not, therefore, be construed or impliedby
any person, persons, or organization that the Government is
licensing or conveying anyrights or permission to manufacture, use,
or market any patented invention that may in any waybe related
thereto.
This technical report has been reviewed and is approved for
publication.This report is releasable to the National
TechnicalInformation Service (NTIS). At NTIS, it vill be
avail.ablt to the general public, includifg foreign nations
DA L. S N, Prdiect Manager DATE
Writ-Pa rsJ AFB, OH 45433-6533
FOR THE COMMANDER:
9RUCE A. RASMISEN, Chief DATEWRDCQMTWright-Patterson AFB, OH
45433-6533
If your address has changed, if you wish to be removed form our
mailing list, or if theaddressee is no longer employed by your
organization please notify WRDC/MTI, Wright-Patterson Air Force
Base, OH 45433-6533 to help us maintain a current mailing list.
Copies of this report should not be returned unless return is
required by securityconsiderations, contractual obligations, or
notice on a specific document.
-
Unclassified
SECURITY CLASSIFICATION OF THIS PAGE
REPORT DOCUMENTATION PAGEla. REPORT SECURITY CLASSIFICATION lb.
RESTRICTIVE MARKINGS
Unclassified
2a. SECURITY CLASSIFICATION AUTHORITY 3.
DISTRIBUTION/AVAILABILITY OF REPORT
Approved for Public Release;
2b. DECLASSIFICATION/DOWNGRADING SCHEDULE Distribution is
Unlimited.
4. PERFORMING ORGANIZATION REPORT NUMBER(S) 5. MONITORING
ORGANIZATION REPORT NUMBER(S)LIM 620341001 WRDC-TR- 90-8007 Vol. V,
Part 1
$, 6a. NAME OF PERFORMING ORGANIZATION b. OFFICE SYMBOL 7a. NAME
OF MONITORING ORGANIZATIONControl Data Corporation; (if applicable)
WRDC/MTIIntegration Technology Services
6c. ADDRESS (City, State, and ZIP Code) 7b. ADDRESS (City,
State, and ZIP Code)2970 Presidential Drive
Fairbor, OH 45324-6209 WPAFB, OH 45433-6533
8a. NAME OF FUNDING/SPONSORING 3bE OFFICE SYMBOL 9. PROCUREMENT
INSTRUMENT IDENTFNCATION.NUM.ORGAN IZATION (if applicable)
Wright Research and Development Center, F33600F87-C-0464Air
Force Systems Command, USAF WRDC/MTI
10. SOURCE OF FUNDING NOS.8c. ADDRESS (City, State, and ZIP
Code)
Wright- Patterson AFB, Ohio 45433-6533 PROGRAM PROJECT TASK WORK
UNIT11. TITLE (Include Security Classification) -ELEMENT No. No.
NO. NO.
See Block 19 78011 F 595600 F95600 20950607
12. PERSONAL AUTHOR(S)Control Data Corporation: Apicella. M. L.,
Palumbo, R., and Singh, S.
13a. TYPE OF REPORT 13b. TIME COVERED 14. DATE OF REPORT
(Yr.,Mo.,Day) 15. PAGE COUNTFinal Report 1 4/1/87-12/31/90
fgOSeptember30 222
16. SUPPLEMEN, .-YNOTATION
WRDCMT Project Priority 6203
17. COSATI CODES 18. SUBJECT TERMS t Continue on reverse if
necessary and identify block no.)
FIELD GROUP SUB GR.
1308 10905
19. ABSTRACT (Continue on reverse if necessary and identify
block number)
This document is the Common Data Model Adminstrator's User
Manual. Its purposes are several and include:
o Describing the philosophical and practical objectives of the
CDM Administrator.o Discussing the CDM, its design, and its role in
the IISS environment.o Descnbing the steps necessary to entenng and
maintaining data kept in the CDM.
Block 11 - INTEGRATED INFORMATION SUPPORT SYSTEM (IISS)Vol V -
Common Data Model Subsystem
Part I - CDM Administrator's Manual
20. DISTRIBUTIONAVAILABILIT tOF ABSTRACT 21. ABSTRACT SECURITY
CLASSIFICATiON
UNCLASSIFIED/UNLIMITED x SAME AS RPT. DTIC USERS
Unclassified
22a. NAME OF RESPONSIBLE INDIVIDUAL 22b. TELEPHONE NO. 22c
OFFICE SYM11©0L(Include Area Code)
David L. Judson (513) 255-7371 WRDC MTI
EDITION OF 1 JAN 73 is OBSOLETEDD FORM 1473, 83 APR
Unclassified
SECURITY CLASSIF'CATION OF TWHS !AGE
-
UM 62034100130 September 1990
FOREWORD
This technical report covers work performed under Air
ForceContract F33600-87-C-0464, DAPro Project. This contract
issponsored by the Manufacturing Technology Directorate, Air
ForceSystems Command, Wright-Patterson Air Force Base, Ohio. It
wasadministered under the technical direction of Mr. Bruce
A.Rasmussen, Branch Chief, Integration Technology
Division,Manufacturing Technology Directorate, through Mr. David L.
Judson,Project Manager. The Prime Contractor was Integration
TechnologyServices, Software Programs Division, of the Control
DataCorporation, Dayton, Ohio, under the direction of Mr. W.
A.Osborne. The DAPro Project Manager for Control Data
Corporationwas Mr. Jimmy P. Maxwell.
The DAPro project was created to continue the development,
test,and demonstration of the Integrated Information Support
System(IISS). The IISS technology work comprises enhancements to
IISSsoftware and the establishment and operation of IISS test
bedhardware and communications for developers and users.
The following list names the Control Data
Corporationsubcontractors and their contributing activities:
SUBCONTRACTOR ROLE
Control Data Corporation Responsible for the overall CommonData
Model design development andimplementation, IISS integration
andtest, and technology transfer of IISS.
D. Appleton Company Responsible for providing
softwareinformation services for the CommonData Model and IDEFIX
integrationmethodology.
ONTEK Responsible for defining and testing arepresentative
integrated system basein Artificial Intelligence techniquesto
establish fitness for use.
Simpact Corporation Responsible for Communicationdevelopment.
Acoess10n For
ANNTIS '27?A&IDTIC TABI C1'UNP-mL3unced El
By-
Dt tl ai/on
iii ' K
-
UM 62034100130 September 1990
Structural Dynamics Responsible for User Interfaces,Research
Corporation Virtual Terminal Interface,and Network
Transaction Manager design,development, implementation,
andsupport.
Arizona State University Responsible for test bed operationsand
support.
iv
-
UM 62034100130 September 1990
Table of Contents
Page
SECTION 1. INTRODUCTION ........................ I......1-11.1
Managing Data as a Corporate
Resource ............................ 1-1
SECTION 2. CDM OVERVIEW ............................. 2-12.1 The
Fundamental Approach ................ 2-12.1.1 The Three
Schema-Architecture ....... 2-12.1.2 Representation of the
Three
Types of Schemas ........................ 2-72.1.3 Integration
Methodology .................. 2-72.1.4 Contributions to IRRIASSPA
.......... 2-102.2 Basic Components of the Design ....... 2-102.2.1
The CDM Database ....................... 2-112.2.2 CDM1
................................. 2-112.2.3 The CDM Processor
........................ 2-12
SECTION 3. RESPONSIBILITIES OF THECDM ADMINISTRATOR
....................... 3-1
3.1 Establishing Data Standards ......... 3-13.2 Maintaining the
CDM ..................... 3-13.3 Protecting the CDM
...................... 3-13.4 Facilitating Use of the CDM
........... 3-1
SECTION 4. MAINTAINING THE CONCEPTUAL SCHEMA .-. 4-14.1
Methodology Overview .................... 4-14.1.1 CS Structure
............................. 4-14.1.2 Basic Approach
......................... 4-34.1.3 Modeling Forms
........................... 4-44.2 Building the Initial CS
................ 4-154.2.1 Phase 0: Starting the Project .......
4-154.2.2 Phase 1: Defining Entity
Classes ............................. 4-134.2.3 Phase 2:
Defining Relation
Classes .............................. 4-204.2.4 Phase 3:
Defining Key Classes ....... 4-224.2.5 Phase 4: Defining Nonkey
Attribute
Classes ............................. 4-294.3 Expanding the CS
....................... 4-304.3.1 Phase 0: Starting the Project
....... 4-314.3.2 Phase 1: Defining Entity
Classes ............................. 4-334.3.3 Phase 2:
Defining Relation
Classes .............................. 4-344.3.4 Phase 3:
Defining Key Classes ....... 4-364.3.5 Phase 4: Defining Nonkey
Attribute
Classes ............................. 4-46
v
-
UM 620341001
30 September 1990
Table of Contents
Page
SECTION 5. MAINTAINING THE CDM ..................... 5-15.1
Methodology Overview .................. 5-15.1.1 Generic NDDL
Commands .................. 5-15.1.2 Transaction NDDL Commands
................ 5-25.2 Loading the Initial CS
Description .............................. 5-35.2.1 Loading
Domains ...................... 5-75.2.2 Defining the Model
....................... 5-75.2.3 Loading Attribute Classes
................ 5-85.2.4 Loading Entity Classes ..................
5-105.2.5 Loading Key Classes and
Relation Classes ........................ 5-125.3
Modifying/Deleting CS Objects ......... 5-155.3.1 Domain Class
Changes .................... 5-155.3.2 Model Changes/Deletes
................... 5-175.3.3 Attribute Class Changes/Deletes
...... 5-185.3.4 Entity Class Changes/Deletes ......... 5-195.3.5
Relation Class Changes/Deletes ....... 5-215.4 Modeling &
Validating Tools ............ 5-235.5 Reviewing the Contents of the
CDM .... 5-23
SECTION 6. MAINTAINING INTERNAL SCHEMASAND MAPPINGS
............................. 6-1
6.1 Methodology Overview ................... 6-16.1.1 Internal
Schema and CS-IS
Mapping Structure ........................ 6-26.1.2 CS-IS
Mapping Modeling Forms ......... 6-166.2 Loading The Initial
Internal
Schema .............................. 6-406.2.1 Loading The
Distributed Database
Environment .............................. 6-406.2.2 Loading
User-Defined data types ...... 6-416.2.3 Loading Databases
....................... 6-416.2.4 Loading Record Types And
Data Fields.... ................... 6-436.3 Loading the Initial
CS-IS Mapping
Definition .............................. 6-506.3.1 Loading CS
to IS Mappings ............... 6-506.3.2 Loading Record Unions
................... 6-516.3.3 Loading Horizontal Partitions
......... 6-526.3.4 Loading Tranformational
Algorithms .............................. 6-526.4
Modifying/Deleting IS Objects ......... 6-656.4.1 Distributed
Database Environment
Changes .............................. 6-65
vi
-
UM 62034100130 September 1990
Table of Contents
Page
6.4.2 Modifying User-Defined datatypes
............................... 6-67
6.4.3 Database Changes/Deletes ................ 6-676.4.4 Record
Type Changes/Deletes ............. 6-696.4.5 Datafield
Changes/Deletes ............... 6-706.4.6 Modifying/Deleting
CS-IS
Mappings ............................ 6-716.4.7 Record Union
Changes/Deletes ......... 6-736.4.8 Horizontal Partition
Changes/
Deletes .............................. 6-746.5 Specific
Considerations ............ ... 6-746.5.1 IMS Specific
Considerations ........ 6-746.5.2 VSAM Specific Considerations
....... 6-836.5.3 Sequential Files Specific
Considerations ............................ 6-83
SECTION 7. MAINTAINING EXTERNAL SCHEMASMAPPINGS
........................... 7-1
7.1 Methodology Overview ................... 7-17.1.1 External
Schemas and CS-ES
Mapping Structure .......................... 7-17.1.2 Modeling
Forms .......................... 7-107.2 Loading the Initial ES
& CS-ES
Mapping Definition ...................... 7-137.2.1 Loading
User-Defined data types ..... 7-137.2.2 Loading User Views and
Data
Items .............................. 7-147.2.3 Loading
Transformation
Algorithms .............................. 7-157.3
Modifying/Deleting ES Elements and
CS-ES Mappings ......................... 7-187.3.1 Modifying
User-Defined data
types .............................. 7-187.3.2 User View
Changes/Deletes ............. 7-18
APPENDIX A GLOSSARY ........................... A-I
APPENDIX B USEFUL REFERENCES ......................... B-I
vii
-
UM 62034100130 September 1990
List of Illustrations
Figure Title Paqe
1-1 Data as an Integral Partof the Decision-Making Process
.......... 1-3
2-1 Two Fundamentally DifferentViews of Data: Logical and
Physical ..... 2-3
2-2 Direct Mapping of Logical andPhysical Views
............................... 2-4
2-3 The Three-Schema Architecture ........... 2-64-1 Relation
Classes Form ...................... 4-64-2 Relation Classes Form
Example ........... 4-74-3 Owned Attribute Classes Form
............ 4-114-4 Owned Attribute Classes
Form Example ................................ 4-124-5 Inherited
Attribute Classes Form ........ 4-134-6 Inherited Attribute
Classes
Form Example ................................ 4-144-7
Refinements of Nonspecific
Relation Classes Example ................ 4-484-8 Triads and
Other Dual-Path
Structures .................................. 4-494-9 Migration
Through
Two Relation Classes ...................... 4-504-10 Guidelines
for Determining Key
Classes of Dependent Entity Classes .... 4-515-1 CDM Objects
.................................. 5-45-2 CDM Object Description
................... 5-45-3 CDM Conceptual Schema
....................... 5-65-4 Owned Attribute Classes Form Example
..... 5-105-5 Figure Entity Class Glossary
Form Example ................................ 5-125-6 Inherited
Attribute Classes
Form Example ............................. 5-145-7 Relation
Classes Form Example............. 5-156-1 Entity Class/Record Type
Mapping ........ 6-36-2 Join Examples
............................... 6-86-2 Join Structures
............................. 6-116-4 Record Type/Entity Class
Mapping Form ................................ 6-196-5 Record
Type/Entity Class
Mapping Form Example ...................... 6-206-6 Record Type
Join Structures Diagram ..... 6-216-7 Record Type Join
Structures
Diagram Example ............................. 6-226-8 Data
Field/Attribute Use
Class Mapping ........................... 6-246-9 Data
Field/Attribute Use Class
Mapping Example ............................. 6-256-10 Set
Type/Relation Class Mapping ......... 6-27
viii
-
UM 62034100130 September 1990
List of Illustrations
Figure Title Page
6-11 Set Type/Relation Class MappingExample
................................. 6-28
6-12 Data Field/Attribute Use ClassMapping Example
............................. 6-31
6-13 Record Type Join StructureDiagram Example
............................. 6-38
6-14 Incomplete Join Structure Example ......... 6-396-15 CDM
Tables Distributed Data Bases ......... 6-466-16 CDM Tables Domains
and Data Types for
Internal Schema .............................. 6-476-17 CDM
Tables Relational Database Internal
Schema .................................... 6-486-18 CODASYL
Internal Schema. .................. 6-496-19 CS to IS Entity
Mapping ..................... 6-546-20 Record Type/Entity Class
Mapping ......... 6-556-21 S to IS Attribute and Relation
Mapping ................................. 6-566-22 Datafield to
Attribute Use Class
Mapping .................................. 6-586-23 Set Type to
Relation Class Mapping ....... 6-596-24 Record Union
............................. 6-606-25 horizontal Partition
..................... 6-616-26 Complex Mapping Algorithm
................... 6-626-27 IMS Internal Schema
......................... 6-636-28 IMS Internal Schema
......................... 6-807-1 Data Item/Attribute Use Class
Mappings ................................ 7-27-2 Vertical
Partition ....................... 7-37-3 Entity Joins
............................. 7-47-4 ES-CS Join Examples
.......................... 7-57-5 ES-CS Join Structures
........................ 7-87-6 Single Entity Views
......................... 7-107-7 Domains and Data Types External
Schema... 7-167-8 External Schema and CS/ES Mapping .........
7-17
ix
-
UM 620341001/ 30 September 1990
SECTION 1
INTRODUCTION
The purposes of this document are several and include:
a) Describing the philosophical and practical objectives ofthe
Common Data Model (CDM) Administrator;
b) Discussing the CDM itself, its underlying design, and itsrole
in the IISS environment;
c) Describing in detail the steps necessary in entering
andmaintaining data kept in the CDM.
After reading and understanding this document, the
CDMAdministrator should not be able only to collect, enter,
andmaintain CDM-related data, but also be able to understand
thereasons why such activities are performed.
The NDDL statements used to perform the actual CDMmaintenance
activities are described in detail in the NDDL UserGuide.
1.1 Managing Data as a Corporate Resource
Managing data as a corporate resource is a philosophy aboutthe
importance of data to an organization. The approachrecognizes that
data are assets to be managed along with the othermore generally
recognized resources of an enterprise, includingits personnel,
inventories, capital, and so forth. Organizationsspend tremendous
sums of money collecting and manipulating data,trying to extract
information needed to support decision making.The CDM Administrator
has as one of his or her primary objectivesthe preservation of that
continuing, substantial investment indata resources. The CDM
Administrator plays a major role inprotecting and properly managing
that investment by managingcommon data rather than just managing
applications that accessdata.
Data management includes all the activities that ensure
thatquality data are available to produce needed information
andknowledge. The objective of data management is to keep
dataassets resilient, flexible, and adaptable to
supportingdecision-making activities in the business. Data
managementresponsibilities include: 1) the representation, storage,
andorganization of data so that they can be selectively
andefficiently accessed, 2) the manipulation and presentation of
dataso that they suppcrt the user environment effectively, and 3)
theprotection of data so that they retain their value.
The philosophy of the CDM recognizes that data are
absolutelynecessary to the decision-making cycles of organizations
(Figure1-1). Individuals must not only be able to collect and
retaindata for their own use, but also be able to share data and
pooltheir knowledge resources. The ability to correlate
informationacross traditional applications boundaries and to
provide
1-1
-
UM 62034100130 September 1990
information that supports all levels of decision making,
fromoperational through tactical through strategic, is
increasinglyimportant as management at all levels is becoming more
aware ofthe potential power of information systems.
The CDM provides the capability to pull the enterprise'sdatabase
resources together to form an integrated, common sourceof
information to support decision making.
The objectives of data management include the following:
o Independence of data access from data descriptionso Increased
data accessibilityo Improved data integrityo Improved data
shareabilityo Improved data resiliencyo Improved data
administration and controlo Improved data securityo Improved
performance
The CDM Administrator needs to understand each of
theseobjectives.
Independence between data access and data descriptionsimproves
control over the data descriptions, facilitatesstandardization of
data-naming conventions, and rpduces theprogramming effort required
to accommodate modified datadescriptions. Data independence is
perhaps the single mostimportant factor in determining the
long-range success of adata-driven environment.
1-2
-
UM 62034100130 September 1990
Knowledge - o Decisions
Actions
Information Facts
) 0 00
• 0 0 0
Data Pool
Figure 1-1. Data as an Integral Part of the Decision
MakingProcess.
1-3
-
UM 62034100130 September 1990
Data accessibility refers to the capability for a user toextract
needed information from the data resource. Dataaccessibility is
enhanced by user-friendly interface languagesand well designed
screens. Good accessibility is characterizedby being able to relate
data in many different ways to produceinformation, and by being
able to represent that information ina variety of suitable forms.
Data accessibility is improved bythe CDM in its support of multiple
access paths and retrievalsequences through the physical databases.
Programming effortfor data manipulation is decreased and
cost-effective, general-purpose query facilities such as the NDML
become possible.
Data integrity is essential to maintain the quality of thedata
resource. Data integrity is measured by the completenessand
consistency of the data resource. Does it contain the datathat are
relevant to the decision-making needs of the user?Does it contain
all required interrelationships among types ofdata, and are all
consistency constraints satisfied?
Data shareability is needed to keep common data trulycommon.
Without shareability, data proliferate and theirquality becomes
uncontrollable. Without shareability, data areprivate and personal;
their quality is each individual user'sresponsibility. The main
difficulty with this distribution andredundancy of control is that
it results in no control at all.Improved shareability can be
achieved by supporting multipleaccess paths through the physical
databases, thereby enablingthem to serve many diverse needs.
Shareability is also achievedby separating individual user's views
of the data resource fromthe actual physical implementation of
databases.
Data shareability refers not just to database contents, butalso
to logic that accesses and manages data. Reduced dataduplication
streamlines data access, reduces the programmingeffort required for
updating data, and reduces the potential forinconsistent data.
Reduced redundancy in the data managementeffort improves the
productivity of data processing personnel.
Data recoverability is needed to keep the data resourceresilient
in the wake of errors. Error conditions need to bedetected and
corrected. Better yet, errors should be preventedfrom occurring in
the first place. Part of the difficulty inproviding a resilient
data resource is continuing to make thedata available to users
while recovering from errors.
The CDM Administrator should help to ensure that the
dataresource continues to satisfy users' information needs, even
asthose needs change through time. Many organizations
havesuccessfully established data administration functions to
helpdevelop and protect data assets. The CDM Administrator plays
asimilar role for the integrated, overall data resource.
Data security is essential to prevent unauthorized accessto
data. Certainly not all environments require the same,elaborate
security schemes, but nearly all organizations' dataassets need to
have some degree of access protection. Some dataare wide open to
public retrieve-only access; others require
1-4
-
UM 62034100130 September 1990
strict authentication to provide retrieval. Many databases
havemore stringent restrictions on accesses that will
changedatabase contents than on accesses that only read
databasecontents.
Performance of the data resource has two facets: efficiencyand
effectiveness. Efficiency is a measure of how well the datasystem
utilizes physical computer support, while effectivenessis a measure
of how well the data system meets users'information needs. The
characteristics are closely related; forexample, a user may be
totally dissatisfied with the system ifresponse time is measured in
hours rather than seconds.Response time is generally considered to
be an efficiencymeasure, but it certainly has an impact on
effectiveness.
1-5
-
UM 62034100130 September 1990
SECTION 2
CDM OVERVIEW
2.1 The Fundamental Approach
2.1.1 The Three-Schema Architecture
A key to implementing effective data-oriented environmentslies
in a framework that is called the Three-SchemaArchitecture. This
approach was proposed in the mid-1970s, thendeveloped, and finally
published in 1977 in a report from acommittee of the American
National Standards Institute - "TheANSI/X3/SPARC DBMS Framework:
Report of the Study Group on DataBase Management Systems."
The basic concepts proposed in the report have the power tolead
us to more effective information resource management. Theyare
implemented in the CDM.
The Three-Schema Architecture is based upon severalfundamental
facts:
o Computers and users need to be able to view the samedata in
different ways
o Different users need to be able to view the same datain
different ways
o It is (more or less) frequently desirable for usersand
computers to change the ways they view data
o It is undesirable for the computer to dictate orconstrain the
ways that users view data
Thus, it is necessary to be able to support different typesof
views of a data resource. Users need to be able to work withlogical
representations of data, which are independent of anyphysical
considerations of how the data are actually stored andmanaged on
computer facilities. Users view data in terms ofhigh-level
entities, e.g., staff members, tools, vehicles,products, orders,
and customers. Meanwhile, computerfacilities, access methods,
operating systems, and DBMSs, forexample, need to be able to work
with more physicalrepresentations. They view data in terms of
records and files,with index structures, B-trees, linked lists,
pointers,addresses, pages, and so forth.
These requirements lead us to conclude first that there aretwo
fundamentally different types of data views: logical andphysical.
The logical views are user-oriented, while thephysical views are
computer-oriented (Figure 2-1).
A second conclusion is that there must be a mapping
ortransformation between the logical and physical views. Afterall,
the ultimate objective is to enable users to gain access totheir
data that reside on computerized media. This mapping
2-1
-
UM 62034100130 September 1990
might be simple if there were only one user view and
onedatabase, but that is not the real-world situation. Rather,there
are multitudes of user views and commonly many (sometimeshundreds
or thousands) databases in an enterprise.
Each user view could be mapped directly to the
underlyingdatabases (Figure 2-2). This solution suffers, however,
whenchange is introduced in either type of view. If a
physicaldatabase is restructured on a disk to provide more
efficientperformance, then the mapping to each of the user views
thatreferences that database can be affected. If a logical view
isrevised to present information in a somewhat different way,
thenthe napping to each of the referenced databases may be
affected.Independence of logical and physical considerations would
nothave been achieved, and we would find that physical
computerfactors would constrain the ways that users logically view
theirdata. This is undesirable.
Using three-schema architecture terminology, "externalschemas"
represent user views of data, while "internal schemas"represent
physical implementations of databases. Schemas aremetadata, i.e.,
they are data about data. As a simple example,CUSTOMER-NAME and
CHARACTER (17) are metadata describing thedata value CHRISTOPHER
ROBIN.
To enable multiple users to share a data resource that
isimplemented on potentially many physical databases, we
insertbetween the users' views and the physical views a
neutral,integrated view of the data resource. This view is called
a"conceptual schema" in three-schema architecture
terminology.Others sometimes call it an "enterprise view."
2-2
-
UM 62034100130 September 1990
Logical Data Views Physical Data Views
Figure 2-1. Two Fundamentally Different Views of Data:
Logicaland Physical
2-3
-
UM 62034100130 September 1990
Database Auser view 1 -
User View 2
User View 3
Database 0
Figure 2-2. Direct Mapping of Logical and Physical Views
2-4
-
UM 62034100130 September 1990
As the vehicle for data integration and sharing, theconceptual
schema also carries metadata for enforcement of dataintegrity
rules. It is extensible, consistent, accessible,shareable, and
enables the data resource to evolve as needs changeand mature.
Figure 2-3 illustrates the relationships between the threetypes
of schemas. The schemas and the mappings between them arethe
mechanism for achieving both data independence and support
ofmultiple views. An internal schema can be changed to
improveefficiency and take advantage of new technical
developmentswithout altering the conceptual schema.
The conceptual schema represents knowledge of shareable
data.There may be access controls and security restrictions placed
uponthese common data, but they are not restricted to access by
onlyone user. The conceptual schema does not describe personal
data.
The scope of the conceptual schema expands through time.
Theconceptual schema extension methodology continually expands
theconceptual schema to include knowledge of more shared data.
Theexternal-conceptual mappings protect the external schemas and
thetransactions/programs that depend on them from most
modificationsincurred in evolving the conceptual schema.
Adding data to the integrated, common resource does not
startover in defining the data resource, nor does it create
anotherstand-alone database. Rather, development of its database
mustexamine questions of how those data relate to what is
alreadyknown by the conceptual schema. The result will be an
integrateddata resource whose scope is expanded gradually. It is
absolutefolly to approach integration of the data resources of
anorganization all at once; the job must be taken on piecemeal.
Theconceptual schema is the integrator.
The CDM contains all three types of schemas, as well as
theinterschema mappings. It not only documents these metadata,
butalso supplies appropriate metadata to support
transactionprocessing.
2-5
-
UM 620341001
30 September 1990
Internal
Schem Schchem
SSchema 2
InternalSchema 2
ExternalC
Schema 2Nch m Internal
Schema 4
Figure 2-3. The Three-Schema Architecture: One Conceptual
SchemaThat Provides for Integration and Independence ofMany
External Schemas and Many Internal Schemas
2-6
-
UM 62034100130 September 1990
2.1.2 Representation of the Three Types of Schemas
In the IISS, the Three-Schema Architecture is implementedthrough
the CDM facilities to store each of the three types ofschemas and
the interschema mappings. An appropriaterepresentation mode has
been selected for each of the three typesof schemas.
The conceptual schema is represented by an IDEFl model. TheCDM
stores this model in terms of entity classes, attributeclasses, and
relation classes.
The external schemas are represented by tables. The userviews
the common data resource in terms of flat, simple tables.The
mappings between these tables and the IDEFI model of theconceptual
schema are part of the CDM database.
The internal schemas are represented in terms of
physicaldatabase components, including record types and
inter-recordrelationships. The CDM Processor routines convert the
users' dataaccess requests, which are phrased in terms of tables,
intorequests against the conceptual schema IDEFl model, then
intorequests against the physical database structures described in
theinternal schema part of the CDM.
2.1.3 Integration Methodology
The Integration Methodology is the set of procedures
andquidelines that are used to expand the conceptual schema and
toincrease the sphere of common data available to support users
andapplications. The schemas and schema mappings in the CDM
arebuilt, maintained, and accessed using the Integration
Methodologyand the CDM Processors. (CDMP)
The Integration Methodology is intended to guide the
CDMAdministrator in building and maintaining the conceptual
schemaand in keeping its mappings to the internal and external
schemashighly accurate. This methodology consists of a set of
techniquesfor building the conceptual schema in gradual increments,
forbuilding external and internal schemas from portions of
theconceptual schema, for developing schema mappings, and for
keepingthese various CDM components current.
The first step in populating the CDM is to select a portionof
the data and to document it in the conceptual schema. Thenexternal
and internal schemas for those data are built and mappedto the
conceptual schema. Subsequently, other portions of thedata resource
are incorporated into the conceptual schema, and newexternal and
internal schemas and mappings are developed. The CDMis populated
gradually, in increments, rather than all at once.It evolves
through time.
A conceptual schema is represented by a semantic data model.The
IISS uses the IDEFl methodology, with certain extensions
fromDACOM's Data Modeling Technique. (Subsequent to the
developmentof CDM subsystem, IDEF1 was formally extended. See
Appendix B for
2-7
-
UM 62034100130 September 1990
references.) The data model reflects business policy, provides
arigorous view of the meaning of the data resource, and
isindependent of the physical implementation of the data
resource.
Building a data model is a rigorous procedure, whoseobjective is
to discover and document the semantic data structurein its most
fundamental terms. The modeling is a multi-stepprocess that
requires substantial input from users who are expertin the subject
area.
The fundamental steps of the CDM Integration Methodology areas
follows:
1. Identify the scope of the initial increment of theconceptual
schema.
2. Develop the data model for that initial increment of
theconceptual schema.
3. Load the data model into the CDM database.
4. Identify any physical databases or files within the scopeof
data in the conceptual schema.
5. Load their internal schemas into the CDM database.
6. Build the conceptual-to-internal schema mappings for
theinternal schemas loaded in Step 5.
7. Load the conceptual-to-internal schema mappings into theCDM
database.
8. Determine which users/application programs should
haveexternal schemas mapped from the conceptual schema.
9. Design the external schemas identified in Step 8, andtheir
mappings to the conceptual schema.
10. Load the external schemas and external-to-conceptualschema
mappings into the CDM database.
11. Identify the scope of the next increment to theconceptual
schema.
12. Develop the data model for the next increment of
theconceptual schema.
13. Integrate the data model from Step 12 with the data modelof
the existing conceptual schema.
14. Load the integrated data model into the CDM database.
15. Verify that the conceptual-to-internal
andexternal-to-conceptual schema mappings are still
valid,correcting them as needed.
16. Identify any additional physical databases or files thatare
now within the scope of the extended conceptualschema.
2-8
-
UM 62034100130 September 1990
17. Load their internal schemas into the CDM database.
18. Build the conceptual-to-internal schema mappings for
theincremented portions of the conceptual schema.
19. Load the conceptual-to-internal schema mappings into theCDM
database.
20. Identify any additional users or application programsthat
should be supported by the extended conceptualschema.
21. Design external schemas to support the
users/applicationprograms identified in Step 20, and develop
theirexternal-to-conceptual schema mappings.
22. Load the external schemas and external-to-conceptualschema
mappings from Step 21 into the CDM database.
23. Repeat Steps 11 through 22 for each increment to
theconceptual schema.
The evolutionary strategy for the conceptual schema should
bedeveloped early in the life of the above cycle. The
strategyshould ensure that the common data resource evolves in a
mannerthat serves the enterprise's need for controlled, shared
data.One tactic is to define the initial scope by that of an
existingdatabase that has a corresponding data model. Ideally,
thatdatabase would contain core information of high interest to
thetarget user community.
Perhaps the most important point to understand about the
CDMIntegration Methodology is that the incorporation of
additionaldata into the common data resource MUST be done in
conjunctionwith the existing conceptual schema. No data can be
accessedusing the CDM integrated facilities, including the Neutral
DataManipulation Language, unless they are known to the CDM.
Addingdata causes the conceptual schema to expand in a consistent
mannerthat enables integration to occur. By contrast, adding data
to anenvironment that does not use conceptual schema technology
justadds more fragmentation to what is probably already at best
aninterfaced (not integrated) system.
Applying the CDM Integration Methodology is not likeswallowing a
pill. It requires precise knowledge of the meaningsof the data that
are to be available in the integrated common dataresource. It means
not just building IDEFl models for thosedatabases, but also
analyzing the models for overlap, synonyms,homonyms, and all the
incipient anomalies and quirks that somehowhave crept into our
database structures over the years. The costis measured in
man-months of effort; the benefits are integrationand a knowledge
base that can be built on and evolved in thefuture.
2-9
-
UM 62034100130 September 1990
2.1.4 Contributions to IRRIASSPA
The use of the Common Data Model and the
Three-SchemaArchitecture allows an organization to benefit from
contributionsto IRRIASSPA, which are part of the objectives of the
USA'sIntegrated Computer Aided Manufacturing (ICAM) project to
developthe Integrated Information Support System (IISS).
Thecontributions can best be summarized as follows:
Independence - the IISS allows the separation of thedescription
and manipulation of logical data structuresfrom the actual physical
data representations and isolatesimplementation changes from user
views and programs.
Relatability - the NDDL used in building the CDM allowsthe CDM
Administrator to define, modify, and maintainrelationships among
data.
Resiliency/Recoverability - although not specificallyaddressed
by the CDM, the design of the CDM Processorprovides the ability to
recover from failures withoutdamage to the data resource.
Integrity - is provided through the use of data
integrityconstraints, which the application may specify and the
CDMProcessor enforces.
Accessibilit - the NDDL allows the definition of datathat resi
es not only in different databases but also ondifferent
computers.
Security - not expressly addressed by the CDM.
Shareability - is provided by support of multiple userviews
(i.e., external schemas) of the data resource.
Performance - the NDML, by use of the CDM, allows data [Bfrom
multiple resources to be addressed in a cost-effective manner in a
distributed environment.
Administration - by providing a means of documenting themeanings
in the data resource and of providing a vehicleby which consistency
can be maintained even as the scopeof the CDM is extended. It also
allows the maintenance ofinformation about data in different
databases.
2.2 Basic Components of the Design
The Common Data Model(CDM) subsystem is comprised of
threecomponents:
1. The CDM database, which is the database dictionary ofthe
IISS
2. A logical model of the CDM database called CDM
3. The CDM Processor (CDMP), which is the distributeddatabase
manager of the IISS
2-10
-
UM 62034100130 September 1990
This section will briefly discuss each of these basiccomponents
and show how they interrelate, one with another.
2.2.1 The CDM Database
The CDM database is the database dictionary of the IISS.
Itcaptures knowledge of the locations, characteristics,
andinterrelationships of all shared data in the system. The
mostsignificant feature of the CDM database is that it implements
theANSI/X3/SPARC concepts of the three-schema approach to
datamanagement. These three types of schemas are the
conceptualschema (CS), the internal schemas (IS), and the external
schemas(ES).
The conceptual schema describes a neutral, integrated view ofthe
shared data resource. There is one conceptual sche,. in
anenterprise. It is independent of physical database structures
andboundaries and is neutral to biases of individual
applications.Each external schema represents a user or application
view ofdata. Requests are made against external schemas. Each
internalschema represents an external schema to the local DBMS.
The CDM database is implemented as a relational database,which
presently resides on a VAX 11/780 computer. It is accessedby the
CDMP at compile-time to generate appropriate local DBMScalls
against internal schemas to process a user's NDML requestagainst an
external schema.
The CDM database is repzesented logically using a semanticdata
modeling technique called IDEFI. This method of datamodeling is a
hybrid of the entity-relationship approach, therelational model,
and the Smith's 2D data abstraction approach.This logical model of
the CDM database is called CDM1.
2.2.2 CDM1
CDM1 is a model of metadata, i.e., data about data. It givesthe
logical structure of the CDM database which maintains themetadata.
These metadata describe the meanings andcharacteristics of user
data.
The conceptual schema portion of the CDM1 model is related
toportions that describe internal and external schemas. An
internalschema describes a local database structure in just enough
detailto give the CDMP adequate information to generate code that
can beprocessed by the pertinent local DBMS. Because one of
therequirements of the IISS is that it provide integration of data
inexisting databases, the mappings between the conceptual
schemametadata and the internal schema metadata are not simple.
IISSdoes not have the luxury of supporting only certain clean
databasestructures. It is very likely that an attribute may
berepresented by one or more data files, which may be in
differentdatabases and even on different computers, or by
relationshipsbetween record types.
An external schema describes the portion of the conceptualschema
that is within the purview of a user or application. Anexternal
schema is equivalent to a view in the relational model.
2-11
-
UM 62034100130 September 1990
The conceptual-to-external schema mapping part of the CDMI
isstraightforward. The present implementation of the CDM
subsystemsupports any external schema that can be formed by
joiningconceptual schema entities and selecting attributes.
Thus, the CDMI model is a semantic data model that describesthe
logical structure of the CDM database. The CDM1 representsthe
conceptual schema, the internal schemas and their mappingsfrom the
conceptual schema, and the external schemas and theirmappings from
the conceptual schema.
2.2.3 The CDM Processor
The CDMP is the distributed database manager of the IISS.
Itbuilds on top of local DBMS services to provide data access.
TheCDMP plays both a compile-time and a run-time role in
theprocessing of transactions. The compile-time component is
calledthe CDMP Precompiler. The run-time components are called the
CDMPDistributed Request Supervisor (DRS) and the CDMP
Aggregator.
2.2.3.1 CDMP Precompiler
The CDMP Precompiler performs the following functions foreach
data request:
1. Parses the request
2. Transforms the request f . an external schema access toa
conceptual schema ac-ess
3. Decomposes tle request into subrequests, each of
whichaccesses one internal schema
4. Determines an appropriate access path for each
subrequestgenerating code that can be processed by the
pertinentlocal DBMS
5. Generates code to transform any data to be extracted
fromlocal databases from internal to conceptual schema format(this
code is called a Request Processor or RP)
6. Generates code to transform any data results fromconceptual
to external schema format and to performstatistical calculations
(this code is called a C/ETransformer or CEX)
7. Generates code to invoke appropriate RPs and CEXs atrun-time,
via calls to the NTM Subsystem
The CDMP Precompiler accesses the CDM database to findmetadata
for the inter-schema transforms and integrityconstraints for update
requests.
After successful precompilation of a user's program,
whichcontains embedded data requests in a SQL-like language
calledthe Neutral Definition/Manipulation Language (NDML), the
CDMPhas produced the following code modules:
2-12
-
UM 62034100130 September 1990
1. Modified user program will activate appropriateprocesses
(RP's and CEX's) at runtime.
2. One Request Processor (RP) per DBMS that manages datato be
accessed by the user program.
3. One Conceptual-to-External Transformer (CEX), whichwill
deliver query results to the modified user programat run-time.
2.2.3.2 Distributed Request Supervisor
There are presently two CDMP Distributed Request
Supervisor(DRS), one residing on the IBM node, the other on the VAX
whichhave responsibility for scheduling and coordinating the
varioussubrequests of user transactions. The DRS uses request
graphsproduced by the CDMP Precompiler to determine which
operations areto be performed where. The DRS also uses knowledge
ofcommunications costs and intermediate result volumes in
itsalgorithm for scheduling RPs.
Request Processors always deliver results as relations. The
relations are operated upon by the Aggregators.
2.2.3.3 Aggregators
An Aggreqator is called to perform a single function;
forexample, a union or a join, or an outer join on two sets of
data,each of which exists in a single sequential file. These data
setsare the results of an RPP or another Aggregator.
An Aggregator always deals with data in conceptual
schemaformat.
2-13
-
UM 62034100130 September 1990
SECTION 3
RESPONSIBILITIES OF THE CDM ADMINISTRATOR
The role that the CDM Administrator plays in the IISSenvironment
is not unlike that of the database administrator inthat the CDMA is
responsible for the following:
1. Establishing Data Standards
2. Maintaining the CDM
3. Protecting the CDM
4. Facilitating Use of the CDM
Each of these areas is of major importance to theorganization
and a failure to properly administer either ofthese areas of
responsibility can cost the organization dearly.
3.1 Establishing Data Standards
One of the early roles of the CDMA is the establishment ofdata
standards. Part of this work has already been initiatedduring the
development of the CDM1. The work that remains is todetermine what
types of standards to implement and to gainacceptance for the use
of these standards. It should be notedthat, without acceptable
standards, it will be difficult, if notimpossible, for the CDMA to
enforce any level ofstandardization.
3.2 Maintaining the CDM
The CDMA must maintain the CDM. This entails the buildingof the
initial conceptual schema (CS), internal schemas (IS), CSto IS
mappings, external schemas (ES), and ES to CS mappings, aswell as
extending the model and modifying and deleting elementsas needed.
It is to be expected that the need for extending andmodifying the
CDM will grow over time, slowly at first, thengrowing rapidly as
the benefits of the concept are proved beforeleveling off after
several years.
3.3 Protecting the CDM
One of the most important responsibilities of the CDMA isthe
protection of the CDM against loss, theft, and corruption,be it
intentional or not. At issue is the substantialinvestment that went
into the development of the CDM and thepotential damage that can be
caused to the enterprise should thedata fall into the wrong
hands.
3.4 Facilitating Use of the CDM
The CDMA must make the CDM available to all those who
canpotentially gain from the use of the CDM and have
legitimatereason to do so. This may involve making the CDM
available onother computers in the network. It also involves
communicating
3-1
-
UM 62034100130 September 1990
with the CDM user and potential users as to the contents
andperformance of the CDM, as well as the usability of the
data.Part of this communication will involve solving problems
andanswering questions and reporting the status of the CDM.
3
3-2
-
UM 62034100130 September 1990
SECTION 4
MAINTAINING THE CONCEPTUAL SCHEMA
4.1 Methodology Overview
This section and its subsections (4.2 - 4.3) introduce
themethodology for building and updating a conceptual schema.
Theportion of the CDM database that contains a conceptual schema
isdescribed, and the basic approach to developing a
conceptualschema is presented. Detailed instructions for filling
out themodeling forms are included.
4.1.1 CS Structure
A conceptual schema is essentially a single IDEFI modelthat
describes all of the common data in an enterprise.Consequently, its
components are those of any IDEFI model:
Entity ClassesRelation ClassesAttribute ClassesAttribute Use
ClassesInherited Attribute Use ClassesKey ClassesKey Class
Members
Detailed explanations of these can be found in the
IDEFIdocumentation. (Extensions to the IDEFI language, referenced
inAppendix C, simplify the IDEFI terminology used here.)
In addition to the usual metadata (data about data)contained in
any IDEFI model, the conceptual schema requirescertain new elements
of metadata. Key class numbers areassigned to enable alternate key
classes for the same entityclass to be distinguished from one
another. Tag numbers, tags(names), and tag labels are assigned to
enable attribute useclasses within the same entity class to be
distinguished fromone another. Data types and sizes are identified
for allattribute classes.
The conceptual schema must conform to several rules thatcause
the data relationships and descriptions to be as explicitas
possible. (Note: In these rules the phrase "any number"includes the
possibility of zero.)
1. Single-Owner Rule: An entity class can own any number
ofattribute classes. Every attribute class is owned byexactly one
entity class.
2. Every entity class contains one or more attribute useclasses.
Every attribute use class is contained inexactly one entity
class.
4-1
-
UM 62034100130 September 1990
3. Every attribute class appears as exactly one attributeuse
class in its owner entity class. An attribute classcan also appear
as any number of attribute use classes inany number of other entity
classes. Every attribute useclass corresponds to exactly one
attribute class.
4. Every entity class has one or more key classes. Everykey
class is for exactly one entity class.
5. Every key class is composed of one or more key classmembers.
Every key class member is in exactly one keyclass.
6. An attribute use class can be used as a member of anynumber
of key classes for the entity class in which it iscontained. An
attribute use class cannot be used as morethan one member of the
same key class; i.e., every memberof a key class must be a
different attribute use class.An attribute use class in one entity
class cannot be usedas a member of a key class for any other entity
class.Every key class member is exactly one attribute useclass.
7. An entity class can be independent in any number ofrelation
classes and dependent in any number. An entityclass cannot be both
independent and dependent in thesame relation class. Every relation
class has exactlytwo entity classes: one independent, one
dependent.
8. A key class can migrate through any number of relationclasses
in which its entity class is independent. A keyclass cannot migrate
through a relation class inwhich its entity class is dependent or
one in which itsentity class is not involved. Every relation class
hasexactly one key class from the independent entity classmigrating
through it into the dependent entity class.
9. Every relation class is a migration path for one or
moreinherited attribute use classes, one for each member ofthe key
class that migrates through it. Every inheritedattribute use class
has exactly one relation class as itsmigration path.
10. Every member of the key class that migrates through
arelation class creates exactly one inherited attributeuse class in
the dependent entity class for that relationclass. Every inherited
attribute use class is createdfrom exactly one key class
member.
11. Every attribute use class in an entity class
representseither one attribute class that is owned by that
entityclass or one inherited attribute use class that migratedinto
that entity class. Every inherited attribute useclass is
represented by exactly one attribute use class.
4-2
-
UM 62034100130 September 1990
12. Unique-Key Rule: No two entity instances in an entityclass
can have identical values in the samekey class forthat entity
class. For a multi-member key class,instances can have identical
values for some members, butnot for all.
13. No-Null Rule: Every entity instance in an entity classhas a
value in each attribute use class in that entityclass.
14. No-Repeat Rule: No entity instance in an entity classcan
have more than one value in any attribute use classin that entity
class. This rule is equivalent to thefirst normal form in the
relational database model.
15. Full-Functional-Dependency Rule: No entity instance inan
entity class can have a value in an owned, nonkeyattribute use
class that can be identified by less thanthe entire key value for
that entity instance. This ruleapplies only to entity classes with
multi-memberkey classes and is equivalent to the second normal
formin the relational database model.
16. No-Transitive-Dependency Rule: No entity instance in
anentity class can have a value in an owned, nonkeyattribute use
class that can be identified by the valuein another owned or
inherited, nonkey attribute use classin that entity class. This
rule is equivalent to thethird normal form in the relational
database model.
17. Smallest-Key-Class Rule: No entity class with amulti-member
key class can be split into two or moreentity classes, each with
fewer members in its key class,without losing some information.
This rule is acombination and extension of the fourth and fifth
normalforms in the relational database model.
4.1.2 Basic Approach (Onion Concept)
The complete conceptual schema for an enterprise
containsthousands of entity classes and a corresponding number of
relationclasses, attribute classes, etc. It is much too large to be
builtall at once. Instead, it must be built in increments -- each
onebuilding on the prior ones, until the conceptual schema
iscomplete. The increments are like the layers of an onion; as
eachlayer is added, the onion gets a little larger.
The process of "growing" the conceptual schema involves
twoprocedures, both of which are enhanced versions of the
IDEFImodeling procedure. The first is used to build the
initialincrement only. The second is used to build each
additionalincrement. The only difference between the two is that
the secondmust be concerned about the integration of the new
increment withthe existing conceptual schema. This involves being
continuallyaware of which components of the conceptual schema are
within thescope of the new increment and how any of those
components will beaffected by the addition of the new increment.
These twoprocedures are in Sections 4.2 and 4.3, respectively.
4-3
-
UM 62034100130 September 1990
4.1.3 Modeling Forms
Because the methodology for maintaining the conceptual schemais
based on the IDEFI information modeling methodology, it usesmost of
the IDEFl forms:
Source Material LogSource Data ListEntity Class PoolEntity Class
DefinitionRelation Class MatrixAttribute Class PoolKit Cover
SheetEntity Class Diagram (optional)Relation Class Definition
(optional)Attribute Class Diagram (optional)Entity Class/Attribute
Class Matrix (optional)Attribute Class Migration Index
(optional)Author Page Control Log (optional)Index Control Log
(optional)Kit Control Log (optional)Text Control Log (optional)FEO
Control Log (optional)Entity Class Set Control Log (optional)Entity
Class/Function View Matrix (optional)
Please refer to the IDEFI documentation for detaileddescriptions
of these forms.
A few of the regular IDEFI forms have certain shortcomingsthat
make them unsuitable for use in directly loading theconceptual
schema tables into the CDM database. The forms listedbelow were
designed to eliminate those shortcomings:
Relation ClassesOwned Attribute ClassesInherited Attribute
Classes
The rest of this section contains a detailed description andtwo
samples (one blank, one filled in) of each of these forms.
NOTE: When using the NDDL (see Neutral Data Definition
LanguageUsers Guide, Pub. No. UM 620341100) for maintaining
theconceptual schema in the CDM database, names should be
substitutedfor any/all numbers on the modeling forms. A discussion
of theNDDL can be found in Subsection 5.1.1.
Relation Classes Form
Purpose:
To provide a single source of information about relationclasses
that are to be described in the conceptual schema.
4-4
-
UM 62034100130 September 1990
Instructions:
Fill in one or more pages for each entity class that
isindependent in a relation class. List only those relationclasses
in which the entity class is independent; do notlist any relation
classes in which it is dependent. Donot fill in a page for an
entity class that is dependentin all of its relation classes.
* Form Area Explanation
1. Independent Entity Name of the entity class that isClass Name
independent in the relation
class. This will be the same forall relation classes entered on
apage. It is included only tomake the entry readable; it isnot used
in loading theconceptual schema.
2. Relation Class Label Label of the relation class.This is part
of the uniqueidentification of a relationclass.
3. R.C. Card. Symbol for the cardinality of therelation
class.
4. Dependent Entity Name of the entity class that isClass Name
dependent in the relation class.
It is included only to make theentry readable; it is not used
inloading the conceptual schema.
5. Dep. E.C. No. Number of the entity class thatis dependent in
the relationclass.
6. Ind. K.C. No. Number of the key class in theindependent
entity class thatmigrates through the relationclass into the
dependent entityclass.
7. Node Number of the entity class thatis independent in all of
therelation classes listed on thepage.
All other form areas correspond to areas on the regular
IDEFIforms. Please refer to the IDEFI documentation for details
aboutthose areas.
4-5
-
UM 62034100130 September 1990
Independent flelabon Class Ai C. Dependent De. Ind.Entity Class
Narne Label Card Enit 2 C.ss Narne E C. No. K C. No.
Relation Classes Nmr
Figure 4-1. Relation Classes Form
4-6
-
UIM 62034100130 September 1990
USE OAT AUTIMM OAC01A ICEM. OAR) DATE Aug t963 X VX"4x Arn Of
COWnEXISMOJICI 6201M MCMM WV row- 1
I nrc nqrtgxNO eTES 2 4~SS' ___1__1_0_H_1____A
Independent Relation Class nl c Dependent Dei,. IndEnity Class
Namne Label Card Enfily Clas% Name EzC. No K C No
0Exec Plan Is OEP Group M .of E03 KI
00 Exec Pion Has -- > EP Slowed ItemY Req E6I KI
OR Exec Plan is Used To Manuwitive OP E20c Plan~ Pall EIs KI
00 E.gc Plan is -4. Op Exec Plan Cosp E14 PCI
op Exec PIMs Has Opration Elo KI
op c sMn Has ..- > Op Exec Plan ObsI E71 PCI
40( Ell Irit Relation Classes 62IT 6
Figure 4-2. Relation Classes Form Example
4-7
-
UM 62034100130 September 1990
Owned Attribute Classes Form
Purpose:
To provide a single source of information about ownedattribute
use classes that are to be described in theconceptual schema.
Instructions:
Fill in one or more pages for each entity class that ownsan
attribute use class, either key or nonkey. List onlythose attribute
use classes that are owned by the entityclass; do not list any
attribute use classes that areinherited by the entity class. Do not
fill in a page foran entity class that contains only inherited
attribute useclasses.
Form Area Explanation
1. Tag No. Tag number for the attribute useclass.
2. A.C. Name & Label Name, label, and any synonyms ofthe
attribute use class. Thename is listed first. The labelis enclosed
in parentheses andplaced on the line below thename. If the name and
label areidentical, the label can beomitted. If the attribute
useclass has any syn.;nyms, the term"Synonyms:" is placed below
thename and label and the synonymsare listed under it.
3. A.C. No. Attribute class number for theattribute use
class.
4. A.C. Definition Definition of the attribute useclass.
5. Type ID. Format description for theattribute use class
indicatingdata type (numeric, character,etc.), length, and decimal
length(if applicable). The data typemust be one from the CDM
DataType Table.
6. Mbr. of K.C. No. Number(s) of the key class(es) towhich the
attribute use classbelongs, if any.
7. Node Number of the entity class thatowns all of the attribute
useclasses listed on the page.
4-8
-
UM 62034100130 September 1990
All other form areas correspond to areas on the regular
IDEFlforms. Please refer to the IDEFI documentation for details
aboutthose areas.
Inherited Attribute Classes Form
Purpose:
To provide a single source of information about
inheritedattribute use classes that are to be described in
theconceptual schema.
Instructions:
Fill in one or more pages for each entity class thatinherits an
attribute use class. List only thoseattribute use classes that are
inherited by the entityclass; do not list any attribute use classes
that areowned by the entity class. Do not fill in a page for
anentity class that contains only owned attribute useclasses.
Form Area Explanation
1. Tag No. Tag number for the attribute useclass.
2. Tag & Label Name, label, and any synonyms ofthe attribute
use class. Thename is listed first. The labelis enclosed in
parentheses andplaced on the line below thename. If the name and
label areidentical, the label can beomitted. If the attribute
useclass has any synonyms, the term"Synonyms:" is placed below
thename and label, and the synonymsare listed under it.
3. A.C. No. Attribute class number for theattribute use
class.
4. Ind. E.C. No. Number of the independent entityclass from
which the attributeuse class was inherited.
5. Ind. K.C. No. Number of the key class in theindependent
entity class thatmigrated through the relationclass named in the
"MigrationPath R.C. Label" area.
6. Ind. Tag No. Tag number of the attribute useclass in the
independent entityclass that migrated to becomethis attribute use
class.
4-9
-
UM 62034100130 September 1990
7. Migration Path Label of the relation classthrough which the
attribute useclass was inherited.
8. Mbr. of K.C. No. Number(s) of the key class(es) towhich the
attribute use classbelongs, if any.
9. Node Number of the entity class thatcontains all of the
attribute useclasses listed on the page.
All other form areas correspond to areas on the regular
IDEFIforms. Please refer to the IDEFl documentation tor details
aboutthose areas.
4-10
-
UM 62034100130 September 1990
USDA AU71,401% DATE O"( IWAIV1I flAT i I EMoACf ntv O NF (
IO T S 1 2 3 4 S 7
_ _ t o I_
_ __IA IO_
Tag AC Type Mb OfNo A C Name & Label No. A C. Defnlion 1_ KC
No
0 0 © 0
CO( ( TIVL Owned Atribute Classes NU4 n
Figure 4-3. Owned Attribute Classes Form
4-11
-
UM 62034100130 September 1990
US OT UIK)I ACF (EMon) AT Ag19e3 IX VIN 41I All It fli
CONTEXTJIO,(C T 6201M MCMM ntv f IAr
NOES 12 3a5 6 7 0 I j'll C OMW P D _____
Tog A N &IiA C. AC DiType Mbr ofNo. ACNae&LblNo. _______
Definition ___ ID K C. No
737 Opetamon Eveculoon Plan Group ldenbticaIton AtO A unklue
idinler aSSqned To wientify Nfdf K01(01 P 0VID) groups ol opefalion
excclool pLins
T134 Stalus A34 A CO that 'inifatei uwee a gimi, of c(sfOperalen
.secubon plA14 IS WdhWn 13Slile cyci.
TM3 Toua Opetation Esecubion Plans A35 The Wlo riumbt of
operalion erpectron Mitt(70WEPS)plansthVat makie up Ihe cyoup
F~t 12l Owned Altribute Classes NUM1 69
Figure 4-4. Owned Attribute Classes Form Example
4-12
-
UM 62034100130 September 1990
USEO AT ALrflKG (ATE WOOIK G ITAl 01 nAf COM lEXT
I II c c~kffl NIF(
NOTtS 1 22 4 S6 7 9 10 JTag A C. Ind Ind, Ind. Mbr ofNo. Tag
& Label No.o Ta No M ation athRC Label KC No
@ :,O Il(C Inheiled Atlibute Classes I R
Figure 4-5. Inherited Attribute Classes Form
4-13
-
UM 62034100130 September 1990
USEAflUHM VACOM (CEM. DAM~ OATE Aug 190 j D I OdG I WVV~ A rn
oAT? CO#E1XETTMW3JCT 6201M MCMM MYE FTWNOTS 23 45 & ?89 $0
V.IJtJIN CAION I J
Tag A C Ind. Ind. Ind ftoPahRCLbl Mbr ofNo, Tag & Label No E
C. No. K.C. No. Ta No. MgainPm ae .. N
T73 Ftemfon Number A09 E20 K01 T28 IsFor X01?(Roq NO)
T191 IS" sn~mauackw rea A07 E20 X1O? 7182 fs Far K01
(hs M19 Area 01
T192 Desimatioe manuiacluin Area A07 E24 K01? 140
ISkknorcalon
Figure 4-6. Inherited Attribute Classes Form Example
4-14
-
UM 62034100130 September 1990
4.2 Building the Initial CS
This section and its subsections (4.2.1 - 4.2.5) describe
theprocedure for initiating an enterprise's conceptual schema.
Theprocedure is concerned with creating a detailed description
(aninformation model) of a portion of the enterprise's common
dataand with collecting the data required to place that description
inthe CDM database as the first piece of the conceptual schema
(thefirst layer of the onion). It is not concerned with
decidingwhich portion of the common data to describe nor with
setting upthe CDM database and its utilities; these things must be
donebefore starting the procedure. The procedure consists of
sixphases, the first five of which are patterned after those
inIDEFI. The five IDEFI phases are as follows:
o Phase 0 - Starting the Project
o Phase 1 - Defining Entity Classes
o Phase 2 - Defining Relation Classes
o Phase 3 - Defining Key Classes
o Phase 4 - Defining Nonkey Attribute Classes
The procedure for the sixth phase, which consists of
populatingthe CDM database with the conceptual schema, is described
inSection 5. Each IDEF phase is described in a
subsequentsubsection.
4.2.1 Phase 0: Starting the Project
Objectives:
o State the purpose, scope, and viewpoint for theinformation
model.
o Establish the project team.
o Develop a phase-level project schedule.
o Collect and catalog relevant source material.
This phase is patterned after Phase 0 of IDEFI, and
thedescription presented here is less detailed than the one in
theIDEFI documentation. Please refer to that documentation
forfurther information.
Tasks:
1. The CDM Administrator appoints a project manager.
Usually, this will be the CDM Administrator.
2. The project manager states the purpose for building
theinformation model.
4-15
-
UM 62034100130 September 1990
This explains why the model is needed, i.e., what it willbe used
for. A model built with this procedure isprimarily used to initiate
the enterprise's conceptualschema. (It is not necessary to explain
why theconceptual schema is needed.) If the model has
otherpurposes, they should be mentioned also.
3. The project manager states the scope of the
informationmodel.
This sets the boundary of the model. It should bespecific enough
to be useful in deciding whether or not aparticular element of
common data should be included inthe model. Some of the things that
can be used as thebasis for scoping a model are the following:
o Information subjects: parts, employees, salesorders, etc.
o Functions: engineering release, shop floorcontrol, etc.
o Existing computer files or databases
o Existing computer application systems
4. The project manager states the viewpoints for theinformation
model.
This explains the mental attitude or role that peopleshould
adopt when looking at and thinking about themodel, i.e., in whose
place they should put themselves.Usually, this will be the job
title of someone who isintimately involved with the common data
being modeled.
5. The project manager appoints the project team members.
The four roles to be filled are as follows:
o Modeler - one or two IDEFI experts.
o Source - several subject experts, i.e., peoplewho have
in-depth knowledge about some or all ofthe common data being
modeled.
o Reviewer - several subject experts; some sourcesmay also serve
as reviewers. The CDMAdministrator must also serve as a reviewer
toensure that the model, as it is developed, isproperly documented
for loading into the CDMdatabase tables.
o Librarian - a person who is trained andexperienced in
coordinating kit reviews and inmaintaining files of model
documentation; amodeler may also serve as the librarian.
4-16
-
UM 62034100130 September 1990
6. The project manager appoints t -ceptance reviewcommittee
members.
This committee should consist of subject experts from the
area being modeled and from other, related areas.
7. The project manager schedules the project phases.
Estimate the amount of effort needed to complete eachphase
(usually in man-weeks or man-months) and thenconvert those
estimates to elapsed times and milestonesbased on the availability
of the project team members.At this point, only the phases are
scheduled; theindividual tasks within a phase will be scheduled
whenthat phase is started.
8. The project manager schedules the remaining Phase 0tasks.
Estimate the amount of effort needed to perform eachremaining
task in this phase (usually in man-hours orman-days) and then
convert those estimates to elapsedtimes and milestones based on the
availability of theproject team members who will perform those
tasks. Theschedules for the subsequent phases should be adjusted
ifthey are inconsistent with these task schedules.
9. The modeler develops a data collection plan.
Determine what kinds of source material are needed andwhere and
how to get that material.
10. The project manager conducts a project kick-off
meetingattended by the project team members.
The objectives of the meeting are as follows:
o To introduce the team members to one another andto the roles
they will be performing.
o To determine which members need IDEFI training.
o To present, discuss, and finalize the statementsof purpose,
scope, and viewpoint.
o To present and discuss the project schedule.
o To present, discuss, and finalize the data
collection plan.
11. The modeler collects source material from the sources.
Gather the documents, policies, procedures, databasedesigns,
etc., and interview the sources in accordancewith the data
collection plan (Task 9).
4-17
-
UM 62034100130 September 1990
12. The modeler catalogs the source material.
Prepare Source Material Log Forms and Source Data ListForms. If
a database design is among the sourcematerial, the record names and
data field names should beincluded in the source data list.
13. The modeler explains any author conventions.
These are deviations from or additions to the regularIDEFI
methodology. Mention the use of the threespecially designed
modeling forms: Relation ClassesForm, Owned Attribute Classes Form,
and InheritedAttribute Classes Form.
Deviation from IDEFl:
Usually, kits are not used to accomplish the review of thePhase
0 model documentation; the essentials are reviewed duringthe
kick-off meeting (Task 10). However, the project manager mayrequire
that kits be used to supplement or replace the kick-offmeeting.
4.2.2 Phase 1: Defining Entity Classes
Objective:
o Identify and define the apparent entity classes thatare within
the scope of the model.
This phase is patterned after Phase 1 of IDEFI, and
thedescription presented here is less detailed than the one in
theIDEFI documentation. Please refer to that documentation
forfurther information.
Tasks:
1. The project manager decides what method to use to reviewthe
Phase 1 model.
The options are to distribute review kits, to hold awalk-through
meeting, or to do both. The factors toconsider are the
following:
o Some team members may have to travel to attend awalk-through.
How many trips can the projectbudget afford?
o A review can usually be accomplished faster witha walk-through
than with kits. Is there enoughtime to circulate kits, perhaps two
or threetimes?
o Some reviewers may have very limited time tospend on the
project. How can their time beused most effectively, by reviewing a
kit or byattending a walk-through? Will they devote timeto
reviewing a kit on their own?
4-18
-
UM 62034100130 September 1990
2. The project manager schedules the Phase 1 tasks.
Estimate the amount of effort needed to perform each taskin this
phase (usually in man-hours or man-days) and thenconvert those
estimates to elapsed times and milestonesbased on the availability
of the project team members whowill perform those tasks. The
schedules for thesubsequent phases should be adjusted if they
areinconsistent with these task schedules.
3. The modeler builds an entity class pool.
Examine the entries in the source data list and deducewhat sort
of thing each entry identifies, describes,refers to, etc. For
example:
o Employee number, name, birth date, and salaryare data elements
about an employee; hence, an"Employee" entity class.
o Part number, description, and dimensions are allabout a part;
hence, a "Part" entity class.
Each sort of thing is represented by an entity class.Talk to the
sources when additional information isneeded. The entity instances
within an entity classshould be distinguishable from one another by
some uniqueidentifier. Assign an entity class number to each
entityclass, and record it on an Entity Class Pool Form.
When examining record names from a Catabase design, becareful to
think about the "real-world thing" that eachkind of record
represents. Realize that several kindsof records may represent the
same thing or, conversely,that one kind of record may represent
several differentthings. Also, realize that certain kinds of
records maybe present for technical reasons only
(performance,backup/recovery, etc.). Such records do not
represent"real-world things" and should not result in entityclasses
being added to the pool.
4. The modeler defines each entity class.
Fill out an Entity Class Definition Form for each entityclass in
the pool. Talk to the sources when additionalinformation about an
entity class is needed. Check offeach pool entry as it is dealt
with.
Watch for synonyms (different names for the same thing)and
homonyms (same name for different things). Whenthere are synonyms
for something, there is only oneentity class to define. Use the
most commonly used nameas the "official" entity class name, and
record it andthe corresponding entity class number on an Entity
ClassDefinition Form. Record the other names as synonyms onthe
form. In the pool, add a note to each synonym entryreferring to the
official name or number.
4-19
-
UM 62034100130 September 1990
For a homonym, there are two or more entity classes todefine,
one for each thing that the term represents.Pick a new name for
each thing to clarify thedifferences. Record the new names in the
entity classpool along with a new entity class number for each,
andfill out Entity Class Definition Forms. For example, ifan order
can be either something received by anenterprise from a customer,
or something sent by anenterprise to a vendor, call the first a
sales order andthe second a purchase order, and fill out two
definitionforms.
5. The modeler, reviewers, and librarian participate inreviewing
the Phase 1 model.
The method of review was selected in Task 1. Themodelers prepare
the review materials (kits orwalk-through handouts), the reviewers
read and comment onthe materials, and the modelers respond to
thecomments. If kits are used, the librarian coordinatestheir
circulation. The CDM Administrator reviews themodel to ensure that
all model documents are preparedproperly for loading the CDM
database tables.
4.2.3 Phase 2: Defining Relation Classes
Objective:
o Identify and define the apparent relation classesthat are
within the scope of the model.
This phase is patterned after Phase 2 of IDEFl, and
thedescription presented here is less detailed than the one in
theIDEF1 documentation. Please refer to that documentation
forfurther information.
Tasks:
1. The project manager decides what method to use to reviewthe
Phase 2 model.
See Phase 1, Task 1, for the options and factors toconsider.
2. The project manager schedules the Phase 2 tasks.
See Phase 1, Task 2, for details.
3. The modeler builds a relation class matrix.
List all of the entity classes across the top and downthe left
side of Relation Class Matrix Forms or on alarge sheet of grid
paper; the matrix is easier to workwith when it is all on one sheet
of paper. Then,determine which pairs of entity classes are related
toeach other. Look for data about one thing that is alsodata about
another. For example:
o Customer and Sales Order
4-20
-
UM 62034100130 September 1990
A sales order has some data about the customer
that placed it, such as customer number, name,address, etc.
o Part and Purchase Order
A purchase order contains some data about theparts being
ordered, such as part numbers,descriptions, dimensions, etc.
o Department and Employee
One element of data about an employee is thedepartment to which
he/she is assigned, such asdepartment number, name, etc.
o Manufacturing Order and Employee
A manufacturing order has some data about theemployees who
performed its operations, such asemployee numbers, names, etc.
Such sharing of data implies a relationship of some sort.Talk to
the sources when additional information aboutsuch sharing of data
is needed. If a database design isamong the source material, the
relationships it depictsmay be useful. Place an "X" in the matrix
at theintersection of each pair of related entity classes.
4. The modeler prepares overview diagrams (FEOs).
These diagrams are intended to show all of the entity
andrelation classes on just a few pages. Reviewers canusually
understand overview diagrams better thanindividual entity class
diagrams, so they will be theprimary (or sole) depiction of the
model. Each diagramshould focus on a particular subject with which
thereviewers will be comfortable (e.g., major activities),and each
should contain about 10-to-20 entity classes andtheir relation
classes. Use large sheets of paper (e.g.,11x17) and
photo-reduction, if necessary.
Every entity and relation class in the matrix must appearin at
least one diagram. Use some authorconvention to signify the entity
classes that appear inmore than one diagram (e.g., by broadening
ordouble-lining the entity class boxes) and to identifywhich other
diagrams they are in (e.g., by listing thediagram numbers near the
entity class boxes). Forexample, if entity class E27 is in diagrams
Fl, F3, andF4:
o List F3 and F4 near E27's box on Fl.o List F1 and F4 near
E27's box on F3.o List F1 and F3 near E27's box on F4.
4-21
-
UM 62034100130 September 1990
Add the appropriate cardinality and a meaningful label toeach
relation class as it is drawn in a diagram. Talk tothe sources when
additional information about a relationclass label and cardinality
is needed. Cardinalities maybe either specific or nonspecific;
derived entity classesshould not be introduced yet to avoid getting
ahead ofthe reviewers. Check off each relation class in thematrix
as it is drawn in a diagram (e.g., by circling theX in the
matrix).
5. The modeler defines any additional entity classes thatare
introduced during this phase.
Whenever a new entity class is introduced, immediatelydocument
it by performing the tasks in Phases 1 and 2that are needed to:
o Update the entity class pool.o Prepare an Entity Class
Definition Form.o Update the relation class matrix if it has
been
started.o Update the overview diagrams if they have been
started.
6. The modeler, reviewers, and librarian participate inreviewing
the Phase 2 model.
See Phase 1, Task 5 for details.Deviation from IDEFI:
Usually, individual entity class diagrams are not
preparedbecause the overview diagrams are easier to understand and
review,and Relation Class Definition Forms are not filled out
because therelation class labels are supposed to be
self-descriptive. Also,the Related Entity Class Node
Cross-Reference Form is replaced bythe specially designed
Relation.
Classes Form, which is called for in Phase 3. However,
theproject manager may require the use of any or all of these
tosupplement the model documentation called for above.
4.2.4 Phase 3: Defining Key Classes
Objectives:
o Refine all nonspecific relation classes in the model.
o Identify the apparent attribute classes that arewithin the
scope of the model.
o Identify and define a key class for each entity classin the
model.
o Validate every relation class in the model via keyclass
migration.
This phase is patterned after Phase 3 of IDEFI, and
thedescription presented here is less detailed than the one in
theIDEFl documentation. Please refer to that documentation for
4-22
-
UM 62034100130 September 1990
further information. Also, please refer to Subsection 5.2.2.1
fordetails on how to fill out the Relation Classes, Owned
AttributeClasses, and Inherited Attribute Classes Forms.
Tasks:
1. The project manager decides what method to use to reviewthe
Phase 3 model.
See Phase 1, Task 1, for the options and factors toconsider.
2. The project manager schedules the Phase 3 tasks.
See Phase 1, Task 2, for details.
3. The modeler refines the nonspecific relation classes.
Introduce a derived entity class for each nonspecificrelation
class and convert that relation class to a pairof specific relation
classes as shown in Figure 4-7 atthe end of this section. Assign
entity class numbers tothe derived entity classes, record them in
the entityclass pool, and fill out Entity Class Definition
Forms.The sources may be able to recommend appropriate namesand
definitions for some derived entity classes.
Remove the nonspecific relation classes from the relationclass
matrix and the overview diagrams. Add the derivedentity classes and
the specific relation classes to thematrix and the diagrams. Retain
the same focus for eachdiagram unless the reviewers suggested a
change.
Also, update any optional documents that are affected.
4. The modeler eliminates any unneeded triads or otherdual-path
structures.
A dual-path structure is one composed of two or morerelated
entity classes in which:
o There are two paths connecting one entity classto another
o One path is a single relation class
o The other path is a series of relation classes(unless the
structure has only two entityclasses in which case the second path
is asingle relation class also)
See the examples in Figure 4-8 at the end of thissection. Talk
to the sources to determine whether thetwo paths are equal,
unequal, or indeterminant. The pathsare equal if, for each
dependent entity instance,they both lead to the same independent
entity instance.The paths are unequal if, for each dependent
entityinstance, they each lead to a different independententity
instance. The paths are indeterminant if they are
4-23
-
UM 620341001.30 September 1990
equal for some dependent entity instances and unequal forothers.
If the paths are equal, thesingle-relation-class path is redundant
and must beremoved from the relation class matrix and the
overviewdiagrams (and from any optional docume:±r ir
whichappears).
5. The modeler fills out Relation Class Forms.
Record each relation class on a Re]ption Classes Form.Leave the
Ind. K.C. No. column biank for now. As eachrelation class is
recorded on a form, check it off on acopy of each overview diagram
in which it appears (e.g.,by circling the relation class
labels).
6. The modeler builds an attribute class pool.
Examine the entries in the source data list and deducewhat sort
of characteristic each represents, where acharacteristic is a data
element that identifies,describes, refers to, etc., a thing being
modeled. Eachsort of characteristic is represented by an
attributeclass. Talk to the sources when additional informationis
needed. Assign an attribute class number to eachattribute class,
and record it on an Attribute Class PoolForm.
When examining data field names from a database design,realize
that several data fields may represent the samekind of "real-world
characteristic" or, conversely, thatone data field may represent
several differentcharacteristics. For example:
o SALES-ORDER-CUSTOMER-NUMBER, INVOICE-CUSTOMER-NUMBER, and
ACCOUNTS-RECEIVABLE-CUSTOMER-NUMBERall represent the same
characteristic of acustomer, i.e., customer number.
o SALESMAN-ASSIGNMENT-CODE may represent both theterritory and
the product for which the salesmanis responsible.
Also, realize that certain data fields may be present
fortechnical reasons only (e.g., record codes) and shouldnot be
included in the attribute class pool.
7. The modeler defines the key classes of the totallyindependent
entity classes.
A totally independent entity class is one that is notdependent
in any relation class. Select any one and findthe attribute classes
in the pool that make up its keyclass. Watch for attribute class
synonyms and homonyms,and handle them like those for entity classes
(Phase 1,Task 4). A few totally independent entity classes havetwo
or more alternate key classes (e.g., employees can beuniquely
identified by either employee numbers or Social
4-24
-
UM 62034100130 September 1990
Security Numbers). Be sure to identify all key classesfor such
an entity class. Also, be sure each key classconforms to the
following rules:
o Single-Owned Ruleo Unique-Key Ruleo No-Null Ruleo No-Repeat
Ruleo Smallest-Key-Class-Rule
See Section 4.1.1 for explanations of these rule