Top Banner

of 30

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 1

    DATABASE MANAGEMENT

    SYSTEM

  • 2

    What is data base management system? Explain the three level architecture with

    diagram?

    A database management system (DBMS) is the software that allows a computer to

    perform database functions of storing, retrieving, adding, deleting and modifying data.

    Relational database management systems (RDBMS) implement the relational model of

    tables and relationships. A database management system (DBMS) is a software package

    designed to define, manipulate, retrieve and manage data in a database. A DBMS

    generally manipulates the data itself, the data format, field names, record structure and file

    structure. It also defines rules to validate and manipulate this data. A DBMS relieves users

    of framing programs for data maintenance. Fourth-generation query languages, such as

    SQL, are used along with the DBMS package to interact with a database.

    Three Level Architecture of DBMS

    An early proposal for a standard terminology and general architecture database a system

    was produced in 1971 by the DBTG (Data Base Task Group) appointed by the Conference

    on data Systems and Languages. The DBTG recognized the need for a two level approach

    with a system view called the schema and user view called subschema. The American

    National Standard Institute terminology and architecture in 1975.ANSI-SPARC

    recognized the need for a three level approach with a system catalog.

    The design of a Database Management System highly depends on its architecture. It can

    be centralized or decentralized or hierarchical. DBMS architecture can be seen as single

    tier or multi tier. n-tier architecture divides the whole system into related but independent

    n modules, which can be independently modified, altered, changed or replaced.

    In 1-tier architecture, DBMS is the only entity where user directly sits on DBMS and uses

    it. Any changes done here will directly be done on DBMS itself. It does not provide handy

    tools for end users and preferably database designer and programmers use single tier

    architecture.

    If the architecture of DBMS is 2-tier then must have some application, which uses the

    DBMS. Programmers use 2-tier architecture where they access DBMS by means of

    application. Here application tier is entirely independent of database in term of operation,

    design and programming.

    3-tier architecture

    Most widely used architecture is 3-tier architecture. 3-tier architecture separates it tier

    from each other on basis of users. It is described as follows:

  • 3

    [3-tier DBMS architecture]

    Database (Data) Tier: At this tier, only database resides. Database along with its

    query processing languages sits in layer-3 of 3-tier architecture. It also contains all

    relations and their constraints.

    Application (Middle) Tier: At this tier the application server and program, which

    access database, resides. For a user this application tier works as abstracted view of

    database. Users are unaware of any existence of database beyond application. For

    database-tier, application tier is the user of it. Database tier is not aware of any

    other user beyond application tier. This tier works as mediator between the two.

    User (Presentation) Tier: An end user sits on this tier. From a users aspect this

    tier is everything. He/she doesn't know about any existence or form of database

    beyond this layer. At this layer multiple views of database can be provided by the

    application. All views are generated by applications, which resides in application

    tier.

    Multiple tier database architecture is highly modifiable as almost all its components are

    independent and can be changed independently.

  • 4

    There are following three levels or layers of DBMS architecture:

    1. External Level

    2. Conceptual Level

    3. Internal Level

    1. External Level: - External Level is described by a schema i.e. it consists of definition

    of logical records and relationship in the external view. It also contains the method of

    deriving the objects in the external view from the objects in the conceptual view.

    2. Conceptual Level: - Conceptual Level represents the entire database. Conceptual

    schema describes the records and relationship included in the Conceptual view. It also

    contains the method of deriving the objects in the conceptual view from the objects in the

    internal view.

    3. Internal Level: - Internal level indicates hoe the data will be stored and described the

    data structures and access method to be used by the database. It contains the definition of

    stored record and method of representing the data fields and access aid used.

    A mapping between external and conceptual views gives the correspondence among the

    records and relation ship of the conceptual and external view. The external view is the

    abstraction of conceptual view which in turns is the abstraction of internal view. It

    describes the contents of the database as perceived by the user or application program of

    that view.

  • 5

    Explain all DDL & DHL command with systems and output?

    DML vs. DDL

    Data Manipulation Language (also known as DML) is a family of computer languages.

    They are used by computer programs, and/or database users, to manipulate data in a

    database that is, insert, delete and update this data in the database.

    Data Definition Language (also known as DDL) is a computer language used to define

    data structures as its namesake suggests. It first made its appearance in the CODASYL

    database model (a model pertaining to the information technology industry consortium,

    known as Conference on Data Systems Languages). DDL was used within the schema of

    the database in order to describe the records, fields, and sets that made up the user Data

    Model. It was at first a way in which programmers defined SQL. Now, however, it is used

    generically to refer to any formal language used to describe data or information structures

    (for example, XML schemas).

    The most popular form of DML is the Structured Query Language (or SQL). This is a

    language used for databases, and is designed specifically for managing data in relational

    database management systems (or RDBMS). There are also other forms in which DML is

    used, for instance IM S/DLI, CODASYL databases (IDMS, for example), and a few

    others. DML comprises of SQL data change statements, meaning that stored data is

    modified, but the schema or database objects remain the same. The functional capability of

    the DML is organised by the initial word in a statement. This word is most generally a

    verb giving the page a specific action to fulfil. There are four specific verbs that initiate

    an action: SELECTINTO, INSERT, UPDATE, and DELETE.

    The DDL is used mainly to create that is to make a new database, table, index or stored

    query. A CREATE statement in SQL literally creates an object inside any RDBMS. As

    such, the types of objects able to be created are completely dependent on which RDBMS

    is currently in use. Most RDBMS support the table, index, user, synonym and database

    creation. In some cases, a system will allow the CREATE command and other DDL

    commands inside a specific transaction. This means that these functions are capable of

    being rolled back. The most common CREATE command is the CREATE TABLE

    command.

    DMLs are quite various. They have different functions and capabilities between database

    vendors. There are only two DML languages, however: Procedural and Declarative. While

  • 6

    there are multiple standards established for SQL, most vendors provide their own

    extensions to the standard without implementing it entirely.

    Summary:

    1. DML is a grouping of computer languages used by computer programs to manipulate

    data in a database; DDL is a computer language used specifically to define data structures.

    2. The most popular form of DML is SQL, and is comprised of various change statements;

    DDL mainly uses the CREATE command.

    DDL

    Data Definition Language (DDL) statements are used to define the database structure or

    schema. Some examples:

    o CREATE - to create objects in the database o ALTER - alters the structure of the database o DROP - delete objects from the database o TRUNCATE - remove all records from a table, including all spaces allocated for

    the records are removed

    o COMMENT - add comments to the data dictionary o RENAME - rename an object

    DML

    Data Manipulation Language (DML) statements are used for managing data within

    schema objects. Some examples:

    o SELECT - retrieve data from the a database o INSERT - insert data into a table o UPDATE - updates existing data within a table o DELETE - deletes all records from a table, the space for the records remain o MERGE - UPSERT operation (insert or update) o CALL - call a PL/SQL or Java subprogram o EXPLAIN PLAN - explain access path to data o LOCK TABLE - control concurrency

    DCL

    Data Control Language (DCL) statements. Some examples:

    o GRANT - gives user's access privileges to database o REVOKE - withdraw access privileges given with the GRANT command

  • 7

    TCL

    Transaction Control (TCL) statements are used to manage the changes made by DML

    statements. It allows statements to be grouped together into logical transactions.

    o COMMIT - save work done o SAVEPOINT - identify a point in a transaction to which you can later roll back o ROLLBACK - restore database to original since the last COMMIT o SET TRANSACTION - Change transaction options like isolation level and what

    rollback segment to use

  • 8

    Explain various types of data models with the help of diagrams?

    In software engineering, the term data model is used in two related senses. In the sense

    covered by this article, it is a description of the objects represented by a computer system

    together with their properties and relationships; these are typically "real world" objects

    such as products, suppliers, customers, and orders. In the second sense, covered by the

    article database model, it means a collection of concepts and rules used in defining data

    models: for example the relational model uses relations and tuples, while the network

    model uses records, sets, and fields.

    Overview of data modeling context: Data model is based on Data, Data relationship, Data

    semantic and Data constraint. A data model provides the details of information to be

    stored, and is of primary use when the final product is the generation of computer software

    code for an application or the preparation of a functional specification to aid a computer

    software make-or-buy decision. The figure is an example of the interaction between

    process and data models.

    Data models are often used as an aid to communication between the business people

    defining the requirements for a computer system and the technical people defining the

    design in response to those requirements. They are used to show the data needed and

    created by business processes.

    According to Hoberman (2009), "A data model is a wayfinding tool for both business and

    IT professionals, which uses a set of symbols and text to precisely explain a subset of real

    information to improve communication within the organization and thereby lead to a more

    flexible and stable application environment."

  • 9

    A data model explicitly determines the structure of data. Data models are specified in a

    data modeling notation, which is often graphical in form.

    A data model can be sometimes referred to as a data structure, especially in the context of

    programming languages. Data models are often complemented by function models,

    especially in the context of enterprise models.

    Relationships and functions

    A given database management system may provide one or more of the five models. The

    optimal structure depends on the natural organization of the application's data, and on the

    application's requirements, which include transaction rate (speed), reliability,

    maintainability, scalability, and cost. Most database management systems are built around

    one particular data model, although it is possible for products to offer support for more

    than one model.

    Various physical data models can implement any given logical model. Most database

    software will offer the user some level of control in tuning the physical implementation,

    since the choices that are made have a significant effect on performance.

    A model is not just a way of structuring data: it also defines a set of operations that can be

    performed on the data. The relational model, for example, defines operations such as

    select (project) and join. Although these operations may not be explicit in a particular

    query language, they provide the foundation on which a query language is built.

    Flat model

    Flat File Model.

    The flat (or table) model consists of a single, two-dimensional array of data elements,

    where all members of a given column are assumed to be similar values, and all members

    of a row are assumed to be related to one another. For instance, columns for name and

    password that might be used as a part of a system security database. Each row would have

    the specific password associated with an individual user. Columns of the table often have a

  • 10

    type associated with them, defining them as character data, date or time information,

    integers, or floating point numbers. This tabular format is a precursor to the relational

    model.

    Early data models

    These models were popular in the 1960s, 1970s, but nowadays can be found primarily in

    old legacy systems. They are characterized primarily by being navigational with strong

    connections between their logical and physical representations, and deficiencies in data

    independence.

    Hierarchical model

    In a hierarchical model, data is organized into a tree-like structure, implying a single

    parent for each record. A sort field keeps sibling records in a particular order. Hierarchical

    structures were widely used in the early mainframe database management systems, such as

    the Information Management System (IMS) by IBM, and now describe the structure of

    XML documents. This structure allows one one-to-many relationship between two types

    of data. This structure is very efficient to describe many relationships in the real world;

    recipes, table of contents, ordering of paragraphs/verses, any nested and sorted

    information.

    This hierarchy is used as the physical order of records in storage. Record access is done by

    navigating through the data structure using pointers combined with sequential accessing.

    Because of this, the hierarchical structure is inefficient for certain database operations

    when a full path (as opposed to upward link and sort field) is not also included for each

    record. Such limitations have been compensated for in later IMS versions by additional

    logical hierarchies imposed on the base physical hierarchy.

  • 11

    Network model

    The network model expands upon the hierarchical structure, allowing many-to-many

    relationships in a tree-like structure that allows multiple parents. It was the most popular

    before being replaced by the relational model, and is defined by the CODASYL

    specification.

    The network model organizes data using two fundamental concepts, called records and

    sets. Records contain fields (which may be organized hierarchically, as in the

    programming language COBOL). Sets (not to be confused with mathematical sets) define

    one-to-many relationships between records: one owner, many members. A record may be

    an owner in any number of sets, and a member in any number of sets.

    A set consists of circular linked lists where one record type, the set owner or parent,

    appears once in each circle, and a second record type, the subordinate or child, may appear

    multiple times in each circle. In this way a hierarchy may be established between any two

    record types, e.g., type A is the owner of B. At the same time another set may be defined

    where B is the owner of A. Thus all the sets comprise a general directed graph (ownership

    defines a direction), or network construct. Access to records is either sequential (usually in

    each record type) or by navigation in the circular linked lists.

    The network model is able to represent redundancy in data more efficiently than in the

    hierarchical model, and there can be more than one path from an ancestor node to a

    descendant. The operations of the network model are navigational in style: a program

    maintains a current position, and navigates from one record to another by following the

  • 12

    relationships in which the record participates. Records can also be located by supplying

    key values.

    Although it is not an essential feature of the model, network databases generally

    implement the set relationships by means of pointers that directly address the location of a

    record on disk. This gives excellent retrieval performance, at the expense of operations

    such as database loading and reorganization.

    Popular DBMS products that utilized it were Cincom Systems' Total and Cullinet's IDMS.

    IDMS gained a considerable customer base; in the 1980s, it adopted the relational model

    and SQL in addition to its original tools and languages.

    Most object databases (invented in the 1990s) use the navigational concept to provide fast

    navigation across networks of objects, generally using object identifiers as "smart"

    pointers to related objects. Objectivity/DB, for instance, implements named one-to-one,

    one-to-many, many-to-one, and many-to-many named relationships that can cross

    databases. Many object databases also support SQL, combining the strengths of both

    models.

    Inverted file model

    In an inverted file or inverted index, the contents of the data are used as keys in a lookup

    table, and the values in the table are pointers to the location of each instance of a given

    content item. This is also the logical structure of contemporary database indexes, which

    might only use the contents from a particular columns in the lookup table. The inverted

    file data model can put indexes in a second set of files next to existing flat database files,

    in order to efficiently directly access needed records in these files.

    Notable for using this data model is the ADABAS DBMS of Software AG, introduced in

    1970. ADABAS has gained considerable customer base and exists and supported until

    today. In the 1980s it has adopted the relational model and SQL in addition to its original

    tools and languages.

  • 13

    Relational model

    The relational model was introduced by E.F. Codd in 1970 as a way to make database

    management systems more independent of any particular application. It is a mathematical

    model defined in terms of predicate logic and set theory, and systems implementing it

    have been used by mainframe, midrange and microcomputer systems.

    The products that are generally referred to as relational databases in fact implement a

    model that is only an approximation to the mathematical model defined by Codd. Three

    key terms are used extensively in relational database models: relations, attributes, and

    domains. A relation is a table with columns and rows. The named columns of the relation

    are called attributes, and the domain is the set of values the attributes are allowed to take.

    The basic data structure of the relational model is the table, where information about a

    particular entity (say, an employee) is represented in rows (also called tuples) and

    columns. Thus, the "relation" in "relational database" refers to the various tables in the

    database; a relation is a set of tuples. The columns enumerate the various attributes of the

    entity (the employee's name, address or phone number, for example), and a row is an

    actual instance of the entity (a specific employee) that is represented by the relation. As a

    result, each tuple of the employee table represents various attributes of a single employee.

    All relations (and, thus, tables) in a relational database have to adhere to some basic rules

    to qualify as relations. First, the ordering of columns is immaterial in a table. Second, there

    can't be identical tuples or rows in a table. And third, each tuple will contain a single value

    for each of its attributes.

    A relational database contains multiple tables, each similar to the one in the "flat" database

    model. One of the strengths of the relational model is that, in principle, any value

    occurring in two different records (belonging to the same table or to different tables),

    implies a relationship among those two records. Yet, in order to enforce explicit integrity

    constraints, relationships between records in tables can also be defined explicitly, by

    identifying or non-identifying parent-child relationships characterized by assigning

    cardinality (1:1, (0)1:M, M:M). Tables can also have a designated single attribute or a set

  • 14

    of attributes that can act as a "key", which can be used to uniquely identify each tuple in

    the table.

    A key that can be used to uniquely identify a row in a table is called a primary key. Keys

    are commonly used to join or combine data from two or more tables. For example, an

    Employee table may contain a column named Location which contains a value that

    matches the key of a Location table. Keys are also critical in the creation of indexes,

    which facilitate fast retrieval of data from large tables. Any column can be a key, or

    multiple columns can be grouped together into a compound key. It is not necessary to

    define all the keys in advance; a column can be used as a key even if it was not originally

    intended to be one.

    A key that has an external, real-world meaning (such as a person's name, a book's ISBN,

    or a car's serial number) is sometimes called a "natural" key. If no natural key is suitable

    (think of the many people named Brown), an arbitrary or surrogate key can be assigned

    (such as by giving employees ID numbers). In practice, most databases have both

    generated and natural keys, because generated keys can be used internally to create links

    between rows that cannot break, while natural keys can be used, less reliably, for searches

    and for integration with other databases. (For example, records in two independently

    developed databases could be matched up by social security number, except when the

    social security numbers are incorrect, missing, or have changed.)

    The most common query language used with the relational model is the Structured Query

    Language (SQL).

    Dimensional model

    The dimensional model is a specialized adaptation of the relational model used to

    represent data in data warehouses in a way that data can be easily summarized using

    online analytical processing, or OLAP queries. In the dimensional model, a database

    schema consists of a single large table of facts that are described using dimensions and

    measures. A dimension provides the context of a fact (such as who participated, when and

    where it happened, and its type) and is used in queries to group related facts together.

    Dimensions tend to be discrete and are often hierarchical; for example, the location might

    include the building, state, and country. A measure is a quantity describing the fact, such

    as revenue. It is important that measures can be meaningfully aggregatedfor example,

    the revenue from different locations can be added together.

  • 15

    In an OLAP query, dimensions are chosen and the facts are grouped and aggregated

    together to create a summary.

    The dimensional model is often implemented on top of the relational model using a star

    schema, consisting of one highly normalized table containing the facts, and surrounding

    denormalized tables containing each dimension. An alternative physical implementation,

    called a snowflake schema, normalizes multi-level hierarchies within a dimension into

    multiple tables.

    A data warehouse can contain multiple dimensional schemas that share dimension tables,

    allowing them to be used together. Coming up with a standard set of dimensions is an

    important part of dimensional modeling.

    Its high performance has made the dimensional model the most popular database structure

    for OLAP.

    Post-relational database models

    Products offering a more general data model than the relational model are sometimes

    classified as post-relational. Alternate terms include "hybrid database", "Object-enhanced

    RDBMS" and others. The data model in such products incorporates relations but is not

    constrained by E.F. Codd's Information Principle, which requires that

    all information in the database must be cast explicitly in terms of values in relations and in

    no other way

    Some of these extensions to the relational model integrate concepts from technologies that

    pre-date the relational model. For example, they allow representation of a directed graph

    with trees on the nodes. The German company sones implements this concept in its

    GraphDB.

    Some post-relational products extend relational systems with non-relational features.

    Others arrived in much the same place by adding relational features to pre-relational

    systems. Paradoxically, this allows products that are historically pre-relational, such as

    PICK and MUMPS, to make a plausible claim to be post-relational.

    The resource space model (RSM) is a non-relational data model based on multi-

    dimensional classification.

    Graph model

    Graph databases allow even more general structure than a network database; any node

    may be connected to any other node.

  • 16

    Multivalue model

    Multivalue databases are "lumpy" data, in that they can store exactly the same way as

    relational databases, but they also permit a level of depth which the relational model can

    only approximate using sub-tables. This is nearly identical to the way XML expresses

    data, where a given field/attribute can have multiple right answers at the same time.

    Multivalue can be thought of as a compressed form of XML.

    An example is an invoice, which in either multivalue or relational data could be seen as

    (A) Invoice Header Table - one entry per invoice, and (B) Invoice Detail Table - one entry

    per line item. In the multivalue model, we have the option of storing the data as on table,

    with an embedded table to represent the detail: (A) Invoice Table - one entry per invoice,

    no other tables needed.

    The advantage is that the atomicity of the Invoice (conceptual) and the Invoice (data

    representation) are one-to-one. This also results in fewer reads, less referential integrity

    issues, and a dramatic decrease in the hardware needed to support a given transaction

    volume.

    Object-oriented database models

    In the 1990s, the object-oriented programming paradigm was applied to database

    technology, creating a new database model known as object databases. This aims to avoid

    the object-relational impedance mismatch - the overhead of converting information

    between its representation in the database (for example as rows in tables) and its

    representation in the application program (typically as objects). Even further, the type

    system used in a particular application can be defined directly in the database, allowing the

    database to enforce the same data integrity invariants. Object databases also introduce the

  • 17

    key ideas of object programming, such as encapsulation and polymorphism, into the world

    of databases.

    A variety of these ways have been tried for storing objects in a database. Some products

    have approached the problem from the application programming end, by making the

    objects manipulated by the program persistent. This typically requires the addition of some

    kind of query language, since conventional programming languages do not have the ability

    to find objects based on their information content. Others have attacked the problem from

    the database end, by defining an object-oriented data model for the database, and defining

    a database programming language that allows full programming capabilities as well as

    traditional query facilities.

    Object databases suffered because of a lack of standardization: although standards were

    defined by ODMG, they were never implemented well enough to ensure interoperability

    between products. Nevertheless, object databases have been used successfully in many

    applications: usually specialized applications such as engineering databases or molecular

    biology databases rather than mainstream commercial data processing. However, object

    database ideas were picked up by the relational vendors and influenced extensions made to

    these products and indeed to the SQL language.

    An alternative to translating between objects and relational databases is to use an object-

    relational mapping (ORM) library.

  • 18

    Explain all types of DBMS? List the system and command along with output?

    A DBMS always provides data independence. Any change in storage mechanism and

    formats are performed without modifying the entire application. There are four main types

    of database organization:

    Relational Database: Data is organized as logically independent tables.

    Relationships among tables are shown through shared data. The data in one table

    may reference similar data in other tables, which maintains the integrity of the

    links among them. This feature is referred to as referential integrity - an important

    concept in a relational database system. Operations such as "select" and "join" can

    be performed on these tables. This is the most widely used system of database

    organization.

    Flat Database: Data is organized in a single kind of record with a fixed number of

    fields. This database type encounters more errors due to the repetitive nature of

    data.

    Object Oriented Database: Data is organized with similarity to object oriented

    programming concepts. An object consists of data and methods, while classes

    group objects having similar data and methods.

    Hierarchical Database: Data is organized with hierarchical relationships. It

    becomes a complex network if the one-to-many relationship is violated.

    Data management models

    The data management systems (also called data base management systems) introduced

    several new ways of organizing data. That is they introduced several new ways of linking

    record fragments (or segments) together to form larger records for processing. Although

    many different methods were tried, only three major methods became popular: the

    hierarchic method, the network method, and the newest, the relational method.

    Each of these methods reflected the manner in which the vendor constructed and

    physically managed data within the file. The systems designer and the programmer had to

    understand these methods so that they could retrieve and process the data in the files.

    These models depicted the way the record fragments were tied to each other and thus the

    manner in which the chain of pointers had to be followed to retrieved the fragments in the

    correct order.

    Each vendor introduced a structural model to depict how the data was organized and tied

    together. These models also depicted what options were chosen to be implemented by the

  • 19

    development team, data record dependencies, data record occurrence frequencies, and the

    sequence in which data records had to be accessed - also called the navigation sequence.

    The hierarchic model

    The hierarchic model (figure) is used to describe those record structures in which the

    various physical records which make up the logical record are tied together in a sequence

    which looks like an inverted tree. At the top of the structure is a single record. Beneath

    that are one or more records each of which can occur one or more times. Each of these can

    in turn have multiple records beneath them. In diagrammatic form the top to bottom set of

    records looks like a inverted tree or a pyramid of records. To access the set of records

    associated with the identifier one started at the top record and followed the pointers from

    record to record.

  • 20

    The various records in the lower part of the structure are accessed by first accessing the

    records above them and then following the chain of pointers to the records at the next

    lower level. The records at any given level are referred to as the parent records and the

    records at the next lower level that are connected to it, or dependent on it are referred to as

    its children or the child records. There can be any number of records at any level, and each

    record can have any number of children. Each occurrence of the structure normally

    represent the collection of data about a single subject. This parent-child repetition can be

    repeated through several levels.

    The data model for this type of structural representation usually depicts each segment or

    record fragment only once and uses lines to show the connection between a parent record

    and its children. This depiction of record types and lines connecting them looks like an

    inverted tree or an organizational hierarchy chart.

    Each file is said to consist of a number of repetitions of this tree structure. Although the

    data model depicts all possible records types within a structure, in any given occurrence,

    record types may or may not be present. Each occurrence of the structure represents a

    specific subject occurrence an is identified by a unique identifier in the single, topmost

    record type (the root record).

    Designers employing this type of data management system would have to develop a

    unique record hierarchy for each data storage subject. A given application may have

    several different hierarchies, each representing data about a different subject, associated

    with it and a company may have several dozen different hierarchies of record types as

    components of its data model. A characteristic of this type of model is that each hierarchy

    is normally treated as separate and distinct from the other hierarchies, and various

    hierarchies can be mixed and matched to suit the data needs of the particular application.

    The network model

    The network data model (figure) has no implicit hierarchic relationship between the

    various records, and in many cases no implicit structure at all, with the records seemingly

    placed at random. The network model does not make a clear distinction between subjects

    mingling all record types in an overall schematic. The network model may have many

    different records containing unique identifiers, each of which acts as an entry point into

    the record structure. Record types are grouped into sets of two, one or both of which can in

    turn be part of another set of two record types. Within a given set, one record type is said

    to be the owner record and one is said to be the member record. Access to a set is always

  • 21

    accomplished by first locating the specific owner record and then following the chain of

    pointers to the member records of the set. The network can be traversed or navigated by

    moving from set to set. Various different data structures can be constructed by selecting

    sets of records and excluding others.

    Each record type is depicted only once in this type of data model and the relationship

    between record types is indicated by a line between them. The line joining the two records

    contains the name of the set. Within a set a record can have only one owner, but multiple

    owner member sets can be constructed using the same two record types

    The network model has no explicit hierarchy and no explicit entry point. Whereas the

    hierarchic model has several different hierarchies structures, the network model employs a

  • 22

    single master network or model, which when completed looks like a web of records. As

    new data is required, records are added to the network and joined to existing sets.

    The relational model

    The relational model (figure), unlike the network or the hierarchic models did not rely on

    pointers to connect and chose to view individual records in sets regardless of the subject

    occurrence they were associated with. This is in contrast to the other models which sought

    to depict the relationships between record types. In the network model records are

    portrayed as residing in tables with no physical pointer between these tables. Each table is

    thus portrayed independently from each other table. This made the data model itself a

    model of simplicity, but it in turn made the visualization of all the records associated with

    a particular subject somewhat difficult.

  • 23

    Data records were connected using logic and by using that data that was redundantly

    stored in each table. Records on a given subject occurrence could be selected from

    multiple tables by matching the contents of these redundantly stored data fields.

    The impact of data management systems

    The use of these products to manage data introduced a new set of tasks for the data

    analysis personnel. In addition to developing record layouts, they also had the new task of

    determining how these records should be structured, or arranged and joined by pointer

    structures.

    Once those decisions were made they had to be conveyed to the members of the

    implementation team. The hierarchic and network models were necessary because without

    them the occurrence sequences and the record to record relationships designed into the

    files could not be adequately portrayed. Although the relational "model" design choices

    also needed to be conveyed to the implementation team, the relational model was always

    depicted in much the same format as standard record layouts, and any other access or

    navigation related information could be conveyed in narrative form.

    Data as a corporate Resource

    One additional concept was introduced during the period when these new file management

    systems were being developed - the concept that data was a corporate resource. The

    implications of concept were that data belonged to the corporation as a whole, and not to

    individual user areas. This implied that data should somehow be shared, or used in

    common by all members of the firm.

    Data sharing required data planning. Data had to be organized, sized and formatted to

    facilitate use by all who needed it. This concept of data sharing was diametrically opposed

    to the application orientation where data records and data files were designed for, and data

    owned by the application and the users of that application.

    This concept also introduced a new set of participants in the data analysis process and a

    new set of users of the data models. These new people were business area personnel who

    were now drawn into the data analysis process. The data record models which had sufficed

    for the data processing personnel no longer conveyed either the right information nor

    information with the correct perspective to be meaningful for these new participants.

    The primary method of data planning is the development of the data model. Many of the

    early data planning was accomplished within the context of the schematics used by the

    design team to describe the data management file structures.

  • 24

    These models were used as analysis and requirements tools, and as such were moderately

    effective. They were limited in one respect, that being that organizations tended to use the

    implementation model, which also contained information about pointer use, navigation

    information, or in the case of the network models, owner-member set information, access

    choice information and other information which was important to the data processing

    implementation team, but not terribly relevant to the user.

    Normalization

    Concurrent with the introduction of the relational data model another concept was

    introduced - that of normalization. Although it was introduced in the early nineteen-

    seventies its full impact did not begin to be felt until almost a decade later, and even today

    its concepts are not well understood. The various record models gave the designer a way

    of presenting to the user not only the record layout but also also the connections between

    the data records. In a sense allowing the designer to show the user what data could be

    accessed with what other data. Determination of record content however was not

    addressed in any methodical manner. Data elements were collect into records in a

    somewhat haphazard manner. That is there was no rationale or predetermined reason why

    one data element was placed in the same record as another. Nor was there any need to do

    so since the physical pointers between records prevented data on one subject from being

    confused with data about another, even at the occurrence level.

    The relational model however lacked these pointers and relied on logic to assemble a

    complete set of data from its tables. Because it was logic driven (based upon mathematics)

    the notion was proposed that placement of data elements in records could also be guided

    by a set of rules. If followed, these rules would eliminate many of the design mistakes

    which arose from the meaning of data being inadvertently changed due to totally unrelated

    changes. It also set forth rules which if followed would arrange the data within the records

    and within the files more logically and more consistently.

    Previously data analysis, file and record designers, relied on intuition and experience to

    construct record layouts. As the design progressed, data was moved from record to record,

    records were split and others combined until the final model was pleasing, relatively

    efficient and satisfied the processing needs of the application that needed the data that

    these models represented. Normalization offered the hope that the process of record

    layout, and thus model development could be more procedurally driven, more rule driven

    such that relatively inexperienced users could also participate in the process. It was also

  • 25

    hoped that these rules would also assist the experienced designer and eliminate some of

    the iterations, and thus make the process more efficient.

    The first rule of normalization was that data should depend (or be collected) by key. That

    is, data should be organized by subject, as opposed to previous methods which collected

    data by application or system. This notion was obvious to hierarchic model users, whose

    models inherently followed this principle, but was somewhat foreign and novel to network

    model developers where the aggregation of data about a data subject was not as

    commonplace.

    This notion of subject organized data led to the development of non-DBMS oriented data

    models.

    The Entity-Relationship model

    While the record data models served many purposes for the system designers, these

    models had little meaning or relevance to the users community. Moreover, much of the

    information the users needed to evaluate the effectiveness of the design was missing.

    Several alternative data model formats were introduced to fill this void. These models

    attempted to model data in a different manner. Rather than look at data from a record

    perspective, they began to look at the entities or subjects about which data was being

    collected and maintained. They also realized that the the relationship between these data

    subjects was also an area that needed to be modeled and subjected to user scrutiny. These

    relationships were important because in may respects they reflected the business rules

    under which the firm operated. This modeling of relationships was particularly important

    when relational data management systems were being used because the relationship

    between the data tables was not explicitly stated, and the design team required some

    method for describing those relationships to the user.

    As we shall see later on, the Entity-Relatinship model has one other important advantage.

    In as much as it is non-DBMS specific, and is in fact not a DBMS model at all, data

    models can be developed by the design team without first having to make a choice as to

    which DBMS to use. In those firms where multiple data management systems are both in

    use and available, this is a critical advantage in the design process.

  • 26

    IRCTC TABLES Layout of railway reservation form and connection of this form with the database required

    to store information.

    PASSENGERS DATABASE: database of passengers contains following fields

    1. Name

    2. Age

    3. Gender..

    4. Total Number Of Passengers Travelling

    Number of Adults..

    Number Of children..

    Senior Citizen

    5. Date Of Travel

    6. Class of Travel..

    TRAIN DATABASE : database of train contains following fields

    1. Train Name.

    2. Train Number..

    3. RouteFrom..To..

    4. Train Time

    5. Number of Compartments.

    AC First Class

    AC 2 Tier

    AC 3 Tier

    Sleeper..

    General.

    6. Number of Employees.

    DESIGN OF TABLES

    The passenger database will contain following fields

    PNR NO (Primary key)

    NAME

    AGE

    GENDER

    TOTAL PASSENGER

    DATE OF TRAVEL

  • 27

    CLASS

    TRAIN NO.

    The train database will contain following field

    Train name

    Train no. (Primary key)

    Route from-to

    Departure time

    No of compartments

    1 AC

    2AC

    3AC

    SLEEPER

    GENERAL

    SLR

    SNAPSHOTS OF TABLES

    TABLE FORPASSENGERS

  • 28

    This is the original snapshot from M S Access. The primary key here is PNR NO. , this

    table also contains name of passenger, age, gender, total passenger travelling, date of

    travel, class and train no. in which they are travelling.

    TABLE FOR TRAINS

    This is the original snapshot from M S Access. The primary key here is train no. , this

    table also contains train name, route, departure time from originating station, no. of

    compartments in whole train and class wise segmentation of compartments.

  • 29

    SNAPSHOTS OF FORMS

    PASSENGER RESERVATION FORM

    This form contain the same data labels whatever is there in M S ACCESS database i.e.

    name of passenger, age, gender, total passenger travelling, date of travel, class and train

    no. in which they are travelling

  • 30

    FORM FOR TRAINS

    This form contains the same data labels whatever is there in M S ACCESS database. I.e.

    train name, route, and departure time from originating station, no. of compartments in

    whole train and class wise segmentation of compartments.