Chapter

1

CHAPTER-1: INTRODUCTION

2

INTRODUCTION

This document describes the procedure required to migrate a database from Windows to Linux

using the RMAN Convert Database command. Both Windows and Linux platforms have the

same endian format, which makes possible to transfer the whole database, making the migration

process very straightforward and simple. To migrate between platforms that have a different

endian format, Cross Platform Transportable Tablespaces (XTTS)needs to be used instead.

For IT managers, database migration has become one of the most routine and challenging facts

of life. Workload balancing, technology refresh, server and storage consolidation, data center

relocation, data classification, and mergers/acquisitions are all drivers to move data from one

storage device to another on a regular basis. You’d think that any operation performed routinely

would become easy, but it is not that simple which it seems to be. Migration moving data from

one device to another (especially as the capacity of storage devices continues to grow) and then

redirecting all I/O to the new device is an inherently disruptive process. With the increase in the

percentage of mission-critical data and the proportionate increase in data availability demands,

downtime with its huge impact on a company’s financial bottom line becomes unacceptable. In

addition, business, technical and operational requirements impose challenging restrictions on the

migration process itself. Resource demands staff, CPU cycles, and bandwidth and risks

application downtime, performance impact to production environments, technical

incompatibilities, and data corruption/loss make migration one of IT’s biggest challenges.

3

1.1 DATA

The term data means groups of information that represent the qualitative or quantitative

attributes of a variable or set of variables. Data (plural of "datum", which is seldom used) are

typically the results of measurements and can be the basis of graphs, images, or observations of a

set of variables. Data are often viewed as the lowest level of abstraction from which information

and knowledge are derived. Raw data refers to a collection of numbers, characters, images or

other outputs from devices that collect information to convert physical quantities into symbols

that are unprocessed. Raw data refers to a collection of numbers, characters, images or other

outputs from devices to convert physical quantities into symbols, that are unprocessed. Such data

is typically further processed by a human or input into a computer, stored and processed there, or

transmitted (output) to another human or computer (possibly through a data cable). Raw data is a

relative term; data processing commonly occurs by stages, and the "processed data" from one

stage may be considered the "raw data" of the next. The terms information and knowledge are

frequently used for overlapping concepts. The main difference is in the level of abstraction being

considered. Data is the lowest level of abstraction, information is the next level, and finally,

knowledge is the highest level among all three.

1.2 INFORMATION

Information as a concept bears a diversity of meanings, from everyday usage to technical

settings. Generally speaking, the concept of information is closely related to notions of

constraint, communication, control, data, form, instruction, knowledge, meaning, mental

stimulus, pattern, perception, and representation.

4

CHAPTER-2: INTRODUCTION TO

DATABASE

5

INTRODUCTION TO DATABASE

DATABASE

A database consists of an organized collection of data for one or more multiple uses. Databases

consist of software-based "containers" that are structured to collect and store information so

users can retrieve, add, update or remove such information in an automatic fashion. They do so

through the use of tables .Database programs are designed for users so that they can add or delete

any information needed. The structure of a database is tabular, consisting of rows and columns of

information.

APPLICATIONS

Databases function in many applications, spanning virtually the entire range of computer

software. Databases have become the preferred method of storage for large multiuser

applications, where coordination between many users is needed. Even individual users find them

convenient, and many electronic mail programs and personal organizers are based on standard

database technology. Software database drivers are available for most database platforms so that

application software can use a common API to retrieve the information stored in a database.

Commonly used database APIs include JDBC and ODBC.

DATABASES IN NEW MEDIA

Within new media, databases collect items on which the user can carry out various operations

such as viewing, navigating, creates, and searching. Though there are various types of items

within the database, each item has the same significance. Unlike a narrative or film, the

collections are computerized and therefore may offer a unique experience with each view. This

form of data may present a unique presentation of what the world is like. Databases can be seen

as a symbolic form of the computer age.

6

2.1 COMPONENTS OF DATABASE

A database is developed with several components. All these components are linked to each other.

Each has its own role to play in the developing and working of database. Some of the main

components which are used in developing and working of database are:

COLUMNS ROWS TABLES KEYS FORMS REPORTS QUERY

Table, rows and columns form the building blocks of a database. They store the data that we

want to save in our database

2.1.1 COLUMNS

Columns are akin to fields, that is, individual items of data that we wish to store. A customer's name, the price of a part, the date of an invoice is all examples of columns. They are also similar to the columns found in spreadsheets (the A, B, C etc along the top).

2.1.2 ROWS

Rows are akin to records as they contain data of multiple columns (like the 1,2,3 etc in a

spreadsheet). Unlike file records though, it is possible to extract only the columns you want to

make up a row of data. Old "records" that computers read forced the computer to read

everything, even if you only wanted a tiny portion of the record. In databases, a row can be made

up of as many or as few columns as you want. This makes reading data much more efficient -

you fetch what you want.

7

2.1.3 TABLE

A table is a logical group of columns. For example, you may have a table that stores details of

customers' names and addresses. Another table would be used to store details of parts and yet

another would be used for supplier's names and addresses. It is the tables that make up the entire

database and it is important that we do not duplicate data at all. Only keys would duplicate (and

even then, on some tables - these would be unique).

2.1.4 KEYS

A key is a single or combination of multiple fields. Its purpose is to access or retrieve data rows

from table according to the requirement. The keys are defined in tables to access or sequence the

stored data quickly and smoothly. They are also used to create links between different tables.

TYPES OF KEYS

The following tables or relations will be used to define different types of keys.

PRIMARY KEY

A primary key is a value that can be used to identify a unique row in a table. Attributes are

associated with it. Examples of primary keys are Social Security numbers (associated to a

specific person) or ISBNs (associated to a specific book). In the relational model of data, a

primary key is a candidate key chosen as the main method of uniquely identifying a tuple in a

relation.

SECONDARY KEY

A field or combination of fields that is basis for retrieval is known as secondary key. Secondary

key is a non-unique field. One secondary key value may refer to many records.

8

COMPOSITE KEY

A primary key that consists of two or more attributes is known as composite key.

CANDIDATE KEY

A candidate key is a field or combination of fields that can act as a primary key field for that

table to uniquely identify each record in that table.

ALTERNATE KEY

An alternate key is any candidate key which is not selected to be the primary key.

SUPER KEY

A super key is defined in the relational model as a set of attributes of a relation variable (relvar)

for which it holds that in all relations assigned to that variable there are no two distinct tuples

(rows) that have the same values for the attributes in this set. Equivalently a superkey can also be

defined as a set of attributes of a relvar upon which all attributes of the relvar are functionally

dependent.

FOREIGN KEY

A foreign key (FK) is a field or group of fields in a database record that points to a key field or

group of fields forming a key of another database record in some (usually different) table.

Usually a foreign key in one table refers to the primary key (PK) of another table. This way

references can be made to link information together and it is an essential part of database

normalization.

9

2.1.5 FORMS

Forms are used to enter or look at data. They are the screens that make up the application that

allows us to enter and retrieve data. Each "field" on a form usually relates to a column on the

database but this is not always the case.

For example, suppose you have a field that shows a total of two or more items on a form. The

total would NOT be stored on on the database. Why? Because if one of the columns were

incorrectly updated, you wouldn't know which was incorrect? For example. Suppose you had 3

columns on a table - A, B and C. A is equal to B + C. suppose the columns contain the following

values:

A = 11

B = 7

C = 5

Which column is wrong? You might think that A is wrong, but perhaps B should be equal to 6 or

maybe C should be 4. You just don't know. This is why derived items are not stored on the

database but can be shown on a form. The form should have some form of validation so that the

user cannot enter "rubbish" onto the database. Numbers must store numbers; dates should store

dates and so on. Depending on the application, there can be some very complex validation.

2.1.6 REPORTS

It is all very well placing all this data into the database, but if you cannot access it in a format

that you want (usually on paper), then it is extremely limited in use. Every database should have

some kind of reporting facility and I usually find that the report side of things is what makes a

good (or bad) database application/package. Reports should be easy to produce and be quick.

Simple reports should be so easy to produce that anyone can do it. More complex reports will

10

need someone with programming/database knowledge, as it is possible to produce really silly

results.

2.1.7 QUERY

Queries are little programs that let the user ask questions. Things like "Give me a list of all the

customers who ordered something last month" or "List all customers in a specific area". Queries

can and often are the foundation of reports (in that a report can be based on the results of a query.

Depending on the database, queries can be constructed on screen or you may even be able to type

in a query command direct. Queries often use a language called "SQL" (Structured Query

Language) which is a computer language that makes accessing databases simple. These then are

the general components of what makes up a database. Next week, we take a simple look at some

basic design concepts for your tables.

2.2 TYPES OF DATABASE

There is various kind of database which exists in the computer world. They are differentiated on

their functionality and how they model the data.

2.2.1 BASED ON FUNCTIONALITY

OPERATIONAL DATABASE

These databases store detailed data needed to support the operations of an entire organization.

They are also called subject-area databases (SADB), transaction databases, and production

databases. For example:

Customer database

Personal database

11

Inventory database

Accounting database

ANALYTICAL DATABASE

Analytic databases (a.k.a. OLAP- On Line Analytical Processing) are primarily static, read-only

databases which store archived, historical data used for analysis. For example, a company might

store sales records over the last ten years in an analytic database and use that database to analyze

marketing strategies in relationship to demographics. On the web, you will often see analytic

databases in the form of inventory catalogs such as the one shown previously from Amazon.com.

An inventory catalog analytical database usually holds descriptive information about all

available products in the inventory. Web pages are generated dynamically by querying the list of

available products in the inventory against some search parameters. The dynamically-generated

page will display the information about each item (such as title, author, ISBN) which is stored in

the database.

DATA WAREHOUSE

A data warehouse stores data from current and previous years data extracted from the various

operational databases of an organization. It becomes the central source of data that has been

screened, edited, standardized and integrated so that it can be used by managers and other end-

user professionals throughout an organization. Data warehouses are characterized by being slow

to insert into but fast to retrieve from. Recent developments in data warehousing have led to the

use of a Shared nothing architecture to facilitate extreme scaling.

12

DISTRIBUTED DATABASE

These are databases of local work-groups and departments at regional offices, branch offices,

manufacturing plants and other work sites. These databases can include segments of both

common operational and common user databases, as well as data generated and used only at a

user’s own site.

END-USER DATABASE

These databases consist of a variety of data files developed by end-users at their workstations.

Examples of these are collections of documents in spreadsheets, word processing and even

downloaded files.

EXTERNAL DATABASE

These databases provide access to external, privately-owned data online — available for a fee to

end-users and organizations from commercial services. Access to a wealth of information from

external database is available for a fee from commercial online services and with or without

charge from many sources in the Internet.

HYPERMEDIA DATABASES ON THE WEB

These are a set of interconnected multimedia pages at a web-site. They consist of a home page

and other hyperlinked pages [of multimedia or mixed media such as text, graphic, photographic

images, video clips, audio etc.

13

NAVIGATIONAL DATABASE

In navigational databases, queries find objects primarily by following references from other

objects. Traditionally navigational interfaces are procedural, though one could characterize some

modern systems like XPath as being simultaneously navigational and declarative.

IN-MEMORY DATABASES

In-memory databases primarily rely on main memory for computer data storage. This contrasts

with database management systems which employ a disk-based storage mechanism. Main

memory databases are faster than disk-optimized databases since the internal optimization

algorithms are simpler and execute fewer CPU instructions. Accessing data in memory provides

faster and more predictable performance than disk. In applications where response time is

critical, such as telecommunications network equipment that operates emergency systems, main

memory databases are often used.

DO CUMENT-ORIENTED DATABASES

Document-oriented databases are computer programs designed for document-oriented

applications. These systems may be implemented as a layer above a relational database or an

object database. As opposed to relational databases, document-based databases do not store data

in tables with uniform sized fields for each record. Instead, they store each record as a document

that has certain characteristics. Any number of fields of any length can be added to a document.

Fields can also contain multiple pieces of data.

REAL-TIME DATABASES

A real-time database is a processing system designed to handle workloads whose state may

change constantly. This differs from traditional databases containing persistent data, mostly

14

unaffected by time. For example, a stock market changes rapidly and dynamically. Real-time

processing means that a transaction is processed fast enough for the result to come back and be

acted on right away. Real-time databases are useful for accounting, banking, law, medical

records, multi-media, process control, reservation systems, and scientific data analysis. As

computers increase in power and can store more data, real-time databases become integrated into

society and are employed in many applications.

2.2.2 MODEL BASED

HIERARCHICAL DATABASES

As its name implies, the Hierarchical Database Model defines hierarchically-arranged data.

Perhaps the most intuitive way to visualize this type of relationship is by visualizing an upside

down tree of data. In this tree, a single table acts as the "root" of the database from which other

tables "branch" out. You will be instantly familiar with this relationship because that is how all

windows-based directory management systems (like Windows Explorer) work these days.

Relationships in such a system are thought of in terms of children and parents such that a child

may only have one parent but a parent can have multiple children. Parents and children are tied

together by links called "pointers" (perhaps physical addresses inside the file system). A parent

will have a list of pointers to each of their children.

15

This child/parent rule assures that data is systematically accessible. To get to a low-level table,

you start at the root and work your way down through the tree until you reach your target. Of

course, as you might imagine, one problem with this system is that the user must know how the

tree is structured in order to find anything.

NETWORK DATABASES

In many ways, the Network Database model was designed to solve some of the more serious

problems with the Hierarchical Database Model. Specifically, the Network model solves the

problem of data redundancy by representing relationships in terms of sets rather than hierarchy.

The model had its origins in the Conference on Data Systems Languages (CODASYL) which

had created the Data Base Task Group to explore and design a method to replace the hierarchical

model. The network model is very similar to the hierarchical model actually. In fact, the

hierarchical model is a subset of the network model. However, instead of using a single-parent

tree hierarchy, the network model uses set theory to provide a tree-like hierarchy with the

exception that child tables were allowed to have more than one parent. This allowed the network

model to support many-to-many relationships. Visually, a Network Database looks like a

hierarchical Database in that you can see it as a type of tree. However, in the case of a Network

Database, the look is more like several trees which share branches. Thus, children can have

multiple parents and parents can have multiple children.

16

Nevertheless, though it was a dramatic improvement, the network model was far from perfect.

Most profoundly, the model was difficult to implement and maintain. Most implementations of

the network model were used by computer programmers rather than real users. What was needed

was a simple model which could be used by real end users to solve real problems.

2.3 ARCHITECTURE OF DATABASE

A number of database architectures exist. Many databases use a combination of strategies.

Databases consist of software-based "containers" that are structured to collect and store

information so users can retrieve, add, update or remove such information in an automatic

fashion. Database programs are designed for users so that they can add or delete any information

needed. The structure of a database is tabular, consisting of rows and columns of information.

Online Transaction Processing systems (OLTP) often use a "row oriented" or an "object

oriented" data store architecture, whereas data-warehouse and other retrieval focused

applications like Google's Big Table, or bibliographic database (library catalog) systems may use

a Column oriented DBMS architecture. Document-Oriented, XML, knowledge bases, as well as

frame databases and RDF-stores (also known as triple stores), may also use a combination of

these architectures in their implementation. Not all databases have or need a database schema

("schema-less databases").Over many years general-purpose database systems have dominated

the database industry. These offer a wide range of functions, applicable to many, if not most

circumstances in modern data processing. These have been enhanced with extensible data types

(pioneered in the PostgreSQL project) to allow development of a very wide range of

applications. There are also other types of databases which cannot be classified as relational

databases. Most notable is the object database management system, which stores language

objects natively without using a separate data definition language and without translating into a

separate storage schema. Unlike relational systems, these object databases store the relationship

between complex data types as part of their storage model in a way that does not require runtime

calculation of related data using relational algebra execution algorithms.

17

2.3.1 THREE LEVEL DATABASE ARCHITECTURE

DATA AND RELATED STRUCTURES

Data are actually stored as bits, or numbers and strings, but it is difficult to work with data at this

level. It is necessary to view data at different levels of abstraction.

Schema :

Description of data at some level. Each level has its own schema.

We will be concerned with three forms of schemas:

Physical,

Conceptual

External.

18

PHYSICAL DATA LEVEL

The physical schema describes details of how data is stored: files, indices, etc. on the random

access disk system. It also typically describes the record layout of files and type of files (hash, b-

tree, flat).Early applications worked at this level - explicitly dealt with details. E.g., minimizing

physical distances between related data and organizing the data structures within the file

(blocked records, linked lists of blocks, etc.)

Problem:

Routines are hardcoded to deal with physical representation.

Changes to data structures are difficult to make.

Application code becomes complex since it must deal with details.

Rapid implementation of new features very difficult.

CONCEPTUAL DATA LEVEL

Hides details of the physical level.

In the relational model, the conceptual schema presents data as a set of tables.

The DBMS maps data access between the conceptual to physical schemas automatically.

Physical schema can be changed without changing application:

DBMS must change mapping from conceptual to physical.

Referred to as physical data independence.

EXTERNAL DATA LEVEL

19

In the relational model, the external schema also presents data as a set of relations. An external

schema specifies a view of the data in terms of the conceptual level. It is tailored to the needs of

a particular category of users. Portions of stored data should not be seen by some users and

begins to implement a level of security and simplifies the view for these users

Examples:

Students should not see faculty salaries.

Faculty should not see billing or payment data.

Information that can be derived from stored data might be viewed as if it were stored.

GPA not stored, calculated when needed.

Applications are written in terms of an external schema. The external view is computed when

accessed. It is not stored. Different external schemas can be provided to different categories of

users. Translation from external level to conceptual level is done automatically by DBMS at run

time. The conceptual schema can be changed without changing application:

Mapping from external to conceptual must be changed.

Referred to as conceptual data independence

WHAT IS A DATA MODEL?

Well, essentially a data model is a "description" of both a container for data and a methodology

for storing and retrieving data from that container. Actually, there isn't really a data model

"thing". Data models are abstractions, oftentimes mathematical algorithms and concepts. You

cannot really touch a data model. But nevertheless, they are very useful. The analysis and design

of data models has been the cornerstone of the evolution of databases. As models have advanced

so has database efficiency.

2.4 DATABASE MODELS

20

Besides differentiating databases according to function, databases can also be differentiated

according to how they model the data. Before the 1980's, the two most commonly used Database

Models were the hierarchical and network systems. Let's take a quick look at these two models

and then move on to the more current models.

2.4.1 FLAT MODEL

The flat (or table) model consists of a single, two-dimensional array of data elements, where all

members of a given column are assumed to be similar values, and all members of a row are

assumed to be related to one another. For instance, columns for name and password that might be

used as a part of a system security database. Each row would have the specific password

associated with an individual user. Columns of the table often have a type associated with them,

defining them as character data, date or time information, integers, or floating point numbers.

This may not strictly qualify as a data model, as defined above.

2.4.2 HIERARCHICAL MODEL

http://en.wikipedia.org/wiki/File:Flat_File_Model.svg

21

In a hierarchical model, data is organized into a tree-like structure, implying a single upward link

in each record to describe the nesting, and a sort field to keep the records in a particular order in

each same-level list. Hierarchical structures were widely used in the early mainframe database

management systems, such as the Information Management System (IMS) by IBM, and now

describe the structure of XML documents. This structure allows one 1:N relationship between

two types of data. This structure is very efficient to describe many relationships in the real world;

recipes, table of contents, ordering of paragraphs/verses, any nested and sorted information.

However, the hierarchical structure is inefficient for certain database operations when a full path

(as opposed to upward link and sort field) is not also included for each record. Parent–child

relationship: Child may only have one parent but a parent can have multiple children. Parents

and children are tied together by links called "pointers". A parent will have a list of pointers to

each of their children.

2.4.3 NETWORK MODEL

http://en.wikipedia.org/wiki/File:Hierarchical_Model.jpg

22

The network model (defined by the CODASYL specification) organizes data using two

fundamental constructs, called records and sets. Records contain fields (which may be organized

hierarchically, as in the programming language COBOL). Sets (not to be confused with

mathematical sets) define one-to-many relationships between records: one owner, many

members. A record may be an owner in any number of sets, and a member in any number of sets.

The network model is a variation on the hierarchical model, to the extent that it is built on the

concept of multiple branches (lower-level structures) emanating from one or more nodes (higher-

level structures), while the model differs from the hierarchical model in that branches can be

connected to multiple nodes. The network model is able to represent redundancy in data more

efficiently than in the hierarchical model. The operations of the network model are navigational

in style: a program maintains a current position, and navigates from one record to another by

following the relationships in which the record participates. Records can also be located by

supplying key values. Although it is not an essential feature of the model, network databases

generally implement the set relationships by means of pointers that directly address the location

of a record on disk. This gives excellent retrieval performance, at the expense of operations such

as database loading and reorganization. Most object databases use the navigational concept to

provide fast navigation across networks of objects, generally using object identifiers as "smart"

pointers to related objects. Objectivity/DB, for instance, implements named 1:1, 1:many, many:1

and many: many named relationships that can cross databases. Many object databases also

support SQL, combining the strengths of both models.

http://en.wikipedia.org/wiki/File:Network_Model.jpg

23

2.4.4 RELATIONAL MODEL

The relational model was introduced by E.F. Codd in 1970 as a way to make database

management systems more independent of any particular application. It is a mathematical model

defined in terms of predicate logic and set theory. The products that are generally referred to as

relational databases in fact implement a model that is only an approximation to the mathematical

model defined by Codd. Three key terms are used extensively in relational database models:

Relations, attributes, and domains. A relation is a table with columns and rows. The named

columns of the relation are called attributes, and the domain is the set of values the attributes are

allowed to take. All relations (and, thus, tables) in a relational database have to adhere to some

basic rules to qualify as relations. First, the ordering of columns is immaterial in a table. Second,

there can't be identical tuples or rows in a table. And third, each tuple will contain a single value

for each of its attributes.

A relational database contains multiple tables, each similar to the one in the "flat" database

model. One of the strengths of the relational model is that, in principle, any value occurring in

two different records (belonging to the same table or to different tables), implies a relationship

among those two records. Yet, in order to enforce explicit integrity constraints, relationships

between records in tables can also be defined explicitly, by identifying or non-identifying parent-

child relationships characterized by assigning cardinality (1:1, (0)1:M, M:M). Tables can also

http://en.wikipedia.org/wiki/File:Relational_Model_2.jpg

24

have a designated single attribute or a set of attributes that can act as a "key", which can be used

to uniquely identify each tuple in the table. A key that can be used to uniquely identify a row in a

table is called a primary key. Keys are commonly used to join or combine data from two or more

tables. For example, an Employee table may contain a column named Location which contains a

value that matches the key of a Location table. Keys are also critical in the creation of indexes,

which facilitate fast retrieval of data from large tables. Any column can be a key, or multiple

columns can be grouped together into a compound key. It is not necessary to define all the keys

in advance; a column can be used as a key even if it was not originally intended to be one.A key

that has an external, real-world meaning (such as a person's name, a book's ISBN, or a car's serial

number) is sometimes called a "natural" key. If no natural key is suitable (think of the many

people named Brown), an arbitrary or surrogate key can be assigned (such as by giving

employees ID numbers). In practice, most databases have both generated and natural keys,

because generated keys can be used internally to create links between rows that cannot break,

while natural keys can be used, less reliably, for searches and for integration with other

databases. (For example, records in two independently developed databases could be matched up

by social security number, except when the social security numbers are incorrect, missing, or

have changed.)

2.4.5 DIMENSIONAL MODEL

The dimensional model is a specialized adaptation of the relational model used to represent data

in data warehouses in a way that data can be easily summarized using OLAP queries. In the

dimensional model, a database consists of a single large table of facts that are described using

dimensions and measures. A dimension provides the context of a fact (such as who participated,

when and where it happened, and its type) and is used in queries to group related facts together.

Dimensions tend to be discrete and are often hierarchical; for example, the location might

include the building, state, and country. A measure is a quantity describing the fact, such as

revenue. It's important that measures can be meaningfully aggregated - for example, the revenue

from different locations can be added together. In an OLAP query, dimensions are chosen and

the facts are grouped and added together to create a summary. The dimensional model is often

25

implemented on top of the relational model using a star schema, consisting of one table

containing the facts and surrounding tables containing the dimensions. Particularly complicated

dimensions might be represented using multiple tables, resulting in a snowflake schema. A data

warehouse can contain multiple star schemas that share dimension tables, allowing them to be

used together. Coming up with a standard set of dimensions is an important part of dimensional

modeling.

2.4.6 OBJECT-ORIENTED MODELS

In recent years, the object-oriented paradigm has been applied to database technology, creating a

new programming model known as object databases. These databases attempt to bring the

database world and the application programming world closer together, in particular by ensuring

that the database uses the same type system as the application program. This aims to avoid the

overhead (sometimes referred to as the impedance mismatch) of converting information between

its representation in the database (for example as rows in tables) and its representation in the

application program (typically as objects). At the same time, object databases attempt to

introduce the key ideas of object programming, such as encapsulation and polymorphism, into

the world of databases. A variety of these ways have been tried for storing objects in a database.

Some products have approached the problem from the application programming end, by making

the objects manipulated by the program persistent. This also typically requires the addition of

some kind of query language, since conventional programming languages do not have the ability

to find objects based on their information content. Others have attacked the problem from the

database end, by defining an object-oriented data model for the database, and defining a database

programming language that allows full programming capabilities as well as traditional query.

26

Object databases suffered because of a lack of standardization: although standards were defined

by ODMG, they were never implemented well enough to ensure interoperability between

products. Nevertheless, object databases have been used successfully in many applications:

usually specialized applications such as engineering databases or molecular biology databases

rather than mainstream commercial data processing. However, object database ideas were picked

up by the relational vendors and influenced extensions made to these products and indeed to the

SQL language.

2.5 DATABASE DESIGN

Database design is the process of producing a detailed data model of a database. This logical data

model contains all the needed logical and physical design choices and physical storage

parameters needed to generate a design in a Data Definition Language, which can then be used to

create a database. A fully attributed data model contains detailed attributes for each entity. The

term database design can used to describe many different parts of the design of an overall

database system. Principally, and most correctly, it can be thought of as the logical design of the

base data structures used to store the data. In the relational model these are the tables and views.

http://en.wikipedia.org/wiki/File:Object-Oriented_Model.jpg

27

In an object database the entities and relationships map directly to object classes and named

relationships. However, the term database design could also be used to apply to the overall

process of designing, not just the base data structures, but also the forms and queries used as part

of the overall database application within the database management system (DBMS). The

process of doing database design generally consists of a number of steps which will be carried

out by the database designer. Usually, the designer must:

* Determine the relationships between the different data elements.

* Superimpose a logical structure upon the data on the basis of these relationships

2.5.1 The Design Process

The design process consists of the following steps

Determine the purpose of your database - This helps prepare you for the remaining

steps.

Find and organize the information required - Gather all of the types of information

you might want to record in the database, such as product name and order number.

Divide the information into tables - Divide your information items into major

entities or subjects, such as Products or Orders. Each subject then becomes a table.

Turn information items into columns - Decide what information you want to store

in each table. Each item becomes a field, and is displayed as a column in the table.

For example, an Employees table might include fields such as Last Name and Hire

Date.

Specify primary keys - Choose each table’s primary key. The primary key is a

column that is used to uniquely identify each row. An example might be Product ID

or Order ID.

Set up the table relationships - Look at each table and decide how the data in one

table is related to the data in other tables. Add fields to tables or create new tables to

clarify the relationships, as necessary.

28

Refine your design - Analyze your design for errors. Create the tables and add a few

records of sample data. See if you can get the results you want from your tables.

Make adjustments to the design, as needed.

Apply the normalization rules - Apply the data normalization rules to see if your

tables are structured correctly. Make adjustments to the tables, as needed.

2.5.2 TYPES OF DATABASE DESIGN

CONCEPTUAL SCHEMA

Once a database designer is aware of the data which is to be stored within the database, they

must then determine where dependency is within the data. Sometimes when data is changed you

can be changing other data that is not visible. For example, in a list of names and addresses,

assuming a situation where multiple people can have the same address, but one person cannot

have more than one addresses, the name is dependent upon the address, because if the address is

different than the associated name is different too. However, the other way around is different.

One attribute can change and not another.

LOGICALLY STRUCTURING DATA

Once the relationships and dependencies amongst the various pieces of information have been

determined, it is possible to arrange the data into a logical structure which can then be mapped

into the storage objects supported by the database management system. In the case of relational

databases the storage objects are tables which store data in rows and columns. Each table may

represent an implementation of either a logical object or a relationship joining one or more

instances of one or more logical objects. Relationships between tables may then be stored as

links connecting child tables with parents. Since complex logical relationships are themselves

tables they will probably have links to more than one parent.

PHYSICAL DATABASE DESIGN

29

The physical design of the database specifies the physical configuration of the database on the

storage media. This includes detailed specification of data elements, data types, indexing options

and other parameters residing in the DBMS data dictionary. It is the detailed design of a system

that includes modules & the database's hardware & software specifications of the system.

2.6 DATABASE MANAGEMENT SYSTEM

Database Management System (DBMS) consists of software that organizes the storage of data.

A DBMS controls the creation, maintenance, and use of the database storage structures of social

organizations and of their users. It allows organizations to place control of organization wide

database development in the hands of Database Administrators (DBAs) and other specialists. In

large systems, a DBMS allows users and other software to store and retrieve data in a structured

way. Database management systems are usually categorized according to the database model that

they support, such as the network, relational or object model. The model tends to determine the

query languages that are available to access the database. One commonly used query language

for the relational database is SQL, although SQL syntax and function can vary from one DBMS

to another. A common query language for the object database is OQL, although not all vendors

of object databases implement this, majority of them do implement this method. A great deal of

the internal engineering of a DBMS is independent of the data model, and is concerned with

managing factors such as performance, concurrency, integrity, and recovery from hardware

failures. In these areas there are large differences between the products. The database holds the

records, fields, cells with data. The DBMS is the "tools" used to manipulate the data in the

database. An example of a DBMS is SQL Server. The DBMS is the database engine itself. This

is the core, the actual program that contains the logic to execute your SQL statements. Performs

that actual data access based upon these commands (relationally).

2.6.1 COMPONENTS OF DBMS

30

DBMS Engine accepts logical request from the various other DBMS subsystems,

converts them into physical equivalents, and actually accesses the database and data

dictionary as they exist on a storage device.

Data Definition Subsystem helps user to create and maintain the data dictionary and

define the structure of the files in a database.

Data Manipulation Subsystem helps user to add, change, and delete information in a

database and query it for valuable information. Software tools within the data

manipulation subsystem are most often the primary interface between user and the

information contained in a database. It allows user to specify its logical information

requirements.

Application Generation Subsystem contains facilities to help users to develop

transaction-intensive applications. It usually requires that user perform a detailed series of

tasks to process a transaction. It facilitates easy-to-use data entry screens,

programming languages, and interfaces.

Data Administration Subsystem helps users to manage the overall database

environment by providing facilities for backup and recovery, security management,

query optimization, concurrency control, and change management.

2.6.2 PRIMARY TASKS OF DBMS PACKAGES

Database Development: used to define and organize the content, relationships, and

structure of the data needed to build a database.

Database Interrogation: can access the data in a database for information retrieval and

report generation. End users can selectively retrieve and display information and

produce printed reports and documents.

Database Maintenance: used to add, delete, update, correct, and protect the data in a

database.

31

Application Development: used to develop prototypes of data entry screens, queries,

forms, reports, tables, and labels for a prototyped application. Or use 4GL or 4th

Generation Language or application generator to develop program codes.

2.6.3 DBMS FEATURES AND CAPABILITIES

Alternatively, and especially in connection with the relational model of database management,

the relation between attributes drawn from a specified set of domains can be seen as being

primary. For instance, the database might indicate that a car that was originally "red" might fade

to "pink" in time, provided it was of some particular "make" with an inferior paint job. Such

higher entity relationships provide information on all of the underlying domains at the same

time, with none of them being privileged above the others. Thus, the DBMSs of today roll

together frequently needed services or features of attribute management. By externalizing such

functionality to the DBMS, applications effectively share code with each other and are relieved

of much internal complexity. Features commonly offered by database management systems:

QUERY ABILITY

Querying is the process of requesting attribute information from various perspectives and

combinations of factors. Example: "How many 2-door cars in Texas are green?" A

database query language and report writer allow users to interactively interrogate the

database, analyze its data and update it according to the users privileges on data.

BACKUP AND REPLICATION

Copies of attributes need to be made regularly in case primary disks or other equipment

fails. A periodic copy of attributes may also be created for a distant organization that

cannot readily access the original. DBMS usually provide utilities to facilitate the process

32

of extracting and disseminating attribute sets. When data is replicated between database servers,

so that the information remains consistent throughout the database system and users cannot tell

or even know which server in the DBMS they are using, the system is said to exhibit replication

transparency.

RULE ENFORCEMENT

Often one wants to apply rules to attributes so that the attributes are clean and reliable. For

example, we may have a rule that says each car can have only one engine associated with it

(identified by Engine Number). If somebody tries to associate a second engine with a given car,

we want the DBMS to deny such a request and display an error message. However, with changes

in the model specification such as, in this example, hybrid gas-electric cars, rules may need to

change. Ideally such rules should be able to be added and removed as needed without significant

data layout redesign.

SECURITY

Often it is desirable to limit who can see or change which attributes or groups of attributes. This

may be managed directly by individual, or by the assignment of individuals and privileges to

groups, or (in the most elaborate models) through the assignment of individuals and groups to

roles which are then granted entitlements.

COMPUTATION

There are common computations requested on attributes such as counting, summing, averaging,

sorting, grouping, cross-referencing, etc. Rather than have each computer application implement

these from scratch, they can rely on the DBMS to supply such calculations.

33

CHANGE AND ACCESS LOGGING

Often one wants to know who accessed what attributes, what was changed, and when it was

changed. Logging services allow this by keeping a record of access occurrences and changes.

AUTOMATED OPTIMIZATION

If there are frequently occurring usage patterns or requests, some DBMS can adjust themselves

to improve the speed of those interactions. In some cases the DBMS will merely provide tools to

monitor performance, allowing a human expert to make the necessary adjustments after

reviewing the statistics collected.

2.7 RELATIONAL DATABASE MANAGEMENT SYSTEM

RDBMS stands for Relational Database Management System. RDBMS data is structured in

database tables, fields and records. Each RDBMS table consists of database table rows. Each

database table row consists of one or more database table fields. RDBMS store the data into

collection of tables, which might be related by common fields (database table columns). RDBMS

also provide relational operators to manipulate the data stored into the database tables. Most

RDBMS use SQL as database query language. Edgar Codd introduced the relational database

model. Many modern DBMS do not conform to the Codd’s definition of a RDBMS, but

nonetheless they are still considered to be RDBMS. The most popular RDBMS are MS SQL

Server, DB2, Oracle and MySQL.

2.7.1 RDBMS COMPONENTS

INTERFACE DRIVERS - A user or application program initiates either schema

modification or content modification. These drivers are built on top of SQL. They provide

34

methods to prepare statements execute statements, fetch results, etc. Examples include DDL,

DCL, DML, ODBC, and JDBC. Some vendors provide language-specific proprietary

interfaces. For example MySQL and Firebird provide drivers for PHP, Python

SQL ENGINE - This component interprets and executes the SQL query. It comprises three

major components (compiler, optimizer, and execution engine).

TRANSACTION ENGINE - Transactions are sequences of operations that read or write

database elements, which are grouped together.

RELATIONAL ENGINE - Relational objects such as Table, Index, and Referential

integrity constraints are implemented in this component.

STORAGE ENGINE - This component stores and retrieves data records. It also provides a

mechanism to store metadata and control information such as undo logs, redo logs, lock

tables, etc.

2.8 DATABASE - ADVANTAGES & DISADVANTAGES

ADVANTAGES

Reduced data redundancy

Reduced updating errors and increased consistency

Greater data integrity and independence from applications programs

Improved data access to users through use of host and query languages

Improved data security

35

Reduced data entry, storage, and retrieval costs

Facilitated development of new applications program

DISADVANTAGES

Database systems are complex, difficult, and time-consuming to design

Substantial hardware and software start-up costs

Damage to database affects virtually all applications programs

Extensive conversion costs in moving form a file-based system to a database

system

Initial training required for all programmers and users

36

CHAPTER-3: ORACLE

ARCHITECTURAL COMPONENTS

AND DATABASE CREATION

3.1 OVERVIEW OF PRIMARY COMPONENTS

37

Overview of Primary Components

38

1. Oracle Server:- There are several files, process, and memory structures in an

Oracle server, however, not all of them are used when processing a SQL

statement. Some are used to improve the performance of the database, ensure that

the database can be recovered in the event of a software or hardware error, or

perform other tasks necessary to maintain the database. The Oracle server consists

of an Oracle instance and an Oracle database.

2. Oracle Instance: - An Oracle instance is the combination of the background

processes and memory structure. The instance must be started to access the data in

the database. Every time an instance is started, a System Global Area (SGA) is

allocated and Oracle background processes are started. Background processes

perform functions on behalf of the invoking process. They consolidate functions

that would otherwise be handled by multiple Oracle programs running for each

user. The background processes perform input/output (I/O) and monitor other

Oracle processes to provide increase parallelism for better performance and

reliability.

3. Oracle Database: - An Oracle database consists of operating system files, also

known as database files, that provide the actual physical storage for database

information. The database files are used to ensure that the data is kept consistent

and can be recovered in the event of a failure of the instance.

4. Other Key Files: - Non-database files are used to configure the instance,

authenticate privileged users, and recover the database in the event of a disk

failure.

5. User and Server Processes: - The user server processes are the primary processes

involved when a SQL statement is executed; however, other processes may help

the server complete the processing of the SQL statement.

6. Other Processes:- Many other processes exist that are used options within

Oracle, such as Advanced Queuing, Real Application Clusters, Shared Server,

Advanced Replication, and so on.

3.2 ORACLE SERVER

39

An Oracle Server:

Is a database management system that provides an open, comprehensive, integrated

approach to information management.

Consists of an Oracle Instance and an Oracle Database.

The Oracle server run on a different computers in one of the following ways:

1. Client-Application Server-server

2. Client-Server

3. Host-Based

40

1. Client-Application Server-Server:- (Three-Tier) Users access the database from

their personal computers (client) through an application server, which is used for

the application’s requirements.

2. Client-Server:- (Two-Tier) Users access the database from their personal

computer (client) over a network, and the database sits on a seprate computer

(server).

3. Host-Based:- Users are connected directly to the same computer on which the

database resides.

3.3 ORACLE INSTANCE

An Oracle Instance:

Is a means to access an oracle database

Always opens one and only one database

Consists of memory and background process structures

41

An Oracle instance consists of the System Global Area (SGA) memory structure and the

background processes used to manage a database. An instance is identified by using methods

specific to each operation system. The instance can open and use only one database at a time.

3.4 ESTABLISING A CONNECTION AND CREATING A SESSION

Connection to an Oracle Instance:

Establishing a user connection

Creating a session

42

Connection to an Oracle Instance

Before users can submit SQL statements to an Oracle database, they must connect to an instance.

The user starts a tool such as SQL*Plus or runs an application developed using a tool

such as Oracle Forms. This application or tool is executed as a user process.

In the most basic configuration, when a user logs on to the Oracle Server, a process is

created on the computer running the Oracle Server. This process is called a server

process. The server process communicates with the Oracle instance on the behalf of the

user process that runs on the client. The server process executes SQL statements on

behalf of the user.

Connection

A connection is a communication pathway between a user process and an Oracle Server. A

database user can connect to an Oracle Server in one of the three ways:

1. The user log on to the operating system running the Oracle instance and

starts an application or tool that accesses the database on that system. The

communication pathway is established using the interprocess

communication mechanisms available on the host operating system.

2. The user starts the application or tool on a local computer and connects

over a network to the computer running the Oracle instance. In this

configuration, called client-server, network software is used to

communicate between the user and the Oracle Server.

3. In a three-tier connection, the user’s computer communicates over the

network to an application or a network server, which is connected through

a network to the machine running the Oracle instance. For example, the

user runs a browser on a network to use an application residing on an NT

server that retrieves data from an Oracle database running on a UNIX

host.

43

Session

A session is a specific connection of a user to an Oracle server. The session starts when the user

is validated by the Oracle Server, and it ends when the user logs out or when there is an

abnormal termination. For a given database user, many concurrent sessions are possible if the

user logs on from many tools, applications, or terminals at the same time. Except for some

specialized database administration tools, starting a database session requires that the Oracle

Server be available for use.

3.5 ORACLE DATABASE

An Oracle database:

Is a collection of data that is treated as a unit

Consists of three files types

44

An Oracle Database

The general purpose of a database is to store and retrieve information. An Oracle database has a

logical and a physical structure. The physical structure of the database is the set of operating

system files in the database. An Oracle database consists of three file types:

1. Data files containing the actual data in the database.

2. Redo logs containing a record of changes made to the database to enable recovery

of the data in case of failures.

3. Control files containing information necessary to maintain and verify database

integrity.

Other Key File Structures

The Oracle Server also files that are not part of the database:

The parameter file defines the characteristics of an Oracle instance. For example, it

contains parameters that size some of the memory structure in the SGA.

The password file authenticates users privileged to starts up and down an Oracle instance.

Archived redo log files are offline copies of the redo log files that may be necessary to

recover from media failures.

3.6 PHYSICAL STRUCTURE

The physical structure includes three types of files:

Control files

Data files

Redo log files

Other keys files exits and are required to start up and use a database, for example: parameter

files, configuration files, password files and so on. However, the physical structure of an Oracle

database includes only three types of files: control files, data files, and redo log files.

45

3.7 MEMORY STRUCTURE

Oracle’s memory structure consists of two memory areas known as:

1. System Global Area (SGA) : Allocated at instance startup, and is a fundamental

component of an Oracle instance

2. Program Global Area (PGA) : Allocated when the server process is started

SGA is used to store database information that is shared by database processes. It contains data

and control information for the Oracle Server and is allocated in the virtual memory of the

computer where Oracle resides.

46

The following statement can be used to view SGA memory allocation

SHOW SGA;

Sizing the SGA

The size of SGA is determined by several initialization parameters. The parameters that most

affects SGA are:

DB_CACHE_SIZE: The size of the cache of standard blocks.

LOG_BUFFER: The number of bytes allocates for the redo log buffer cache.

SHARED_POOL_SIZE: The size in bytes of the area devoted to shared SQL and PL/SQL.

LARGE_POOL_SIZE: The size of the large pool; the default size is zero.

Dynamic SGA

A dynamic SGA implements an infrastructure that allows the SGA configuration to change

without shutting down the instance. This then allows the size of the database buffer cache, shared

pool, and large pool to changes without shutting down the instance. Conceivably, the database

buffer cache, shared pool, and large pool could be initially under configured and would grow and

shrink depending upon their respective workloads, up to a maximum of

SGA_MAX_SIZE.

3.7.1 System Global Area (SGA)

The SGA consists of several memory structures:

Shared pool

Database Buffer Cache

Redo Log buffer

Other structures (for example, lock and latch management, statistical data)

47

There are two additional memory structures that can be configured within the SGA:

Large Pool

Java Pool

SGA is dynamic

Sized by the SGA_MAX_SIZE parameter

Allocated and tracked in granules by SGA components

Contiguous virtual memory allocation

Granules size based on total estimated SGA_MAX_SIZE

Unit of Allocation

A granule is a unit of contiguous virtual memory allocation. The size of a granule depends on the

estimated total SGA size whose calculation is based on the value of the parameter

SGA_MAX_SIZE

- 4 MB if estimated SGA size is < 128 MB

- 16 MB otherwise

3.7.2 SHARED POOL

Used to store:

Most recently executed SQL statements

Most recently used data definitions

It consists of two key performance-related memory structures:

Library Cache

Data Dictionary Cache

48

Sized by the parameter

SHARED_POOL_SIZE

The shared pool environment contains both fixed and variable structure. The fixed structures

remain relatively the same size, whereas the variable structures grow and shrink based on user

and program requirements. The actual sizing for the fixed and variable structures is based on an

initialization parameter and the work of an Oracle internal algorithm.

Sizing the Shared Pool

Since the shared pool is used for objects that can be shared globally, such as reusable SQL

execution plans; PL/SQL packages, procedures, and functions; and cursor information, it must be

sized to accommodate the needs of both the fixed and variable areas. Memory allocation for the

shared pool is determined by the SHARED_POOL_SIZE initialization parameter. It can be

dynamically resized using ALTER SYSTEM SET. After performance analysis, this can be

adjusted but the total SGA size cannot exceed SGA_MAX_SIZE.

3.7.3 LIBRARY CACHE

Stores information about the most recently used SQL and PL/SQL statements

Enables the sharing of commonly used statements

Is managed by a least recently used (LRU) algorithm

49

Consists of two structures:

Shared SQL area

Shared PL/SQL area

Size determine by the Shared Pool sizing.

The library cache size is based on the sizing defined for the shared pool. Memory is allocated

when a statement is parsed or a program unit is called. If the size of the shared pool is too small,

statements are continually reloaded into the library cache, which affects performance. The library

cache is managed by a least recently used (LRU) algorithm. As the cache fills, less recently used

execution paths and parse trees are removed from the library cache to make room for the new

entries. If the SQL or PL/SQL statements are not reused, they eventually are aged out.

The library cache consists of two structures:

Shared SQL:- The shared SQL stores and shares the execution plan and parse tree for

SQL statements run against the database. The second time that an identical SQL

statement is run; it is able to take advantage of the parse information available in the

shared SQL to expedite its execution. To ensure that SQL statements use a shared SQL

area whenever possible, the text, schema, and bind variables must be exactly the same.

Shared PL/SQL: - The shared PL/SQL area stores and shares the most recently executed

PL/SQL statements. Parsed and compiled program units and procedures are stored in this

area.

3.7.4 DATA DICTIONARY CACHE

A collection of the most recently used definitions in the database

Includes information about database files, tables, indexes, columns, users, privileges, and

other database objects

During the parse phase, the server process look at the data dictionary for information to

resolve object names and validate access

Caching data dictionary information into memory improves response time on queries and

DML

50

Size determine by the Shared Pool sizing.

The data dictionary cache is also referred to as the dictionary cache or row cache, caching data

dictionary information into memory improves response time. Information about the database is

stored in the data dictionary tables. When this information is needed by the database, the data

dictionary tables are read, and the data that is returned is stored in the data dictionary cache.

Sizing the Data Dictionary

The overall size is dependent on the size of the shared pool size and is internally by the database.

If the data dictionary cache is too small, then the database has to query the data dictionary tables

repeatedly for information needed by the database. These queries are called recursive calls and

are slower than the queries that are handled by the data dictionary cache.

3.7.5 DATABASE BUFFER CACHE

Stores copies of data blocks that have been retrieved from the data files

Enables great performance gains when you obtain and update data

Managed through an LRU algorithm

DB_BLOCK_SIZE determine primary block size.

When a query is processed, the Oracle server process looks in the database buffer cache for any

blocks it needs. If the block is not found in the database buffer cache, the server process reads the

block from the data files and places a copy in the database buffer cache. Because subsequent

require physical reads. The Oracle server uses a least recently used algorithm to age out buffers

that have not been accessed recently to make room for new blocks in the database buffer cache.

51

Database Buffer Cache consists of independent sub-caches

DB_CACHE_SIZE

DB_KEEP_CACHE_SIZE

DB_RECYCLE_CACHE_SIZE

Can alter dynamically resized

ALTER SYSTEM SET DB_CACHE_SIZE=96M;

DB_CACHE_ADVICE set to gather statistics for predicting different cache size

behavior

Statistics displayed by V$DB_CACHE_ADVICE

Sizing the Buffer Cache

The size of buffer cache is equal to the size of an Oracle block, and it is specified by the

DB_BLOCK_SIZE parameter. The database buffer cache consists of independent sub-caches for

buffer pools and for multiple block sizes. The parameter DB_BLOCK_SIZE determines the

primary block size, which is used for the SYSTEM tablespace.

Three parameters define the sizes of the buffer caches:

52

1. DB_BLOCK_SIZE:- Sizes the default buffer cache size only, it always exists and

cannot be set to zero.

2. DB_KEEP_CACHE:- Sizes the keep buffer cache, which s used to retain blocks in

memory that are likely to be reused.

3. DB_RECYCLE_CACHE_SIZE:- Sizes the recycle buffer cache, which is used to

eliminate blocks from memory that have little change of being reused.

Multiple Block Sizes

An Oracle database can be created with a standard block size and up to four non-standard block

sizes. Non-standard block sizes can have any power-of-two values between 2 KB and 32 KB.

Data Buffer Advisory Parameter

The buffer cache advisory feature enables and disables statistics gathering for predicting

behavior with different cache sizes. The information provided by these statistics can help BDA

size the buffer cache optimally for a given workload. The buffer cache advisory information is

collected and displayed through the V$DB_CACHE_ADVICE view.

The buffer cache advisory is enabled via the initialization parameter DB_CACHE_ADVICE.

It is a dynamic parameter via ALTER SYSTEM. Three values ( OFF, ON, READY) are

available.

DB_CACHE_ADVICE Parameter Values

OFF:- Advisory is turned off and the memory for the advisory is not allocated

ON:- Advisory is turned on and both cpu and memory overhead is incurred.

READY:- Advisory is turned off but the memory for the advisory remains allocated.

53

3.7.6 REDO LOG BUFFER

Records all changes made to the database blocks

Primary purpose is recovery

Changes recorded within are called redo entries

Redo entries contain information to reconstruct or redo changes

Size defined by LOG_BUFFER

The redo log buffer cache is a circular buffer that contains changes made to data files blocks.

This information is stored in redo entries. Redo entries contain the information necessary to

recreate the data prior to the change made by INSERT, UPDATE, CREATE, ALTER, or DROP

operation.

Sizing the Redo Log Buffer Cache

The size of the redo log buffer cache is defined by the initialization parameter LOG_BUFFER.

54

3.7.7 LARGE POOL

An optional area of memory in the SGA

Relieves the burden placed on the Shared Pool

Used for:

Session memory (UGA) for the Shared Server

I/O server processes

Backup and restore operations or RMAN

Parallel execution message buffers

PARALLEL_AUTOMATIC_TUNING set to TRUE

Does not use an LRU list

Sized by LARGE_POOL_SIZE

When users connect through the shared server, Oracle needs to allocate additional space in the

shared pool for storing information about the connections between the user processes, dispatches,

and servers. The large pool relieves the burden on areas within the shared pool. The shared pool

does not have to give up memory for caching SQL parse trees in favor f shared server session

information, I/O, and backup and recovery processes. The performance gain is from the

reduction of overhead from increasing and shrinking of the shared SQL cache.

Backup and Restore:- Recovery Manager (RMAN) uses the large pool when the

BACKUP_DISK_IO=n and BACKUP_TAPE_IO_SLAVE=TRUE parameters are set. If the

large pool is configured but is not large enough, the allocation of memory from the large pool

fails. RMAN writes an error message to the alert log file and does not use I/O slaves for backup

or restore.

Sizing the Large Pool

The large pool is sized in bytes defined by the LARGE_POOL_SIZE parameter.

55

3.7.8 JAVA POOL

Services parsing requirements for JAVA commands

Required if installing and using JAVA

Sized by JAVA_POOL_SIZE parameter

The Java pool is an optional setting but is required if installing and using Java. Its size is set, in

bytes, using the JAVA_POOL_SIZE parameter.

3.7.9 PROGRAM GLOBAL AREA (PGA)

Memory reserved for each user process connecting to an Oracle database

Allocated when a process is created

De-allocated when the process is terminated

Used by only one process

56

The Program Global Area or Process Global Area (PGA) is a memory region that

contains data and control information for a single server process or a single background

process. The PGA is allocated when a process is created and de-allocated when the

process is terminated. In contrast to the SGA, which is shared by several processes, the

PGA is an area that is used by only one process. In dedicated server configuration, the

PGA included these components:

Sort Area: - Used for any sorts that may be required to process the SQL

statement.

Session Information: - Includes user privileges and performance statistics for the

session.

Cursor State: - Indicates the stage in the processing of the SQL statements that

are currently used by the session.

Stack Space: - Contains other session variables.

3.7.10 PROCESS STRUCTURE

Oracle takes advantage of various types of processes:

USER PROCESS: Started at the time a database user requests connection to the Oracle

server

SERVER PROCESS: Connects to the Oracle Instance and is started when a user

establishes a session

BACKGROUND PROCESSES: Started when as Oracle Instance is started

3.7.10.1 USER PROCESS

A program that requests interaction with the Oracle server

Must first establish a connection

Does not interact directly with the Oracle server

57

A database user who needs to request information from the database must first make a

connection with the Oracle server. The connection is requested using a database interface tool,

such as SQL*PLUS, and beginning the user process. The user process does not interact directly

with the Oracle server. Rather it generates call through the user program interface (UPI), which

creates a session and starts a server process.

3.7.10.2 SERVER PROCESS

A program that directly interacts with the Oracle server

Fulfills calls generated and returns results

Can be dedicated or Shared Server

58

Once a user has established a connection, a server process is started to handle the user processes

request. A server process can be either a dedicated server process or a shared server process. In a

dedicated server environment, the server process is terminated. In a shared server process

communication with the Oracle server using the Oracle Program Interface (OPI).

3.7.11 BACKGROUND PROCESSES

Maintains and enforces relationships between physical and memory structures

Mandatory background process:

DBWn

PMON

CKPT

LGWR

SMON

Optional background processes:

ARCn

59

LMDn

RECO

CJQ0

LMON

Snnn

Dnnn

Pnnn

LCKn

QMNn

3.7.11.1 DATABASE WRITER (DBWn)

DBWn writes when:

Check point occurs

Dirty buffers reach threshold

There are no free buffers

Timeout occurs

RAC ping request is made

Tablespace OFFLINE

Tablespace READ ONLY

Table DROP or TREUNCATE

Tablespace BEGIN BACKUP

The server process records changes to rollback and data blocks in the buffer cache. Database

Writer (DBWn) writes the dirty buffers from the database buffer cache to the data files. It

ensures that a sufficient number of free buffers-buffers that can be overwritten when server

processes need to read in blocks from the data files-are available in the database buffer cache.

Database performance is improved because server processes make changes only in the buffer

cache.

60

3.7.11.2 LOG WRITER (LGWR)

LGWR writes:

At commit

When one-third full

When there is 1 MB of redo

Every three seconds

Before DBWn writes

Because the redo is needed for recovery, LGWR confirms the commit only after the redo is

written to disk.

LGWR can also call on DBWn to write to the data files.

61

3.7.11.3 SYSTEM MONITOR (SMON)

Responsibilities:

Instance recovery

Rolls forward changes in redo logs

Opens database for user access

Rolls back uncommitted transactions

Coalesce free space

De-allocates temporary segments

If the Oracle instance fails, any information in the SGA that has not been written to disk is lost.

For example, the failure of the operating system causes an instance failure. After the loss of the

instance, the background process SMON automatically performs instance recovery when the

database is reopened. Instance recovery consists of the following steps:

1. Rolling forward to recover data that has not been recorded in the files but that has been

recorded in the online redo log. This data has been written to disk because of the loss of

62

the SGA during instance failure. During this process, SMON reads the redo log files and

applies the changes recorded in the redo log to the data blocks. Because all committed

transaction have been written to the redo logs, this process completely recovers these

transactions.

2. Opening the database so that users can log on. Any data that is not locked by unrecorded

transactions is immediately available.

3. Rolling back uncommitted transactions. They are rolled back by SMON or by the

individual server processes as they access locked data.

3.7.11.4 PROCESS MONITOR (PMON)

Cleans up after failed processes by:

Rolling back the transaction

Releasing locks

Releasing other resources

Restarting dead dispatchers

63

3.7.11.5 CHECKPOINT (CKPT)

Responsible for:

Signaling DBWn at checkpoints

Updating data files headers with checkpoint information

Updating control files with checkpoint information

64

An event is called a checkpoint occurs when the Oracle background process DBWn writes all the

modified database buffers in the SGA, including both committed and uncommitted data, to the

data files.

Checkpoints are implemented for the following reasons:

Checkpoints ensure that data blocks in memory that changes frequently are written to

data files regularly. Because of the least recently used algorithm of DBWn, a data block

that changes frequently might never qualify as the least recently used block and thus

might never be written to disk if checkpoints did not occur.

Because all database changes up to the checkpoint have been recorded in the data files,

redo log entries before the checkpoint no longer need to be applied to the data files if

instance recovery is required. Therefore, checkpoints are useful because they can

expedite instance recovery.

At a checkpoint, the following information is written:

Checkpoint number into the data file headers

Checkpoint number, log sequence number, archive log name, and system change numner

into the control files.

CKPT does not write data blocks to disk or redo blocks to the online redo logs.

3.7.11.6 ARCHIVER (ARCn)

Optional background process

Automatically archives online redo logs when ARCHIVELOG mode is set

Preserves the record of all changes made to the database

65

The Archive Process

All other background processes are optional, depending on the configuration of the database;

however, one of them, ARCn, is crucial to recovering a database after the loss of a disk. As

online redo log files fill, the Oracle server begins writing to the next online redo log file. The

process of switching from one redo log to another is called a log switch. The ARCn process

initiates backing up, or archiving, of the filled log group at every log switch. It automatically

archives the online redo log before the log can be reused, so that all of the changes made to the

database are preserved. This enables the DBA to recover the database to the point of failure, even

if a disk drive is damaged.

Archiving Redo Log Files

One of the important decisions that a DBA has to make is whether to configure the database to

operate in ARCHIVELOG or in NOARCHIVELOG mode.

NOARCHIVELOG MODE :- In NONARCHIVELOG mode, the online redo log files are

overwritten each time a log switch occurs. LGWR does not overwrite a redo log group until the

checkpoint for that group is complete. This ensures that committed data can be recovered if there

is an instance crash. During the instance crash, only the SGA is lost. There is no loss of disks,

only memory. For example, an operating system crash causes an instance crash.

ARCHIVELOG MODE:- If the database is configured to run in ARCHIVELOG mode,

inactive group of filled online redo log files must be archived before they can be used again.

Since changes made to the database are recovered in the online redo log files, the database

administrator can use the physical backup of the data files and the archived online redo log files

to recover the database without losing any committed data because of any single point of failure,

including the loss of a disk. Usually, a production database is configured to run in

ARCHIVELOG mode.

66

3.7.12 LOGICAL STRUCTURE

Dictates how the physical space of a database is used

Hierarchy consisting of tablespaces, segments, extents, and blocks

A logical structure hierarchy exists as follows:

An Oracle database is a group of tablespaces.

A tablespace may consist of one or more segments.

A segment is made up of extents.

An extent is made up of logical blocks.

Blocks is the smallest unit for read and write operation.

3.7.13 PROCESSING SQL STATEMENTS

Connect to an instance using:

User process

Server process

67

The Oracle server components that are used depends on the type of SQL statements:

Queries returns rows

DML statements log changes

Commit ensures transaction recovery

Some Oracle server components do not participate in SQL statements processing.

A data manipulation language (DML) statement requires only two phases of processing,

1. Parse is the same as the parse phase used for processing a query.

2. Execute requires additional processing to make data changes.

DML Execute Phase

To execute a DML statement:

If the data and rollback blocks are not already in the buffer cache, the server process

reads them from the data files into the buffer cache.

The server process place locks on the rows that are to be modified. The rollback block is

used to store the before-image of the data, so that the DML statements can be rolled back

if necessary.

The data block records the new values of the data.

Processing a DML Statement

The server process records the before-image to the undo block and updates the data block.

Both of these changes are done in the database buffer cache. Any changed blocks in the buffer

cache are marked as dirty buffers. That is, buffers that are not same as the corresponding block

on the disk.

The processing of a DELETE or INSERT command uses similar steps. The before-image for a

DELETE contains the column values in the deleted rows, and the before image of an INSERT

contains the rows location information.

68

3.7.14 DATABASE CREATION

Before creating the database we have to define some parameters according to computer

configuration so that performance will increase and less I/O will use. If we can’t configure

parameters than Oracle will automatically take the default values.

Db_name=lalit

Db_block_size=8192

Control_files=’G:\oracle\lalit\control.ctl’

Shared_pool_size=200m

Sga_target=300m

Undo_management=auto

Undo_tablespace=undo1

In this, location is given as per the user need. User can choose any of the location to store his

parameter files. If he can’t choose any location than parameter file will store on default location

in DATABASE folder in Oracle. Parameter file save with “initdatabe_name.ora” extension.

When we start creating database first of all control files will read ad than database will created.

69

Along with this our parameter files will generates whose non-default values we already sets.

70

But some files in the Oracle a DBA can’t change so he has to create a view of that tables.

@?/rdbms/admin/catalog.sql

But with the above command view is created but we can’t read from that view because all the

data is in binary form so DBA has to use the command after this that will decode all the view in

user understandable from.

@?/rdbms/admin/catproc.sql

71

CHAPTER-4: INTRODUCTION TO DATABASE MIGRATION

72

INTRODUCTION TO DATA MIGRATION

DATA MIGRATION

Data Migration is the process of transferring data between storage types, formats, or computer

systems. Data migration is usually performed programmatically to achieve an automated

migration, freeing up human resources from tedious tasks. It is required when organizations or

individuals change computer systems or upgrade to new systems, or when systems merge (such

as when the organizations that use them undergo a merger/takeover).To achieve an effective data

migration procedure, data on the old system is mapped to the new system providing a design for

data extraction and data loading. The design relates old data formats to the new system's formats

and requirements. Programmatic data migration may involve many phases but it minimally

includes data extraction where data is read from the old system and data loading where data is

written to the new system. If a decision has been made to provide a set input file specification for

loading data onto the target system, this allows a pre-load 'data validation' step to be put in place,

interrupting the standard E(T)L process. Such a data validation process can be designed to

interrogate the data to be migrated, to ensure that it meets the pre-defined criteria of the target

environment, and the input file specification. An alternative strategy is to have on-the fly data

validation occurring at the point of loading, which can be designed to report on load rejection

errors as the load progresses. However, in the event that the extracted and transformed data

elements are highly 'integrated' with one another, and the presence of all extracted data in the

target system is essential to system functionality, this strategy can have detrimental, and not

easily quantifiable effects. After loading into the new system, results are subjected to data

verification to determine whether data was accurately translated, is complete, and supports

processes in the new system. During verification, there may be a need for a parallel run of both

systems to identify areas of disparity and forestall erroneous loss. Automated and manual data

cleaning is commonly performed in migration to improve data quality, eliminate redundant or

obsolete information, and match the requirements of the new system. Data migration phases

(design, extraction, cleansing, load, verification) for applications of moderate to high complexity

are commonly repeated several times before the new system is deployed.

73

4.1 DATA MIGRATION CATEGORIES

Data is stored on various media in files or databases, is generated and consumed by software

applications which in turn support business processes. The need to migrate data can be driven by

multiple business requirements and the approach taken to the migration depends on those

requirements. Three major migration categories are proposed on this basis.

STORAGE MIGRATION

A business may choose to rationalize the physical media to take advantage of more efficient

storage technologies. This will result in having to move physical blocks of data from one tape or

disk to another, often using virtualization techniques. The data format and content itself will not

usually be changed in the process and can normally be achieved with minimal or no impact to

the layers above.

DATABASE MIGRATION

Similarly, it may be necessary to move from one database vendor to another, or to upgrade the

version of database software being used. The latter case is less likely to require a physical data

migration, but this can happen with major upgrades. In these cases a physical transformation

process may be required since the underlying data format can change significantly.

APPLICATION MIGRATION

Changing application vendor - for instance a new CRM or ERP platform will inevitably involve

substantial transformation as almost every application or suite operates on its own specific data

model. Further, to allow the application to be sold to the widest possible market, commercial off-

the-shelf packages are generally configured for each customer using metadata. Application

programming interfaces (APIs) are supplied to protect the integrity of the data they have to

74

handle. Use of the API is normally a condition of the software warranty, although a waiver may

be allowed if the vendor's own or certified partner professional services and tools are used.

BUSINESS PROCESS MIGRATION

Business processes operate through a combination of human and application systems actions,

often orchestrated by business process management tools. When these change they can require

the movement of data from one store, database or application to another to reflect the changes to

the organization and information about customers, products and operations. Examples of such

migration drivers are Mergers and acquisitions, business optimization and reorganization to

attack new markets or respond to competitive threat.

4.2 DATABASE MIGRATION

Like data migration it may be necessary to move from one database vendor to another, or to

upgrade the version of database software being used. The latter case is less likely to require a

physical data migration, but this can happen with major upgrades. In these cases a physical

transformation process may be required since the underlying data format can change

significantly. This may or may not affect behavior in the applications layer, depending largely on

whether the data manipulation language or protocol has changed - but modern applications are

written to be agnostic to the database technology so that a change from Oracle to MySQL, DB2

or SQL Server should only require a testing cycle to be confident that both functional and non-

functional performance has not been adversely affected. Carefully examine the database and the

data. Determine what you are trying to accomplish with the database. Only then can you plan

what the new database environment should look like. Database migration is necessary when a

company upgrades its database or system software, either from one version to another or from

one program to an entirely different program. Software can be specifically written, using entire

programs or just scripts, to facilitate data migration. Such upgrades or program switches can take

75

place as a result of regular company practices or as a result of directives mandated in the wake of

a company takeover.

4.3 WINDOWS TO LINUX MIGRATION GUIDE

For many companies, migrating from Windows to Linux makes sense. The reasons are

compelling: greater stability and reliability, lower cost, access to application source code, greater

security, and conformity with open standards, according to numerous independent studies and

industry experts. But for those charged with spearheading the actual migration, the project may

seem difficult, unwieldy, or untenable. They may fear unanticipated technological roadblocks as

well as employees reticent to give up the tools they are used to using. Before launching into a

migration, make sure your reasons for migrating are business-driven. "You'll fail if you start a

migration from the ideological side instead," notes Stefan Werden, leader of the Linux

architecture team for Novell SUSE for Europe, the Middle East, and Africa. To make that

decision, conduct a thorough cost-benefit analysis, projected out several years, based on what it

would take to migrate and what it would cost not to migrate--not only financial costs, but other

costs, such as the amount of overall control you want over your organization's software

requirements. With a Windows-to-Linux migration, "nobody can force you to upgrade or deny

you a security fix. You always call the shots, so you are guaranteed to save money and retain

more control in the long run," says Nicholas Petreley, a Linux analyst at IT market intelligence

firm Evans Data Corp. of Santa Cruz, Calif. Of course, reducing costs can be another driving

factor in the decision-making process. According to Forrester Research, 68% of companies in

North America consider lowering overall operating costs a top priority, and Linux is one means

to that end. To achieve success in a Windows to Linux desktop migration, experts advise

planning, planning, and more planning. Before starting, take these STEPS:

GET EXECUTIVE BUY-IN- This step is crucial. Without executive support, your project may

be doomed to failure. By gathering this support before launching your project, you can be sure

will be supported, both financially and otherwise.

76

EXAMINE EXISTING ENVIRONMENT-Also; consider whether it makes sense to invest in

new hardware, or whether you will be migrating with existing hardware.

CHOOSE LINUX VERSION - It doesn't matter much whether you settle on Red Hat, SUSE,

Mandrake, Gentoo, Debian, or another flavor altogether. It's more important that your migration

team is comfortable with the Linux distribution chosen.

IDENTIFY THE EXPERTS -Often, these include transactional workers and developers –

employees who rely on basic applications like Web browsers, productivity suites, and email –

and not workers who routinely use powerful programs like Photoshop or Macintosh-based

applications. "It's about identifying needs or pockets where you can save the greatest amount of

money with minimal disruption," notes Susan Jeffries, marketing manager for the Java Desktop

System at Sun Microsystems Inc.

CONSIDER USING OUTSIDE CONSULTANTS- Unless you or members of your team have

attempted such a migration before, it can be well worth the money to hire an outside consultant.

Doing so can save countless hours of frustration. "When you run into a problem like not being

able to get something to print, someone familiar with Linux will be able to fix it in 10 minutes. If

you don't have such a person, it could take a week to fix the problem," Petreley says. Consultants

charge in various ways--by the hour, or by the project. A reasonable hourly price range for

general consulting with an experienced systems engineer costs $125 to $225 per hour. Total cost

depends on geographic location, the cost of consultants in the area, and the number of

employees, environment and number of applications to migrate. Typically, a migration

consulting service may begin with a workshop, proof of concept and architecture and

implementation review.

Here are some tips you can use to enhance your Windows to Linux migration experience,

along with solutions to some daunting challenges.

77

CHOOSING A LINUX DISTRIBUTION

When it comes to choosing a Linux distribution, there are a dizzying array of choices, including

de facto standard Red Hat, Mandrake, SUSE, Mandrake, a Debian variant, or Fedora, a Red Hat-

and community-supported open source project to build a complete, general purpose operating

system exclusively from free software. Businesses usually can't lose by going with a de facto

standard, which today is SUSE, Red Hat, or Fedora. SUSE is backed by Novell, while the Java

Desktop System is supported by Sun. Both are solid companies, and both distributions also are

solid choices. SUSE can make a great deal of sense for companies used to managing their

desktop environment with systems management tools like ZenWorks and SMS. But for

businesses that have deployed Red Hat servers and have experience with the Red Hat network,

extending their IT infrastructure with Red Hat desktops also can make good sense.

CHOOSING A WINDOW MANAGER

Unlike Windows, Linux supports GNOME and KDE, as well as a wide variety of window

managers. Red Hat makes GNOME and KDE look and work alike. While GNOME has the

fewest features and is least likely to confuse users with options, KDE is more feature-rich and

powerful, but is so configurable that it could some to spend too much time working with it.

DATABASE MIGRATION

Carefully examine the database and the data. Determine what you are trying to accomplish with

the database. Only then can you plan what the new database environment should look like.

Microsoft Access databases can be ported fairly easily to MySQL Database Server and

PostgreSQL. In situations where migration problems occur, some enterprising vendors are

creating solutions, such as Versora's ProgressionDB, which deals with some of the most common

conversion issues. Another option is Shareplex from Quest Software Inc. of Irvine, Calif. This

78

data replication software, mostly used to migrate Oracle to Linux, provides support for thousands

of rows per second for load distribution, disaster recovery, and migrations.

4.4 ORACLE RECOVERY MANAGER (RMAN)

Oracle Recovery Manager satisfies the most pressing demands of per formant, manageable

backup and recovery, for all Oracle data formats. A complete high availability and disaster

recovery strategy requires dependable data backup, restore, and recovery procedures. Oracle

Recovery Manager (RMAN) provides a comprehensive foundation for efficiently backing up and

recovering the Oracle database. It is designed to work intimately with the server, providing

block-level corruption detection during backup and restore. RMAN optimizes performance and

space consumption during backup with file multiplexing and backup set compression, and

integrates with Oracle Secure Backup, as well as third party media management products, for

tape backup. RMAN takes care of all underlying database procedures before and after backup or

restore, freeing dependency on OS and SQL*Plus scripts. It provides a common interface, via

command line and Enterprise Manager, for backup tasks across different host operating systems

and offers features not available through user-managed methods, such as parallelization of

backup/restore data streams, backup files retention policy, and detailed history of all backups.

Database Migration from Windows to Linux Using RMAN

This post describes the procedure required to migrate a database from Windows to Linux using

the RMAN Convert Database command. Both Windows and Linux platforms have the same

endian format, which makes possible to transfer the whole database, making the migration

process very straightforward and simple. To migrate between platforms that have a different

endian format, Cross Platform Transportable Table spaces (XTTS) needs to be used instead.

79

4.5 LIST OF STEPS NEEDED TO COMPLETE THE MIGRATION

1. Check platform compatibility between source and target OS

2. Start the database in read only mode

3. Check database readiness for transport from Windows to Linux using DBMS_TDB.CHECK_DB

4. Check if there are any external objects

5. Execute the Rman Convert database command

6. Copy converted datafiles, generated Transport Script and Parameter File to Linux

7. Edit the init.ora for the new database

8. Edit the Transport Script and Parameter File changing the windows paths to Linux Paths

9. Execute the Transport Script

10. Change the Database ID

11. Check database integrity

4.6 COMMAND NOTATION

desc V$db_transportable_platform

Step 1-->shutdown database

Step 2-->startup mount

Alter database open read only

Step 3-->connect to rman and run a script

80

rman> convert database new database vikas1

transport script 'E:\trans'

to paltform 'Linux IA(32-bit)';

//this scripr generate file for every datafile in database folder and parameter file and

a transport script in e drive copy these files and paste on linux machine.

Step 4-->copy parameter file on default location and edit it;

Step 5-->edit tranport script file (change new path).

Step 6-->open sqlplus and run script transport script.

4.7 SYSTEM REQUIREMENTS

HARDWARE REQUIREMENT

RAM—512 Mb (DDR2) or Higher

CPU—2.86 GHz or higher

MOTHER BOARD -- D945 GNT or Higher version

HARD DISK— 10 Gb free or per requirement

KEYBOARD-- TVS GOLD

MOUSE—Optical Mouse

MONITOR—as per requirement

SOFTWARE REQUIREMENT

OPERATING SYSTEM— Windows 98, 2000, NT, XP. (32-bit), LINUX red hat 5.

DATABASE -- Oracle 9i, 10g

TOOL—Oracle Recovery Manager

FRONT END—SQL Navigator

BACK END – Oracle 10g

81

CHAPTER-5: DATA DESIGN

82

DATA DESIGN

5.1 DATA FLOW DIAGRAM

A data-flow diagram (DFD) is a graphical representation of the "flow" of data through an

information system. DFDs can also be used for the visualization of data processing (structured

design).On a DFD, data items flow from an external data source or an internal data store to an

internal data store or an external data sink, via an internal process. A DFD provides no

information about the timing of processes, or about whether processes will operate in sequence

or in parallel. It is therefore quite different from a flowchart, which shows the flow of control

through an algorithm, allowing a reader to determine what operations will be performed, in what

order, and under what circumstances, but not what kinds of data will be input to and output from

the system, nor where the data will come from and go to, nor where the data will be stored (all of

which are shown on a DFD). Data-flow diagrams (DFDs) are one of the three essential

perspectives of the structured-systems analysis and design method SSADM. The sponsor of a

project and the end users will need to be briefed and consulted throughout all stages of a system's

evolution. With a data-flow diagram, users are able to visualize how the system will operate,

what the system will accomplish, and how the system will be implemented. The old system's

dataflow diagrams can be drawn up and compared with the new system's data-flow diagrams to

draw comparisons to implement a more efficient system. Data-flow diagrams can be used to

provide the end user with a physical idea of where the data they input ultimately has an effect

upon the structure of the whole system from order to dispatch to report. How any system is

developed can be determined through a data-flow diagram.

DEVELO PING A DATA-FLOW DIAGRAM

In the course of developing a set of leveled data-flow diagrams the analyst/designers is forced to

address how the system may be decomposed into component sub-systems, and to identify the

transaction data in the data model. There are different notations to draw data-flow diagrams,

defining different visual representations for Processes, Data Stores, Data Flow, And External

83

Entities. Data flow diagram ("bubble charts") are directed graphs in which the nodes specify

processing activities and the arcs specify data items transmitted between processing nodes.

TOP-DOWN APPROACH

1. The system designer makes "a context level DFD" or Level 0, which shows the

"interaction" (data flows) between "the system" (represented by one process) and "the

system environment" (represented by terminators).

2. The system is "decomposed in lower-level DFD (Level 1)" into a set of "processes, data

stores, and the data flows between these processes and data stores".

http://en.wikipedia.org/wiki/File:DataFlowDiagram_Example.png

http://en.wikipedia.org/wiki/File:Data-flow-diagram-notation.svg

84

3. Each process is then decomposed into an "even-lower-level diagram containing its sub

processes".

4. This approach "then continues on the subsequent sub processes", until a necessary and

sufficient level of detail is reached which is called the primitive process (aka chewable in

one bite).

DFD is also a virtually designable diagram that technically or diagrammatically describes the

inflow and outflow of data or information that is provided by the external entity.

In Level0 the diagram does not contain any Data stores.

EVENT PARTITIONING APPROACH

This level shows the overall context of the system and its operating environment and shows the

whole system as just one process. It does not usually show data stores, unless they are "owned"

by external systems, e.g. are accessed by but not maintained by this system, however, these are

often shown as external entities.

http://en.wikipedia.org/wiki/File:DFDC.png

85

LEVEL 1 (HIGH LEVEL DIAGRAM)

A Level 1 Data flow diagram for the same system. This level (level 1) shows all processes at the

first level of numbering, data stores, external entities and the data flows between them. The

purpose of this level is to show the major high-level processes of the system and their

interrelation. A process model will have one, and only one, level-1 diagram. A level-1 diagram

must be balanced with its parent context level diagram, i.e. there must be the same external

entities and the same data flows, these can be broken down to more detail in the level 1, example

the "inquiry" data flow could be split into "inquiry request" and "inquiry results" and still be

valid.

http://en.wikipedia.org/wiki/File:DFD0.png

86

LEVEL 2 (LOW LEVEL DIAGRAM)

A Level 2 Data flow diagram showing the "Process Enquiry" process for the same system. This

level is decomposition of a process shown in a level-1 diagram, as such there should be a level-2

diagram for each and every process shown in a level-1 diagram. In this example processes 1.1,

1.2 & 1.3 are all children of process 1, together they wholly and completely describe process 1,

and combined must perform the full capacity of this parent process. As before, a level-2 diagram

must be balanced with its parent level-1 diagram test.

5.2 ENTITY-RELATIONSHIP MODEL

In software engineering, an entity-relationship model (ERM) is an abstract and conceptual

representation of data. Entity-relationship modeling is a database modeling method, used to

produce a type of conceptual schema or semantic data model of a system, often a relational

database, and its requirements in a top-down fashion. Diagrams created by this process are called

entity-relationship diagrams, ER diagrams, or ERDs. The first stage of information system

design uses these models during the requirements analysis to describe information needs or the

type of information that is to be stored in a database. The data modeling technique can be used to

describe any ontology (i.e. an overview and classifications of used terms and their relationships)

for a certain area of interest. In the case of the design of an information system that is based on a

http://en.wikipedia.org/wiki/File:DFD1.png

87

database, the conceptual data model is, at a later stage (usually called logical design), mapped to

a logical data model, such as the relational model; this in turn is mapped to a physical model

during physical design. Note that sometimes, both of these phases are referred to as "physical

design”. There are a number of conventions for entity-relationship diagrams (ERDs). The

classical notation mainly relates to conceptual modeling. There are a range of notations

employed in logical and physical database design, such as IDEF1X

THE BUILDING BLOCKS: ENTITIES, RELATIONSHIPS, AND ATTRIBUTES

Two related entities

An entity with an attribute

A relationship with an attribute

An entity may be defined as a thing which is recognized as being capable of an independent

existence and which can be uniquely identified. An entity is an abstraction from the complexities

of some domain. When we speak of an entity we normally speak of some aspect of the real world

which can be distinguished from other aspects of the real world. An entity may be a physical

object such as a house or a car, an event such as a house sale or a car service, or a concept such

http://en.wikipedia.org/wiki/Entity

http://en.wikipedia.org/wiki/File:Erd-entity-relationship-example1.svg

http://en.wikipedia.org/wiki/File:Erd-entity-with-attribute.svg

http://en.wikipedia.org/wiki/File:Erd-relationship-with-attribute.png

http://en.wikipedia.org/wiki/File:Erd-id-as-primary-key.png

88

as a customer transaction or order. Although the term entity is the one most commonly used,

following Chen we should really distinguish between an entity and an entity-type. An entity-type

is a category. An entity, strictly speaking, is an instance of a given entity-type. There are usually

many instances of an entity-type. Because the term entity-type is somewhat cumbersome, most

people tend to use the term entity as a synonym for this term. A relationship captures how two or

more entities are related to one another. Relationships can be thought of as verbs, linking two or

more nouns. Examples: an owners relationship between a company and a computer, a supervises

relationship between an employee and a department, a performs relationship between an artist

and a song, a proved relationship between a mathematician and a theorem.

DIAGRAMMING CONVENTIONS

Entity sets are drawn as rectangles, relationship sets as diamonds. If an entity set participates in a

relationship set, they are connected with a line. Attributes are drawn as ovals and are connected

with a line to exactly one entity or relationship set. Cardinality constraints are expressed as

follows:

a double line indicates a participation constraint, totality or surjectivity: all entities in the

entity set must participate in at least one relationship in the relationship set;

an arrow from entity set to relationship set indicates a key constraint, i.e. infectivity: each

entity of the entity set can participate in at most one relationship in the relationship set;

A thick line indicates both, i.e. objectivity: each entity in the entity set is involved in

exactly one relationship.

An underlined name of an attribute indicates that it is a key: two different entities or

relationships with this attribute always have different values for this attribute.

Attributes are often omitted as they can clutter up a diagram; other diagram techniques often list

entity attributes within the rectangles drawn for entity sets

http://en.wikipedia.org/wiki/Verb

89

CHAPTER-6: CODING

90

The Windows Database

This exercise was done using a small default 10.2.0.4 database

SQL> select bytes, file_name from dba_data_files;

BYTES FILE_NAME

---------- ------------------------------------------------------------440401920 G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSTEM_44QYBFPF_.DBF

246415360 G:\TECHNOLOGY\ORCLW\DATAFILE\

O1_MF_UNDOTBS1_44QYCFQW_.DBF

136314880 G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSAUX_44QYCZ3D_.DBF

5242880 G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_USERS_44QYDHY7_.DBF

SQL> select name from v$controlfile;

NAME

----------------------------------------------------------------------------------

G:\TECHNOLOGY\ORCLW\CONTROLFILE\O1_MF_44QY9SXR_.CTL

C:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCLW\CONTROLFILE\

O1_MF_44QY9TPX_.CTL

SQL> select member from v$logfile;

MEMBER

----------------------------------------------------------------------------------

G:\TECHNOLOGY\ORCLW\ONLINELOG\O1_MF_1_44QY9VJL_.LOG

C:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCLW\ONLINELOG\

O1_MF_1_44QY9ZZ7_.LOG

G:\TECHNOLOGY\ORCLW\ONLINELOG\O1_MF_2_44QYB14V_.LOG


O1_MF_2_44QYB5L1_.LOG

G:\TECHNOLOGY\ORCLW\ONLINELOG\O1_MF_3_44QYB6OY_.LOG


O1_MF_3_44QYBC2F_.LOG

91

1. Check platform compatibility between sources and target OS

You need to check the platforms to be sure they have the same endian format; also you need to

save the platform_ name string to use it later as part of the convert database

SQL> select * from V$DB_TRANSPORTABLE_PLATFORM

2 where PLATFORM_NAME='Microsoft Windows IA (32-bit)' or

3 PLATFORM_NAME like 'Linux%'

4 /

PLATFORM_ID PLATFORM_NAME ENDIAN_FORMAT

----------- --------------------------------------------- --------------

7 Microsoft Windows IA (32-bit) Little

10 Linux IA (32-bit) Little

11 Linux IA (64-bit) Little

13 Linux x86 64-bit Little

2. Start the database in read only mode

In order to execute dbms_tdb.check_db the database must be opened on read only mode.

SQL> shutdown immediate;

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount;

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 1295608 bytes

Variable Size 71305992 bytes

92

Database Buffers 88080384 bytes

Redo Buffers 7090176 bytes

Database mounted.

SQL> alter database open read only;

Database altered.

3. Check database readiness for transport from Windows to Linux

If the execution of dbms_tdb.check_db does not return any exceptions, that means the database is

ready for transport to the target platform.

SQL> set serveroutput on

SQL> declare

2 db_ready boolean;

3 begin

4 db_ready:= dbms_tdb.check_db('Linux IA (32-bit)');

5 end;

6 /

PL/SQL procedure successfully completed

4. Check if there are any external objects

If there is any external objects take note of them, they will need to be taken care manually

SQL> set serveroutput on

SQL> declare

2 external boolean;

3 begin

4 /* value of external is ignored, but with SERVEROUTPUT set to ON

5 * dbms_tdb.check_external displays report of external objects

93

6 * on console */

7 external := dbms_tdb.check_external;

8 end;

9 /

The following directories exist in the database:

SYS.DATA_PUMP_DIR, SYS.ORACLE_OCM_CONFIG_DIR, SYS.ADMIN_DIR,

SYS.WORK_DIR

PL/SQL procedure successfully completed.

5. Using the RMAN CONVERT DATABASE COMMAND

Having executed successfully the checkup steps, the database is open in read only

mode, then the convert database command can be executed with Rman.

In this example, I’m not using the db_file_name_convert '<source-path> '<target-

path>' option because the database is using oracle managed files (omf); when using

omf Rman does generate and display the new filenames on the output of the convert

database command. At the end of the convert process Rman does display information

about how to complete the conversion on the target platform.

C:\Documents and Settings\avargas>Rman target sys/oracle@orclw nocatalog

Recovery Manager: Release 10.2.0.4.0 - Production on Thus May 19 17:26:22 2008

Copyright (c) 1982, 2007, Oracle. All rights reserved.

connected to target database: ORCLW (DBID=1718464921)

using target database control file instead of recovery catalog

RMAN> CONVERT DATABASE NEW DATABASE 'orcllnx'

2> transport script 'G:\TECHNOLOGY\ORCLLNX\transportscript'

3> to platform 'Linux IA (32-bit)';

Starting convert at 19-MAY-11

allocated channel: ORA_DISK_1

channel ORA_DISK_1: sid=154 devtype=DISK

94

Directory SYS.DATA_PUMP_DIR found in the database

Directory SYS.ORACLE_OCM_CONFIG_DIR found in the database

Directory SYS.ADMIN_DIR found in the database

Directory SYS.WORK_DIR found in the database

User SYS with SYSDBA and SYSOPER privilege found in password file

channel ORA_DISK_1: starting datafile conversion

input datafile fno=00001 name=G:\TECHNOLOGY\ORCLW\DATAFILE\

O1_MF_SYSTEM_44QYBFPF_.DBF

converted datafile=G:\TECHNOLOGY\ORCLW\DATAFILE\

O1_MF_SYSTEM_44TM3OPF_.DBF

channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:55



O1_MF_UNDOTBS1_44QYCFQW_.DBF


O1_MF_UNDOTBS1_44TM5F98_.DBF




O1_MF_SYSAUX_44QYCZ3D_.DBF


O1_MF_SYSAUX_44TM6JTB_.DBF




O1_MF_USERS_44QYDHY7_.DBF


O1_MF_USERS_44TM7BD5_.DBF


95

Run SQL script G:\TECHNOLOGY\ORCLLNX\TRANSPORTSCRIPT on the target platform

to create

database

Edit init.ora file C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\

INIT_00JIGSKL_1_0.ORA. This

PFILE will be used to create the database on the target platform

To recompile all PL/SQL modules, run utlirp.sql and utlrp.sql on the target platform

To change the internal database identifier, use DBNEWID Utility

Finished backup at 19-MAY-11

6. Copy Converted Datafiles, Generated Transport Script and Parameter File

to the Linux

We need to copy over the generated files to the Linux server, they include all converted

datafiles, the transportscript and the generated pfile.If needed create at this point the

directories you will need on the Linux server, for dump destination and flash

recovery area i.e.

mkdir –p /oradisk/oracle/app/admin/ORCLLNX/adump

mkdir –p /oradisk/oracle/app/admin/ORCLLNX/bdump

mkdir –p /oradisk/oracle/app/admin/ORCLLNX/cdump

mkdir –p /oradisk/oracle/app/admin/ORCLLNX/udump

mkdir –p /oradisk/database/ORCLLNX/FRA

O1_MF_SYSTEM_44TM3OPF_.DBF, then,

G:\TECHNOLOGY\ORCLW\DATAFILE>move *44TM* ..\..\ORCLLNX

G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSTEM_44TM3OPF_.DBF

G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_UNDOTBS1_44TM5F98_.DBF

G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSAUX_44TM6JTB_.DBF

G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_USERS_44TM7BD5_.DBF

96

G:\TECHNOLOGY\ORCLW\DATAFILE>move

C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00JIGSKL_1_0.ORA

..\..\ORCLLNX\initORCLLNX.ora

Check that all required files are located on the ORCLLNX directory

G:\TECHNOLOGY\ORCLW\DATAFILE>dir ..\..\ORCLLNX

Volume in drive G is TECHNOLOGY

Volume Serial Number is 1877-B4EA

Directory of G:\TECHNOLOGY\ORCLLNX

19/05/11 05:27 PM <DIR>

19/05/11 05:27 PM <DIR>

19/05/11 05:31 PM 2,616 TRANSPORTSCRIPT

19/05/11 05:31 PM 440,410,112 01_MF_SYSTEM_44TM30PF_.DBF

19/05/11 05:31 PM 246,423,552 01_MF_UNDOTBS1_44TM5F98_.DBF

19/05/11 05:31 PM 146,808,832 01_MF_SYSAUX_44TM6JTB_.DBF

19/05/11 05:31 PM 5,251,072 01_MF_USERS_44TM7BDS_.DBF

19/05/11 05:31 PM 1,556 initORCLLNX.ora

6 File(s) 838,897,740 bytes

2 Dir(s) 18,968,444,928 bytes free

7. Edit init.ora for new database

The Rman convert command executed on Windows generated a parameter file that

needs to be edited to be usedon the target Linux Server.

97

The pfile generated by Rman:

# Please change the values of the following parameters:

control_files = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\CF_DORCLLNX_

ID-1718464921_00JIGSKL"

db_create_file_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TECHNOLOGY"

db_recovery_file_dest =

"C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\flash_recovery_area"

db_recovery_file_dest_size= 2147483648

audit_file_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\ADUMP"

background_dump_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\BDUMP"

user_dump_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UDUMP"

core_dump_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\CDUMP"

db_name = "ORCLLNX"

# Please review the values of the following parameters:

__shared_pool_size = 62914560

__large_pool_size = 4194304

__java_pool_size = 4194304

__streams_pool_size = 0

__db_cache_size = 88080384

remote_login_passwordfile= "EXCLUSIVE"

db_domain = ""

dispatchers = "(PROTOCOL=TCP) (SERVICE=orclwXDB)"

# The values of the following parameters are from source database:

processes = 150

sga_target = 167772160

db_block_size = 8192

compatible = "10.2.0.3.0"

db_file_multiblock_read_count= 16

undo_management = "AUTO"

98

undo_tablespace = "UNDOTBS1"

job_queue_processes = 10

open_cursors = 300

pga_aggregate_target = 16777216

The pfile edited to be used on Linux

# Please change the values of the following parameters:

control_files = '/oradisk/database/ORCLLNX/orcllnx_control1.ctl' ,

'/oradisk/database/ORCLLNX/orcllnx_control2.ctl'

db_create_file_dest = '/oradisk/database/ORCLLNX'

db_recovery_file_dest = '/oradisk/database/ORCLLNX/FRA'

db_recovery_file_dest_size= 2147483648

audit_file_dest = '/oradisk/oracle/app/admin/ORCLLNX/adump'

background_dump_dest = '/oradisk/oracle/app/admin/ORCLLNX/bdump'

user_dump_dest = '/oradisk/oracle/app/admin/ORCLLNX/udump'

core_dump_dest = '/oradisk/oracle/app/admin/ORCLLNX/cdump'

db_name = 'ORCLLNX'

# Please review the values of the following parameters:

__shared_pool_size = 62914560

__large_pool_size = 4194304

__java_pool_size = 4194304

__streams_pool_size = 0

__db_cache_size = 88080384

remote_login_passwordfile= 'EXCLUSIVE'

db_domain = ''

dispatchers = '(PROTOCOL=TCP) (SERVICE=orcllnxXDB)'

# The values of the following parameters are from source database:

processes = 150

sga_target = 167772160

99

db_block_size = 8192

compatible = '10.2.0.3.0'

db_file_multiblock_read_count= 16

undo_management = 'AUTO'

undo_tablespace = 'UNDOTBS1'

job_queue_processes = 10

open_cursors = 300

pga_aggregate_target = 16777216

8. Edit the Transport Script

Before running the transport script on the target Linux server we need to edit it to set the

correct paths for datafiles,controlfiles and dump directories, also we may want to

change the value for tuning parameters.The script generated by Rman:

-- The following commands will create a new control file and use it

-- to open the database.

-- Data used by Recovery Manager will be lost.

-- The contents of online logs will be lost and all backups will

-- be invalidated. Use this only if online logs are damaged.

-- After mounting the created controlfile, the following SQL

-- statement will place the database in the appropriate

-- protection mode:

-- ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE

STARTUP NOMOUNT

PFILE='C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00JIGSKL_1_0.ORA'

CREATE CONTROLFILE REUSE SET DATABASE "ORCLLNX" RESETLOGS

NOARCHIVELOG

MAXLOGFILES 16

MAXLOGMEMBERS 3

100

MAXDATAFILES 100

MAXINSTANCES 8

MAXLOGHISTORY 292

LOGFILE

GROUP 1 SIZE 50M,

GROUP 2 SIZE 50M,

GROUP 3 SIZE 50M

DATAFILE

'C: \ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\DATA_D-ORCLW_I-

1718464921_TSSYSTEM_

FNO-1_07JIGSKL',


1718464921_TSUNDOTBS1_

FNO-2_08JIGSMD',


1718464921_TSSYSAUX_

FNO-3_09JIGSNG',

'C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\DATA_D-ORCLW_I-

1718464921_TSUSERS_

FNO-4_0AJIGSOA'

CHARACTER SET AL32UTF8

;

-- Database can now be opened zeroing the online logs.

ALTER DATABASE OPEN RESETLOGS;

-- Commands to add tempfiles to temporary tablespaces.

-- Online tempfiles have complete space information.

-- Other tempfiles may require adjustment.

ALTER TABLESPACE TEMP ADD TEMPFILE

SIZE 20971520 AUTOEXTEND ON NEXT 655360 MAXSIZE 33554431M;

-- End of tempfile additions.

--

101

set echo off

prompt

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

prompt * Your database has been created successfully!

prompt * There are many things to think about for the new database. Here

prompt * is a checklist to help you stay on track:

prompt * 1. You may want to redefine the location of the directory objects.

prompt * 2. You may want to change the internal database identifier (DBID)

prompt * or the global database name for this database. Use the

prompt * NEWDBID Utility (nid).

prompt

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SHUTDOWN IMMEDIATE

STARTUP UPGRADE

PFILE='C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00JIGSKL_1_0.ORA'

@@?/rdbms/admin/utlirp.sql

SHUTDOWN IMMEDIATE

STARTUP PFILE='C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\

INIT_00JIGSKL_1_0.ORA'

-- The following step will recompile all PL/SQL modules.

-- It may take serveral hours to complete.

@@ ?/rdbms/admin/utlrp.sql

set feedback 6;

The script edited to be run on Linux:

STARTUP NOMOUNT

PFILE='/oradisk/oracle/app/oracle/product/10.2.0/db_1/dbs/initORCLLNX.ORA'

CREATE CONTROLFILE REUSE SET DATABASE "ORCLLNX" RESETLOGS

NOARCHIVELOG

MAXLOGFILES 16

MAXLOGMEMBERS 3

102

MAXDATAFILES 100

MAXINSTANCES 8

MAXLOGHISTORY 292

LOGFILE

GROUP 1 SIZE 50M,

GROUP 2 SIZE 50M,

GROUP 3 SIZE 50M

DATAFILE

'/oradisk/database/ORCLLNX/O1_MF_SYSTEM_44TM3OPF_.DBF',

'/oradisk/database/ORCLLNX/O1_MF_UNDOTBS1_44TM5F98_.DBF',

'/oradisk/database/ORCLLNX/O1_MF_SYSAUX_44TM6JTB_.DBF',

'/oradisk/database/ORCLLNX/O1_MF_USERS_44TM7BD5_.DBF'

CHARACTER SET AL32UTF8

;

-- Database can now be opened zeroing the online logs.

ALTER DATABASE OPEN RESETLOGS;

-- Commands to add tempfiles to temporary tablespaces.

-- Online tempfiles have complete space information.

-- Other tempfiles may require adjustment.

ALTER TABLESPACE TEMP ADD TEMPFILE

SIZE 20971520 AUTOEXTEND ON NEXT 655360 MAXSIZE 33554431M;

-- End of tempfile additions.

--

set echo off

prompt

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

prompt * Your database has been created successfully!

prompt * There are many things to think about for the new database. Here

prompt * is a checklist to help you stay on track:

prompt * 1. You may want to redefine the location of the directory objects.

prompt * 2. You may want to change the internal database identifier (DBID)

103

prompt * or the global database name for this database. Use the

prompt * NEWDBID Utility (nid).

prompt

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SHUTDOWN IMMEDIATE

STARTUP UPGRADE

PFILE='/oradisk/oracle/app/oracle/product/10.2.0/db_1/dbs/initORCLLNX.ORA'

@@?/rdbms/admin/utlirp.sql

SHUTDOWN IMMEDIATE

STARTUP

PFILE='/oradisk/oracle/app/oracle/product/10.2.0/db_1/dbs/initORCLLNX.ORA';

-- The following step will recompile all PL/SQL modules.

-- It may take several hours to complete.

@@?/rdbms/admin/utlrp.sql

Set feedback 6;

9. Execute the Transport Script

This is the log of the transport script execution:

avargas-pc:/oradisk/database/ORCLLNX> sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Wed Jun 11 10:11:54 2008

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.

Connected to:

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> @transport

SQL> STARTUP NOMOUNT

PFILE='/oradisk/oracle/app/oracle/product/10.2.0/db_1/dbs/initORCLLNX.ora';



104





SQL> CREATE CONTROLFILE REUSE SET DATABASE "ORCLLNX" RESETLOGS

NOARCHIVELOG

2 MAXLOGFILES 16

3 MAXLOGMEMBERS 3

4 MAXDATAFILES 100

5 MAXINSTANCES 8

6 MAXLOGHISTORY 292

7 LOGFILE

8 GROUP 1 SIZE 50M,

9 GROUP 2 SIZE 50M,

10 GROUP 3 SIZE 50M

11 DATAFILE

12 '/oradisk/database/ORCLLNX/O1_MF_SYSTEM_44TM3OPF_.DBF',

13 '/oradisk/database/ORCLLNX/O1_MF_UNDOTBS1_44TM5F98_.DBF',

14 '/oradisk/database/ORCLLNX/O1_MF_SYSAUX_44TM6JTB_.DBF',

15 '/oradisk/database/ORCLLNX/O1_MF_USERS_44TM7BD5_.DBF'

16 CHARACTER SET AL32UTF8

17 ;

Control file created.

SQL>

SQL> -- Database can now be opened zeroing the online logs.

SQL> ALTER DATABASE OPEN RESETLOGS;

Database altered.

SQL>

SQL> -- Commands to add tempfiles to temporary tablespaces.

SQL> -- Online tempfiles have complete space information.

SQL> -- Other tempfiles may require adjustment.

105

SQL> ALTER TABLESPACE TEMP ADD TEMPFILE

2 SIZE 20971520 AUTOEXTEND ON NEXT 655360 MAXSIZE 33554431M;

Tablespace altered.

SQL> -- End of tempfile additions.

SQL> --

SQL>

SQL> set echo off

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Your database has been created successfully!

* There are many things to think about for the new database. Here

* is a checklist to help you stay on track:

* 1. You may want to redefine the location of the directory objects.

* 2. You may want to change the internal database identifier (DBID)

* or the global database name for this database. Use the

* NEWDBID Utility (nid).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Database closed.

Database dismounted.

ORACLE instance shut down.







Database mounted.

Database opened.

SQL> WHENEVER SQLERROR EXIT;

SQL>

SQL> DOC

DOC>#######################################################################

106

DOC>#######################################################################

DOC> The following statement will cause an "ORA-01722: invalid number"

DOC> error if there the database was not opened in UPGRADE mode

DOC>

DOC> If you encounter this error, execute "SHUTDOWN", "STARTUP UPGRADE" and

DOC> re-execute utlirp.sql

DOC>#######################################################################

DOC>#######################################################################

DOC>#

SQL> SELECT TO_NUMBER ('MUST_BE_OPEN_UPGRADE') FROM v$instance

2 WHERE status! = 'OPEN MIGRATES';

no rows selected

SQL> Rem Store object numbers of all valid PL/SQL-based functional indexes

SQL> DROP TABLE utlirp_enabled_func_indexes;

Table dropped.

SQL> CREATE TABLE utlirp_enabled_func_indexes AS

2 SELECT obj# FROM ind$

3 WHERE bitand(property, 2048) != 0 AND bitand(flags, 1024) = 0;

Table created.

>>>>>>>>>>>>>> many output lines followed

These are the final lines of the logfile:


SQL>

SQL> DOC

DOC> the following query reports the number of objects that have compiled

DOC> with errors (objects that compile with errors have status set to 3 in

DOC> obj$). If the number is higher than expected, please examine the error

DOC> messages reported with each object (using SHOW ERRORS) to see if they

DOC> point to system misconfiguration or resource constraints that must be

DOC> fixed before attempting to recompile these objects.

107

DOC>#

SQL> select COUNT (*) "OBJECTS WITH ERRORS" from obj$ where status = 3;

OBJECTS WITH ERRORS

-------------------

0

SQL>

SQL>

SQL> DOC

DOC> the following query reports the number of errors caught during

DOC> recompilation. If this number is non-zero, please query the error

DOC> messages in the table UTL_RECOMP_ERRORS to see if any of these errors

DOC> are due to misconfiguration or resource constraints that must be

DOC> fixed before objects can compile successfully.

DOC>#

SQL> select COUNT (*) "ERRORS DURING RECOMPILATION" from utl_recomp_errors;

ERRORS DURING RECOMPILATION

---------------------------

0

SQL>

SQL>

SQL> Rem

=====================================================================

SQL> Rem Run component validation procedure

SQL> Rem

=====================================================================

SQL>

SQL> SET serveroutput on

SQL> EXECUTE dbms_registry_sys.validate_components;


SQL> SET serveroutput off

SQL>

108

SQL>

SQL> Rem

=====================================================================

======

SQL> Rem END utlrp.sql

SQL> Rem

=====================================================================

======

SQL> set feedback 6;

10. Change database identifier

To change the database identifier you need to use the NEWDBID utility “nid”. It is run

from within Sqlplus havingthe database mounted:

sqlplus "/ as sysdba"

startup mount

exit

To change the DBID

cd $ORACLE_HOME/bin

./nid target=/

To verify the DBID and database name

SELECT dbid, name FROM v$_database;

DBID Change, Execution Log:

avargas-pc:~/app/oracle/product/10.2.0/db_1/bin> sqlplus / as sysdba



109

Connected to an idle instance.

SQL> startup mount;







Database mounted.

SQL> exit

Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 -

Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

avargas-pc:~/app/oracle/product/10.2.0/db_1/bin> ./nid target=/

DBNEWID: Release 10.2.0.4.0 - Production on Wed Jun 11 10:25:50 2008

Copyright (c) 1982, 2007, Oracle. All rights reserved.

Connected to database ORCLLNX (DBID=1718464921)

Connected to server version 10.2.0

Control Files in database:

/oradisk/database/ORCLLNX/orcllnx_control1.ctl

/oradisk/database/ORCLLNX/orcllnx_control2.ctl

Change database ID of database ORCLLNX? (Y/[N]) => Y

Proceeding with operation

Changing database ID from 1718464921 to 1179074095

Control File /oradisk/database/ORCLLNX/orcllnx_control1.ctl - modified

Control File /oradisk/database/ORCLLNX/orcllnx_control2.ctl - modified

Datafile /oradisk/database/ORCLLNX/O1_MF_SYSTEM_44TM3OPF_.DBF - dbid changed

Datafile /oradisk/database/ORCLLNX/O1_MF_UNDOTBS1_44TM5F98_.DBF - dbid

changed

Datafile /oradisk/database/ORCLLNX/O1_MF_SYSAUX_44TM6JTB_.DBF - dbid changed

110

Datafile /oradisk/database/ORCLLNX/O1_MF_USERS_44TM7BD5_.DBF - dbid changed

Datafile /oradisk/database/ORCLLNX/ORCLLNX/datafile/o1_mf_temp_44yxofkr_.tmp

- dbid changed

Control File /oradisk/database/ORCLLNX/orcllnx_control1.ctl - dbid changed

Control File /oradisk/database/ORCLLNX/orcllnx_control2.ctl - dbid changed

Instance shut down

Database ID for database ORCLLNX changed to 1179074095.

All previous backups and archived redo logs for this database are unusable.

Database is not aware of previous backups and archived logs in Recovery Area.

Database has been shutdown, open database with RESETLOGS option.

Successfully changed database ID.

DBNEWID - Completed succesfully.

avargas-pc:~/app/oracle/product/10.2.0/db_1/bin> sqlplus / as sysdba



Connected to an idle instance.

SQL> startup mount;







Database mounted.

SQL> alter database open resetlogs;

Database altered

11. Check database integrity

SQL> select tablespace_name from dba_tablespaces;

TABLESPACE_NAME

111

------------------------------

SYSTEM

UNDOTBS1

SYSAUX

TEMP

USERS

SQL> select file_name from dba_data_files;

FILE_NAME

-------------------------------------------------------------------------------

/oradisk/database/ORCLLNX/O1_MF_USERS_44TM7BD5_.DBF

/oradisk/database/ORCLLNX/O1_MF_SYSAUX_44TM6JTB_.DBF

/oradisk/database/ORCLLNX/O1_MF_UNDOTBS1_44TM5F98_.DBF

/oradisk/database/ORCLLNX/O1_MF_SYSTEM_44TM3OPF_.DBF

Checking component status after transport:

SQL> SELECT COMP_NAME, STATUS FROM DBA_REGISTRY;

COMP_NAME STATUS

-------------------------------------- ------

Oracle XML Database VALID

Oracle Expression Filter VALID

Oracle Rules Manager VALID

Oracle Workspace Manager VALID

Oracle interMedia VALID

Oracle Database Catalog Views VALID

Oracle Database Packages and Types VALID

JServer JAVA Virtual Machine VALID

Oracle XDK VALID

Oracle Database Java Packages VALID

112

CHAPTER-7: DATABASE MIGRATION

TESTING

113

DATABASE MIGRATION TESTING

The approach to testing data and content migrations relies upon sampling, where some subset of

random data or content is selected and inspected to ensure the migration was completed “as

designed”. Those that have tested migrations using this approach are familiar with the typical

iterative test, debug and retest method, where subsequent executions of the testing process reveal

different error conditions as new samples are reviewed. Sampling works, but is reliant upon an

acceptable level of error and an assumption pertaining to repeatability. An acceptable level of

error implies that less than 100% of the data will be migrated without error and the level of error

is inversely proportionate to the number of samples tested(refer to sampling standards such as

ANSI/ASQ Z1.4). As per the assumption on repeatability, the fact that many migrations require

four, five or more iterations of testing with differing results implies that one of the key tenets of

sampling is not upheld, i.e., “non-conformities occur randomly and with statistical independence.

o We have to check database integrity.

o We have to check the referential integrity.

o Whether the data’s are loaded in the target database correctly.

o Check the constraints.

Even with these shortcomings, sampling has a role in a well defined testing strategy, but what are

the other testing options. The following lists options for testing by the phase of the migration

process:

7.1 PRE-MIGRATION TESTING

These tests occur early in the migration process, before any migration, even migration for testing

purposes, is completed. The pre-migration testing options include:

Verify scope of source systems and data with user community and IT. Verification should

include data to be included as well as excluded and, if applicable, tied to the specific

queries being used for the migration.

114

Define the source to target high-level mappings for each category of data or content and

verify that the desired type has been defined in the destination system.

Verify destination system data requirements such as the field names, field type,

mandatory fields, valid value lists and other field-level validation checks.

Using the source to destination mappings, test the source data against the requirements of

the destination system. For example, if the destination system has a mandatory field,

ensure that the appropriate source is not null, or if the destination system field has a list of

valid values, test to ensure that the appropriate source fields contain these valid values.

Test the fields that uniquely link source and target records and ensure that there is a

definitive mapping between the record sets

Test source and target system connections from the migration platform.

Test tool configuration against the migration specification which can often be completed

via black box testing on a field –by- field basis. If clever, testing here can also be used to

verify that a migration specification’s mappings are complete and accurate.

FORMAL DESIGN REVIEW

Conduct a formal design review of the migration specification when the pre-migration testing in

near complete, or during the earliest stages of the migration tool configuration.

The specification should include:

A definition of the source systems

The source system’s data sets and queries

The mappings between the source system fields and the destination system

Number of source records

Number of source systems records created per unit time (to be used to define the

migration timing and downtime

Identification of supplementary sources

Data cleansing requirements

Performance requirements

115

Testing requirements

The formal design review should include representatives from the appropriate user communities,

IT and management. The outcome of a formal design review should include a list of open issues,

the means to close each issue and approve the migration specification and a process to maintain

the specification in sync with the migration tool configuration (which seems to continuously

change until the production migration).

7.2 Post-Migration Testing

Once a migration has been executed, additional end to end testing can be executed. Expect a

significant sum of errors to be identified during the initial test runs although it will be minimized

if sufficient pre-migration testing is well executed. Post-migration is typically performed in a test

environment and includes:

Test the throughput of the migration process (number of records per unit time). This

testing will be used to verify that the planned downtime is sufficient. For planning

purposes, consider the time to verify that the migration process was completed

successfully.

Compare Migrated Records to Records Generated by the Destination System – Ensure

that migrated records are complete and of the appropriate context.

Summary Verification – There are several techniques that provide summary information

including record counts and checksums. Here, the number of records migrated is

compiled from the destination system and then compared to the number of records

migrated. This approach provides only summary information and if any issue exists, it

does not often provide insight to an issue’s root cause.

Compare Migrated Records to Sources – Tests should verify that fields’ values are

migrated as per the migration specification. In short, source values and the field level

mappings are used to calculate the expected results at the destination. This testing can be

completed using sampling if appropriate or if the migration includes data that poses

116

significant business or compliance risk, 100% of the migrated data can be verified using

an automated testing tool.

The advantages of the automated approach include the ability to identify errors that are less

likely to occur (the proverbial needles in a haystack). Additionally, as an automated testing tool

can be configured in parallel with the configuration of the migration tool, the ability to test 100%

of the migrated data is available immediately following the first test migration. When compared

to sampling approaches, it is easy to see that automated testing saves significant time and

minimizes the typical iterative test, debug and retest found with sampling.

User Acceptance Testing

Functional subtleties related to the co-mingling of migrated data and data created in the

destination system may be difficult to identify early in the migration process. User acceptance

testing provides an opportunity for the user community to interact with legacy data in the

destination system prior to production release, and most often, this is the first such opportunity

for the users. Attention should be given to reporting, downstream feeds, and other system

processes that rely on migrated data.

117

CHAPTER-8: SCREENSHOTS

118

119

120

121

122

123

124

CHAPTER-9: CONCLUSION OF PROJECT

125

CONCLUSION

Database migration is a complex project that has to be thoroughly analyzed. Determine the

migration type and the necessary tools and resources to complete it. Various approaches have to

be fully reviewed and understood prior to choosing one. Migration can be done in-house or

outsourced to a consulting practice. Consider using third-party solutions to prevent hand-coding

migration scripts that will be time consuming and rather complex if data cleansing is involved. In

addition, these solutions can reduce or eliminate the downtime and facilitate the replication

between the systems. Outsourcing will bring in the best practices and appropriate tools to

complete the project. This project over database migration is a value-add tool that enables you to

identify any issues with your database migration before the project is signed off as complete. The

combination of Oracle Recovery manager, Oracle JDeveloper Application Migration Assistant,

and Oracle Database Migration Verifier provides a comprehensive solution for database and

application migration to the Oracle platform.

Migration Task % of Effort

DB Best product % of Task

Migration Analysis 7% Migration

Analyzer

100%

Schema and Content

Conversion

3% Schema & Data

Migrator

100%

Application

Component

Conversion

37% SQL Converter

& Runtime

Query Converter

95%

Environment/Utility

Conversion

1% SQL Converter 50%

Product Test Planning 6% Migration Tester 50%

Product Test

Execution

39% Migration Tester 66%

Deployment 7% N/A 0%

126

Appendix 1

INDEX

A R

Analytical database 11 Relational Model 22

Architecture of database 16 RDBMS 32

B S

Backup And Replication 31 Step of migration 78

Business Migration 73 System requirement 79

C T

Command Notation 78 Three-Level Architecture 17

Coding 88 Task of DBMS Package 30

Choosing Linux 76 Tables 7

D W

Data migration 71 Window Manager 76

Database migration 73

DBMS Features 30

Data Flow Diagram 81

E

External database 12

External data level 18

K

Key 7

127

Appendix 2

GLOSSARY

A

Attribute - A single data item related to a database object

C

Candidate Key - A candidate key is a combination of attributes that can be uniquely used to

identify a database record without any extraneous data.

Column- Database tables are composed of individual columns corresponding to the attributes of

the object.

D

Data - Data consists of a series of facts or statements that may have been collected, stored,

processed and/or manipulated but have not been organized or placed into context.

DB2- DB2 is a relational database system developed by IBM Corporation, originally for use on

large mainframe computer systems.

E

ER Diagram- An entity-relationship (ER) diagram is a specialized graphic that illustrates the

interrelationships between entities in a database.

K

Key - A database key is a attribute utilized to sort and/or identify data in some manner.

R

Report- A database report presents information retrieved from a table or query in a preformatted,

attractive manner.

S

Schema- A database schema is a collection of meta-data that describes the relations in a database

128

REFERENCES

[1]. Van der Hoeven, Jeffery, Brian Lohman, and Remco Verdegem. "Emulation for Digital

Preservation in Practice: The Results." The International Journal”Curation 2.2 :

{121-132}.

[2]. Muira, Gregory. "Pushing the Boundaries of Traditional Heritage Policy: maintaining

long-term access to multimedia content." IFLA Journal 33: 323-326.

[3]. Migrating Legacy Systems, by Brodie and Stonebraker

[4]. Database migration – “A planned approach” by Steve Callan

LINKS

[1]. www.databasejournal.com/database migration

[2]. www.wikisearch.com/databaseconcepts

Chapter

Documents

database database

raw data

information information

form of data

processed data

data classification

critical data

data corruptionloss