Top Banner
Management Information Systems Unit 5. Database Management Systems Data warehousing Data mining Data Definition Language Data Control Language Data Manipulation Language 2-12-10 & 3-12-10 & 4-12-10 prepared by A Sanjeev raj
44
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Unit 5

Management Information SystemsUnit 5. Database Management Systems

Data warehousingData mining

Data Definition LanguageData Control Language

Data Manipulation Language

2-12-10 & 3-12-10 & 4-12-10

prepared by A Sanjeev raj

Page 2: Unit 5

Today we are going to discuss about..

• Data Warehousing.• Data Mining.• Data Definition Language.• Data Control Language.

5. Database Management Systems

Page 3: Unit 5

Data warehousing

5. Database Management Systems

Page 4: Unit 5

Data warehouse

• A collection of corporate information, derived directly from operational system & some external data sources.

• Its specific purpose is to support business decisions, not business operations.

• A data warehouse maintains its functions in three layers:

1.Staging.

2.Integration.

3.Access.

5. Database Management Systems

Page 5: Unit 5

• Staging is used to store raw data for use by developers (analysis and support).

• The integration layer is used to integrate data and to have a level of abstraction from users.

• The access layer is for getting data out for users.

5. Database Management Systems

Page 6: Unit 5

Purpose

• Realize the value of data:

a) Data/information is an asset.

b) Methods to realize the value (Reporting , analysis etc).

• Make better decision:

a) Turn data into information.

b) Create competitive advantage.

c) Methods to support the decision making process(EIS , DSS , etc).

5. Database Management Systems

Page 7: Unit 5

Data warehouse Components

• Staging Area:

- A preparatory repository where transaction data can be transformed for use in the data warehouse.

• Data Mart:

-Traditional dimensionally modeled set of dimension & fact labels.

- Per Kimball , a data warehouse is the union of a set of data marts.

• Operational Data Store(ODS):

- Modeled to support near real time reporting needs.

5. Database Management Systems

Page 8: Unit 5

Architecture• One possible simple conceptualization of a data warehouse architecture consists of

the following interconnected layers:

a) Operational Database Layer :

The source data for the data warehouse — An organization‘s Enterprise resource planning systems fall into this layer.

b) Data Access Layer :

The interface between the operational and informational access layer — Tools to extract , transform, load data into the warehouse fall into this layer.

5. Database Management Systems

Page 9: Unit 5

c) Metadata layer :

I ) The data directory — This is usually more detailed than an operational system data directory.

II )There are dictionaries for the entire warehouse and sometimes dictionaries for the data that can be accessed by a particular reporting and analysis tool.

5. Database Management Systems

Page 10: Unit 5

d) Informational access layer :

I )The data accessed for reporting and analyzing and the tools for reporting and analyzing data — Business intelligence tools fall into this layer.

II ) The Inmon-Kimball differences about design methodology, discussed later in this article, have to do with this layer.

5. Database Management Systems

Page 11: Unit 5

Evolution in organization use

• Organizations generally start off with relatively simple use of data warehousing.

• Over time, more sophisticated use of data warehousing evolves.

• The following general stages of use of the data warehouse can be distinguished:

-> Offline Operational Data Warehouse :

Data warehouses in this initial stage are developed by simply copying the data off of an operational system to another server where the processing load of reporting against the copied data does not impact the operational system's performance.

5. Database Management Systems

Page 12: Unit 5

-> Offline Data Warehouse :

Data warehouses at this stage are updated from data in the operational systems on a regular basis and the data warehouse data are stored in a data structure designed to facilitate reporting.

-> Real Time Data Warehouse :

Data warehouses at this stage are updated every time an operational system performs a transaction (e.g. an order or a delivery or a booking).

-> Integrated Data Warehouse  :

These data warehouses assemble data from different areas of business, so users can look up the information they need across other systems.

5. Database Management Systems

Page 13: Unit 5

The Advantages of Data Warehousing

• Data warehouses tend to have a very high query success as they have complete control over the four main areas of data management systems.

-> Clean data.

->Indexes: Multiple Types

->Query processing: Multiple Options

->Security: Data and Access.

->Easy report creation.

->Enhanced access to data and information.

5. Database Management Systems

Page 14: Unit 5

Disadvantages of Data Warehousing

There are considerable disadvantages involved in moving data from multiple, often highly disparate, data sources to one data warehouse that translate into long implementation time, high cost, lack of flexibility, dated information, and limited capabilities.

-> Preparation may be time consuming.

-> Compatibility with existing systems.

-> Security issues.

-> Long initial implementation time and associated high cost.

-> Limited flexibility of use and types of users.

-> Difficult to accommodate changes in data types and ranges, data source schema, indexes and queries.

5. Database Management Systems

Page 15: Unit 5

Data Mining

Page 16: Unit 5

Introduction

• With the increased and widespread use of technologies, interest in data mining has increased rapidly.

• Companies are now utilized data mining techniques to exam their database looking for trends, relationships, and outcomes to enhance their overall operations and discover new patterns that may allow them to better serve their customers.

5. Database Management Systems

Page 17: Unit 5

• Data mining provides numerous benefits to businesses, government, society as well as individual persons.

• An ethical analysis about data mining is provided in the figure shows as

5. Database Management Systems

Page 18: Unit 5

What is Data Mining?

• Data Mining, also known as Knowledge-Discovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns.

• Data Mining applies many older computational techniques from statistics, machine learning and pattern recognition.

5. Database Management Systems

Page 19: Unit 5

• With data mining, the information obtained from the bankers, credit card companies, and department stores can be put to good use. 

5. Database Management Systems

Page 20: Unit 5

Data mining consists of five major elements: • Extract, transform, and load transaction data onto the data

warehouse system. • Store and manage the data in a multidimensional database system. • Provide data access to business analysts and information

technology professionals. • Analyze the data by application software. • Present the data in a useful format, such as a graph or table.

5. Database Management Systems

Page 21: Unit 5

Data Mining Goal

• The ultimate goal of data mining is prediction - and predictive data mining is the most common type of data mining and one that has the most direct business applications.

5. Database Management Systems

Page 22: Unit 5

How Data mining works

• Data mining is a component of a wider process called “knowledge discovery from database”. 

• It involves scientists and statisticians, as well as those working in other fields such as machine learning, artificial intelligence, information retrieval and pattern recognition.

• How data mining can assist bankers in enhancing their businesses is illustrated in this example.

Records include information such as age, sex, marital status, occupation, number of children, and etc. of the bank’s customers over the years are used in the mining process.

5. Database Management Systems

Page 23: Unit 5

3 Steps Data Mining Process• Stage 1: Exploration. This stage usually starts with data preparation

which may involve cleaning data, data transformations, selecting subsets of records.

• Stage 2: Model building and validation. This stage involves considering various models and choosing the best one based on their predictive performance.

• Stage 3: Deployment. That final stage involves using the model selected as best in the previous stage and applying it to new data in order to generate predictions or estimates of the expected outcome.

5. Database Management Systems

Page 24: Unit 5

Tools used for data mining• Artificial neural networks - Non-linear predictive models that learn through

training and resemble biological neural networks in structure.• Decision trees - Tree-shaped structures that represent sets of decisions. These

decisions generate rules for the classification of a dataset.• Genetic algorithms - Optimization techniques based on the concepts of genetic

combination, mutation, and natural selection.

5. Database Management Systems

Page 25: Unit 5

Reasons for the growing popularity of Data Mining

• Growing Data Volume.

• Limitations of Human Analysis.

• Low Cost of Machine Learning.

5. Database Management Systems

Page 26: Unit 5

ADVANTAGES OF DATA MINING

• Marking/Retailing: Data mining can aid direct marketers by providing them with useful and accurate trends about their customers’ purchasing behavior.

• Banking/Crediting: Data mining can assist financial institutions in areas

such as credit reporting and loan information.    

5. Database Management Systems

Page 27: Unit 5

DISADVANTAGES OF DATA MINING

Privacy Issues: For example, according to Washington Post, in 1998, CVS had sold their patient’s prescription purchases to a different company. American Express also sold their customers’ credit card purchases to another company.

Security issues. Misuse of information.

5. Database Management Systems

Page 28: Unit 5

Data Definition language

Page 29: Unit 5

• Data Definition Language (DDL) is a standard for commands that define the different structures in a database.

• DDL statements create, modify, and remove database objects such as tables, indexes, and users. Common DDL statements are

-> CREATE

-> ALTER

-> DROP

Data Definition language

5. Database Management Systems

Page 30: Unit 5

• Create - To make a new database, table, index, or stored query.

• A CREATE statement in SQL creates an object inside of a relational database management system(RDBMS).

• The types of objects that can be created depends on which RDBMS is being used, but most support the creation of tables, indexes, users, synonyms and databases.

Syntax:

CREATE [TEMPORARY] TABLE [table name] ( [column definitions] ) [table parameters]

Create

5. Database Management Systems

Page 31: Unit 5

• CREATE TABLE employees (

id INTEGER PRIMARY KEY,

first_name CHAR(50) NULL,

last_name CHAR(75) NOT NULL,

Date of birth DATE NULL

)

Example

5. Database Management Systems

Page 32: Unit 5

• Drop - To destroy an existing database, table, index, or view.

• Syntax :

DROP object type object name.

Example:

• The command to drop a table named employees would be:

DROP TABLE employees;

• The DROP statement is distinct from the Delete and truncate statements, in that they do not remove the table itself.

Drop

5. Database Management Systems

Page 33: Unit 5

• Alter - To modify an existing database object.

• Syntax:

ALTER object type object name parameters

Example:

• The command to add (then remove) a column named bubbles for an existing table named sink would be:

ALTER TABLE sink ADD bubbles INTEGER;

ALTER TABLE sink DROP COLUMN bubbles;

Alter statement

5. Database Management Systems

Page 34: Unit 5

Data Control Language

5. Database Management Systems

Page 35: Unit 5

Introduction

• A Data Control Language or DCL is a part Structured Query Language (SQL) used to control the access to data in a database.

• For any in-house or outsource software development process, Data Control Language plays the critical role of setting authorization mechanism of the database.

DCL Commands:• GRANT: To grant users the right to access the database or perform certain tasks.

• REVOKE: To cancel any previously granted or denied permission.

5. Database Management Systems

Page 36: Unit 5

The GRANT and REVOKE commands can be used with the following privileges for a database user:

• INSERT (To insert records)

• SELECT (To return a result set from a table)

• DELETE (To delete specific records)

• EXECUTE (To execute a procedure)

• USAGE (To modify the maximum number of records allowed)

• CONNECT (To set a database connection)

• UPDATE (To modify a record in a table)

5. Database Management Systems

Page 37: Unit 5

• GRANT command:

The GRANT command sets permissions (for INSERT, SELECT, UPDATE etc.) on any specific or all tables to different users using the database.

Syntax:

GRANT {ALL/SPECIFIC PERMISSIONS} ON {TABLENAME} TO {USER ACCOUNT} [WITH GRANT OPTION]

Example:

In a real-world database-driven environment (such as in an Offshore Software Development Company ) any user getting the permission on a specific object (such as a table) from another user "with GRANT option", then that user has the privilege to set permission on that same table to a different user acting as an owner.

5. Database Management Systems

Page 38: Unit 5

REVOKE Command:

The REVOKE command revokes or cancels permissions (for INSERT, SELECT, UPDATE etc.) on any specific or all tables from different users using the database.

Syntax:

REVOKE {ALL/SPECIFIC PERMISSIONS} ON {TABLENAME} FROM {USER ACCOUNT} [CASCADE]

The CASCADE option revokes the permissions from the user as well as the main user who initiated the permission.

Be it for professionals engaged in Software Development Services in India, the United States or elsewhere, the correct permission settings are of paramount importance through DCL.

5. Database Management Systems

Page 39: Unit 5

Data Manipulation Language

5. Database Management Systems

Page 40: Unit 5

Introduction:

• Data Manipulation Language (DML) is a family of computer languages used by computer programs and/or database users to insert, delete and update data in a database.

• Read-only querying, i.e. SELECT, of this data may be considered to be either part of DML or outside it, depending on the context.

• Currently the most popular data manipulation language is that of SQL, which is used to retrieve and manipulate data in a Relational database.

5. Database Management Systems

Page 41: Unit 5

DML Commands

• INSERT.

• SELECT.

• UPDATE.

• DELETE.

INSERT:

The INSERT command in SQL is used to add records to an existing table.

Returning to the personal_info example from the previous section, let's imagine that our HR department needs to add a new employee to their database. They could use a command similar to the one shown below:

INSERT INTO personal_infovalues('bart','simpson',12345,$45000)

5. Database Management Systems

Page 42: Unit 5

SELECT:

The SELECT command is the most commonly used command in SQL. It allows database users to retrieve the specific information they desire from an operational database.

SELECT *FROM personal_info

This literally means "Select everything from the personal_info table."

UPDATE:

The UPDATE command can be used to modify information contained within a table, either in bulk or individually. Each year, our company gives all employees a 3% cost-of-living increase in their salary. The following SQL command could be used to quickly apply this to all of the employees stored in the database:

UPDATE personal_infoSET salary = salary * 1.03

5. Database Management Systems

Page 43: Unit 5

DELETE:

The DELETE command with a WHERE clause can be used to remove his record from the personal_info table:

DELETE FROM personal_infoWHERE employee_id = 12345

5. Database Management Systems

Page 44: Unit 5

Overview

• ???

Thank You

5. Database Management Systems