Top Banner
IT420: Database Management and Organization Adina Crăiniceanu [email protected]
50

IT420: Database Management and Organization Adina Crăiniceanu [email protected].

Dec 25, 2015

Download

Documents

Lesley Small
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

IT420: Database Management and Organization

Adina Crăiniceanu

[email protected]

Page 2: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Instructor

Adina Crainiceanu

M.S. and Ph.D. Cornell University Area of Specialization: Databases Research: search in peer-to-peer systems

Page 3: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Database Management and Organization How does Wal-Mart manage its 200 TB

data warehouse? What is the database technology

behind ebay’s website? How do you build an Oracle 9i, IBM DB2

or Microsoft SQL Server database?

Page 4: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Course Goals

Understand the functionality of modern database systems

Understand where database systems fit into an enterprise data management infrastructure

Design and build data-driven applications websites

Learn several important technologies: SQL, PHP, XML, XQuery, web services

Page 5: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Course Workload

Labs + Lectures Grade:

25%: Final Exam 30%: 6-Week and 12-Week Exams 20%: Homeworks, Labs, Quizes 20%: Projects 5%: Class Participation

Page 6: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Evaluation Policies

Assignments: No late submissions

Exams: comprehensive, closed book/ closed notes

Re-grade requests: up to 7 days after grade

Page 7: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Academic Integrity - Honor

Honor Concept of the Brigade of Midshipmen Policies Concerning Graded Academic Work

USNA CS

http://www.cs.usna.edu/academics/honor.htm Collaboration on homeworks is possible, but

submitted work should be your own. Cite any assistance, from any sources

Collaboration on projects, exams, quizzes is prohibited

Page 8: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Resources

Textbook: Database Processing by David Kroenke

Database Management Systems by R. Ramakrishnan and J. Gehrke

MySQL/PHP Database Applications by B. Bulger Microsoft Access reference book Lecture slides Course website:

www.cs.usna.edu/~adina/teaching/it420spring2006

Page 9: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Classroom

No food permitted in classroom No use of computer equipment for any

purpose other than as outlined in the class activity

Page 10: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Course Topics

Database design Relational model SQL Normalization Database administration PHP, MySQL XML Three-tier concepts

Page 11: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Database Management Systems (DBMS) Information is one of the most valuable

resources in this information age How do we effectively and efficiently

manage this information? Relational database management systems

Dominant data management paradigm today

6 billion dollars a year industry!

Page 12: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Why not Files?

Page 13: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Classes

string JobName class Contractorclass Equipment class Date

double charge

Page 14: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

‘Query Processing’class Rental{ public: string job; Contractor Con_data; Equipment Equip_data; Date rent_data; double charge;};

Q: All jobs with Charge > x?A: Rental allRentals[10]; changesGreaterThan(double x){ for(i…){ if (allRentals[i].charge > x) cout << … }}

Page 15: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Problems Changes to Data Data inconsistencies Access Control Security of information (views) Loss of info due to deletion “on the fly” Queries?

Page 16: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Why Database Management Systems? Benefits

Transactions (concurrent data access, recovery from system crashes)

High-level abstractions for data access, manipulation, and administration

Data integrity and security Performance and scalability

Page 17: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

What is a Transaction?

The execution of a program that performs a function by accessing a database.

Examples: Reserve an airline seat. Buy an airline ticket. Withdraw money from an ATM. Verify a credit card sale. Order an item from an Internet retailer.

Page 18: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Transactions

A transaction is an atomic sequence of actions Each transaction must leave the system in a

consistent state (if system is consistent when the transaction starts).

The ACID Properties: Atomicity Consistency Isolation Durability

Page 19: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Example Transaction: Online Store

Your purchase transaction: Atomicity: Either the complete purchase

happens, or nothing Consistency: The inventory and internal

accounts are updated correctly Isolation: It does not matter whether other

customers are also currently making a purchase Durability: Once you have received the order

confirmation number, your order information is permanent, even if the site crashes

Page 20: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Transactions (cont.)

A transaction will commit after completing all its actions, or it could abort (or be aborted by the DBMS) after executing some actions.

Page 21: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Example Transactions: ATM

You withdraw money from the ATM machine Atomicity Consistency Isolation Durability

Commit versus Abort?

Page 22: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

What Makes Transaction Processing Hard? Reliability - system should rarely fail Availability - system must be up all the time Response time - within a few seconds Throughput - thousands of transactions/second Scalability - start small, ramp up to Internet-scale Security – for confidentiality and high finance Configurability - for above requirements + low cost Atomicity - no partial results Durability - a transaction is a legal contract Distribution - of users and data

Page 23: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

What Makes TP Important?

It is at the core of electronic commerce Most medium-to-large businesses use TP

for their production systems. It is a huge slice of the computer system

market

Page 24: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Why Database Management Systems? Benefits

Transactions (concurrent data access, recovery from system crashes)

High-level abstractions for data access, manipulation, and administration

Data integrity and security Performance and scalability

Page 25: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Data Model

A data model is a collection of concepts for describing data.

Examples: ER model (used for conceptual modeling) Relational model, object-oriented model,

object-relational model (actually implemented in current DBMS)

Page 26: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

The Relational Data Model

A relational database is a set of relations. Turing Award (“Nobel Prize” in CS) for Codd in

1980 Example relation:

Student(cid: integer, name: string, byear: integer, state: string)

Page 27: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

The Relational Model: Terminology

Relation instance and schema (table) Field (column) Record or tuple (row) Primary key Foreign key

Page 28: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

The Object-Oriented Data Model

Richer data model. Goal: Bridge mismatch between programming languages and the database system.

Example components of the data model: Relationships between objects directly as pointers.

Result: Can store abstract data types directly in the DBMS Pictures Geographic coordinates Movies CAD objects

Page 29: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Object-Oriented DBMS

Advantages: Engineering applications (CAD and CAM and

CASE computer aided software engineering), multimedia applications.

Disadvantages: Querying is much harder

Page 30: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Object-Relational DBMS

Mixture between the object-oriented and the object-relational data model Combines ease of querying with ability to

store abstract data types Conceptually, the relational model, but every

field

All major relational vendors are currently extending their relational DBMS to the object-relational model

Page 31: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Query Languages

We need a high-level language to describe and manipulate the data

Requirements: Precise semantics Easy integration into applications written in C+

+/Java/Visual Basic/etc. Easy to learn DBMS needs to be able to efficiently evaluate

queries written in the language

Page 32: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

SQL: Structured Query Language

Developed by IBM (System R) in the 1970s

ANSI standard since 1986: SQL-86 SQL-89 (minor revision) SQL-92 (major revision, current standard) SQL-99 (major extensions)

More about SQL in later lectures

Page 33: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Example Query

SELECT

Customers.cid,

Customers.name,

Customers.byear,

Customers.state

FROM Customers

WHERE Customers.cid = 3

Page 34: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Why Database Management Systems? Benefits

Transactions (concurrent data access, recovery from system crashes)

High-level abstractions for data access, manipulation, and administration

Data integrity and security Performance and scalability

Page 35: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Integrity Constraints

Integrity Constraints (ICs): Condition that must be true for any instance of the database.

ICs are specified when schema is defined. ICs are checked when relations are modified. A legal instance of a relation is one that satisfies

all specified ICs. DBMS should only allow legal instances. Example: Domain constraints.

Page 36: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Security

Secrecy: Users should not be able to see things they are not supposed to. E.g., A student can’t see other students’

grades. Integrity: Users should not be able to

modify things they are not supposed to. E.g., Only instructors can assign grades.

Availability: Users should be able to see and modify things they are allowed to.

Page 37: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Why Database Management Systems? Benefits

Transactions (concurrent data access, recovery from system crashes)

High-level abstractions for data access, manipulation, and administration

Data integrity and security Performance and scalability

Page 38: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

DBMS and Performance

Efficient implementation of all database operations

Indexes Query optimization Automatic high-performance concurrent

query execution, query parallelization

Page 39: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Summary Of DBMS Benefits

Transactions ACID properties, concurrency control, recovery

High-level abstractions for data access Data models

Data integrity and security Key constraints, foreign key constraints, access

control Performance and scalability

Parallel DBMS, distributed DBMS, performance tuning

Page 40: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

The Three-Tier Architecture

Presentation tier Client Program (Web Browser)

Middle tier Application Server

Database Management System

Data management tier

Page 41: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Presentation Tier

Primary interface to the user Needs to adapt to different display devices

(PC, PDA, cell phone, voice access?)

Page 42: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Middle Tier

Application Programs: Create and process forms Create and transmit queries Create and process reports Execute application logic: implement complex

actions, maintain state between different steps of a workflow

Access different data management systems

Page 43: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Database Management Tier

One or more standard database management systems: Oracle, DB2, SQL Server, MySQL

Page 44: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Example 1: Airline reservations

Build a system for making airline reservations

Database System

Application Server

Client Program

Page 45: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Example 1: Airline reservations

Build a system for making airline reservations Database System

Airline info, available seats, customer info, etc.

Application Server Logic to make reservations, cancel reservations, add

new airlines, etc.

Client Program Log in different users, display forms and human

readable output

Page 46: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Three-Tier Architecture: Advantages Heterogeneous systems

Tiers can be independently maintained, modified, and replaced Thin clients

Only presentation layer at clients (web browsers) Integrated data access

Several database systems can be handled transparently at the middle tier

Central management of connections Scalability

Replication at middle tier permits scalability of business logic Software development

Code for business logic is centralized Interaction between tiers through well-defined APIs: Can reuse

standard components at each tier

Page 47: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Technologies

Client Program (Web Browser)

Application Server

Database Management System

HTML, Javascript, XSLT

XML, C#, Cookies, XPath,web services

SQL, Stored Procedures

Page 48: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Next: Microsoft Access

DBMS + Application Server

Page 49: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.
Page 50: IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu.

Relational DB => “relate tables”

Tables are related by “keys” which uniquely identify a record in a table