Top Banner
1 Data Models How to structure data
42

Data Models

Dec 31, 2015

Download

Documents

Tobias Richard

Data Models. How to structure data. What is a Data Model?. Having formed a model of the enterprise, we now need to represent the data. The data model tells us the structure of the database. Historically, three data models: Hierarchical data model Network data model Relational data model. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Models

1

Data Models

How to structure data

Page 2: Data Models

2

What is a Data Model?

Having formed a model of the enterprise, we now need to represent the data.

The data model tells us the structure of the database.

Historically, three data models: Hierarchical data model Network data model Relational data model

Page 3: Data Models

3

Hierarchical and Network Data Models

Hierarchical and network data models have been superseded by the relational data model.

Reasons: Lack of expressive power

E.g., one cannot express many-to-many relationships in the hierarchical model

More closely tied to the underlying implementation. Hence, less data independence.

Relational data model has a clean mathematical basis.

Page 4: Data Models

4

The Relational Model

Due to Codd. Everything is represented as a

relation in the mathematical sense. Also called tables.

A database therefore is a collection of tables, each of which has a unique name, and each of which is described by a schema.

In addition, Codd defined a data manipulation language.

Page 5: Data Models

5

Example of Schemas in the Relational Model

Example of a representation of entity sets:Student(sid,name,addr)

Course(cid,title,eid)

Empl(eid, ename, deptid)

Dept(deptid, dname, loc) Primary keys are underlined. Recall that a primary key is one

that uniquely identifies an entity. An entity is a row in a table.

Page 6: Data Models

6

More Example Schemas

Relationship sets between entity sets are also represented in tables.

Example of a table corresponding to a relationship:Enrol(sid, cid, grade)

Again, a relationship is represented by a row (or a tuple) in a relation.

Page 7: Data Models

7

Relational Databases:Basic Concepts I

Attribute: A column in a table

Domain The set of values from which the

values of an attribute are drawn. Null value

A special value, meaning “not known” or “not applicable”.

Relation schema A set of attribute names

Page 8: Data Models

8

Relational Databases: Basic Concepts II

Tuple A set of values, one for each

attribute in the relation scheme over which the tuple is defined, i.e. a mapping from attributes to the appropriate domains

Relation instance A set of tuples over the scheme

of the relation

Page 9: Data Models

9

Relational Databases:Basic Concepts III

Relational Database A set of relations, each with a

unique name Normalized Relation

A relation in which every value is atomic (non-decomposable). Hence, every attribute in every tuple has a single value.

Page 10: Data Models

10

Keys

Candidate Key A minimal set of attributes that

uniquely identifies a tuple Primary Key

The candidate key chosen as the identifying key of the relation

Alternate Key Candidate keys which are not

primary keys

Page 11: Data Models

11

Foreign Key An attribute (or set of attributes)

in table R1 which also occurs as the primary key of relation R2.

R2 is called the referenced relation.

Foreign keys are also called connection keys or reference attributes.

Page 12: Data Models

12

Integrity Rules: Entity Constraint

Entity constraint All attributes in a primary key

must be non-null. Motivation: If the primary key

uniquely identifies an entity in an entity set, then we must ensure that we have all the relevant information

Page 13: Data Models

13

Integrity Rules:Referential Integrity

Referential integrity A database cannot contain a

tuple with a value for a foreign key that does not match a primary key value in the referenced relation.

Or, a foreign key must refer to a tuple that exists.

Motivation: If referential integrity were violated, we could have relationships between entities that we do not have any information about.

Page 14: Data Models

14

Data Manipulation Languages

In order for a database to be useful, it should be possible to store and retrieve information from it. This is the role of the data manipulation language.

One of the attractions of the relational data model is that it comes with a well-defined data manipulation language.

Page 15: Data Models

15

Types of DML

Two types of data manipulation languages Navigational (procedural)

The query specifies (to some extent) the strategy used to find the desired result e.g. relational algebra.

Non-navigational(non-procedural)

The query only specifies what data is wanted, not how to find it e.g. relational calculus.

Page 16: Data Models

16

Relational Algebra

Codd defined a number of algebraic operations for the relational model.

Unary operations take as input a single table and produce as output another table.

Binary operations take as input two tables and produce as output another table.

Page 17: Data Models

17

Unary Operations: Select

Select produces a table that only contains the tuples that satisfy a particular condition, in other words a “horizontal” subset.

Appearance:C(R) where C is a selection condition and R is the relation over which

the selection takes place

Page 18: Data Models

18

Example of Select

Student

sid name addr

123 Fred 3 Oxford

345 John 6 Hope Rd.

567 Ann 5 Garden

sid > 300(Student) yields

345 John 6 Hope Rd.

567 Ann 5 Garden

Page 19: Data Models

19

Unary Operations:Project

Project produces a table consisting of only some of the attributes. It creates a “vertical” subset.

Note that a project eliminates duplicates.

Appearance:

ПA(R) where A is a set of attributes of R and R is the relation over which

the project takes place.

Page 20: Data Models

20

Example of Project

Enrol

sid cid grade

123 CS51T 76

234 CS52S 50

345 CS52S 55

Пcid(Enrol) yields

CS51T

CS52S

Page 21: Data Models

21

Binary Operations

Two relations are (union) compatible if they have the same set of attributes.

Example, one table may represent suppliers in one country, while another table with same schema represents suppliers in another country.

For the union, intersection and set-difference operations, the relations must be compatible.

Page 22: Data Models

22

Union, Intersection, Set-difference

R1 R2 The union is the table comprised

of all tuples in R1 or R2. R1 R2

The intersection is the table comprised of all tuples in R1 and R2

R1 - R2 The set-difference between R1

and R2 is the table consisting of all tuples in R1 but not in R2.

Page 23: Data Models

23

Cartesian Product

R1 R2 The Cartesian product is the

table consisting of all tuples formed by concatenating each tuple in R1 with a tuple in R2, for all tuples in R2.

Page 24: Data Models

24

Example of a Cartesian Product

R1 A B

1 x

2 y

R2 C D

a s

b t

c u

R1 R2 A B C D

1 x a s

1 x b t

1 x c u

2 y a s

2 y b t

2 y c u

Page 25: Data Models

25

Natural Join

R1 R2 Assume R1 and R2 have

attributes A in common. Natural join is formed by concatenating all tuples from R1 and R2 with same values for A, and dropping the occurrences of A in R2

R1 R2 = ПA’(C(R1 R2)) where C is the condition that the

values for R1 and R2 are the same for all attributes in A and A’ is all attributes in R1 and R2 apart from the occurrences of A in R2.

hence, natural join is syntactic sugar

Page 26: Data Models

26

Example of a Natural Join I

Course

cid title eid

CS51T DBMS 123

CS52S OS 345

CS52T Networking 345

CS51S ES 456

Instructor

eid ename

123 Rao

345 Allen

456 Mansingh

Page 27: Data Models

27

Example of a Natural Join II

Course Instructorcid title eid ename

CS51T DBMS 123 Rao

CS52S OS 345 Allen

CS52T Net... 345 Allen

CS51S ES 456 Mansingh

Page 28: Data Models

28

Division

R1 R2 Assume that the schema for R2

is a proper subset of the one for R1.

We form the division by Ordering the tuples in R1 so that

all the tuples with the same value for the non-common attributes are grouped together.

Each group contributes a tuple to the result if the group’s values on the common attributes form a superset of the values of these attributes in R2.

Page 29: Data Models

29

Example of Division I

Enrol cid sid grade

CS51T 123 A

CS52S 123 A

CS51T 234 C

CS52S 234 B

CS51T 345 C

CS52S 345 C

Temp sid grade

123 A

234 B

Page 30: Data Models

30

Example of Division II

Enrol cid sid grade

CS51T 123 A

CS51T 234 C

CS51T 345 C

CS52S 123 A

CS52S 234 B

CS52S 345 C

Enrol Temp cid

CS52S

Thus, the division gives all courses for which 123 got an A and 234 a B.

Page 31: Data Models

31

Assignment

Allows the expression to be written in parts.

Assigns the part to a temporary variable.

This variable can be used in subsequent expressions.

E.g. sid(title = ‘DBMS’ (Enrol Course)

Could be re-written as: r Enrol Course sid(title = ‘DBMS’(r))

Page 32: Data Models

32

Rename Operation

Names the result of an expression.

x(A1,A2,…,An) (E)

returns the result of expression E under the name x with the attributes renamed as A1,A2,…,An.

E.g. S (Student) Renames Student table to S.

Page 33: Data Models

33

Database Modification

Insert r r E e.g.

Course Course {(‘CS51T’,’DBMS’)}

Delete r r - E e.g.

Student Student - sid=‘1’(Student)

Update r

F1,F2,…,Fn (r)

e.g. Enrol sid,cid,grade grade + 2 (Enrol)

Page 34: Data Models

34

Examples

Assume the following schema: Student(sid,sname,saddr)

Course(cid,title,lid)Enrol(sid, cid, grade)Lecturer(lid,lname,deptname)

Query 1: Find the name of all students that have taken the course entitled ‘Expert Systems’.

Query 2: Find the titles of all courses that student ‘Mark Smith’ has done.

Query 3: Find the id of students that have enrolled in all the courses that lecturer with id. = ‘234’ has taught.

Query 4: Find the highest grade for ‘CS51T’.

Page 35: Data Models

35

Relational Calculus

A relational calculus expression defines a new relation in terms of other relations.

A tuple variable ranges over a named relation. So, its values are tuples from that relation.

Example: Get grades for CS51T e(Enrol)

{<e.grade>: e.cid = ‘CS51T’ }

Page 36: Data Models

36

Basic Syntax for Relational Calculus Expressions

r(R),…,s(S)

{ <target> : predicate} where

R,..,S are tables r,..,s are tuple variables target specifies the attributes of

the resulting relation predicate is a formula giving a

condition that tuples must satisfy to qualify for the resulting relation.

Page 37: Data Models

37

The Predicate

Predicate is constructed from attribute names constants comparison operators

logical connectives

quantified tuple variables

t(R), t(R)

Page 38: Data Models

38

Examples of Relational Calculus

Example 2 Get names and grades for

students enrolled in CS51T e(Enrol), s(Student)

{<s.name, e.grade>:

e.cid = ‘CS51T’ s.sid = e.sid}

In relation algebraПcid, name( CID =‘ CS51T’(Grade Student))

Page 39: Data Models

39

Example 3

Give the names of all students who got at least one A. s(Student)

{<s.name>:

e(Enrol)

(e.grade = ‘A’ s.sid = e.sid)}

Tuple variables not mentioned in the target list must be bound in the predicate.

Page 40: Data Models

40

Example 4

Get the names of all students who only got A’s s(Student)

{<s.name>:

e(Enrol)( s.sid = e.sid e.grade

= ‘A’)

e2(Enrol) (s.sid = e2.sid)}

Page 41: Data Models

41

Example 5

Get the names of all students who got an A and a B s(Student)

{<s.name>:

e(Enrol) (e.grade = ‘B’ s.sid =

e.sid)

e2(Enrol) (e2.grade = ‘A’

s.sid = e2.sid)}

Page 42: Data Models

42

Example 6

Get the course titles and names for the courses for which the student did not get an A c(Course), s(Student)

{<s.name, c.title>:

g(Enrol) s.sid = g.sid g.cid = c.cid

g.grade ‘A’}