Page 1
1
Advanced Database Topics
Copyright © Ellis Cohen 2002-2005
Modeling, Storing & Querying
Object-Oriented Data
These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.
For more information on how you may use them, please see http://www.openlineconsult.com/db
Page 2
© Ellis Cohen 2002-2005 2
Overview of Object / Database Topics
Modeling OO Data (ODL)Querying OO Data (OQL)OO Persistent Storage (OODBs)Client Access to Persistent OO
Data Modifying Persistent OO DataObject-Relational MappingIntegrated Access to Relational
and OO DatabasesOORDBs
Page 3
© Ellis Cohen 2002-2005 3
Topics for This LectureObject Modeling & MappingODL (Object Definition Language)Introduction to OQL (Object Query
Language) OODB's (Object-Oriented Databases)Inheritance in ODLBasic OQLOQL Functions and Named QueriesSubqueriesCollection OperationsPartitioningFlattening
Page 4
© Ellis Cohen 2002-2005 4
Object Modeling & Mapping
Page 5
© Ellis Cohen 2002-2005 5
Rows as Objects/Entities
empno: 7654name: MARTINsal: 1250comm: 1400
an Employee Object
It can be useful to think of each row as an object or entity (i.e. an instance of an entity class) and the table as a collection of these objects
The columns of the table correspond to the instance variables for each object
It can be useful to think of each row as an object or entity (i.e. an instance of an entity class) and the table as a collection of these objects
The columns of the table correspond to the instance variables for each object
empno name sal comm
Emps
7499 ALLEN 1600 300
7654 MARTIN 1250 1400
7698 BLAKE 2850
7839 KING 5000
7844 TURNER 1500 0
7986 STERN 1500
Page 6
© Ellis Cohen 2002-2005 6
The Object Database Challenge
Why not model persistent data the same way we represent program data – as collections of objects which point to one another?
If we model data as objects, can we design a query language as powerful and robust as SQL?
Page 7
© Ellis Cohen 2002-2005 7
Object Model
Object Classes & Instances– Correspond to entity classes
and instances
CollectionsSet (of specified type, no duplicates)Bag (of specified type, duplicates) List (of specified type, ordered, duplicates)Array (efficiently indexed list,
names are integers or enumerated values)Dictionary (both values & names have a
specified type )
Pointers between objects correspond to relationships
Page 8
© Ellis Cohen 2002-2005 8
Object Mapping
Relational MappingMapping an ER model to a Relational Model
Object MappingMapping an ER model to an Object Model
Dept
works for
deptnodname
empnonamejobaddr
Employee
manages
Page 9
© Ellis Cohen 2002-2005 9
Representing an Employee
Employeeempno: intname: stringjob: stringaddr: Addressdept: ptr to a Dept
(generally called a reference)
Page 10
© Ellis Cohen 2002-2005 10
Employee References its Dept
dept1
dept2
dept3
emp1
emp3
emp5
emp4
emp2
dept
Page 11
© Ellis Cohen 2002-2005 11
Representing a Dept
Deptdeptno: intdname: stringempls:
Collection of ptrs to Employee
Page 12
© Ellis Cohen 2002-2005 12
Dept References its Employees
dept1
dept2
dept3
emp1
emp3
emp5
emp4
emp2
empls
Page 13
© Ellis Cohen 2002-2005 13
Mapping Decision for Object Models
A relationship between two ER classes A and B can be represented in 3 different ways
1. ptrs from A's to B's2. ptrs from B's to A's3. both (one is the inverse of
the other) In mapping an ER model to an Object model,
you may be able to decide how to map each relationship
Page 14
© Ellis Cohen 2002-2005 14
Inverse Relationships
dept1
dept2
dept3
emp1
emp3
emp5
emp4
emp2
empls
dept
Page 15
© Ellis Cohen 2002-2005 15
ODL(Object Definition
Language)
Page 16
© Ellis Cohen 2002-2005 16
Object Definition Language (ODL)
class Employee {attribute int empno;attribute string name;attribute string job;attribute Address addr;relationship Dept dept inverse Dept::empls;
}
class Dept {attribute int deptno;attribute string dname;relationship Set<Employee> empls
inverse Employee::dept;
Page 17
© Ellis Cohen 2002-2005 17
ODL Exercises
1. Represent the manages relationship in ODL.
2. Add a Project class. A project has multiple employees; an employee can be assigned to multiple project. Represent this in the ER diagram and in ODL.
3. There is a relationship between employees who are spouses. Represent this in the ER diagram and in ODL.
Page 18
© Ellis Cohen 2002-2005 18
Representing Manages
class Employee {attribute int empno;attribute string name;attribute string job;attribute Address addr;relationship Dept dept inverse Dept::empls;relationship Employee mgr inverse managees;relationship Set<Employee> managees
inverse mgr;}
class Dept {attribute int deptno;attribute string dname;relationship Set<Employee> empls
inverse Employee::dept;
Page 19
© Ellis Cohen 2002-2005 19
Updated ER Model
Dept
works for
deptnodname
empnonamejobaddr
Employee
manages
Project
pnopname
spouse
Page 20
© Ellis Cohen 2002-2005 20
Representing Projects & Spouses
class Employee {attribute int empno;attribute string name;attribute string job;attribute Address addr;relationship Dept dept inverse Dept::empls;relationship Employee mgr inverse managees;relationship Set<Employee> managees
inverse mgr;relationship Employee spouse inverse spouse;relationship Set<Project> assignments
inverse Project::members;}
class Project {attribute int pno;attribute string pname;relationship Set<Employee> members
inverse Employee::assignments;}
Page 21
© Ellis Cohen 2002-2005 21
About RelationshipsSome object systems automatically maintain inverse relationships.
In Employeerelationship Set<Project> assignments
inverse Project::members;
When a project is added to an employee's assignments, the employee can automatically be added to that project's members.
1:M and M:N relationships can be represented by any type of collection, not just by a set.
In Employeerelationship List<Project> assignments
inverse Project::members;
This allows each employee's projects to be maintained in some order; e.g. in order of importance to that employee. Makes it harder to automatically maintain inverse relationships (if a members is added to a project, in what order should that project be added to the employee's assignments)
Page 22
© Ellis Cohen 2002-2005 22
Containment vs Reference Example
111-33-6174 SMITH
Kidney 23 cm
Spleen 25 cm
Liver 29 cm
111-33-6174 SMITH
Kidney 23 cm
Spleen 25 cm
Liver 29 cm
Person wOrganStruct attribute
ContainmentReference(sharable)
Page 23
© Ellis Cohen 2002-2005 23
Modeling Containment vs ReferenceRelationships always represent
sharable references to objects
Attributes usually hold primitive typesAttributes may hold objects
(or structs or collections containing objects)
Is an attribute contained in its parent objector does it hold a reference?
If an attribute is contained in its parent object,can some other attribute share it,i.e. hold a reference to it?
Approaches differ from system to system.Keywords may be used to specify the approach
We'll assume that attributes cannot be shared
Page 24
© Ellis Cohen 2002-2005 24
1st & 2nd Class Objects
1st Class ObjectsManaged independentlyAccessed through relationships
2nd Class ObjectsManaged by a parent objectAccessed through attributes
Page 25
© Ellis Cohen 2002-2005 25
Objects & Methods
class Rectangle {attribute Point topleft;attribute Point botright;float height() returns float;float width() returns float;float area() returns float;boolean overlaps( Rectangle rect )
}
Classes can also declare methods, whose definitions are stored as part of the database.
Depending upon the database, these may be classic "stored procedures", meant to be executed at the database (esp for functions used in queries), or could be meant to be
returned to the client and executed on the client-side
Page 26
© Ellis Cohen 2002-2005 26
Introduction to OQL(Object Query
Language)
Page 27
© Ellis Cohen 2002-2005 27
Referencing / Navigation
Given an employee ee.name – e's namee.dept – department of ee.dept.deptno – the # of e's depte.dept.dname – the name of e's dept
Given a department dd.empls – the set of employees
who work in dept d
d.empls.age – NOT LEGAL because d.empls is a collection
Page 28
© Ellis Cohen 2002-2005 28
OQL (Object Query Language)
Given a department dd.empls
the collection of employees who work in dept d
SELECT e.empno FROM e IN d.empls
the employee numbers of the employees who work in department d
SELECT e.empno, e.name FROM e IN d.empls
the employee numbers & names of the employees who work in department d
OQL SELECT iterates through the elements of a collection
Page 29
© Ellis Cohen 2002-2005 29
Extents
How do you iterate through all the persistent employees?
We can associate an extent with a class, which corresponds to the set of persistent instances in that class.
Page 30
© Ellis Cohen 2002-2005 30
ODL with Extents
class Employee (extent emps) {attribute int empno;attribute string name;attribute string job;attribute Address addr;relationship Dept dept inverse Dept::empls;relationship Employee mgr inverse managees;relationship Set<Manager> managees
inverse mgr;}
class Dept (extent depts) {attribute int deptno;attribute string dname;relationship Set<Employee> empls
inverse Employee::dept;}
Page 31
© Ellis Cohen 2002-2005 31
Collections and Relationships
deptsemps
dept1
dept2
dept3
emp1
emp3
emp5
emp4
emp2
empls
dept
Page 32
© Ellis Cohen 2002-2005 32
Queries involving Extents
SELECT e.empnoFROM e IN empsWHERE e.name = "SMITH"
SELECT e.empno, e.nameFROM e IN empsWHERE e.job = "CLERK"
Page 33
© Ellis Cohen 2002-2005 33
Navigation Eliminates Joins
SQL:SELECT e.name, d.dname
FROM Emps e NATURAL JOIN Depts dWHERE d.loc = 'Boston'
OQL:SELECT e.name, e.dept.dname
FROM e IN empsWHERE e.dept.loc = "Boston"
Return the name of every employee and the name of their department, if their department is located in Boston
Page 34
© Ellis Cohen 2002-2005 34
Still Need Ordinary Joins
SELECT e.nameFROM e IN emps, s IN starbucksWHERE e.zip = s.zip
Generate the names of employees who live in zipcodes where there is a Starbucks
SELECT e1.name, e2.nameFROM e1 IN emps, e2 IN empsWHERE e1.empno < e2.empno AND abs(e1.sal - e2.sal) <= 100
Generate pairs of employees whose salaries are within $100 of one another
Page 35
© Ellis Cohen 2002-2005 35
OODB's(Object-Oriented
Databases)
Page 36
© Ellis Cohen 2002-2005 36
Object & Database Notions
DatabasesPersistence
Storage OptimizationIndexing
Queries (SQL)Query Optimization
Constraints & TriggersTransactionsBackup & RecoveryRole-based Security
Object SystemsEncapsulation
Object TypesAttributesMethods
InheritancePolymorphism
Object IdentityReferencesNavigation
CollectionsVersioning
Page 37
© Ellis Cohen 2002-2005 37
Extents and CollectionsClass' Extent
An automatically maintained collection of persistent 1st class objects of that class (creating a new persistent 1st class object automatically adds it to its class's extent)
A class need not have an associated extent
An OODB consists of– persistent objects– their extents– other explicitly defined persistent collections
populace
person-1
person-2
person-3… coolpeople
How would you model this in an
RDB?
Page 38
© Ellis Cohen 2002-2005 38
RDB Persistent Collections
40694 …
15129 …
30276 …
81107 …
60019 …
persno namePopulace
40694
30276
81107
persno
CoolPeople
40694 … T
15129 … F
30276 … T
81107 … T
60019 … F
persno name iscoolPopulace
Could use the ROWID
Page 39
© Ellis Cohen 2002-2005 39
RDB Identity
30 SALES
10 ACCOUNTING
50 SUPPORT
deptno dname
Deptsempno name addr deptno
7499 ALLEN ... 30
7654 MARTIN … 30
7844 TURNER … 30
7212 LAVICH …
7698 BLAKE … 10
7986 STERN … 10
Emps
Identity of an "object" is determined by its primary key, and "references" are represented by foreign keys.
What happens if a department's primary key changes?
Page 40
© Ellis Cohen 2002-2005 40
OODB Identity
empno: 7698name: BLAKEaddr: …dept: 3A47215FE
empno: 7986name: STERNaddr: …dept: 3A47215FE
deptno: 10dname: ACCOUNTING
Note: only showing Employee.dept, not its inverse, Dept.empls
3A47215FE
B67EE3CD4
5C7780112
Every object has an OID (an Object ID) which uniquely identifies it.
A reference can be implemented by a field hold the OID of the referenced object
Page 41
© Ellis Cohen 2002-2005 41
Object Identity Models
Primary KeyIdentifies row in an RDBCan be changed or reusedNot usually globally uniqueReference via foreign key, may not necessarily
always refer to same row
OID (Object ID)Uniquely identifies 1st class object, independent
of all of its values (e.g. my axe)Can't be changed; also can't be reused (unlike
ROWIDs)Often is globally uniqueReference holds (directly or indirectly) the OID,
and always refers/points to the same object (why references are also called pointers)
Page 42
© Ellis Cohen 2002-2005 42
OODB Storage Representation
All objects are stored independently
Could dramatically increase the cost of a full scan through an extent (OODB equivalent of a table)
Many OODB's automatically cluster together (on the same page) objects that are used together (e.g. a dept & all its employees)
Some OODB's allow user to explicitly list/describe objects that should be clustered together.
Page 43
© Ellis Cohen 2002-2005 43
Extents & Objects in ODB's
622 Auditing CHICAGO …
deptno dname loc …
AAAGDxAABAAAH9EAAD
depts
Objects from different extents/classes may all share the same block!
Actually, objects are identified by OIDs, which are
not quite ROWIDs
Page 44
© Ellis Cohen 2002-2005 44
Extents with Keysclass Employee (extent emps, key empno) {
attribute int empno;attribute string name;attribute string job;attribute Address addr;relationship Dept dept inverse Dept::empls;
}
class Dept (exptent depts, key deptno) {attribute int deptno;attribute string dname;relationship Set<Employee> empls
inverse Employee::dept;}
Keys are optional - indicate unique attributes
Relationship between objects is modeled directly, not via correspondence between primary and foreign keys
Relationships reference objects via their OIDs
Page 45
© Ellis Cohen 2002-2005 45
Indexing (non-standard)
Some OODB's support creation of indices on extents & other persistent collections
Assuming emps is the extent for Employee
Attribute indexCreate index sal on emps
Multi-attribute indexCreate index job, sal on emps
Indexing can be especially useful when used with persistent collections other than extents
Page 46
© Ellis Cohen 2002-2005 46
Inheritancein ODL
Page 47
© Ellis Cohen 2002-2005 47
ODL Class Extensionclass Person (extent populace) {
attribute int persid,attribute string name,attribute int age,attribute string kind,…
}
class Employee extends Person (extent emps) {attribute int empno, …
}
How would you model this in an RDB?
An Employee has the attributes/relationships/methods specified for
Employee plus those for Person
Person
Employee
persidnameagekind
empno…
Page 48
© Ellis Cohen 2002-2005 48
RDBs vs OODBs
40694 … …
15129 … …
30276 … …
81107 … …
60019 … …
persid name agePersons
40694 … … …
30276 … … …
81107 … … …
persid empno job …
Emps
populace
person-1
emp-1
emp-2… emps
Some persons are employees
Page 49
© Ellis Cohen 2002-2005 49
Subclass Checking Problem
Suppose coolpeople is a persistent Set<Person>. Some of the persons in coolpeople are Employees.
Write the SQL and then the OQL to list the names of all the employees in coolpeople.
Page 50
© Ellis Cohen 2002-2005 50
SQL Subclass Checking
SELECT p.name FROM Persons NATURAL JOIN CoolPersons NATURAL JOIN Employee e
or
SELECT p.name FROM Persons NATURAL JOIN CoolPersonsWHERE p.isEmployee = 'T'
Page 51
© Ellis Cohen 2002-2005 51
OQL Subclass Checking
SELECT p.nameFROM p IN coolpeopleWHERE p.isEmployee
Many object systems have a built-in mechanism for checking subclassese.g.
SELECT p.nameFROM p IN coolpeopleWHERE p INSTANCE OF Employee
Unfortunately, OQL does not have such a standard mechanism
Not standard
OQL has boolean
attributes
Page 52
© Ellis Cohen 2002-2005 52
Subclass Casting Problem
Suppose coolpeople is a persistent Set<Person>. Some of the persons in coolpeople are Employees.
Write the OQL to list the employee numbers of all the employees in coolpeople.
Page 53
© Ellis Cohen 2002-2005 53
Subclass Casting Non-Solution
These don't work. Why not?
SELECT p.empno FROM p IN coolpeople WHERE p.isEmployee
Or, in a system with INSTANCE OF
SELECT p.empno FROM p IN coolpeople WHERE p INSTANCE OF Employee
Page 54
© Ellis Cohen 2002-2005 54
Subclass Casting
What's the problem with
SELECT p.empnoFROM p IN coolpeopleWHERE p.isEmployee
coolpeople is a Set<Person>, so p identifies a Person instance. But Persons don't have an empno field, so this query will not compile.
We need to claim that the only p's that satisfy the WHERE clause will be Employees. If p is an Employee, it will have an empno field. This is accomplished via casting:
SELECT ((Employee)p).empnoFROM p IN coolpeopleWHERE p.isEmployee
Page 55
© Ellis Cohen 2002-2005 55
ODL ClassesA class defines attributes, relationships &
methods.
A class may extend another class (its immediate superclass) by defining attributes, relationships and methods in addition to those of its superclass.
A class may only extend a single class, but that class may also extend a class (ad nauseum)
Classes are instantiable – i.e. you can create an object of the specified class. It will have the attributes & relationships defined by its class (including those of its superclasses).
Creating an instance of a class adds a reference for the instance to the class's extent (if it has one), and to the extents of its superclasses.
Page 56
© Ellis Cohen 2002-2005 56
ODL InterfacesAn interface also defines attributes, relationships
& methods.
An interface may inherit from any number of interfaces, adding attributes, relationships and methods of all the interfaces it inherits from.
Interfaces are NOT instantiable – i.e. you cannot create an object of the specified interface, and they do not have extents (though this might sometimes be convenient)
A class may be specified to implement any number of interfaces. This is a promise that the class defines (directly or through extension) all the attributes, relationships and methods defined by all the interface it implements.
Page 57
© Ellis Cohen 2002-2005 57
Interfaces & Classes
interface Student {attribute string college,attribute string program,attribute string year }
Class TA extends Person : Student {…
}
A TA extend Person, and also implements the Student interface.
Page 58
© Ellis Cohen 2002-2005 58
Basic OQL
Page 59
© Ellis Cohen 2002-2005 59
SQL vs OQL
SQL returns– result sets
convertible to tables and sometimes to sets or scalar values
– by searching through tables – using joins to search through
multiple tables
OQL returns– collections of elements
convertible sometimes to a single element
– by searching through collections– using navigation (and sometimes joins)
to search through multiple collections
Page 60
© Ellis Cohen 2002-2005 60
Selection as Iteration
Iterate through a collection
Given Bag<int> ages
SELECT a+5FROM a IN agesWHERE a > 20
Bag<int>
If ages is the bag containing {5, 26,42,8,26}, the result will be the bag
containing {31,47,31}
Page 61
© Ellis Cohen 2002-2005 61
Selecting From CollectionsGiven Set<Emp> emps
SELECT e.nameFROM e IN empsWHERE e.age > 40
Bag<string>(perhaps should return Set<string> if name is a key)
Given List<Emp> rankemps
SELECT e.nameFROM e IN rankempsWHERE e.age > 40
Bag<string> (perhaps should return List<string>)
Return the names of the employees in rankemps who
are older than 40
Return the names of the employees in emps who
are older than 40
Page 62
© Ellis Cohen 2002-2005 62
Selecting Collection Elements
Given Set<Emp> emps
SELECT eFROM e IN empsWHERE e.age > 40
Set<Emp>
Given List<Emp> rankemps
SELECT eFROM e IN empsWHERE e.age > 40
Bag<Emp> (perhaps should return List<Emp>)
Return the employees in emps who are older than 40
Return the names of the employees in emps who are older than 40
Page 63
© Ellis Cohen 2002-2005 63
Generating Sets, Lists & BagsSELECT e.name
FROM e IN empsWHERE e.age > 40
Bag<string>
SELECT DISTINCT e.nameFROM e IN empsWHERE e.age > 40
Set<string>
SELECT e.nameFROM e IN empsWHERE e.age > 40 ORDER BY e.age
List<string>
distinct(SELECT e.name FROM e IN empsWHERE e.age > 40)
Page 64
© Ellis Cohen 2002-2005 64
Generating Collections of Structs
SELECT who: e.name, e.ageFROM e IN empsWHERE e.age > 40
Bag<struct( string who; int age )>
SELECT struct(who: e.name, age: e.age )
FROM e IN empsWHERE e.age > 40
Return the names and ages (bundled up in a struct) of the employees in
emps who are older than 40
Renames field
The struct can be specified
explicitly, but then all fields
must be explicitly named
Page 65
© Ellis Cohen 2002-2005 65
Basic OQL Problem
Suppose Project (extent projs) hasattribute string pname
-- the name of the projectrelationship Employee pmgr
-- the employee who manages the project
List the names of projects whose project manager has the job of ANALYST
List the names of project managers who are analysts
Neither query requires a join!
Page 66
© Ellis Cohen 2002-2005 66
Basic OQL Solution
Suppose Project (extent projs) has attribute string pnameand relationship Employee pmgr
List the names of projects whose project manager has the job ANALYST
SELECT p.pnameFROM p IN projsWHERE p.pmgr.job = "ANALYST"
List the names of project managers who are analysts
SELECT DISTINCT p.pmgr.nameFROM p IN projsWHERE p.pmgr.job = "ANALYST"
Page 67
© Ellis Cohen 2002-2005 67
Generating Objects
SELECT struct( name: e.name, age: e.age )FROM e IN empsWHERE e.age > 40 Bag<struct( string name; int age )>
SELECT Person(name: e.name, age: e.age - 20 )
FROM e IN empsWHERE e.age > 40 Bag<Person>
Clone a younger model
Page 68
© Ellis Cohen 2002-2005 68
OQL Functions and Named Queries
Page 69
© Ellis Cohen 2002-2005 69
Aggregate/Collection Functions
SQL:SELECT avg(sal) FROM EmpsSELECT count(*) FROM Emps
OQL:avg( SELECT e.sal FROM e IN emps )count( emps )
Return the average salary of the employees Return the number of the employees
In OQL aggregate functions operate on collections
Page 70
© Ellis Cohen 2002-2005 70
Aggregating Navigated Collections
SQL:SELECT d.dname, (SELECT count(*) FROM Emps e WHERE e.deptno = d.deptno) AS knt FROM Depts d
OQL:SELECT d.dname, knt: count(d.empls) FROM d IN depts
Page 71
© Ellis Cohen 2002-2005 71
Collection Functionsexists( col )
whether a collection has at least one element
unique( col )whether a collection has exactly one element
element( col )the one element in a collection w one element
(returns UNDEFINED otherwise)
pick( col ) – non-standardarbitrarily chooses an element of a collection
(returns UNDEFINED if empty)
distinct( col )the set with all the distinct elements of the
collection (i.e. with duplicates removed)
flatten( col ) A flattened representation of a collection of
collections
SQL's NULL OQL's UNDEFINED
Page 72
© Ellis Cohen 2002-2005 72
Operations on Lists/Arrays
first( lst ), last( lst )The first or last element of a list or an
array.
lst[nth]The nth element of a list or array.
lst[n1:n2]The n1th through n2th elements of a list
or array.
lst1 + lst2The concatenation of lst1 and lst2
listToSet( lst )A set with the same elements as the list
Page 73
© Ellis Cohen 2002-2005 73
Access to Dictionaries(non-standard!)
dict[nam]The element of a dictionary
named nam, for example mydict["joe"]
dictionaryContents( dict )A set with the contents of the dictionary
elements
dictionaryKeys( dict )A set with the keys of the dictionary
elements
Page 74
© Ellis Cohen 2002-2005 74
Named Queries
DEFINE OldEmps asSELECT eFROM e IN empsWHERE e.age >= 30
SELECT x.name, x.jobFROM x IN OldEmps
Define OldEmps as employees 30 or older
OQL does not have views
Page 75
© Ellis Cohen 2002-2005 75
Views vs Named Queries
In SQL, Views are used– As functions/macros
(using view expansion)– To limit modifications– For access control
If OQL is not used for modifications & access control,
then we just need support for macros/functions
Allows us to treat named queries as macros/functions, with parameters
Page 76
© Ellis Cohen 2002-2005 76
Parameterized Named Queries
DEFINE AgedEmps( minage ) asSELECT e
FROM e IN empsWHERE e.age >= minage
SELECT x.name, x.jobFROM x IN AgedEmps( 30 )
Define AgedEmps as employees minage or older
Page 77
© Ellis Cohen 2002-2005 77
Subqueries & Correlation
Page 78
© Ellis Cohen 2002-2005 78
Subqueries
SELECT m.name, m.jobFROM m IN (SELECT e FROM e IN emps WHERE e.age >= 30)
SELECT e.nameFROM e IN empsWHERE e.age = min( SELECT e.age FROM e IN emps)
SELECT d.dname FROM d IN deptsWHERE 25 < avg( SELECT e.age FROM e IN d.empls)
What do these do?
Page 79
© Ellis Cohen 2002-2005 79
Subquery AnswersList the name & job of employees who are 30 or older
SELECT m.name, m.jobFROM m IN (SELECT e FROM e IN emps WHERE e.age >= 30)
List the names of all employees who have the same age as the youngest employee
SELECT e.nameFROM e IN empsWHERE e.age = min( SELECT e.age FROM e IN emps)
List the names of departments whose average employee age is over 25
SELECT d.dname FROM d IN deptsWHERE 25 < avg( SELECT e.age FROM e IN d.empls)
Page 80
© Ellis Cohen 2002-2005 80
Generating CollectionsSELECT d.dname, d.empls
FROM d IN deptsWHERE d.loc <> "BOSTON"
Bag<struct( string dname; Set<Employee> empls )>
SELECT d.dname,enames: (SELECT e.name FROM e IN d.empls)FROM d IN deptsWHERE d.loc <> "BOSTON"
Bag<struct( string dname; Bag<string> enames )>
What's the difference between these two?
Page 81
© Ellis Cohen 2002-2005 81
Generating Collections SolutionSELECT d.dname, d.empls
FROM d IN deptsWHERE d.loc <> "BOSTON"
For each department not in Boston, generates the name of the department, along with the set of that department's employees
SELECT d.dname,enames: (SELECT e.name FROM e IN d.empls)FROM d IN deptsWHERE d.loc <> "BOSTON"
For each department not in Boston, generates the name of the department, along with the set of the names of that department's employees
Note: enames: d.empls.nameis ILLEGAL!
Page 82
© Ellis Cohen 2002-2005 82
Correlated Query Problem
SQL:SELECT e.name, d.dname
FROM Emps e NATURAL JOIN Depts d
Return the name of every employee and the name of their department, for employees who work for departments
How would you write this in OQL?
Page 83
© Ellis Cohen 2002-2005 83
Correlated Query Solutions
SELECT e.name, e.dept.dnameFROM e IN empsWHERE is_defined( e.dept )– Lists every employee & their dept– OQL: is_defined ~ SQL: IS NOT NULL
needed because some employees may not be in depts
SELECT e.name, d.dnameFROM e IN emps, d IN deptsWHERE e.dept = d– Using an OQL join
SELECT e.name, d.dnameFROM d IN depts, e IN empsWHERE e IN d.empls– Also an OQL join
Page 84
© Ellis Cohen 2002-2005 84
Correlated Joins
SELECT e.name, d.dnameFROM d IN depts, e IN empsWHERE e IN d.empls
SELECT e.name, d.dnameFROM d IN depts, e IN d.empls -- emps not needed
Page 85
© Ellis Cohen 2002-2005 85
Correlated Collection Problem
Suppose Employee (extent emps) has relationship Set<Project> projsManaged
Use it to
List the names of projects whose managers are analysts
Also, list the names of analysts who manage projects
Page 86
© Ellis Cohen 2002-2005 86
Correlated Collection Solution
Suppose Employee (extent emps) has relationship Set<Project> projsManaged
List the names of projects whose managers are analysts
SELECT p.pnameFROM e IN emps, p IN e.projsManagedWHERE e.job = "ANALYST"
List the names of employees who manage projects
SELECT e.nameFROM e IN empsWHERE exists( e.projsManaged )
Page 87
© Ellis Cohen 2002-2005 87
CollectionOperations
Page 88
© Ellis Cohen 2002-2005 88
Navigation & Quantification
SQL:SELECT d.dname FROM Depts d
WHERE exists ( SELECT * FROM Emps WHERE e.deptno = d.deptno AND e.age > 50)
OQL:SELECT d.dname FROM d IN depts
WHERE exists ( SELECT e FROM e IN d.empls WHERE e.age > 50 )
SELECT d.dname FROM d IN deptsWHERE EXISTS e IN d.empls : e.age > 50
List the names of the departments that have at least one employee older than 50
Page 89
© Ellis Cohen 2002-2005 89
DISTINCT vs exists
SQL:SELECT DISTINCT d.dname
FROM Depts d NATURAL JOIN Emps eWHERE e.age > 50
OQL:SELECT DISTINCT d.dname
FROM depts d, emps eWHERE e.dept =d AND e.age > 50– Corresponds to SQL, but not the best way to write it in OQL
SELECT DISTINCT d.dnameFROM Depts d, d.empl eWHERE e.age > 50
List the names of the departments that have at least one employee older than 50
Page 90
© Ellis Cohen 2002-2005 90
More Quantification
SELECT d.dname FROM d IN deptsWHERE FOR ALL e IN d.empls : e.age > 50
List the names of the departments that have exactly one employee is older than 50
List the names of the departments where all employees are older than 50
SELECT d.dname FROM d IN deptsWHERE UNIQUE e IN d.empls : e.age > 50
SELECT d.dname FROM d IN deptsWHERE unique (
SELECT e FROM e IN d.empls WHERE e.age > 50 )
Page 91
© Ellis Cohen 2002-2005 91
MembershipClass Person (extent populace) {
attribute string ssno;attribute string name;attribute List<string> kidnames;
}List the names of the people who have a kid named "Yael"
SELECT p.name FROM p IN populaceWHERE EXISTS k IN p.kidnames : k = "Yael"
SELECT p.name FROM p IN populaceWHERE ("Yael" IN p.kidnames)
Page 92
© Ellis Cohen 2002-2005 92
Collection Operators
OQL has union, intersect & except operators
Generate all employees in the accounting department who manage projects(SELECT e
FROM e IN empsWHERE exists( e.projsManaged ))
INTERSECT(SELECT d.empls
FROM d IN deptsWHERE d.dname = "ACCOUNTING")
The result of these operators is a set if both operands are sets, else it is a bag
Page 93
© Ellis Cohen 2002-2005 93
Collection Comparison
SELECT p1, p2FROM p1 IN populace, p2 IN populaceWHERE p1.kidnames = p2.kidnames
Find pairs of people whose kids have the same names in the same order
Find pairs of people whose kids have the same names (in any order)
SELECT p1, p2 FROM p1 IN populace, p2 IN populaceWHERE listToSet(p1.kidnames) = listToSet(p2.kidnames)
Bags & Setsc1 = c2 if c1 & c2 have same elementsc1 < c2 if every element in c1 is in c2
Lists, Arrays, Dictionariesc1 = c2 if, for all j, c1[j] = c2[j]c1 < c2 if, for all j, c1[j] = c2[j] or c1[j] = nil (nil/undefined/null)
Page 94
© Ellis Cohen 2002-2005 94
Partitioning
Page 95
© Ellis Cohen 2002-2005 95
Partitioning
emps
33
emp1 33
emp3 33
emp5 49
emp4 33
emp2 49
age
49
partition
grpage
partition
grpage
Page 96
© Ellis Cohen 2002-2005 96
Partitioning with GROUP BYSELECT grpage, partition
FROM e IN empsGROUP BY grpage: e.age
Set<struct( grpage: int; partition: Set<Employee> )>
SELECT partition FROM e IN empsGROUP BY grpage: e.age
Set<Set<Emp>>
Group the employees by age
Group by age but just return the partitions
Suppose you wanted to calculate the # of employees at each age:
In SQL:SELECT age AS grpage, count(*) AS knt FROM Employees GROUP BY age
How would you get the equivalent result in OQL?
Page 97
© Ellis Cohen 2002-2005 97
GROUP BY with Aggregation
SQL:SELECT age AS grpage, count(*) AS knt FROM Emps GROUP BY age
OQL:SELECT grpage, knt:count(partition)
FROM e IN empsGROUP BY grpage: e.age
Set<struct( grpage: int; knt: int )>
Page 98
© Ellis Cohen 2002-2005 98
Partition Selection
SELECT grpage, grp: (SELECT p.name, p.job FROM p IN partition)FROM e IN empsGROUP BY grpage: e.age
Set<struct( grpage: int; grp: Bag<struct{name: string; job: string}> )>
Divide the employees by age, and collect together the name and job of all employees in
each age group
Can you write this without using GROUP BY?
Page 99
© Ellis Cohen 2002-2005 99
Grouping without GROUP BY
SELECT DISTINCT grpage: e.age,(SELECT ee.name, ee.job FROM ee in emps WHERE ee.age = e.age)FROM e IN emps
Page 100
© Ellis Cohen 2002-2005 100
Grouping Problem
For each job, produce the names of the departments in which some employee has that job
(Use GROUP BY!Also consider a solution withoutusing it)
Page 101
© Ellis Cohen 2002-2005 101
Grouping Solution
For each job, produce the names of the departments in which some employee has that job
SELECT grpjob, jobdepts: (SELECT DISTINCT p.dept.dname FROM p IN partition) FROM e IN empsGROUP BY grpjob: e.job
SELECT job, jobdepts:(SELECT d.dname FROM d IN depts WHERE EXISTS e IN d.empls : e.job = job)FROM job IN (SELECT DISTINCT e.job FROM e IN emps)
Page 102
© Ellis Cohen 2002-2005 102
Grouping by Expression
SELECT partitionFROM e IN empsGROUP BY e.sal < 30000
Group together the employees who have low salaries (< 30000), and separately, the employees who have high salaries
Group together people whose kids have exactly the same set of names
SELECT partitionFROM p IN populaceGROUP BY listToSet(p.kidnames)
Set<Set<Emp>>
Page 103
© Ellis Cohen 2002-2005 103
Group Restriction via Having
SQL: SELECT job FROM Emps
GROUP BY jobHAVING count(*) > 1
OQL:SELECT e.job FROM e IN emps
GROUP BY e.jobHAVING count(partition) > 1
List jobs held by more than one employee
Do in OQL: For each job that is held by more than one employee, list the employees in that job
Page 104
© Ellis Cohen 2002-2005 104
Group Restriction Solution
SELECT e.job, partitionFROM e IN empsGROUP BY e.jobHAVING count(partition) > 1
For each job that is held by more than one employee, list the employees in that job
Page 105
© Ellis Cohen 2002-2005 105
Tricky Grouping Problem
For each job, produce the names of the departments in which more than one employee holds that job
(Use HAVING)
Page 106
© Ellis Cohen 2002-2005 106
Tricky Grouping SolutionFor each job, produce the names of the departments
in which more than one employee holds that job
SELECT grpjob, jobdepts: (SELECT p.dname FROM p IN partition)FROM de IN (SELECT d.dname, e.job FROM d IN depts, e IN d.empls GROUP BY d.dname, e.job HAVING count(partition) > 1)GROUP BY grpjob: de.job
SELECT grpjob, jobdepts: (SELECT dpt.dname FROM p IN partition GROUP BY dpt: p.dept HAVING count(partition) > 1 FROM e IN empsGROUP BY grpjob: e.job
Note use of inner & outer
partition
Page 107
© Ellis Cohen 2002-2005 107
Flattening
Page 108
© Ellis Cohen 2002-2005 108
Flattening
Collection of Collections
Result of Flattening
Page 109
© Ellis Cohen 2002-2005 109
Flattening in OQL
Generate the employees who are in jobs that are done by more than one employee
flatten( SELECT partitionFROM e IN empsGROUP BY e.jobHAVING count(partition) > 1 )
Page 110
© Ellis Cohen 2002-2005 110
Correlation & Flattening
SELECT e.name, d.dnameFROM d IN depts, e IN d.emplsWHERE d.loc = "Boston"
flatten( SELECT (SELECT e.name, d.dname FROM e IN d.empls)FROM d IN deptsWHERE d.loc = "Boston" )
For all employees whose departments are located in Boston, generate their name and their department's name
Page 111
© Ellis Cohen 2002-2005 111
Flattening Sublists
DEFINE SizeDepts ASSELECT d.empls FROM d IN depts ORDER BY count(d.empls) desc
flatten( SizeDepts[1:3] )
What does this do?
Page 112
© Ellis Cohen 2002-2005 112
Flattening Sublists Solution
DEFINE SizeDepts ASSELECT d.empls FROM d IN depts ORDER BY count(d.empls) desc
flatten( SizeDepts[1:3] )
Generate the set of the employees who are in the three largest departments
Page 113
© Ellis Cohen 2002-2005 113
Flattening and Transitive Closure
DEFINE ManageClosure( e ) ASset( e ) -- a set consisting of e
UNIONflatten( SELECT ManageClosure(m) FROM m IN e.managees )
Define ManageClosure( e ) to consist of e plus all the employees who directly or indirectly report to e
Note: e.managees is the set of employees that e directly manages