COMP30311: Database Programming: Architectures and Issues Norman Paton University of Manchester [email protected]
Dec 27, 2015
COMP30311: Database Programming: Architectures and Issues
Norman PatonUniversity of Manchester
Observations Databases are mostly accessed by
applications: Transaction processing: many small query
or update requests (e.g., flight booking, account management).
Analytical processing: more complex queries, but less frequent updates (e.g., management information systems).
In practice, databases are hardly ever accessed by users typing a query language at a prompt.
Client-Server Architecture The client:
Runs the application.
Invokes requests on the database using the query/update language.
The server: Manages
concurrency, caching, etc.
Client-1 Client-2
Server
Database
Network
Client-Server Issues Classical model
has a thick client: Process flow. Business rules. Constraints. ...
The server is essentially a shared fact store.
Thin clients involve more central code: Process flow. Business rules. Constraints. ...
Database servers are able to be much more than fact stores.
Classical Relational Database Clients encode
most application functionality.
Clients are written using embedded SQL.
Calls to the database use SQL-92.
Client (C)
SQL
DBMS
Tables
Views
Modern Relational Database Application
functionality is divided – there is no strict thin or thick client choice.
Clients are written using embedded languages.
The database has stored programs, using embedded languages or extensions to SQL.
Client (C)
SQL+
CallsDBMS
Tables
Views
Triggers
Procedures
Multi-Tier EnvironmentUser/ApplicationLayer
MiddlewareLayer
DatabaseServerLayer
ApplicationUser
InterfaceApplication
ApplicationLibrary
MiddlewareApplication
Library
DatabaseServer
DatabaseServer
Multi-Tier Environments Greater flexibility,
and thus potentially scaleability.
Data-intensive tasks near the database.
Compute-intensive tasks in the middle layer or on the client.
Example of multi-tier platform: A Web Browser
interacts with a Web Server using CGI (Common Gateway Interface).
The Web Server runs a Java Servlet that interacts with a DBMS using JDBC.
Where Does SQL Fit in? SQL acts as the
API to the database (if relational).
Features of SQL: Standardised. Declarative. Flexible (Queries,
Updates, Administration).
Problems with SQL: Non-trivial to learn
(not good for end users).
Poor for repetitive tasks (e.g., for manual data entry).
Of limited computational power (so used with other languages).
Programming Databases Options include:
Embed query language in existing programming language (e.g., JDBC, SQLJ).
Extend query language with programming features (e.g., SQL-99, PL/SQL).
Extend programming language with database features (no current products?).
Map database constructs to programming constructs as in Object Databases and JDO (e.g., FastObjects, Objectivity)
Provide database components for programming environments (e.g., Delphi, ADO.NET).
Embedded SQL ExampleEXEC SQL BEGIN DECLARE SECTION;
VARCHAR name[20]; // Data passed C <-> SQL
EXEC SQL END DECLARE SECTION;
EXEC SQL
SELECT type INTO :type // Single valued result
FROM station WHERE name=:name; // Parameter
type.arr[type.len] = '\0'
printf("%s\n",type.arr); // Note String Format
JDBC ExampleConnection conn =
DriverManager.getConnection(url,args[0],args[1]);
Statement stmt = conn.createStatement();
ResultSet rset = stmt.executeQuery (
"select T# from TRAIN");
while (rset.next())
System.out.println(rset.getString(1));
PL/SQL Exampledeclare
cursor c1 is
select t# from train
where source = ‘Edinburgh’;
begin
for ed_train in c1 loop
insert into edinburgh values (ed_train.t#);
end loop;
end
Multi-Language Environments Where two languages are used
together, a mapping is required between their type systems.SQL-92 C
INTEGER int
VARCHAR char*
tuple struct
table
array
Impedance Mismatch The problems encountered linking
two independently developed languages are known as the impedance mismatch, which has two aspects: A type system mismatch that affects
programmer productivity. An evaluation strategy mismatch that
affects performance.
Type System Mismatch Database types are not supported directly
in the programming language, so, for example, relations may have to be mapped to iterators.
Programming language types are not supported directly in the database, and thus have to be mapped, for example, to relations for storage.
The programming language type checker cannot check the legality of embedded calls, leading to runtime errors.
Evaluation Strategy Mismatch Database operations typically act on
and return collections. Programming language operations typically act on and return single values. Query results may be computed in their
entirety and cached before any access from the programming language.
The database may retrieve data that is never consumed by the programming language.
Summary There are many choices in
database programming: Which technologies to use. Which architecture to use.
Many non-trivial decisions may significantly influence: System performance. Development and maintenance costs.
Further Reading Oracle Database Application
Developers Guide – Fundamentals [Chapter 1: Programmatic Environments].
Trains Database Schema
Station
District
Visit Train*1 * 1
BookingCustomer
*
1
1 *
See handout for the relationalschema and example programs.
JDBC and SQLJ There are two standard interfaces
allowing relational databases to be accessed and manipulated from Java: JDBC: A class library that allows dynamic
SQL statements to be called from Java. SQLJ: A preprocessor that allows static
SQL statements to be embedded in Java. JDBC is much more widely used.
JDBC JDBC can be used in client applets or
applications, or (in some database systems) for implementing server-side functionality.
JDBC involves no extensions to the syntax of Java. The JDBC package is imported thus:
import java.sql.* Specific database systems are accessed
using vendor or third party drivers: DriverManager.registerDriver( new oracle.jdbc.driver.OracleDriver());
JDBC Database Interaction
ResultSet
Statement
ResultSet ResultSet
PreparedStatement
CallableStatement
Connection
DriverManager
mySQLDriver
OracleDriver
Application
Connecting to a Database Statements and transactions are
associated with connections. There are several ways of
establishing a connection. An example is:
String url = "jdbc:oracle:thin:@sr.cs.man.ac.uk:1526:teach";
Connection conn = DriverManager.getConnection
(url,username,password);
Connection URL The URL is of the form: jdbc:oracle:<drivertype>@<hoststring> An example hoststring is:
aardvark.cs.man.ac.uk:1526:teach
Different driver types use pure java or include native code, and use generic or custom network protocols.
In the above, 1526 is the port, and teach is the system identifier.
Single Slide Exampleimport java.sql.*;class Trains{ public static void main (String args []) throws SQLException { DriverManager.registerDriver(...); String url = “..."; Connection conn = DriverManager.getConnection (url,args[0],args[1]); Statement stmt = conn.createStatement(); ResultSet rset = stmt.executeQuery ("select T# from TRAIN"); while (rset.next()) System.out.println (rset.getString(1)); }}
Statements Queries are run against the database
through the creation and execution of statements:
Statement stmt = conn.createStatement(); ResultSet rset = stmt.executeQuery ("select T# from TRAIN");
Note that the query is a String, which could be constructed at runtime if required.
Note the potential for runtime errors if the query is invalid.
Query Results The result of executing a query is a ResultSet, which supports: Iterator functionality, through boolean next(), boolean previous().
Tuple access functionality, as described on the next slide.
Update functionality, for results from simple queries, through deleteRow(), updateXXX().
Control functionality, through setFetchSize(int rows).
Accessing Result Tuples In JDBC there is no predefined Java type for
the result of a query, so attribute values are retrieved by: getXXX() functions, where XXX is the result
type. The argument to the function is either the
column position (starting from 1) or its name.
Note the potential for runtime errors if the result is not as anticipated.
Prepared Statements - 1 A PreparedStatement object allows an
SQL statement to be run multiple times, with different parameters, without the SQL being recompiled by the database.
Simple example: PreparedStatement pstmt = conn.prepareStatement( "select t# from train where source = ?"); pstmt.clearParameters(); pstmt.setString(1,args[2]);
ResultSet rset = pstmt.executeQuery();
Prepared Statements - 2 Creating a prepared statement – formal
parameters are identified by “?”s: PreparedStatement pstmt = conn.prepareStatement( "insert into booking values (?,?,?)");
Parameters are bound using setXXX (pos,val) (pos starts from 1): pstmt.setString(1,args[2])
The request is executed using executeQuery() or executeUpdate().
Single Slide Updateimport java.sql.*;class MakeBooking{ public static void main (String args []) throws SQLException { DriverManager.registerDriver(...); String url = “...”; Connection conn = DriverManager.getConnection(url,args[0],args[1]); PreparedStatement pstmt = conn.prepareStatement( "insert into booking values (?, ?, ?)"); pstmt.clearParameters(); pstmt.setString(1,args[2]); pstmt.setString(2,args[3]); pstmt.setDate(3,java.sql.Date.valueOf(args[4])); pstmt.executeUpdate(); }}
Update Results Statement and PreparedStatement
objects can be associated with queries and updates (as strings).
The result types of the outputs are different, however, so separate ResultSet executeQuery() and int executeUpdate() methods are required.
Transactions By default, each statement
executes in a distinct transaction. To group statements, where conn is
a Connection, use: conn.setAutoCommit(false) to
override the single-statement default and start a transaction.
conn.commit() and conn.rollback() to complete a transaction.
Closing Things Down The close() operation is
supported on lots of things: Connection. Statement. ResultSet.
In all cases, close() reclaims resources; it is good practice to close all the above as soon as possible.
Handling Errors – Important!Connection conn = null;try { ...} catch (SQLException e) { System.out.println("SQL Exception: " + e.getMessage());} finally { if (conn != null) { try {
conn.rollback(); conn.close(); } catch (SQLException sqlEx) { // ignore } }}
Summary JDBC is the most widely used means of
accessing relational databases from Java. JDBC is a class library – there are no
syntactic extensions to Java. JDBC supports dynamic SQL (i.e., queries
are strings) – flexible, but runtime type error possibilities.
Impedance mismatches? See tutorial sheet.
Further Reading Oracle 10g JDBC Developers Guide
and Reference, 2001 [Chapter 1: Overview; Chapter 3: Basic Features].
Sun JDBC Tutorial: http://java.sun.com/docs/books/tutorial/jdbc/
Object-Relational Databases Weaknesses of vanilla
Relational databases: Limited data modelling
facilities. Limited application
development facilities. Object-relational
databases aim to overcome these weaknesses.
“Object-Relational” is an umbrella term for assorted extensions.
Model: Abstract data types
(cartridges, blades, ...). Object type extensions.
Programming: Programming language
extensions to SQL. Active rules/triggers.
Object Relational Databases These add to the relational model:
Object types. Nested tables. References. Inheritance. Methods. Abstract data types.
The SQL:2003 standard covers all of the above; in what follows, examples are from Oracle 10g.
SQL:1999 and SQL:2003 The SQL-92 standard
now characterises basic relational functionality (as taught in CS231).
SQL:1999 was the successor for object-relational databases, developed throughout the ’90s.
SQL:2003 refined the many extensions in SQL:1999 and started to add XML support.
SQL:1999/SQL:2003: Are not uniformly
adopted – many vendors have their own object-relational extensions developed since the early ’90s.
Cover model extensions, type extensions, programming extensions, triggers, etc.
Object-Relational in Oracle Model:
Type system extensions to support object types, encapsulation, references.
Primitive type extensions as cartridges to support multimedia data, spatial data, etc.
Programming: PL/SQL adds
imperative programming to SQL.
Triggers allow PL/SQL programs to be executed reactively.
Relational Model and Types Data type
completeness: each type constructor can be applied uniformly to types in the type system.
In the basic relational model:
There is only one type constructor (i.e. relation).
That type constructor cannot be applied to itself.
Incorporating data type completeness to the relational model gives nested relations.
In addition, the type relation is essentially:
Bag < Tuple >. Separating out these
type constructors provides further flexibility, such as tuple-valued attributes.
Object Types in Oracle An object type is a user-defined data
type, somewhat analogous to a class in object-oriented programming.
Types can be arranged in hierarchies, and instances of types can be referenced.
create type visit_type as object ( name varchar(20), /* the station */ thetime number);
Nested Relations Nested relations involve the storage of
one relation as an attribute of another.
create type visit_tab_type as table of visit_type;
create table train ( t# varchar(10) not null, type char(1) not null, visits visit_tab_type,primary key (t#))nested table visits store as visits_tab;
Populating Nested Tables The name of the type can be used as
a constructor for values of the type.
update train set visits = visit_tab_type( visit_type('Edinburgh',950), visit_type('Aberdeen',720)) where t# = '22403101'
Querying Nested Tables Query operations such as unnesting
allow access to the contents of a nested table.
The following query retrieves details of the trains that visit Inverness.
select *from train t, table(t.visits) vwhere v.name = ‘Inverness’
Abstract Data Types Abstract data types allow new primitive
types to be added to a DBMS (a.k.a. data blades, cartridges).
These primitive types can be defined by (skilled) users or vendors.
Oracle-supplied cartridges include: Time. Text. Image. Spatial. Video.
Supporting the Spatial Types Operations:
Geometric (area, difference, …).
Topological:
Implementation: The cartridge uses
specialised index structures such as R-trees.
The optimiser knows the properties of the R-tree, and how it can be used to make queries faster.
Programming in SQLdeclare cursor c1 is select t# from train where source = 'Edinburgh' or dest = 'Edinburgh';begin for ed_train in c1 loop insert into edinburgh values (ed_train.t#); end loop;end
Programming in SQL The following PL/SQL program iterates
through a query result.
declare cursor c1 is select t# from train where source = 'Edinburgh' or dest = 'Edinburgh';begin for ed_train in c1 loop insert into edinburgh values (ed_train.t#); end loop;end
Example Program: Comments-1 Pl/SQL is a block structured language,
with structure:[declare declarations]
begin
statements
[exception handlers]
end No relation type in PL/SQL, so cursors
iterate over query results.
Example Program: Comments-2 The for loop iterates over the result
of the query associated with the cursor, fetching results one at a time.
Each tuple retrieved from the cursor has type:
record(t# varchar(10)) The type of the variable ed_train is
inferred.
PL/SQL: More Cursors/Loopsdeclare cursor c1 is <as before> ed_tno train.t#%type;begin open c1; loop fetch c1 into ed_tno; exit when c1%notfound; insert into edinburgh values (ed_tno); end loop; close c1;end
Loop Example: Comments The declare section can introduce new
cursors, types or variables. Variables and cursors have attributes,
such as %type, %rowtype and %notfound for accessing properties.
The cursor is explicitly opened, closed and fetched from (in contrast with the previous example).
The loop construct can mimic classical while-do and repeat-until loops.
Declaring Types Types can be declared explicitly:
As a choice, even if in the database. If there is no direct analogue in the
database. Other than records, there are object
types and lookup tables.
declare type ed_train_type is record (t# varchar(10), thetime number); ed_train ed_train_type;
Collection Types Collections tend to be
important in databases:
Persistent data types tend to be bulk data types (e.g. relations).
Operations on bulk data types tend to act on complete collections (e.g. there is no operation to update a tuple in SQL-92).
There are normally few built-in collection types in programming languages (e.g. array).
Collections are often provided in class libraries (e.g. java.util.Collection).
PL/SQL Collection Types Declarations:
type name is table of type-name.
type name is varray (size-limit) of type-name.
type name is table of type-name index by binary_integer.
Unlike tables, varrays: Have a maximum size. Are dense, so elements
cannot be deleted. Oracle can store
varrays and (non-indexed) tables in the database.
Stored varrays cannot be manipulated directly by SQL – they must be retrieved first.
Lots of curious rules...
Collection Type Exampledeclare type ed_train_type is table of train.t#%type index by binary_integer; ed_table ed_train_type; i binary_integer := 0;begin for ed_train in c1 loop i := i + 1; ed_table(i) := ed_train.t#; end loop; ...end
Stored Procedures/Functions Oracle supports
stored procedures, functions and packages.
Stored procedures can be called from each other, from triggers, from Java, from Web Services, ...
Client (C)
SQL+
CallsDBMS
Tables
Views
Triggers
Procedures
Example Header A procedure has no result type, whereas
a function returns a result. Function header from tutorial: function FastestTrain
(src varchar, dst varchar)
return varchar The body of a function is a PL/SQL block. Results are returned using return.
Calling PL/SQL from JDBCConnection conn =...// Create a CallableStatementCallableStatement cstmt = conn.prepareCall("{? = call FastestTrain(?,?)}");// Set its two parameterscstmt.setString(2, args[2]);cstmt.setString(3, args[3]);cstmt.registerOutParameter(1, Types.VARCHAR);
// Execute the statement and print its resultcstmt.execute();System.out.println("Fastest = " + cstmt.getString(1))
Summary Claims for
programming language extensions:
Reduces impedance mismatches.
Improves [Portfolio 01]:
Performance. Programmer
productivity. Portability. Security.
The reality: SQL extensions are
often not elegant. They are widely
used. They are not
portable across products.
Performance always has many facets.
Further Reading Oracle 10g PL/SQL User Guide and
Reference [Chapter 1: Overview]. Oracle 9i PL/SQL User Guide and
Reference [Appendix 1: Example Programs].
M. Piattini, O Diaz (eds), Advanced Database Technology and Design, Artech Press, 2000 [Chapter 6: Object-Relational Database Systems].
M. Stonebraker, P. Brown, Object-Relational DBMSs, 2nd Edition, Morgan-Kaufmann, 1999.
Triggers An active database is
one that can respond automatically to events.
The events to which a database may want to react are mostly within the database, but could in principle be outside.
Most relational products support active behaviour, and it is in SQL:2003.
Active behaviour is expressed using rules containing:
an event, an (optional)
condition, and an action,
a.k.a. ECA-rules. These active rules
are known as triggers in relational products and SQL:2003.
Applications of Triggers Extending built-in
behaviours: integrity constraints. auditing. authorisation. statistics. data derivation.
Triggers are thus generic mechanisms, powerful, but often harder to use than the built-in behaviour.
Supporting application functionality:
Alerters – the user is informed when something significant happens.
Business rules – an organisational behaviour is enforced or carried out as a reaction to database changes.
Business Rules Recovering business
rules: Indicate how the
organisation recovers from a problem.
Example: too many people have
enrolled on a seminar for the space allocated.
reaction – book larger room, run two seminars in parallel, ...
Causal business rules:
Brings about a behaviour when a condition is satisfied:
Example: enough people enrol
for a seminar to make it viable.
reaction – book a room, inform potential attendees, inform tutor.
Trigger Structure Oracle trigger syntax:
create or replace trigger nameevent[when condition][for each row]action
In Oracle: The condition is a boolean expression (that
does not access the database). The action is a PL/SQL block.
Rule Triggering
Rulebase:R1: on U1 when C1 do U2, U4R2: on U2 when C2 do U3
U0 ...
if C1
U1transactiontrigger
R1 U2
if C2 U3triggerR2
U4
Trigger Concepts - 1 Transition
granularity. A rule may trigger: once per tuple
change – row transition granularity.
once per update statement – statement transition granularity.
Coupling mode. A rule condition may evaluate: as soon as the
event has taken place – immediate coupling mode.
at some time after the event took place – deferred coupling mode.
Trigger Concepts - 2 Priorities:
A single event may trigger multiple rules.
A collection of deferred rules may be triggered at the same time by different events.
Priorities may be: unspecified, relative, absolute, by creation date.
Event types: A primitive event type
is considered an atomic happening (e.g., the update to a tuple, a time of day).
A composite event type is based on an algebra over primitive events (e.g., E1 OR E2, E1 AND E2, ...).
Oracle Triggers Transition granularity:
row triggers - FOR EACH ROW. statement triggers – no FOR EACH ROW.
Coupling mode: immediate.
Priorities: unspecified.
Event types: primitive (DML, DDL and system (e.g.
startup/shutdown)). composite (but only OR).
DML Events Follow database updates:
[BEFORE|AFTER] INSERT ON table. [BEFORE|AFTER] DELETE ON table. [BEFORE|AFTER] UPDATE OF table. [BEFORE|AFTER] UPDATE OF column on table.
Plus disjunction, e.g.: BEFORE INSERT OR UPDATE OF visit.
Condition Row triggers can
have conditions that guard the action.
The condition is a boolean expression (AND, OR, NOT, >, <, ...).
The condition refers to literals and to event properties through correlation variables, e.g.:
WHEN new.age < 21.
Correlation variables available depend on event types:
Event new old
INSERT Y N
DELETE N Y
UPDATE Y Y
Action An action is a PL/SQL block. An action:
can refer to correlation variables, as :new, :old (row triggers only).
can test the type of event being reacted to using inserting, updating, deleting.
cannot use transaction control commands directly (but can raise exceptions).
Trigger Design Issues Termination:
Triggers can trigger each other recursively, which may lead to cycles (or a threshold as in Oracle).
Confluence: The (arbitrary) order of
selection for multiple triggered rules may lead to unanticipated behaviour.
Mutating tables: in Oracle, a row trigger cannot modify a table in mid-update.
create or replace trigger t9before insert on visitfor each rowbegin delete from visit where t# = :new.t#;end
Example Triggers Requirement: maintain
a table numBookings of the numbers of bookings of each train on each date.
Events to monitor: insert on booking. delete on booking. update of t# on
booking. update of date on
booking.
create table numBookings ( t# varchar(10) references train(t#), thedate date, num number,primary key (t#, thedate))
Insert Casecreate or replace trigger numBookings2after insert on bookingfor each rowdeclare numPresent integer;begin select count(*) into numPresent from numBookings where t# = :new.t# and thedate = :new.thedate; if (numPresent = 0) then insert into numBookings values (:new.t#, :new.thedate, 1); else update numBookings set num = num + 1 where t# = :new.t# and thedate = :new.thedate; end if;end;
Comments on Insert Case AFTER event, as only update numBookings if booking actually changed.
No use of condition, as need to conduct action for every insert to numBookings.
Creates a numBookings tuple if none was present before (corresponding delete action should remove if no bookings remain).
Delete Casecreate or replace trigger numBookings1after delete on bookingfor each rowdeclare currentNumber integer;begin select num into currentNumber from numBookings where t# = :old.t# and thedate = :old.thedate; if (currentNumber = 1) then delete from numBookings where t# = :old.t# and thedate = :old.thedate; else update numBookings set num = num - 1 where t# = :old.t# and thedate = :old.thedate; end if;end;
Comments on Delete Case Broadly the inverse of the insert case. Many references to :old correlation
variable (c.f. :new for insert case). Update case is broadly a delete then
insert – see tutorial. This problem can also be addressed
using statement triggers – see tutorial.
Identifying Events A single application
functionality may need to monitor many events. Example:
Tables: emp(ename,bname,sa
l) boss(bname,sal)
Constraint: no employee is paid
more than his/her boss.
Quiz: what events may invalidate the constraint?
NEW on ??? UPDATE on ??? UPDATE on ??? UPDATE on ???.
Choosing Reactions Many reactions may
be plausible, for example, to restore a constraint.
Different policies may be used in responding to different events.
Different policies: For example, may change boss if boss.sal reduced, but raise salary of boss to match increase in employee’s salary.
Quiz: what reactions could be used to resatisfy the constraint?
Possible reactions: Decrease ??? Increase ??? Change ??? Delete ??? Delete ???
Selecting Transition Granularity Tuple:
Access available to correlation variables.
Precise response to specific changes possible.
Often need many triggers to handle fine grained reactions.
Statement: No access to
correlation variables.
No possibility of precise response to changes.
Often need fewer triggers as generic reaction not very fine grained.
Summary on Triggers Triggers:
Extend the ways in which programming functionality can be stored in the database.
Extend built-in facilities for integrity, security, etc.
Are powerful ... but not always easy to develop or maintain.