16.06.22 - Page 1 Département Office DBIx::DataModel 2.0 in detail YAPC::EU::2011, Riga [email protected] Département Office
May 10, 2015
11.04.23 - Page 1
DépartementOffice
DBIx::DataModel2.0
in detail
YAPC::EU::2011, Riga
[email protected]épartement
Office
Agenda
• Introduction : Object-Relational Mappings & UML• DBIx::DataModel 2.0 Architecture• Modelling the schema• Selecting data• Row objects and statement objects• Joins• Inserts and updates• Customization• Strengths and limitations of DBIx::DataModel
today on CPAN :v 1.99_05
11.04.23 - Page 1
DépartementOffice
Object-Relational Mappers
Perl and databases
Database
DBD driver
DBI
Object-Relational Mapper
Perl program
ORM principle
r1r2...
c1 c2 c3
...
c3 c4
+c1: String+c2: String+c3: class2
r1 : class1
RDBMS
r2 : class1
RAM
table1
table2
Impedance mismatch
• SELECT c1, c2 FROM table1 missing c3, so cannot navigate to class2 is it a valid instance of class1 ?
• SELECT * FROM table1 LEFT JOIN table2 ON … is it a valid instance of class1 ? what to do with the c4 column ?
• SELECT c1, c2, length(c2) AS l_c2 FROM table1 no predeclared method in class1 for accessing l_c2
c1 c2 c3 c3 c4+c1: String+c2: String+c3: class2
r1 : class1 RDBMSRAMtable1 table2
11.04.23 - Page 1
DépartementOffice
The Unified Modeling Language (UML)
Example : CPAN model
Author
Distribution Module
1
*
1 *
*
multiplicity
role
class
association
dependent_distribs
*prereq_modules
contains ►
assoc. name
composition
depends_on ►
11.04.23 - Page 1
DépartementOffice
Modelling a schema
Architecture
Schema
Source
Table Join Statement
My::DB My::DB::Table_n
My::DB::AutoJoin::
row statementrow
DBIx::DataModelclasses
applicationclasses
objects
schema
quite similar to DBI architecture (dbh, sth)
All definitions in one single file
use DBIx::DataModel;
DBIx::DataModel->Schema("My::DB")
->Table(qw/Author author author_id /)->Table(qw/Distribution distribution distrib_id/)->Table(qw/Module module module_id /)
->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);
creates package My::DB
creates package My::DB::Author
adds methods intoboth packages
Multiplicities
"$min..$max"
"*" means "0..POSIX::INT_MAX""1" means "1..1"
$min == 0 ? joins are "LEFT OUTER JOIN" : joins are "INNER JOIN"
$max > 1 ? default result is a list : default result is a single object
Meta-Architecture
Schema Table Join
My::DB
My::DB::Table_n
My::DB::Auto_join
Meta::Source
Meta::Table
Meta::Join
meta::table
meta::join
meta::schema
Meta::Schema Meta::PathMeta::Association Meta::Type
meta::assoc meta::path
meta::type
11.04.23 - Page 1
DépartementOffice
Data retrieval
Fetching one single record
# fetch from primary key# by default, retrieves all columns ('*') my $author = My::DB->table('Author')->fetch('DAMI');
# reach columns through the hashref APIwhile (my ($k, $v) = each %$author) {
print "$k : $v\n";}
Multi-schema mode
# create a schemamy $dbh = DBI->connect(… ):my $schema = My::DB->new(dbh => $dbh);
# fetch datamy $author = $schema->table('Author')->fetch('DAMI');
Fetching a list of records
# select multiple records my $recent_distribs = My::DB->table('Distribution') ->select( -columns => [qw/distrib_name d_release/], -where => {d_release => {'>' => $some_date}}, -order_by => [qw/-d_release +distrib_name/],
);
foreach my $distrib (@$recent_distribs) { ...}
Select API : overview
my $result = $source->select( -columns => \@columns,, -where => \%where, -group_by => \@groupings, -having => \%criteria, -order_by => \@order, -for => 'read only', -post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … }, -page_size => …, -page_index => …, -limit => …, -offset => …, -column_types => \%types, -result_as => 'rows' || 'sth' || 'sql' || 'statement' || 'hashref');
11.04.23 - Page 1
DépartementOffice
Arguments to select()
SQL::Abstract::More : named parameters
my $result = $source->select( -columns => \@columns,, -where => \%where, -order_by => \@order,);
SQL::Abstract->new ->select($table, \@columns, \%where, \@order)
SQL::Abstract::More : extensions
• -columns => [qw/col1|alias1 max(col2)|alias2/]– SELECT col1 AS alias1, max(col2) AS alias2
• -columns => [-DISTINCT => qw/col1 col2 col3/]– SELECT DISTINCT col1, col2, col3
• -order_by => [qw/col1 +col2 –col3/]– SELECT … ORDER BY col1, col2 ASC, col3 DESC
• -for => "update" || "read only"– SELECT … FOR UPDATE
Grouping
• -group_by => [qw/col1 col2 …/]• -having => { col1 => {"<" => val1} , col2 => ... }
– SELECT … GROUP BY col1, col2 HAVING col1 < ? AND col2 …
separate call to SQL::Abstract and re-injection into the SQL
Paging
• -page_size => $num_rows, -page_index => $page_index
# or
• -limit => $num_rows, -offset => $row_index
either new call to $sth->execute(), or use scrollable cursors (DBIx.:DataModel::Statement::JDBC)
starts at 1
starts at 0
Callbacks
-post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … },
• hooks to various states within the statement lifecycle (see later)
• sometimes useful for DB-specific features
Polymorphic result
-result_as =>– 'rows' (default) : arrayref of row objects– 'firstrow' : a single row object (or undef)– 'hashref' : hashref keyed by primary keys– [hashref => @cols] : cascaded hashref– 'flat_arrayref' : flattened values from each row– 'statement' : a statement object (iterator)– 'fast_statement' : statement reusing same memory– 'sth' : DBI statement handle– 'sql' : ($sql, @bind_values)– 'subquery' : \["($sql)", @bind]
don't need method variants : select_hashref(), select_arrayref(), etc.
11.04.23 - Page 1
DépartementOffice
Row objects
A row object …
• is just a hashref– keys are column names– values are column values– nothing else
• actually, when in multi-schema mode, there is an additional __schema field
• is blessed into the table class– has a metadm method (accessor to the metaclass)– has a schema method (accessor to the schema)– has methods for navigating to related tables
• can be dumped as is– to Dumper / YAML / JSON / XML– to Perl debugger
Columns …
• basically, are plain scalars, not objects• but can be "inflated/deflated" through a Type()• programmer chooses the column list, at each
select()-columns => \@columns # arrayref -columns => "col1, col2" # string-columns => "*" # default
• objects have variable size !– if missing keys : runtime error
• when following joins• when updating and deleting
Navigation to associated tables
• Method names come from association declarations• Exactly like a select()
– automatically chooses –result_as => 'rows' || 'firstrow'from multiplicity information
# ->Association([qw/Author author 1 /],# [qw/Distribution distribs 0..* /])
my $author = $distrib->author();my $other_distribs = $author->distribs( -columns => [qw/. . ./], -where => { . . . }, -order_by => [qw/. . ./],);
11.04.23 - Page 1
DépartementOffice
Statement objects
Statement: an encapsulated query
statement
meta::source My::Source
1
1
**schemadbh
0..1 *
rownext() / all()
My::Schema meta::schema
in single-schema mode
in multi-schema mode
singleton()
Statement lifecycle
new
sqlized
prepared
executed
schema + source
data row(s)
new()
sqlize()
prepare()
execute()
bind()refine()
bind()
bind()
bind()execute()
next() / all()
blessedcolumn types applied
-post_bless
-pre_exec
-post_exec
-post_SQL
When to explicitly use a statement
• as iteratormy $statement = $source->select(..., -result_as => 'statement');while (my $row = $statement->next) { . . .}
• for paging$statement->goto_page(123);
• for loop efficiencymy $statement = My::Table->join(qw/role1 role2/); $statement->prepare(-columns => ..., -where => ...); my $list = My::Table->select(...); foreach my $obj (@$list) { my $related_rows = $statement->execute($obj)->all; ... }
Fast statement
• like a regular statement– but reuses the same memory location for each row– see DBI::bind_col()
my $statement = $source->select( . . . , -result_as => 'fast_statement');
while (my $row = $statement->next) { . . . # DO THIS : print $row->{col1}, $row->{col2} # BUT DON'T DO THIS : push @results, $row;}
11.04.23 - Page 1
DépartementOffice
Database joins
Basic join
$rows = My::DB->join(qw/Author distribs modules/) ->select(-where => ...);
Author Distrib Module
My::DB::AutoJoin::…
DBIDM::Source::Join
new class created on the fly
Left / inner joins
->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])
# default : LEFT OUTER JOIN
->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);
# default : INNER JOIN
# but defaults can be overriddenMy::DB->join([qw/Author <=> distribs/)-> . . . My::DB->join([qw/Distribution => modules /)-> . . .
Join from an instance
$rows = $author->join(qw/distribs modules/)->select( -columns => [qw/distrib_name module_name/], -where => {d_release => {'<' => $date} },);
SELECT distrib_name, module_nameFROM distribution INNER JOIN module ON distribution.distrib_id = module.distrib_idWHERE distrib.author_id = $author->{author_id} AND d_release < $date
11.04.23 - Page 1
DépartementOffice
Insert / Update
Insert
@ids = MyDB::Author->insert({ firstname => 'Larry', lastname => 'Wall' },
{ firstname => 'Damian', lastname => 'Conway' },);
INSERT INTO author(firstname, lastname)VALUES (?, ?)
Bulk insert
@ids = MyDB::Author->insert([qw/firstname lastname/],[qw/Larry Wall /],
[qw/Damian Conway /],);
Insert into / cascaded insert
@id_trees = $author->insert_into_distribs( {distrib_name => 'DBIx-DataModel', modules => [ {module_name => 'DBIx::DataModel', ..}, {module_name => 'DBIx::DataModel::Statement', ..}, ]}, {distrib_name => 'Pod-POM-Web', … }, -returning => {},);
Update
$obj->{col1} = $new_val_1;$obj->{col2} = $new_val_2;. . .$opj->update;# orMyDB::Author->update({author_id => $id, col =>
$new_val})# orMyDB::Author->update($id, {col => $new_val})
# or (bulk update)MyDB::Author->update(-set => {col => $new_val},
-where => \%condition)
Transaction
MyDB->do_transaction(sub { my $author = MyDB::Author->fetch($author_id, {-for => "read only"} );
my $distribs = $author->distribs(-for => 'update');foreach my $distrib (@$distribs) {my $id = $distrib->{distrib_id};MyDB::Distrib->update($id, {col => $val});
}});
• can be nested• can involve several dbh• no savepoints (yet)
11.04.23 - Page 1
DépartementOffice
Other features
Named placeholders / bind()
# introduce named placeholders$statement->prepare(-where => { col1 => '?:foo', col2 => {"<" => '?:bar'}, col3 => {">" => '?:bar'}, col3 => 1234, });
# fill placeholders with values$statement->bind(foo => 99, bar => 88, other => 77);$statement->bind($hashref);
$sql @bind
SELECT * FROM .. WHEREcol1 = ? AND col2 < ? AND col3 = ?
-- -- 1234
?:foo ?:bar
Types (inflate/deflate)
# declare a TypeMy::DB->Type(Multivalue => from_DB => sub {$_[0] = [split /;/, $_[0]] }, to_DB => sub {$_[0] = join ";", @$_[0] },);
# apply it to some columns in a tableMy::DB::Author->metadm->define_column_type( Multivalue => qw/hobbies languages/,);
Auto_expand
# declare auto-expansionsMyDB::Author->define_auto_expand(qw/distributions/);MyDB::Distribution->define_auto_expand(qw/modules/);
# apply to an object (automatically fetches all modules of all distributions of that author)
$author->auto_expand();
# use the data treeuse YAML; print Dump($author);
Schema localization
{# a kind of "local MyDB";
my $guard = MyDB->localize_state();
# temporary change class dataMyDB->dbh($new_dbh, %new_options);do_some_work_with_new_dbh();
} # automatically restore previous state
Schema generator
perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBI('dbi:connection:string')" -- \ -schema My::New::Schema > My/New/Schema.pm
perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBIxClass('Some::DBIC::Schema')" -- \ -schema My::New::Schema > My/New/Schema.pm
Auto_insert / Auto_update / No_update
$table->auto_insert_columns(created_by => sub {$ENV{REMOTE_USER} . ", " . localtime });
$table->auto_update_columns( modified_by => sub {…} );
$table->no_update_columns(qw/row_id/);
can also be declared for the whole schema
Extending / customizing DBIx::DataModel
• Schema hooks for– SQL dialects (join syntax, alias syntax, limit / offset, etc.)– last_insert_id
• Ad hoc subclasses for– SQL::Abstract– Table– Join– Statements
• Statement callbacks• Extending table classes
– additional methods– redefining _singleInsert method
11.04.23 - Page 1
DépartementOffice
Conclusion
Strengths
• centralized definitions of tables & associations• efficiency• improved API for SQL::Abstract • clear conceptual distinction between
– data sources (tables and joinss),– database statements (stateful objects representing SQL
queries)– data rows (lightweight blessed hashrefs)
• concise and flexible syntax for joins• used in production for mission-critical app
– (running Geneva courts)
Limitations
• tiny community• no schema versioning• no object caching nor 'dirty columns' • no 'cascaded update' nor 'insert or create'
Lots of documentation
• SYNOPSIS AND DESCRIPTION• DESIGN• QUICKSTART• REFERENCE• COOKBOOK• MISC• INTERNALS• GLOSSARY
11.04.23 - Page 1
DépartementOffice
THANK YOU FOR YOUR ATTENTION
11.04.23 - Page 1
DépartementOffice
Bonus slides
ORM: What for ?
[catalyst list] On Thu, 2006-06-08, Steve wrote:
Not intending to start any sort of rancorous discussion, but I was wondering whether someone could illuminate me a little?
I'm comfortable with SQL, and with DBI. I write basic SQL that runs just fine on all databases, or more complex SQL when I want to target a single database (ususally postgresql).
What value does an ORM add for a user like me?
ORM useful for …
• dynamic SQL– navigation between tables– generate complex SQL queries from Perl datastructures– better than phrasebook or string concatenation
• automatic data conversions (inflation / deflation)• expansion of tree data structures coded in the relational
model• transaction encapsulation • data validation• computed fields• caching• …
See Also : http://lists.scsys.co.uk/pipermail/catalyst/2006-June
Many-to-many implementation
author_idauthor_namee_mail
1
*
1 *
* *
Author
distrib_idmodule_id
Dependency
distrib_iddistrib_named_releaseauthor_id
Distribution
module_idmodule_namedistrib_id
Module
1 1
link table forn-to-n association
Writing SQL
SQL is too low-level, I don't ever want to see it
SQL is the most important part of my application, I won't let
anybody write it for me
Why hashref instead of OO accessors ?
• Perl builtin rich API for hashes (keys, values, slices, string interpolation)
• good for import / export in YAML/XML/JSON• easier to follow steps in Perl debugger• faster than OO accessor methods• visually clear distinction between lvalue / rvalue
– my $val = $hashref->{column};– $hashref->{column} = $val;
• visually clear distinction between – $row->{column} / $row->remote_table()
Callback example
WITH RECURSIVE nodetree(level, id, pid, sort) AS ( SELECT 1, id, parent, '{1}'::int[] FROM nodes WHERE parent IS NULL UNION SELECT level+1,p.id, parent, sort||p.id FROM nodetree pr JOIN nodes p ON p.parent = pr.id
) SELECT * FROM nodetree ORDER BY sort;
my $with_clause = "WITH RECURSIVE …";
DBIx::DataModel->Schema('Tst') ->Table(qw/Nodetree nodetree id/);
my $result = Tst::Nodetree->select ( -post_SQL => sub {my $sql = shift; $sql =~ s/^/$with_clause/; return $sql, @_ }, -orderBy => 'sort',);
Verbose form for definitions
DBIx::DataModel->define_schema( class => "My::DB");
My::DB->metadm->define_table( class => "Author", db_name => "author", primary_key => "author_id"):
New features in 2.0
• metaclasses– client can query about tables, associations, types, etc.– method namespace for regular objects is not polluted by meta-methods
• single-schema / multi-schema mode• misc additions
– bulk update & bulk delete– support for table inheritance– arbitrary clauses in joins
• API changes– perlish_method_names()– SQL generation moved to SQL::Abstract::More– Params::Validate everywhere– deprecated Autoload()– compatibility layer : use DBIx::DataModel –compatibility => 1.0
Migration to v2.0
• deploy DBIx::DataModel 2.0– -compatibility => 1.0
• test• change client code according to new API• test• suppress compatibility layer
DBIx::DataModel history
• 2005 : 1st CPAN publication (v0.10, 16.09.05)• 2006 : YAPC::EU::06 Birmingham presentation• 2008 : heavy refactoring (v1.03, 23.09.08)
– statement object– implicit schema name
• 2011: heavy refactoring– metaobject layer– multi-schema mode– API renaming