Welcome to lecture 5: Object – Oriented Programming in Perl IGERT – Sponsored Bioinformatics Workshop Series Michael Janis and Max Kopelevich, Ph.D. Dept. of Chemistry & Biochemistry, UCLA
Jan 10, 2016
Welcome to lecture 5:Object – Oriented Programming in Perl
IGERT – Sponsored Bioinformatics Workshop SeriesMichael Janis and Max Kopelevich, Ph.D.
Dept. of Chemistry & Biochemistry, UCLA
We’ve been cruising!(Whew!)
Two weeks to go… after this lecture – we’ll have a bit of payoff : now we’ll start using some of our
knowledge!!! (Don’t give up!!!!!!!!)
Last time…
• We covered a bit of material…• Try to keep up with the reading – it’s all in there!• We’ve covered subroutines and modules…
– Now we’ll cover OOP in perl
– We’ll create classes of our own to use
– We’ll take our previous example of a biological problem (gene finding) and see how OOP can help us…
– We are preparing for the introduction to BIOPERL next week – which is object-oriented!
Object Oriented Perl(A gentle introduction)
We have been dealing with increasingly complex data
structuresA result is that we always need to be concerned about the data stateLet’s turn this around and see if there is a better way to think about
biological data – We’d like to keep the data redundancy to a minimum (only one strand
needed?)• Reduces errors• Reduces space• Easier to maintain / update the information (just need a function)
– Increasing our data means increasing the complexity of our data structures
• We might like to define ways of interacting with the data; an API- like approach (like we saw with our fasta file handler example…)
• We could then concentrate on *what to do* with the data rather than *how to get at* the data…
Reversing the way we think about data (and the way we program)
A DATAStructure type
-Retrieve the data
-Access the data-Do something-Put the data into a structure
function
Another DATAstructure
-Need to do something else?
function
function function
function
DATA DATA
-Update…
Maybe we can tie it all together
DATA ANDFUNCTIONS
-Retrieve the data
-Can be a data type of our-Own creation-(beyond scalars, arrays, hashes)
-Access the data viafunctions associated with the data
Interface remains constantWe don’t have to worry about The code (modPerl again!)
AND…-Can have different data types stored together (hashes, scalars arrays)-We treat all of the data types and functions together as a NEW DATA TYPE-We can use the new data type in ANY type of data structure we wish to build
Defining our own data types
DATA ANDFUNCTIONS
-Sequence data-Ontology data-Promoter regions data-Expression analysis data
-Interactions-Homologies-Enrichment-Pattern recognition
-An entire genome with annotation, microarray correlations, and built in functions for positional analysis, pathway analysis, …
-Our new data type can be serialized into any other data structure (like an array of genomes, each with the same functionality possible)
[0] [1] [2] [3]
This is what OOP promises!
We start thinking about the functionality of our data It’s another layer of abstraction, but it makes our lives easier as programmers…
WHAT IS AN OBJECT?: – Data structure bundled with functions to set, access, and process
the data structure• A new data type!
– Rigorous definitions of organizing code• All (most) interactions are defined and must obey certain rules
– A module (as a class)• Instead of importing a series of subroutines that are called directly,
these modules define a series of object types that you can create and use.
– A level of abstraction – data that logically belongs together• That lets us focus on using the object
Perl Object Syntax
Perl objects are special references that come bundled with a set of functions that know how to act on the contents of the reference.
• For example, there may be a Sequence class definition. – Internally, the Sequence object is an instance of the Sequence class
definition• It’s a hash reference that has keys that point to the DNA string, the name and
source of the sequence, and other attributes.
– The object is bundled with functions that know how to manipulate the sequence, such as revcom(), translate(), subseq(), etc.
Perl Object Syntax
When talking about objects, the bundled functions are known as methods. This terminology derives from the grandaddy of all object-oriented languages, Smalltalk.
• You invoke a method using the -> operator, a syntax that looks a lot like getting at the value that a reference points to.– For example, if we have a Sequence object stored in the scalar variable
$sequence, we can call its methods like this:
$reverse_complement = $sequence->revcom();$first_10_bases = $sequence->subseq(1,10);$protein = $sequence->translate;
Parts of an object
-An empty class (data definition) (a class is a package) – modular!
Method(function)
Method(function)
Attributes(parts of The data str.,usually askeys for a hash)
# interaction with code# (main part of program here)# $obj=new $class{};# $obj->method();
The scalar “obj” is of type class; it is a new instance of our class
We create a scalar reference using a
method called “new”
We already know how to build a class; we’ll use our
key/value pairs and subroutines-An empty class (data definition) (a class is a package) – modular!
package exClass;use strict; use warnings;sub new {
my ($class, %arg)=@_;return bless {
_name => $arg{accession}_organism => $arg{organism}, } $class;
}sub get_name { $_[0]->{_name} }sub get_organism {$_[0]->{_organism} }1;
Our methods are just subroutines (we’ve
seen before)
The class definition is a package (module)
We pass arguments as @_
We create an instance of the class, as a reference to an anonymous hash* we define
Building a class attributes
Sets the parts of the internal data structure• For example
– Name of the organism– DNA sequence– Exon/Intron boundaries – …
• Passed as a hash list of arguments• Instantiated using a method (subroutine) – a constructor (we usually call it “new”)• By convention, internal to the class and preceded with “_” to denote this
– Should only access class (object) data through methods!!!
sub new {my ($class, %arg)=@_;return bless {
_name => $arg{accession}_organism => $arg{organism}, } $class;
}
Building a class constructor
The method that sets the attributes• For example
– Name of the organism– DNA sequence– Exon/Intron boundaries – …
• Initializes an object• Marks the object as a member of the class (an ‘instance’ of the class
definition)
sub new {my ($class, %arg)=@_;return bless {
_name => $arg{accession}_organism => $arg{organism}, } $class;
}
We pass arguments as @_ ; the class name is automatically passed as the first agument;
our hash of arguments follows
Building a class
bless
Creates an object of the class definition from a given data structure (usually a hash)
• Takes two arguments:
– An anonymous hash (a reference to an unnamed hash)
– The name of the class for which the object will be marked
– We return this to a scalar variable which is a reference to the object.
Building a class
accessors
Methods (subroutines) which return values of the class attributes (attribute / values ; key/value pairs in our hash)
sub get_name { $_[0]->{_name} }
my $species=$obj->get_name;
We pass arguments as @_ ; the first argument is therefore the object
The call to the object accessor method
Building a class
mutators
Methods (subroutines) which change or update values of the class attributes (attribute / values ; key/value pairs in our hash)
Using Objects
Before you can start using objects, you must load their definitions from the appropriate module(s).
• This is just like loading subroutines from modules;– you use the use statement in both cases. – For example, if we want to load our “exClass” Class definitions, we load the
appropriate module, which in this case is called exClass (or lib::exClass, or whatever file hierarchy you’ve imposed).
use exClass;
• Now you'll probably want to create a new object. – There are a variety of ways to do this, and details vary from module to module,
but most modules, including ours, do it using the new() method:
use exClass;my $obj=exClass->new( accession => “AC00243”,
organism => “Homo Sapiens”, );
Passing Arguments to Methods
When you call object methods, you can pass a list of arguments, just as you would to a regular function.
• We’ve seen this a number of times; for example, using the substr function. – As methods get more complex, argument lists can get quite long and have
possibly dozens of optional arguments. To make this manageable, many object-oriented modules use a named parameter style of argument passing, that looks like this:
– my $result = $object->method(-arg1=>$value1,-arg2=>$value2,-arg3=>$value3)
– We utilize the (->) arrow notation:• We saw this with references• Used on an object to call a method in the class• Perl automatically passes the first argument to the method
We already know how to use a class; references and arguments
-An empty class (data definition) (a class is a package) – modular!
#!/usr/bin/perl –wuse strict;use lib”/home/mako/devel/lib”;use exClass;my $obj=exClass->new(accession => “AC00243”,
organism => “Homo Sapiens”, );
my $species=$obj->get_name;
An accessor method (subroutine)
The class definition is a package (module) in
a location
We pass arguments to the subroutine new
We create an instance of the class, as a reference to an anonymous hash* we define
Passing Arguments to Methods; a bioperl example
As a practical example, Bio::PrimarySeq->new() actually takes multiple optional arguments that allow you to specify the alphabet, the source of the sequence, and so forth. Rather than create a humungous argument list which forces you to remember the correct position of each argument, Bio::PrimarySeq lets you create a new Sequence this way:
use Bio::PrimarySeq;my $sequence = Bio::PrimarySeq->new(-seq => 'gattcgattccaaggttccaaa', -id => 'oligo23', -alphabet => 'dna', -is_circular => 0, -accession_number => 'X123' );
Perl Object Syntax
Don't be put off by this syntax! • $sequence is really just a hash reference!
– you can get its keys using keys %$sequence– you can look at the contents of the "_seq_length" key by using
$sequence->{_seq_length}, and so forth.
– the syntax $sequence->translate is just a fancy way of writing translate($sequence), except that the object knows what module the translate() function is defined in.
Back to our task (geneFinding)
I’ve re-written the fasta file reader and some associated functions as part of our gene finding programming exercise in object –modular perl
Back to our task (geneFinding)
I’ve re-written the fasta file reader and some associated functions as part of our gene finding programming exercise in object –modular perl
• It’s on the website (http://www.chem.ucla.edu/~mjanis/readFasta.pm)
• Three tasks:– Create a file hierarchy for storage of your library files and implement my code
(using the example code for using the class readFasta)– Comment the readFasta class at every line, describing what each component of the
class (constructor, accessors, etc.) are doing– The comments I’ve made in the code point out that the code is actually unfinished,
although it will run as is. Complete the code and adapt the existing gene finding subroutines we’ve used as methods for the class readFasta