Top Banner
Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object oriented programming Part 2 2/24/06 1-4pm Bioperl modules Sequence access Sequence manipulation Parsing BLAST records
22

Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Jan 20, 2018

Download

Documents

Hannah Morris

Why use module? Reusable by different programs. Keep your code well organized.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm

Module structureModule path Module export Object oriented programming

Part 2 2/24/06 1-4pm

Bioperl modulesSequence accessSequence manipulationParsing BLAST records

Page 2: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Module and main program

package Hello1;

sub greet { return "Hello, World!"; } 1;

Hello1.pm test1.pl

#!/usr/bin/perl

use Hello1;

print Hello1::greet();

Page 3: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Why use module?

• Reusable by different programs.

• Keep your code well organized.

Page 4: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Module structure

package Hello1;

sub greet { return "Hello, World!\n"; } 1;

Declare a package; file must be saved as Hello.pm

Contents of the package:functions, and variables.

Return a true value at end

Page 5: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Path to module• Default path to look for module: @INC

perl -e “print @INC”• If your module is placed under one of the path in @INC, you can refer

to your module use relative path. E.g. If @INC contains /usr/my/lib, and

(1) your Mod.pm is /usr/my/lib/Mod.pm, you can refer to your module by “use Mod.pm”.

(2) Your Mod.pm is /usr/my/lib/Mymod/Seq/Mod.pm, then you say:use Mymod::Seq::Mod

• If your module is not placed under any of @INC, e.g. /some/dir/Mod.pm, then:

use lib “/some/dir”; --- this adds the path to the beginning of @INC

use Mod;

Page 6: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Variable scope in module• my $var --- accessible only in module• our $var --- accessible from outside • $var ---same as “our $var”• use strict; --- This forces all variables to be qualified with ‘my’ or ‘our’.

package Hello2;use strict;our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}1;

Hello2.pm

#!/usr/bin/perluse Hello2;print "var1= $Hello2::var1\n";print "var2= $Hello2::var2\n";

pring Hello2::greet();

test2.pl

Page 7: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

ExportExport functions and variables, so that they can be accessed without qualifier

package Hello3;use strict;require Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}1;

Hello3.pm

#!/usr/bin/perluse Hello3 qw(greet);print "var1= $Hello3::var1\n";print "var2= $Hello3::var2\n";

print greet();

test3.pl

Page 8: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

package Hello3;use strict;use Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}

1;

Hello3.pmNeed functionality in Exporter.pm to do exporting.

This programs inherits functionsExporter module, rather than createsits own.

Exporter this sub routineupon request by other program

Page 9: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

#!/usr/bin/perluse Hello3 qw(greet);print "var1= $Hello3::var1\n";print "var2= $Hello3::var2\n";

print greet();

test3.pl

Request “greet”

Page 10: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

package Hello4;use strict;use Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our @EXPORT = qw(greet2);our $var1 = 1;my $var2 = 3;my $str = "Hello World!";sub greet { return $str;}

sub greet2 { return “Hi.\n”;}1;

Hello4.pm

Export this automatically

Page 11: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

#!/usr/bin/perluse Hello4 qw(greet);use Hello4;print "var1= $Hello4::var1\n";print "var2= $Hello4::var2\n";

print greet();print greet2();

test4.pl

Request “greet”

This automatically importswhatever in @EXPORT.

Page 12: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Exercise 1

• Create a module which has functions to calculate the area and boundary of a rectangle. The width and length are to be supplied in your main program and passed into your module. Practice the @EXPORT, and @EXPORT_OK.

Page 13: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Object Orientied Programming

•A package (or module) is a class.

•A reference to a hash becomes an object of this class.

•The object contains member variables which are stored in the hash.

•The object also contains member functions.

Page 14: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Hello5.pmpackage Hello;use strict;

sub new { my $class = shift; my $ref = {}; bless ( $ref, $class ); return $ref;}

sub greet { my ($ref, $str)= @_; return $str;}

sub greet2 { return "Hi\n";}1;

#!/usr/local/bin/perluse Hello5;$h = new Hello5;

print $h->greet("Good morning\n");print $h->greet2;

test5.pl

Page 15: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Rectangle.pm

package Rectangle;sub new { my ($class, $width, $length)=@_; my $hashref = {W=>$width, L=>$length }; bless ( $hashref, $class); return $hashref;}

sub getArea { my $self = shift; return $self->{W} * $self->{L};}

sub getBoundary { my $self=shift; return 2*($self->{W}+$self->{L});}

1;

#!/usr/bin/perluse Rectangle;my $w = 3;my $l = 4;

my $rect = new Rectangle($w,$l);my $area = $rect->getArea();print "Area = $area\n";

my $b = $rect->getBoundary();Print “Boundary=$b\n”;

recttest.pl

Page 16: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Exercise 2

• Create a class called “Cube”. It should have methods to calculate volume based on the cube’s width, length and height.

Page 17: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

More Pratices on Class

• Sequence.pm:clean,wrap,reverse complement,shuffle,GC content,translate

• Main program: seq.pl

Page 18: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Bioperl• A collection of perl modules for bioinformatics

• Facilitates sequence retrieval, manipulation, and parsing results of programs like blast, clustalw.

• http://bioperl.org for download and documentation.

• Individual .pm file has info on how to use modules.

• Usually installed: /usr/local/lib/perl5/site_perl/5.8.0/Bio

Page 19: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Some Bioperl modules

• Bio::Perl, Bio::DB -- access seq databases. Examples: seqret.pl

• Bio::Seq -- sequence and its annotation. E.g. seqio.pl

• Bio::SeqIO – read sequence from file, and write to file. E.g. seqio.pl

• Bio::Tools:SeqStats -- molecular weight, etc. E.g. seqmw.pl

• Bio::SearchIO -- parse blast results.

Page 20: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Accessing Remote Databases

use Bio::Perl;$seqobj = get_sequence(‘swiss’, “ROA1_HUMAN”);write_sequence(“roa1.fasta”, ‘fasta’, $seqobj);

Databases can be: swiss, genbank, genpept, refseq, etc.

Page 21: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Bio::Seq• Contain sequence and annotation• Methods: display_id, desc, seq, revcom, translate, etc.

The revcom and translate methods create new Bio::Seq object.

One way to create a Bio::Seq object:$seq = Bio::Seq->new(-seq => 'actgtggcgtcaact',

-desc => 'Sample Bio::Seq object', -display_id => 'something', -accession_number => 'accnum', -alphabet => 'dna' );

An other way: read the sequence from file via Bio::SeqIO object.

Page 22: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Parsing blast results• Module: Bio::SearchIO• my $in = new Bio::SearchIO(-format => 'blast', -file => 'report.bls'); while( my $result = $in->next_result ) { while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) {

if( $hsp->length('total') > 100 ) { if ( $hsp->percent_identity >= 75 ) {

print "Hit= ", $hit->name, ",Length=", $hsp->length('total'), ",Percent_id=", $hsp->percent_identity, "\n";

} } } } }

Example: blastparse.pl