Top Banner
114
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 2012 03 08_dbi
Page 2: 2012 03 08_dbi

FBW21-02-2008

Wim Van Criekinge

RELOADED2

Page 3: 2012 03 08_dbi

Three Basic Data Types

• Scalars - $

• Arrays of scalars - @

• Associative arrays of scalers or Hashes - %

Page 4: 2012 03 08_dbi

• [m]/PATTERN/[g][i][o]

• s/PATTERN/PATTERN/[g][i][e][o]

• tr/PATTERNLIST/PATTERNLIST/[c][d][s]

Page 5: 2012 03 08_dbi

The ‘structure’ of a Hash

• An array looks something like this:

• A hash looks something like this:

@array =Index

Value

0 1 2

'val1' 'val2' 'val3'

Rob Matt Joe_A

353-7236 353-7122 555-1212

Key (name)

Value%phone =

Page 6: 2012 03 08_dbi

$a=5;$b=9;$sum=Optellen(5,9);print "The SUM is $sum\n";

sub Optellen(){ $d=@_[0];

$e=@_[1]; #alternatively we could do this: my($a,$b)=@_;

my($answer)=$d+$e; return $answer;}

Sub routine

Page 7: 2012 03 08_dbi

Overview

• Advanced data structures in Perl

• Object-oriented Programming in Perl

• Bioperl: is a large collection of Perl software for bioinformatics

• Motivation:– Simple extension: “Multiline parsing“

more difficult than expected

• Goal: to make software modular, easier to maintain, more reliable, and easier to reuse

Page 8: 2012 03 08_dbi

Multi-line parsing

use strict;use Bio::SeqIO;

my $filename="sw.txt";my $sequence_object;

my $seqio = Bio::SeqIO -> new ( '-format' => 'swiss', '-file' => $filename );

while ($sequence_object = $seqio -> next_seq) {my $sequentie = $sequence_object-> seq(); print $sequentie."\n";}

Page 9: 2012 03 08_dbi

Perl 00

• A class is a package

• An object is a reference to a data structure (usually a hash) in a class

• A method is a subroutine in the class

Page 10: 2012 03 08_dbi

Perl Classes

• Modules/Packages – A Perl module is a file that uses a package

declaration

– Packages provide a separate namespace for different parts of program

– A namespace protects the variable of one part of a program from unwanted modification by another part of the program

– The module must always have a last line that evaluates to true, e.g. 1;

– The module must be in “known” directory (environment variable)

• Eg … site/lib/bio/Sequentie.pm

Page 11: 2012 03 08_dbi

Installation on Windows (ActiveState)

• Using PPM shell to install BioPerl– Get the number of the BioPerl repository:

– PPM>repository

– Set the BioPerl repository, find BioPerl, install BioPerl:

– PPM>repository set <BioPerl repository number>– PPM>search * – PPM>install <BioPerl package number>

• Download BioPerl in archive form from – http://www.BioPerl.org/Core/Latest/index.shtml– Use winzip to uncompress and install

Page 12: 2012 03 08_dbi

Directory Structure

• BioPerl directory structure organization:– Bio/ BioPerl modules

– models/ UML for BioPerl classes

– t/ Perl built-in tests

– t/data/ Data files used for the tests

– scripts/ Reusable scripts that use BioPerl

– scripts/contributed/ Contributed scripts not necessarily integrated into BioPerl.

– doc/ "How To" files and the FAQ as XML

Page 13: 2012 03 08_dbi
Page 14: 2012 03 08_dbi

Live.pl

#!e:\Perl\bin\perl.exe -w# script for looping over genbank entries, printing out nameuse Bio::DB::Genbank;use Data::Dumper;

$gb = new Bio::DB::GenBank();

$sequence_object = $gb->get_Seq_by_id('MUSIGHBA1');print Dumper ($sequence_object);

$seq1_id = $sequence_object->display_id();$seq1_s = $sequence_object->seq();print "seq1 display id is $seq1_id \n";print "seq1 sequence is $seq1_s \n";

Page 15: 2012 03 08_dbi

File converter

#!/opt/perl/bin/perl -w

#genbank_to_fasta.pl

use Bio::SeqIO;

my $input = Bio::SeqIO::new->(‘-file’ => $ARGV[0],

‘-format’ => ‘GenBank’);

my $output = Bio::SeqIO::new->(‘-file’ => ‘>output.fasta’,

‘-format’ => ‘Fasta’);

while (my $seq = $input->next_seq()){

$output->write_seq($seq)

}

Page 16: 2012 03 08_dbi

• Bptutorial.pl

• It includes the written tutorial as well as runnable scripts

• 2 ESSENTIAL TOOLS– Data::Dumper to find out what class your

in– Perl bptutorial (100 Bio::Seq) to find the

available methods for that class

Page 17: 2012 03 08_dbi

Run Needleman-Wunsch-monte-carlo.pl

– my $MATCH = 1; # +1 for letters that match– my $MISMATCH = -1; # -1 for letters that mismatch– my $GAP = -1; # -1 for any gap

Score (-64)

Score = f($MATCH,$MISMATCH,$GAP)

f ?Implement convergence criteriaStore in DATABASE, make graphs in Excel

Oefening 1

Page 18: 2012 03 08_dbi

A Guide to MySQL & DBI

Page 19: 2012 03 08_dbi

Objectives

• Start MySQL and learn how to use the MySQL Reference Manual

• Create a database

• Change (activate) a database

• Create tables using MySQL

• Create and run SQL commands in MySQL

Page 20: 2012 03 08_dbi

Objectives (continued)

• Identify and use data types to define columns in tables

• Understand and use nulls

• Add rows to tables

• View table data

• Correct errors in a database

Page 21: 2012 03 08_dbi
Page 22: 2012 03 08_dbi
Page 23: 2012 03 08_dbi

Opvolger voor MySQL Front

• MySQL-Front was destijds een van de meest populaire MySQL-management applicaties. Wat PHPMyAdmin voor webapplicaties is, was MySQL-Front dat voor de desktop. Helaas kon /wilde de auteur niet langer doorgaan met het project en werd het project stilgelegd.

• In begin April 2006 heeft de originele auteur besloten om de laatste broncode voor MySQL-Front beschikbaar te maken onder de naam HeidiSQL en de eerste beta is te downloaden vanaf de nieuwe site: http://www.heidisql.com .

Page 24: 2012 03 08_dbi
Page 25: 2012 03 08_dbi

Starting MySQL

• Windows XP– Click Start button

– Point to All Programs

– Point to MySQL on menu

– Point to MySQL Server 4.1

– Click MySQL Command Line Client • Must enter password in Command Line Client

window

Page 26: 2012 03 08_dbi

Obtaining Help in MySQL

• Type \h at MySQL> prompt

• Type “help” followed by name of command

– help contents

– help union

Page 27: 2012 03 08_dbi
Page 28: 2012 03 08_dbi

Creating a Database

• Must create a database before creating tables

• Use CREATE DATABASE command

• Include database name

Page 29: 2012 03 08_dbi

Creating a Database (continued)

Page 30: 2012 03 08_dbi

Changing the Default Database

• Default database: database to which all subsequent commands pertain

• USE command, followed by database name:– Changes the default database

– Execute at the start of every session

Page 31: 2012 03 08_dbi

Creating a Table

• Describe the layout of each table in the database

• Use CREATE TABLE command

• TABLE is followed by the table name

• Follow this with the names and data types of the columns in the table

• Data types define type and size of data

Page 32: 2012 03 08_dbi

Table and Column Name Restrictions

• Names cannot exceed 18 characters

• Must start with a letter

• Can contain letters, numbers, and underscores (_)

• Cannot contain spaces

Page 33: 2012 03 08_dbi

Creating the REP Table

Page 34: 2012 03 08_dbi

Entering Commands in MySQL

• Commands are free-format; no rules stating specific words in specific positions

• Press ENTER to move to the next line in a command

• Indicate the end of a command by typing a semicolon

• Commands are not case sensitive

Page 35: 2012 03 08_dbi

Running SQL Commands

Page 36: 2012 03 08_dbi

Editing SQL Commands

• Statement history: stores most recently used command

• Editing commands:– Use arrow keys to move up, down, left, and right– Use Ctrl+A to move to beginning of line– Use Ctrl+E to move to end of line– Use Backspace and Delete keys

Page 37: 2012 03 08_dbi

Errors in SQL Commands

Page 38: 2012 03 08_dbi

Editing MySQL Commands

• Press Up arrow key to go to top line

• Press Enter key to move to next line if line is correct

• Use Right and Left arrow keys to move to location of error

• Press ENTER key when line is correct

• If Enter is not pressed on a line, line not part of the revised command

Page 39: 2012 03 08_dbi

Dropping a Table

• Can correct errors by dropping (deleting) a table and starting over

• Useful when table is created before errors are discovered

• Command is followed by the table to be dropped and a semicolon

• Any data in table also deleted

Page 40: 2012 03 08_dbi

Data Types

• For each table column, type of data must be defined

• Common data types:

– CHAR(n)

– VARCHAR(n)

– DATE

– DECIMAL(p,q)

– INT

– SMALLINT

Page 41: 2012 03 08_dbi

Nulls

• A special value to represent situation when actual value is not known for a column

• Can specify whether to allow nulls in the individual columns

• Should not allow nulls for primary key columns

Page 42: 2012 03 08_dbi

Implementation of Nulls

• Use NOT NULL clause in CREATE TABLE command to exclude the use of nulls in a column

• Default is to allow null values

• If a column is defined as NOT NULL, system will reject any attempt to store a null value there

Page 43: 2012 03 08_dbi

Adding Rows to a Table

• INSERT command:

– INSERT INTO followed by table name

– VALUES command followed by specific values in parentheses

– Values for character columns in single quotation marks

Page 44: 2012 03 08_dbi

The Insert Command

Page 45: 2012 03 08_dbi

Modifying the INSERT Command

• To add new rows modify previous INSERT command

• Use same editing techniques as those used to correct errors

Page 46: 2012 03 08_dbi

Adding Additional Rows

Page 47: 2012 03 08_dbi

The INSERT Command with Nulls

• Use a special format of INSERT command to enter a null value in a table

• Identify the names of the columns that accept non-null values, then list only the non-null values after the VALUES command

Page 48: 2012 03 08_dbi

The INSERT Command with Nulls

• Enter only non-null values• Precisely indicate values you are entering by listing

the columns

Page 49: 2012 03 08_dbi

The INSERT Command with Nulls (continued)

Page 50: 2012 03 08_dbi

Viewing Table Data

• Use SELECT command to display all the rows and columns in a table

• SELECT * FROM followed by the name of the table

• Ends with a semicolon

Page 51: 2012 03 08_dbi

Viewing Table Data (continued)

Page 52: 2012 03 08_dbi

Viewing Table Data (continued)

Page 53: 2012 03 08_dbi

Correcting Errors In the Database

• UPDATE command is used to update a value in a table

• DELETE command allows you to delete a record

• INSERT command allows you to add a record

Page 54: 2012 03 08_dbi

Correcting Errors in the Database

• UPDATE: change the value in a table• DELETE: delete a row from a table

Page 55: 2012 03 08_dbi

Correcting Errors in the Database (continued)

Page 56: 2012 03 08_dbi

Correcting Errors in the Database (continued)

Page 57: 2012 03 08_dbi

Saving SQL Commands

• Allows you to use commands again without retyping

• Different methods for each SQL implementation you are using

– Oracle SQL*Plus and SQL*Plus Worksheet use a script file

– Access saves queries as objects

– MySQL uses an editor to save text files

Page 58: 2012 03 08_dbi

Saving SQL Commands

• Script file:– File containing SQL commands

– Use a text editor or word processor to create

– Save with a .txt file name extension

– Run in MySQL:• SOURCE file name• \. file name

– Include full path if file is in folder other than default

Page 59: 2012 03 08_dbi

Creating the Remaining Database Tables

• Execute appropriate CREATE TABLE and INSERT commands

• Save these commands to a secondary storage device

Page 60: 2012 03 08_dbi

Describing a Table

Page 61: 2012 03 08_dbi

Summary

• Use MySQL Command Line Client window to enter commands

• Type \h or help to obtain help at the mysql> prompt• Use MySQL Reference Manual for more detailed

help

Page 62: 2012 03 08_dbi

Summary (continued)

• Use the CREATE DATABASE command to create a database

• Use the USE command to change the default database

• Use the CREATE TABLE command to create tables

• Use the DROP TABLE command to delete a table

Page 63: 2012 03 08_dbi

Summary (continued)

• CHAR, VARCHAR, DATE, DECIMAL, INT and SMALLINT data types

• Use INSERT command to add rows• Use NOT Null clause to identify columns that cannot

have a null value• Use SELECT command to view data in a table

Page 64: 2012 03 08_dbi

Summary (continued)

• Use UPDATE command to change the value in a column

• Use DELETE command to delete a row• Use SHOW COLUMNS command to display a

table’s structure

Page 65: 2012 03 08_dbi

• DBI

Page 66: 2012 03 08_dbi

• use DBI;

• my $dbh = DBI->connect( 'dbi:mysql:guestdb',• 'root',• '',• ) || die "Database connection not made: $DBI::errstr";

• $sth = $dbh->prepare('SELECT * FROM demo');• $sth->execute();• while (my @row = $sth->fetchrow_array) {

• print join(":",@row),"\n";• }• $sth->finish();

• $dbh->disconnect();

Page 67: 2012 03 08_dbi

The Players

• Perl – a programming language

• DBMS – software to manage datat storage

• SQL – a language to talk to a DBMS

• DBI – Perl extensions to send SQL to a DBMS

• DBD – software DBI uses for specific DBMSs

• $dbh – a DBI object for course-grained access

• $sth – a DBI object for fine-grained access

Page 68: 2012 03 08_dbi

• What is DBI ?

• DBI is a DataBase Interface– It is the way Perl talks to Databases

• DBI is a module by Tim Bunce

• DBI is a community of modules & developers

Page 69: 2012 03 08_dbi

• What is an interface ?

• The overlap where two phenomeba affect each other

• A point at which independent systems interact

• A boundary across which two systems communicate

Page 70: 2012 03 08_dbi

• A Sample Interface (the bedrock of DBI)

Fred Wilma

Bone

Dino

Page 71: 2012 03 08_dbi

• Characteristics of the DINO interface

• Separation of knowledge– Fred doesn’t need to know how to find Wilma– Dino doesn’t need to know how to read

• Generalizability– Fred can send any message– Fred can communicate with anyone

Page 72: 2012 03 08_dbi

• The DBI interface

Perl DBMS

SQL

DBI

Page 73: 2012 03 08_dbi

• Characteristics of the DBI interface

• Separation of knowledge– You don’t need to know how to connect– DBI doesn’t need to know SQL

• Generalizeability– You can send any SQL– You can communicate with any DBMS

Page 74: 2012 03 08_dbi

• The ingredients of a DBI App– 1: A perl script that uses DBI– 2: A DBMS– 3: SQL statements

Page 75: 2012 03 08_dbi

Outline of a basic DBI script

Set the Perl Environment

Connect to a DBMS

Perform data-affecting SQL instructions

Perform data-returning SQL requests

Disconnect from the DBMS

Page 76: 2012 03 08_dbi

• $dbh = DataBase Handle

• Done by DBI– Connect

• Done by $dbh, The Database Handle– Perform SQL instructions– Perform SQL request– Disconnect

Page 77: 2012 03 08_dbi

• Set the Perl Environment– use warnings;– use strict;– Use DBI;

Page 78: 2012 03 08_dbi

• Connect to a DBMS

my $dbh = DBI -> connect (‘dbi:DBM:’)

$dbh is a Database Handke

An object created by DBI to handle access to this specific connection

Page 79: 2012 03 08_dbi

• Perform data-affecting Instructions

• $dbh->do($sql_string);

• $dbh->do(“ INSERT INTO geography VALUES (‘Nepal’,’Asia’)” );

Page 80: 2012 03 08_dbi

• Perform data-returning requests

• My @row = $dbh->selectrow_array($sql_string)

• Disconnect from DBMS

• $dbh->disconnect()

Page 81: 2012 03 08_dbi

A complete script

• use strict;• use warnings;• use DBI;

• my $dbh=DBI->connect("dbi:mysql:test","root","");• $dbh->do("CREATE TABLE geography (country Text, region

Text)");• $dbh->do("INSERT INTO geography VALUES

('Nepal','Asia')");• $dbh->do("INSERT INTO geography VALUES

('Portugal','Europe')");• print $dbh->selectrow_array("SELECT * FROM geography");• $dbh->disconnect

Page 82: 2012 03 08_dbi

• The script output

• Only one row

• No seperation of the fields

• No metadata

Page 83: 2012 03 08_dbi

• Improvements

• DBI– Connect to DBMS– Creates a database handle ($dbh)

• $dbh– Provides course-grained access to the DBMS– Creates a statement handle ($sth)

• $sth– Provides fine-grained access to the DBMS

Page 84: 2012 03 08_dbi

• Life-cycle of a statement handle ($sth)

• Prepare– Creates the handle, sends SQL to the DBMS to

be analyzed and optimized

• Execute– Instructs the DBMS to perform operations

• Fetch– Brings data from the DBMS into a script

Page 85: 2012 03 08_dbi

• Life-cycle of a statement handle ($sth)

• My $sth = $dbh->prepare($sql_string);

• $sth->execute();

• Print $sth->fetchrow_array();

Page 86: 2012 03 08_dbi

• Fecthing rows in a loop – the snippet

• My $sth=$dbh->prepare(“SELECT * FROM geography”);

• $sth->execute();

• While (my @row = $sth->fetchrow_array){

• Print join(“:”,@row),”\n”;

• }

Page 87: 2012 03 08_dbi

• Output– Nepal:Asia– Portugal:Europe

• All data retrieved

• Colums seperated

• Rows seperated

• Still no metadata

Page 88: 2012 03 08_dbi

• Finding Metadata – Handle Attributes

• $handle->{$key}=$value;

• Print $handle->{$key};

• $dbh->{RaiseError}=1;

• Print $dbh->{RaiseError};

• My $column_names = $sth->{NAME};

Page 89: 2012 03 08_dbi

• Finding Metadata with $sth->{NAME}

• my $sth=$dbh->prepare(“SELECT * FROM geography”);

• $sth->execute();

• my @column_names=@{$sth->{NAME}};

• my $num_cols = scaler @column_names;

• print join “:”,@column_names;

• print “(there are $num_cols columns)”;

Page 90: 2012 03 08_dbi

• Errors

• $dbh->do (“Junk”);

• Print “I Got here!”;

Page 91: 2012 03 08_dbi

• Checking Errors with RaiseError

• my $dbh=DBI->connect >..

• $dbh->{RaiseError}=1;

• $dbh->do(“Junk”);

• Print “Here ?”;

Page 92: 2012 03 08_dbi

Number of rows affected

$rows=$dbh->do(“DELETE FROM user WHERE age <42”);

# undef = error

# 3 = 3 rows affected

# 0E0 = no error; no rows affected

# -1 = unknown

Page 93: 2012 03 08_dbi

• Summary so far

• DBI connect($data_source)

• $dbh do($sql_instruction)• Prepare ($sql_request)• Disconnect()• {RaiseError}

• $sth execute()– Fetchrow_array()– {NAMEM}

Page 94: 2012 03 08_dbi

• A Deeper look at connection

Perl DBI

DBD#1

DBD#2

MySQL

Oracle

Page 95: 2012 03 08_dbi

• DBDs- Database Drivers

• DRIVER DBMS• DBD::DBM DBM• DBD::Pg postgreSQL• DBD::mysql MySQL• DBD::Oracle Oracle• DBD::ODBC Ms-Access, MS-SQL-

Server• …

Page 96: 2012 03 08_dbi

• Variation in DBDs & DBMSs

• Driver-specific connection parameters

• Driver-specific attributes and methods

• SQL implementaion

• Optimization Plans

Page 97: 2012 03 08_dbi

• Driver-Specific Connection Params – driver name – user name and password

• My $dbh = DBI->connect(• “DBI:$driver:”,• “root”,• “password”;• {• RaiseError => 1,• PrinError => 0,• AutoCommit =>1,• }

• );

Page 98: 2012 03 08_dbi

Finish() – fetchus interuptus

While (my @row=$sth->fetchrow_array){

Last if $row[0] eq $some_conditions;

}

$sth->finish();

Page 99: 2012 03 08_dbi

• Alternate fecthes

• My @row=$sth->fetchrow_array();– Print $row[1];

• My @row=$sth->fetchrow_arrayref();– Print $row->[1]

• My @row=$sth->fetchrow_hashref();– Print $row->{region};

Page 100: 2012 03 08_dbi

• Placeholders !

• my $sth = $dbh -> prepare (“SELECT name from user WHERE country = ? AND city = ? AND age > ?”);

• $sth-> execute(‘Venezuela’,’Caracas’,21);

Page 101: 2012 03 08_dbi

• DBDs that don’t need a separate DBMS

• DBD::CSV, DBD::Excel

• DBD::Amazon DBD::Google

• use DBI; my $dbh = DBI->connect("dbi:Google:", $KEY); my $sth = $dbh->prepare(qq[ SELECT title, URL FROM google WHERE q = "perl" ]); while (my $r = $sth->fetchrow_hashref) { ...

Page 102: 2012 03 08_dbi

Step1: Getting DriversEssential for SQL Querying

• A driver is a piece of software that lets your operating system talk to a database– Installed drivers visible in ODBC manager

• “data connectivity” tool

• Each database engine (Oracle, MySQL, etc) requires its own driver– Generally must be installed by user

• Drivers are needed by Data Source Name tool and querying programs

• Require (simple) installation

Page 103: 2012 03 08_dbi

MySQL Driver: Needed to Query MySQL Databases

• Windows: Download MySQL Connector/ODBC 3.51 here

• Must be installed for direct querying using e.g. Excel – Not necessary if you are using the MySQL

Query Browser

Page 104: 2012 03 08_dbi

Rat versus mouse RBP

Rat versus bacteriallipocalin

Oefening 2

Fetch a sequence by adapting live.pl and do remote blast using 3 different scoring matrices (summarize results) and perform “controls” using adaptation of shuffle …

Page 105: 2012 03 08_dbi

Parsing BLAST Using BPlite, BPpsilite, and BPbl2seq

• Similar to Search and SearchIO in basic functionality

• However:– Older and will likely be phased out in the

near future– Substantially limited advanced

functionality compared to Search and SearchIO

– Important to know about because many legacy scripts utilize these objects and either need to be converted

Page 106: 2012 03 08_dbi

Parse BLAST output

#!/opt/perl/bin/perl -w

#bioperl_blast_parse.pl

# program prints out query, and all hits with scores for each blast result

use Bio::SearchIO;

my $record = Bio::SearchIO->new(-format => ‘blast’, -file => $ARGV[0]);

while (my $result = $record->next_result){

print “>”, $result->query_name, “ “, $result->query_description, “\n”;

my $seen = 0;

while (my $hit = $result->next_hit){

print “\t”, $hit->name, “\t”, $hit->bits, “\t”, $hit->significance, “\n”;$seen++ }

if ($seen == 0 ) { print “No Hits Found\n” }

}

Page 107: 2012 03 08_dbi

Parse BLAST in a little more detail

#!/opt/perl/bin/perl -w

#bioperl_blast_parse_hsp.pl

# program prints out query, and all hsps with scores for each blast result

use Bio::SearchIO;

my $record = Bio::SearchIO->new(-format => ‘blast’, -file => $ARGV[0]);

while (my $result = $record->next_result){

print “>”, $result->query_name, “ “, $result->query_description, “\n”;

my $seen = 0;

while (my $hit = $result->next_hit{

$seen++;

while (my $hsp = $hit->next_hsp){

print “\t”, $hit->name, “has an HSP with an evalue of: “, $hsp->evalue, “\n”;}

if ($seen == 0 ) { print “No Hits Found\n” }

}

Page 108: 2012 03 08_dbi

Shuffle

#!/usr/bin/perl -wuse strict;

my ($def, @seq) = <>;print $def;chomp @seq;@seq = split(//, join("", @seq));my $count = 0;while (@seq) {

my $index = rand(@seq);my $base = splice(@seq, $index, 1);print $base;print "\n" if ++$count % 60 == 0;

}print "\n" unless $count %60 == 0;

Page 109: 2012 03 08_dbi

Searching for Sequence Similarity

• BLAST with BioPerl

• Parsing Blast and FASTA Reports– Search and SearchIO– BPLite, BPpsilite, BPbl2seq

• Parsing HMM Reports

• Standalone BioPerl BLAST

Page 110: 2012 03 08_dbi

Remote Execution of BLAST

• BioPerl has built in capability of running BLAST jobs remotely using RemoteBlast.pm

• Runs these jobs at NCBI automatically– NCBI has dynamic configurations (server side) to “always” be up and

ready– Automatically updated for new BioPerl Releases

• Convenient for independent researchers who do not have access to huge computing resources

• Quick submission of Blast jobs without tying up local resources (especially if working from standalone workstation)

• Legal Restrictions!!!

Page 111: 2012 03 08_dbi

Example of Remote Blast

A script to run a remote blast would be something like the following skeleton:

$remote_blast = Bio::Tools::Run::RemoteBlast->new( '-prog' => 'blastp','-data' => 'ecoli','-expect' => '1e-10' );

$r = $remote_blast->submit_blast("t/data/ecolitst.fa"); while (@rids = $remote_blast->each_rid ) { foreach $rid

( @rids ) {$rc = $remote_blast->retrieve_blast($rid);}}

In this example we are running a blastp (pairwise comparison) using the ecoli database and a e-value threshold of 1e-10. The sequences that are being compared are located in the file “t/data/ecolist.fa”.

Page 112: 2012 03 08_dbi

Example

It is important to note that all command line options that fall under the blastall umbrella are available under BlastRemote.pm.

For example you can change some parameters of the remote job.

Consider the following example:

$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'} = 'BLOSUM25';

This basically allows you to change the matrix used to BLOSUM 25, rather than the default of BLOSUM 62.

Page 113: 2012 03 08_dbi

Parsing Blast Reports

• One of the strengths of BioPerl is its ability to parse complex data structures. Like a blast report.

• Unfortunately, there is a bit of arcane terminology.

• Also, you have to ‘think like bioperl’, in order to figure out the syntax.

• This next script might get you started

Page 114: 2012 03 08_dbi

Sample Script to Read and Parse BLAST Report

# Get the report $searchio = new Bio::SearchIO (-format => 'blast', -file => $blast_report);

$result = $searchio->next_result; # Get info about the entire report $result->database_name;

$algorithm_type = $result->algorithm; # get info about the first hit $hit = $result->next_hit; $hit_name = $hit->name ; # get info about the first hsp of the first hit $hsp = $hit->next_hsp; $hsp_start = $hsp->query->start;