Top Banner
SMRT-Portal Exercises J Fass UCD Genome Center Bioinformatics Core Thursday April 16, 2015
36

SMRT-Portal Exercises

Dec 16, 2016

Download

Documents

habao
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SMRT-Portal Exercises

SMRT-Portal ExercisesJ Fass

UCD Genome Center Bioinformatics CoreThursday April 16, 2015

Page 2: SMRT-Portal Exercises

Running SMRT-Portal in AWS

see PacBio documentation

We’ll be running a virtual machine (VM) in the Amazon Web services “Cloud” (a server farm somewhere in the region you’ve selected). On this VM is a web server, serving you pages created by the SMRT-Portal application.

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 3: SMRT-Portal Exercises

Running SMRT-Portal in AWS

Launch an m3.2xlarge instance using ami-953fddd1.

Generate or re-use a key pair - you will need it!

Once running, find the public IP address (#.#.#.#), and open a browser tab with the URL:#.#.#.#:8080/smrtanalysis … or … #.#.#.#:8080/smrtportal

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 4: SMRT-Portal Exercises

Running SMRT-Portal in AWS

On a “vanilla” PacBio SMRT-Portal instance (U.S. East / N. Virginia), you would need to create one administrator account. This AMI already has one, but feel free to change the password, add non-admin accounts, etc.

user: administratorpwd: 5MRT-P0rtal

Note: pwd = >0 symbols, >0 numbers, >8 characters

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 5: SMRT-Portal Exercises

Running SMRT-Portal in AWS

Log in as administrator (special user), then create separate accounts if desired.

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 6: SMRT-Portal Exercises

How I imported 8 SMRT Cells (E coli)

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 7: SMRT-Portal Exercises

SSH to AWS instance

ssh -i ~/.ssh/yourKey.pem [email protected]

ssh commandoption block (supplies private key in this case)destination (username@computername)

… (or use PuTTY) …

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 8: SMRT-Portal Exercises

PacBio Public Datasets

https://github.com/PacificBiosciences/DevNet/wiki/Datasets

look for “Data supporting publications” …

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 9: SMRT-Portal Exercises

PacBio Public Datasets

https://github.com/PacificBiosciences/DevNet/wiki/Datasets

look for “Data supporting publications” … look for the first MG1655 xml & bas.h5 files …

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 10: SMRT-Portal Exercises

Enter “dropbox” directory

cd /opt/smrtanalysis/userdata/inputs_dropbox

cd commanddestination directory

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 11: SMRT-Portal Exercises

Pull in data

mkdir MG1655

cd MG1655

wget [xml file link]

mkdir Analysis_Results

cd Analysis_Results

wget [bas.h5 file link, + bax.h5’s if present]

commanddirectory / destination / source

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 12: SMRT-Portal Exercises

Import SMRT Cell data

Back in SMRT Portal, click through “Home” (upper left), then “Import and Manage” (third image), then “Input SMRT Cells.”

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 13: SMRT-Portal Exercises

Import another SMRT Cell (exercise)

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 14: SMRT-Portal Exercises

SSH to AWS instance

ssh -i ~/.ssh/yourKey.pem [email protected]

ssh commandoption block (supplies private key in this case)destination (username@computername)

… (or use PuTTY) …

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 15: SMRT-Portal Exercises

PacBio Public Datasets

https://github.com/PacificBiosciences/DevNet/wiki/Datasets

look for “E. coli size selected 20kb library” …

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 16: SMRT-Portal Exercises

PacBio Public Datasets

Find the SMRT Cell data files “tarball,” and copy the link (don’t download; you’ll break our wireless!).

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 17: SMRT-Portal Exercises

Feeding Data to the SMRT-Portal

Back in a shell (terminal) on your instance, navigate to SMRT-Portal’s input dropbox.

cd /opt/smrtanalysis/userdata/inputs_dropboxwget [link]mkdir Ecoli20kbcd Ecoli20kbmkdir Analysis_Resultstar -xzvf ecoliK12.tar.gz

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 18: SMRT-Portal Exercises

Feeding Data to the SMRT-Portal

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Back in SMRT Portal, click through “Home” (upper left), then “Import and Manage” (third image), then “Input SMRT Cells.”

Page 19: SMRT-Portal Exercises

Feeding Data to the SMRT-Portal

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Via Home, Import and Manage, and [Import] SMRT cells, get to import page. Select directory, and Scan.

Page 20: SMRT-Portal Exercises

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

HGAP Assembly

Page 21: SMRT-Portal Exercises

Running HGAP

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Click Design Job, then Create New, (deal with the design wizard - I usually select “display all protocols”). You should see 9 SMRT Cells available (we just imported the 9th).

Page 22: SMRT-Portal Exercises

Running HGAP

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Select the “RS_HGAP_Assembly.3” Protocol from the drop-down menu, enter name and (if desired) comments, select 20kb cell and click right arrowhead to add cell to the job you’re designing, then Save and Start!

Page 23: SMRT-Portal Exercises

Running HGAP

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

early results just assess reads, subreads ...

Page 24: SMRT-Portal Exercises

Running HGAP

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Final results include pre-assembly, realigned reads, etc.

Page 25: SMRT-Portal Exercises

HGAP output

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Find the Polished Assembly Fasta link, right-click and Save link as … (to avoid troublesome name).

Page 26: SMRT-Portal Exercises

HGAP output

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Notice the BAM and BAI links; these allow you to view the original reads aligned back to the assembly (e.g. in IGV).

Page 27: SMRT-Portal Exercises

Check assembly via homology

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Using Mauve, we’ll align our assembled genome to the trusted E. coli K-12 MG1655 reference assembly, from GenBank (link).

Page 28: SMRT-Portal Exercises

Check assembly via homology

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Launch Mauve, then select File → Align with progressiveMauve. Then Add Sequence (click to add GenBank reference, then our assembly), click Align (and add a place to save output).

Page 29: SMRT-Portal Exercises

Check assembly via homology

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

(see Mauve site for details on viewer, etc. … we’ll explore during Workshop)

Page 30: SMRT-Portal Exercises

Check for circularity (if appropriate)

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Launch Gepard, Select file specifying the polished genome assembly twice (once for horizontal, once for vertical), then create dotplot.

Page 31: SMRT-Portal Exercises

Check for circularity (if appropriate)

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Looks fine, right? But the overlaps will be on the size scale of the reads … not visible at this scale.

Page 32: SMRT-Portal Exercises

Check for circularity (if appropriate)

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Use the Advanced mode, Plot tab, to specify the first ~20kb on the horizontal, and the last ~20kb on the vertical. Then Update dotplot.

Page 33: SMRT-Portal Exercises

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Alignment / Resequencing Protocols

Page 34: SMRT-Portal Exercises

Align to your own reference

In SMRT-Portal, go Home, then Import and Manage, then reference sequences. Select New to upload our down loaded reference (note there’s also a Scan option - upload first to /opt/smrtanalysis/userdata/references_dropbox/).

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 35: SMRT-Portal Exercises

Align to your own reference

Design a job using the same reads, and the RS_Resequencing.1 protocol. Specify your uploaded reference sequence, save, and start the job. (I’m using E albertii in this case, RefSeq id NZ_CP007025.1)

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Page 36: SMRT-Portal Exercises

UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16

Viewing Read Alignments with IGV