Top Banner
36

Amazon resource for bioinformatics

May 11, 2015

Download

Documents

Brad Chapman

Walk through using CloudBioLinux, CloudMan, BioCloudCentral to do custom biological analyses on Amazon EC2 hardware.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Amazon resource for bioinformatics

Amazon resources for bioinformatics

Brad Chapman

Bioinformatics Interest Group, 18 Oct 2012

Page 2: Amazon resource for bioinformatics

Goals

Automate:Reduce stepsRemove activation energyIncrease abstraction

Improve:SharingReproducibilityTeaching

Page 3: Amazon resource for bioinformatics

Installation

Page 4: Amazon resource for bioinformatics

Easier installation

Page 5: Amazon resource for bioinformatics

No installation

Page 6: Amazon resource for bioinformatics

Challenge

Biology computing platform

Widely accessible

Customizable

Community driven

Page 7: Amazon resource for bioinformatics

General cloud frameworks

http://aws.amazon.com/

Page 9: Amazon resource for bioinformatics

CloudBioLinux

Amazon image with bioinformatics software andlibraries

Automated build framework

Community e�ort to maintain and extend

http://cloudbiolinux.org

Page 10: Amazon resource for bioinformatics

CloudMan

SGE cluster plus automation

Web interface and monitoring

Persistence and sharing

Powers the Galaxy Cloud o�ering

http://usecloudman.org/

Page 11: Amazon resource for bioinformatics

BioCloudCentral

Automate setup of Amazon instance

Launch CloudBioLinux and CloudMan

Provide easy ssh access, no key pairs

http://biocloudcentral.org

Page 12: Amazon resource for bioinformatics

Galaxy

http://usegalaxy.org

Page 13: Amazon resource for bioinformatics

Acknowledgments

CloudBioLinux: Ntino Krampis, Tim Booth,Dawn Field, Pjotr Prins, John Chilton andCloudBioLinux community.

CloudMan: Enis Afgan, James Taylor

BioCloudCentral: Enis Afgan, John Chilton,Dannon Baker

Page 14: Amazon resource for bioinformatics

Documentation

http://cda.currentprotocols.com/WileyCDA/CPUnit/

refId-bi1109.html

Page 15: Amazon resource for bioinformatics

What we'll do

1 Sign up for Amazon

2 Start a CloudBioLinux/CloudMan instance

3 Add nodes to create a compute cluster

4 Run variant calling pipeline

Everything done through the web

Page 16: Amazon resource for bioinformatics

Getting started

Sign up for Amazon Web Serviceshttp://aws.amzaon.com

Get security credentials: Access Key and Secret Keyhttp://portal.aws.amazon.com/gp/aws/

securityCredentials

Page 17: Amazon resource for bioinformatics

Launch: http://biocloudcentral.org

Page 18: Amazon resource for bioinformatics

Ready two minutes later

Page 19: Amazon resource for bioinformatics

Login to CloudMan

Page 20: Amazon resource for bioinformatics

Shared CloudMan images

Package a complete analysis environmentDataCustomizations

Sharable with other users

Share string with NGS analysis platform:

cm-b53c6f1223f966914df347687f6fc818/shared/2012-07-23--19-23/

Page 21: Amazon resource for bioinformatics

Start CloudMan

Page 22: Amazon resource for bioinformatics

CloudMan console

Page 23: Amazon resource for bioinformatics

CloudMan admin page

Page 24: Amazon resource for bioinformatics

CloudMan: managing a cluster

Page 25: Amazon resource for bioinformatics

Associated Galaxy instance

Page 26: Amazon resource for bioinformatics

Analysis data on shared instance

Page 27: Amazon resource for bioinformatics

Graphical variant-calling pipeline

Page 28: Amazon resource for bioinformatics

Analysis data linked to pipeline

Page 29: Amazon resource for bioinformatics

Con�gure pipeline

Page 30: Amazon resource for bioinformatics

Run pipeline

Page 31: Amazon resource for bioinformatics

Shut everything down

Page 32: Amazon resource for bioinformatics

What happened

1 Sign up for Amazon

2 Start a CloudBioLinux/CloudMan instance

3 Add nodes to create a compute cluster

4 Run variant calling pipeline

Everything done through the web

Page 33: Amazon resource for bioinformatics

ssh to the machine

$ ssh [email protected]

[email protected]'s password:

Welcome to Ubuntu 12.04 LTS

(GNU/Linux 3.2.0-23-virtual x86_64)

ubuntu@ip-10-72-197-11:~$

Page 34: Amazon resource for bioinformatics

NX graphical client: login

http://www.nomachine.com/download.php

Page 35: Amazon resource for bioinformatics

NX graphical client: desktop

Page 36: Amazon resource for bioinformatics

Summary

Use cloud resources to build:

Machines with standard software

Cluster management

Analysis pipelines

Reproducible, sharable instances

Web-based interfaces