Top Banner
Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)
18
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

Taverna and SoapLab

Experience @

Elda Rossi – CINECA (Italy)

Page 2: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

What is CINECA

Cineca is a consortium of Italian Universities and CNR

Funded in 1969, now under the control of Research and University Ministry

Page 3: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

ResourcesThe most important The most important national infrastructure in national infrastructure in Italy for the computational Italy for the computational support to scientific support to scientific researchresearch

Mission:

promoting the use of the most advanced computing systems to support public and private scientific and technological research

Page 4: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

R & Bioconductor

Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data.

It is based on R , a language and environment for statistical computing and graphics.

Page 5: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

R & Bioconductor

BioConductor is a collection of “packages”

Two main types: 1. provides basic infrastructure support.

2. Provides innovative methodology We chose a function in the affy package

(type 2. )

Page 6: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

The affy package

Package: affyDescription: The package contains functions for exploratory oligonucleotide array analysis. The dependance to tkWidgets only concerns few convenience functions. 'affy' is fully functional without it.Version: 1.5.8-1Author: Rafael A. Irizarry , Laurent Gautier , Benjamin Milo Bolstad , and Crispin Miller with contributions from …Maintainer: Rafael A. Irizarry Dependencies: R (>= 1.9.0), Biobase (>= 1.4.22), reposToolsSuggests: tkWidgets (>= 1.2.2), affydataSystemRequirements: NoneLicense: LGPL version 2 or newerURL: None available

Function: Expresso . From raw probe intensities to expression values

Page 7: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

The expresso function

Expression measuresThe most common operation is certainly to convert

probe level data to expression values.

1. reading in probe level data 2. background correction 4 methods3. Normalization 7 methods4. probe specific background correction, e.g.

subtracting MM 3 methods5. summarizing the probe set values into one

expression measure and, in some cases, a standard error for this summary 5 methods

Page 8: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

How to run expresso

$ R> library(affy)> data<-ReadAffy()> data.mas<-expresso(data,bgcorrect.method="mas", pmcorrect.method="mas", normalize.method="constant", summary.method="medianpolish")

> write.exprs(data.mas,file=“Data.out")

Data.CEL Data.out

$ R CMD BATCH scriptlibrary(affy)data<-ReadAffy()data.mas<-expresso(data,bgcorrect.method="mas", pmcorrect.method="mas", normalize.method="constant", summary.method="medianpolish")write.exprs(data.mas,file=“Data.out")

Report

script

Page 9: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

The files

[CEL]Version=3

[HEADER]Cols=126Rows=126TotalX=126TotalY=126Baseline=Not normalizedDatHeader=ctrl150:CLS=1167 …

[INTENSITY]NumberCells=15876CellHeader=X Y MEAN 0 0 551.0 1 0 10651.0 2 0 642.0 3 0 10855.0 4 0 278.0 5 0 452.0 6 0 11139.0

Sample001.cel Sample002.cel Sample003.cel

100084_at 2.68016528652511 2.75619854567269 3.82550383255225

101482_at 2.41830136307405 2.19230548692681 3.4173900695363

31962_at 12.3667390890414 12.4534076075796 12.8658623516881

32466_at 12.4078453130306 12.5262787728982 13.2129784659009

35201_at 6.73875347104673 6.36824635919863 7.53465018481639

36189_at 6.91195864883172 6.77835938949316 7.94585515997792

36678_at 10.0269997503136 9.76893096184106 11.1443619988943

37001_at 8.7690698709579 8.57322443505215 9.80956768540462

37029_at 7.58176898579828 7.24297853600119 8.67002397585278

37046_at 4.7250160934765 4.7250160934765 5.68254863921313

37189_at 7.08125646141077 7.0999566997911 7.92512679504857

37719_at 5.33679629782696 5.33679629782696 6.39140386282694

37725_at 7.634367429284 7.41050271151406 8.85664197069339

38437_at 7.54693596951725 7.16216316289552 8.3816810916508

38730_at 7.61959398527742 7.65907193898742 9.00657184492387

39425_at 6.07663839694708 6.03298499862286 7.14769809957403

40276_at 6.33983152588017 6.21300599988174 6.85968858773872

CEL file OUT file

Page 10: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

Setting up SoapLab

A linux based server was chosen Tomcat was installed Java was upgraded Axis was installed SoapLab was installed

Vega.cineca.it

Tomcat 5.0.28

Java 1.4

Axis 1.1

SoapLab precompiled

for Suse Linux

Up to here: No Problems !!!

Page 11: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

Defining the Application

1. Write the application wrapper

2. Write the ACD file for the application

3. Convert ACD to XML

4. Start up the SoapLab server

5. Deploy the new service

Page 12: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

1. Write the application wrapper

#!/usr/bin/perluse Getopt::Long;

# command arguments (with default)GetOptions("bgcorrect=s"=>\$bgcorrect, "normalize=s"=>\$normalize);$bgcorrect="mas" if $bgcorrect eq "";$normalize="constant" if $normalize eq "";

# location of R executable$rexe="/biotools/R/R-2.1.0/bin/R";

# data directory$datadir=“/biotools/services/data";

# R code to run analysis

open(AFFY,">$datadir/affy"); print AFFY <<EOF ;library(affy)data<-ReadAffy()data.mas<-expresso(data, bgcorrect.method="$bgcorrect", pmcorrect.method="mas", normalize.method="$normalize", summary.method="medianpolish")write.exprs(data.mas,file="data.txt")EOFclose(AFFY);

# now run programsystem "cd $datadir; $rexe CMD BATCH affy";

# print outputopen(OUT,"$datadir/data.txt");while (<OUT>) {print $_;}close(OUT);

/biotools/services/affy-expresso.pl

Page 13: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

2. Write the ACD file

appl: bioconductor [ documentation: "affy/expresso function of BioConductor" version: "1.0" groups: "Microarrays" nonemboss: "Y" executable: affy-expresso.pl]string: bgcorrect [ additional: "Y" parameter: "Y" default: "mas"]string: normalize [ additional:"Y" parameter: "Y" default: "constant"]outfile: output [ additional: "Y" default:“stdout"

/biotools/soapbin/analysis-interfaces/metadata/affy.acd

The path is defined in the shell

Input1: Background correction

Input1: Normalization method

Output: standard output

Page 14: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

3, 4, 5: Final steps

3. Convert ACD to XML

4. Start up the SoapLab server

5. Deploy the new service

/biotools/soapbin/analysis-interfaces/generator/acd2xml

From: ../metadata/affy.acdTo: ../metadata/microarrays/affy-al.xml

/biotools/soapbin/analysis-interfaces/run-AppLab-server How to shut down the server?

/biotools/soapbin/analysis-interfaces/ws/deploy-web-services

Page 15: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

Using the service from Taverna From the Available service window select

Add new SoapLab scavenger and enter our server address http://vega.cineca.it:8082/axis/services

Page 16: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

Using the service … (2)

The new processor appears in the microarrays folder you can find the

affy service After connecting input & output ports, the

service can be launched

Page 17: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

Problems encountered

Documentation is not so clear and complete How can we transfer (large) files from the

personal WS to the server machine We need a permanent and private data

area for storing data We would like to monitor the service while it

is running (asynchronous services?) How can we return data in addition to

stdOut and stdErr …..

Page 18: Taverna and SoapLab Experience @ Elda Rossi – CINECA (Italy)

A possible (future) workflow

Upload one or more CEL files on the server

Analyse the data and get expression levels

Verify the output data

download the output data and clear the personal space

WS-plot

WS-expresso

OK ?NO YES

WS-upload