Tutorial: analysing Microarray data using BioConductor Guiyuan Lei Centre for Integrated Systems Biology of Ageing and Nutrition (CISBAN) School of Mathematics & Statistics Newcastle University http://www.mas.ncl.ac.uk/ ∼ ngl9/ 4 Feb, 2008 Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
30
Embed
Tutorial: analysing Microarray data using …Guiyuan Lei Tutorial: analysing Microarray data using BioConductor Why Bioconductor Bioconductor: open source software for bioinformatics
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Tutorial: analysing Microarray data usingBioConductor
Guiyuan Lei
Centre for Integrated Systems Biology of Ageing and Nutrition (CISBAN)School of Mathematics & Statistics
Bioconductor:open source software for bioinformaticsprovide innovative methodology for analyzing genomic datausing R statistical computing environment
R: Powerful grapphic feature and cut-edge statisticaltechniques, around 800 packages available, around 60basic packages (like affy, limma) in BioconductorPublished Papers using Bioconductorhttp://www.bioconductor.org/pub
Google Scholar Beta, PubMed, BEPress (BerkeleyElectronic Press), Biostatistics, BioMed CentralBioinformatics and IngentaFor example, in Bioinformatics, 161 papers found with‘Bioconductor’ in title
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
Bioconductor:open source software for bioinformaticsprovide innovative methodology for analyzing genomic datausing R statistical computing environment
R: Powerful grapphic feature and cut-edge statisticaltechniques, around 800 packages available, around 60basic packages (like affy, limma) in BioconductorPublished Papers using Bioconductorhttp://www.bioconductor.org/pub
Google Scholar Beta, PubMed, BEPress (BerkeleyElectronic Press), Biostatistics, BioMed CentralBioinformatics and IngentaFor example, in Bioinformatics, 161 papers found with‘Bioconductor’ in title
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
Bioconductor:open source software for bioinformaticsprovide innovative methodology for analyzing genomic datausing R statistical computing environment
R: Powerful grapphic feature and cut-edge statisticaltechniques, around 800 packages available, around 60basic packages (like affy, limma) in BioconductorPublished Papers using Bioconductorhttp://www.bioconductor.org/pub
Google Scholar Beta, PubMed, BEPress (BerkeleyElectronic Press), Biostatistics, BioMed CentralBioinformatics and IngentaFor example, in Bioinformatics, 161 papers found with‘Bioconductor’ in title
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
Pre-process of data
Entering data into BioconductorExtraction of Cerevisiae probesetsExploratory data analysisNormalising Microarray dataProbeset level expression to gene level expressionPrincipal Component Analysis
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
1W.G. Alvord et al., “A microarray analysis for differential gene expressionin the soybean genome using Bioconductor and R.”, Briefings inBioinformatics, September 2007
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
Yeast gene namesYeastGeneName<-character()for(i in 1:length(YeastProbeID)){YeastGeneName[i]=genenames[i][[1]]if(is.na(YeastGeneName[i])){YeastGeneName[i]=YeastTranscriptID[i]}
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
Probeset level expression to gene level expression
There are usually several probesets map to one gene inAffymetrix.
CerevisiaeGeneNameLevels<-factor(CerevisiaeGeneName)#Function to average the expression of probesets#which map to same geneprobeset2genelevel<-function(onesample){
return(tapply(onesample,CerevisiaeGeneNameLevels,mean))}#Do the average for each column/arrayCerevisiaeGeneData<-apply(CerevisiaeProbeData,2,probeset2genelevel)
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
Network Inference
GeneNetStrimmer’s VAR model
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
GeneNet: partial correlation network
Yeast Network
Gene 41
Gene 87
Gene 64
Gene 99Gene 57
Gene 81
Gene 94 Gene 91
Gene 98
Gene 96
Gene 58
Gene 68
Gene 88
Gene 92
Gene 80
Gene 95
Gene 69
Gene 97
Gene 65
Gene 63
Gene 66Gene 74
Gene 49
Gene 67 Gene 89
Gene 70
Gene 82Gene 75
Gene 100
Gene 90
Gene 22
Gene 53
Gene 44
Gene 93
Gene 60
Gene 56
Gene 59Gene 55Gene 76
Gene 54
Gene 34
Gene 10
Gene 35
Gene 71
Gene 28
Gene 50
Gene 84
Gene 79
Gene 73
Gene 16
Gene 86
Gene 42
Gene 20
Gene 31
Gene 27
Gene 40Gene 39
Gene 77
Gene 18
Gene 38
Gene 24
Gene 48
Gene 33
Gene 15
Gene 32
Gene 14
Gene 83
Gene 11
Gene 23
Gene 25
Gene 36 Gene 4
Gene 52
Figure: Inferred network by GeneNet package for top 100differentially expressed Yeast genes
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
GeneNet step 1: build longitudial object
#Need to transpose data matrix, rows to be arraysm = t(m)#Need to rearrange the rows so that the rows are ordered by time points#using the property of design matrix#the entry design[i,j] with value 1 means array i is for time point j!!!mnew = t(matrix(nrow=ngenes,ncol=30))#Get the index of array ordered by time pointarrayindx<-numeric(0)ntime=5for(j in 1:ntime){arrayindx<-c(arrayindx,grep(1,design[,j]),grep(1,design[,j+ntime]))}mnew<-m[arrayindx,]library("GeneNet")# step 1: create longitudinal object #mlong = as.longitudinal(mnew,repeats=c(6,6,6,6,6),time=c(0,1,2,3,4))
Guiyuan Lei Tutorial: analysing Microarray data using BioConductor
GeneNet: step 2 to step 5
# step 2: compute partial correlations #pcor.dyn <- ggm.estimate.pcor(mlong, method = "dynamic")# step 3: assign (local) fdr values to all possible edges #m.edges <- network.test.edges(pcor.dyn,direct=TRUE)dim(m.edges)# step 4: construct graph containing the 150 top edges #m.net <- extract.network(m.edges, method.ggm="number", cutoff.ggm=150)# step 5: plot graph using graphviz ##If rnames has no "", Need for Graphvizfor(i in 1:ngenes){