Top Banner
Bioinformàtica per a la Recerca Biomèdica Ricardo Gonzalo Sanz [email protected] 20/05/14 Hospital Universitari Vall d’Hebron Institut de Recerca - VHIR Institut d’Investigació Sanitària de l’Instituto de Salud Carlos III (ISCIII) Basic aspects of Microarray technology
71

Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

May 25, 2015

Download

Science

Dorin Pop

Course: Bioinformatics for Biomedical Research (2014).
Session: 3.2- Basic Aspects of Microarray Technology and Data Analysis.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

Bioinformàtica per a la Recerca Biomèdica Ricardo Gonzalo Sanz

[email protected] 20/05/14

Hospital Universitari Vall d’Hebron Institut de Recerca - VHIR

Institut d’Investigació Sanitària de l’Instituto de Salud Carlos III (ISCIII)

Basic aspects of Microarray technology

Page 2: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

Affymetrix microarrays manufacture.

2

3

4

5

6

Microarray experiment workflow.

Quality Controls.

Different types of Affymetrix arrays.

1 Introduction

Different types of arrays. Manufactoring. DNA/RNA/Protein

Page 3: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction

reproducibility

only show you what you’re looking for

what about ‘indels’, inversions, translocations...

accuracy

sensitivity

Page 4: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction

Page 5: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction

RNA-Seq was superior in detecting low abundance transcripts

also better detecting differentiating biologically isoforms

RNA-Seq demonstrated a broader dynamic range than microarray.

Page 6: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction

• In molecular biology exist a lot of techniques to measure the gene expression

(Northern blot)

• Main characteristic from the microarrays discovery (Schena et al. (1995)

Science 270:467-70), was not what could be measured, instead the quantity of

simultaneous measures that could be done.

• Pre microarrays time: study of genes was one by one

• Post microarrays time: all the genes together.

Page 7: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction

• But.... what is a microarray in few words?

DNA fixed to a solid surface (nylon, silica, glass,...)

RNA “problem” is labeled and have to bind to DNA

fixed in the solid surface in an specific way.

DNA binded usually is called “probe”

Labeled RNA usually is called “target”

Page 8: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

Important to know in advanced...

1 Introduction

• Microarrays are usually hypothesis-generating:

They highlight specific genes or features that are particularly

interesting for follow-up experiments.

An exception would be the biomarkers discovery studies.

• This does not reduce the importance of experimental design

Page 9: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2

Two color microarrays (cDNA)

• Usually probes are long (20nt)

• Probe is fixed to a glass

• Labeling is with two fluorocrom (Cy3/Cy5).

• Direct comparison of the two samples due

to they are hybridized in the same array.

• Each gene appear few times in the array

• Long probes facilitate crosshybridization

• Not very good reproducibility.

Different types of arrays. Manufactoring. DNA/RNA

Page 10: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2

One color microarrays

• Short probes (20-25 nt)

• Target is labeled with only one fluorocrom

• Only one sample is hybridized in each array.

• Each gene is represented by a lot of probes

in the array

Different types of arrays. Manufactoring. DNA/RNA

Page 11: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Different types of arrays. Manufactoring. DNA/RNA

• DNA Polymorphism (GWAS)

• Transcription Factors

• Resequencing

• Cytogenetics

• Expression

• Alternative splicing

• microRNA

DNA RNA

Page 12: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Different types of Affymetrix arrays.

3’ 5’

3’ IVT Arrays

• Biased measurement of the gene expression

• Array more used in the literature. A lot of species present.

Only genes with polyA tail and good 3’ site will

be amplified and will have the chance of

hybridize correctly.

Page 13: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Different types of Affymetrix arrays.

3’ 5’

Gene Arrays

Exon Arrays

Gene/Exon Arrays

• Gene arrays are the most used (good quality and price ratio)

• Gene arrays 2.0 more updated library and also includes lncRNAs

Page 14: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Different types of expression arrays.

•153 organisms in the array (human, mouse, rat, canine, ….)

•100% miRBase v17

•2.216 snoRNAs and scaRNAs (human small nuclear RNAs)

•Low inputs amounts (130 ng total RNA)

•2.999 probe sets unique to pre-miRNA hairpins

•Able to differentiate pre and mature miRNAs

•Useful for FFPE samples

miRNA

Page 15: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Different types of expression arrays.

HTA array

Page 16: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

Affymetrix microarrays manufacture. 3

Photolitografy

Page 17: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

Affymetrix microarrays manufacture. 3

Page 18: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Microarray experiment workflow

Page 19: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Microarray experiment workflow

Page 20: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Microarray experiment workflow

Page 21: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

6 Quality Controls

Page 22: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

6 Quality Controls

Page 23: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

6 Quality Controls

Length of amplified cRNA

Page 24: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

6 Quality Controls

Length of fragmented cRNA

Page 25: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

Bioinformàtica per a la Recerca Biomèdica Ricardo Gonzalo Sanz

[email protected] 20/05/14

Hospital Universitari Vall d’Hebron Institut de Recerca - VHIR

Institut d’Investigació Sanitària de l’Instituto de Salud Carlos III (ISCIII)

Basic aspects of Microarray Data Analysis

Page 26: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

Filtering

2

3

4

5

6

Statistical inference of diferential expression

Clustering

Normalization

1 Introduction. Experimental design

Quality control

7

8

Annotation

Biological interpretation

Page 27: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction. Experimental design

Page 28: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction. Experimental design

Page 29: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction. Experimental design

Page 30: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction. Experimental design

Page 31: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction. Experimental design

Page 32: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1 Introduction. Experimental design

Microarrays Analysis

Workflow

Page 33: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Page 34: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Was the experiment a success???

• Microarray experiments generate huge quantitites of data

• Standard statistical approach use plots to check the quality

show all data together

highlight structures

may help to detect problems (“unusual patterns”)

It is hard to decide if things “seem to be

all right” just by looking at the numbers.

Page 35: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays:

• Microarray data usually considered at two levels

1. Low level. Data directly coming from the scanner

2. High level. Processed from low level data. Expression values,

normalized or not.

• Some plots are specific for some type of arrays or for some level

Page 36: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays:

1. Low level:

Layout image

Degradation plots (only in 3’IVT)

Histogram/density plots

PCA, Boxplot

2. High level:

MA plots

Model based plots (NUSE,RLE,)

PCA, Boxplot

Page 37: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diganostics plots for microarrays. Low level. Layout image.

Page 38: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostic plots for microarrays. Low level. RNA degradation plot (3’IVT arrays)

Page 39: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays. Low level. Histogram/density Plot

Page 40: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays. Low level. Boxplot

Page 41: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Page 42: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays. Low level. PCA

Page 43: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays. Low level. PCA

Page 44: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Page 45: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays. High level. RLE

Page 46: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Page 47: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays. High level. NUSE

Page 48: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays. High level. MA plots

• MA plots allow pair wise comparison of log-intensity of each array to a

reference array and identification of intensity-dependent biases.

• The Y axis of the plot contains the log-ratio intentsity of one array to the

reference median array, which is called “M” while the X axis contains the

average log-intensity of both arrays – called “A”.

• The probe levels are not likely to differ a lot so we expect a MA plot centered

on the Y=0 axis from low to high intensities.

Page 49: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Diagnostics plots for microarrays. High level. MA plots

Page 50: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

2 Quality Control

Page 51: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

3 Normalization

The goal of normalization is to adjust for the effects that are due to variations in the

technology rather than the biology.

Page 52: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

3 Normalization

Page 53: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

3 Normalization

Page 54: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

3 Normalization

Page 55: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

4 Filtering

• In a microarray experiment only a few hundreds/thousand of genes change their

expression due to the different conditions

•Researcher is interested in keeping the number of tests/genes as low as possible

while keeping the interesting genes in the selected subset.

•If the truly diferentially expressed genes are over-represented among those

selectec in the filtering step, the FDR associated with a certain threshold of the

statistic test will be lowered due to the filtering.

Genes that do not change introduce

noise, therefore is better not to be

present when the statistical analysis is

done

Page 56: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

4 Filtering

Exists different types of filtering:

• Annotation features (specific):

Specific gene features (i.e. GO term, presence of transcriptional regulative

elements in promoters, etc.)

Data derived from IPA

• Signal features (non specific)

% intensities greater of a user defined value

Interquantile range (IQR) greater of a defined value

Page 57: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

4 Filtering

Signal filtering: This technique has as its premise the removal of genes that are

deemed to be not expressed or unchanged according to some specific criterion that

is under the control of the user.

Page 58: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)
Page 59: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Statistical inference of diferential expression

• Indirect comparisons: 2 groups, unpaired

• Direct comparsions: 2 groups. paired

Page 60: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Statistical inference of diferential expression

Limma package (Gordon Smith)

Page 61: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Statistical inference of diferential expression

Page 62: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Statistical inference of diferential expression

Page 63: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Statistical inference of diferential expression

Page 64: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

5 Statistical inference of diferential expression

Page 65: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

6 Clustering

Types:

Supervised clustering try to find the best partition for data that belong to a

know set o classes

Unsupervised clustering try to define the number and the size of the classes

in which the transcription profiles can be fitted in.

Page 66: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

6 Clustering

Page 67: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

6 Clustering

Hierarchical Clustering (HCL)

• HCL is an agglomerative /divise clustering method.

• The iterative process continues until all groups are

connected in a hierarchical tree.

• Samples more similar between them are closed.

Page 68: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

6 Clustering

Page 69: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

7 Annotation

Page 70: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

8 Biological interpretation

Gene Ontology

Page 71: Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

8 Biological interpretation