Paper Review on Cross- species Microarray Comparison Hong Lu 2008-10-14
Paper Review on Cross-species Microarray
Comparison
Hong Lu
2008-10-14
Title: Conservation of Regional Gene Expression in Mouse and Human Brain
Authors: Strand AD, Olson JM., et.al
Year: 2007
Journal: PLoS genetics
Purpose
In-species comparison: To find the differences to distinguish resistant
and sensitive tissues and cell types.
Cross-species comparison: To provide a framework to explore the ability of
mouse to model diseases of the human brain.
Human Group I Group II Total
Tissue3: caudate,cerebellum,motor cortex
2: caudatecerebellum
Pe
rson
s
man 8 7 15
woman 4 2 6
Total 12 9 21
Total Slides 12 x 3 = 36 9 x 2 = 18 54
Ag
e
Range 36 ~ 77 22 ~ 72 22 ~ 77
Mean 58 49 54
Affymetrix HG-U133A
Probesets # 22,283
Materials
Species Human Mouse(C57BL)
Tissue
3 caudate,
cerebellum,motor cortex
3 caudate,
cerebellum,motor cortex
Sa
mp
le
Male 8 1
Female 4 5
Total 12 6
Total Slides 12 x 3 = 36 6 x 3 = 18
Ag
e
Range 36 ~ 77 (years) 35 (days)
Mean 58 (years) 35 (days)
Affymetrix HG-U133A MOE_430A_2
Probesets # 22,283 22,690
Microarray analysis
1) Normalize the CEL files with Robust Multiple-array Average (RMA).
2) Fit a linear model for each of three pairs with LIMMA (bioconductor package)
gene expression ≈ donor + tissue type• Caudate/Cerebellum• BA4 Cortex/Cerebellum• BA4 Cortex/Caudate
3) Get log ratio, paired t-statistics and p-values
Sample result (human)
Score Caudate/Cerebellum …
Caudate Cerebellum Motor cortex
Probeset ID Log Ratio
t P.value …
106.05 -89.15 -16.9 215241_at 6.08 65.1 1.65E-21 …
103.2 -62.01 -41.19 220313_at 5.95 71.9 3.13E-22 …
93.7 -51.66 -42.04 207307_at 5.04 71.9 3.16E-22 …
Caudate score = t-score(Caudate/Cerebellum) + t-score(Caudate/BA4 Cortex)
Different Regions of the Brain Show Many Statistically Significant Differentially Expressed Genes
To select sets of genes whose expression was highly enriched in one of the three regions
1) p < 0.001 and log ratio ≥ 1 in both relevant pair-wise comparisons.
2) The log ratios of the two relevant comparisons were summed, such as log2(BA4/caudate) + log2(BA4/cerebellum) would be candidate BA4 genes
Caudate
Cerebellum BA4 Cortex
3) Order sum of log ratios
4) if summed regional score >2 in more than one region, probesets were culled from the list.
Table 3:Selected Regionally Enriched Genes in Human and Mouse Brain Tissues
Gene Expression Variation between Tissues and Individuals
gene expression ≈ donor + tissue type
Within-tissue variance VS Between-tissue variance
The variance for a probeset, across n samples, was calculated by
where xi is the RMA signal for probeset i on array n.
The between-tissue variability was greater for 89% of the human probesets and 85% of the mouse probesets.
Conclusion: Compared to expression dictated by regional identity,
age and gender appear to have effects of small magnitude or of large magnitude on a small fraction of genes, even in humans.
Cross-Species Comparison of Regional Gene Expression
What’s the relationship between mouse probesets and human probesets?
ENSEMBL
Mouse probesets Mouse ENSEMBL identities
(Example: 1415688_at)
Human probesets Human ENSEMBL identities (209141_at)
dN/dS
dN (number of nonsynonymous substitutions / number of nonsynonymous sites)
dS (number of synonymous substitutions / number of synonymous sites)
dN/dS was generated using the codeml (PAML package, pair-wise Maximum Likelihood Method) with F3 × 4 codon evolution model
Pick up 2,998 one-to-one orthologus pairs.
Compute normalized Euclidian distance between all possible nonself pairs of tissues.
where there are g probesets and x and y are any two mouse or human samples. Euclidian distances between regions were calculated using the mean RMA probeset signals for each tissue.
Conclusion: Orthologous Brain Regions between Species Are More Similar to Each Other than to Different Regions within a Species
Analysis of GO categories
Human: 70.6% of the probesets had an assigned GO category .
Mouse: 66.2% of the probesets had an assigned GO category.
For each GO category,
The total number of probes in that category (a)VS
The number of probes appearing on a list of differentially expressed probes (p < 0.05) (b)
Fisher's exact test Pearson chi-square
If a or b < 10 Otherwise
To detect which category is over-represented.
Conclusion: Mouse and Human Brain Regions Share a Higher Number of Overrepresented Functional Groups than Would Be Expected by Chance
Relationships between Tissue-Specific Expression, Conservation of Sequence, and
Conservation of Expression
(A) X-axis: dN/dS ratios, least conserved (left) to most conserved (right).Y-axis: Correlation coefficient between human and mouse log ratios.
(B) X-axis: The percent nucleotide identity, low (left) to high (right).Y-axis: Correlation coefficient between human and mouse log ratios.
Conclusion: Genes with High Variance across Tissues Have Greater Conservation of
Nucleotide Sequence
Conclusion
1) In-species comparison:The different brain regions have distinctly different expression profiles.
2) Cross-species comparison:Region-specific genes are conserved at both the sequence and gene expression levels. (positive correlated)
Advantage and Shortage?
Thanks