Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion
Slide 2
Outline Introduction Post-Genomic Technologies The Importance of Design Experimental Design When Design Goes Bad More Commonly Made Mistakes Things Done Right Types of Experiment
Slide 3
Post-Genomic Technologies Set of technologies that have become prevalent since the advent of genome sequencing Also referred to as functional genomics technologies Transcriptomics Proteomics Metabolomics 'High-throughput techniques, generate lots of data, fast
Slide 4
Importance of Design Functional Genomics experiments are expensive The quantity of data can mask interesting biological variation (noise) Bad design can increase noise Or at least fail to minimise it
Slide 5
When Design Goes Wrong A trivial example Bill and Ben want to identify proteins upregulated in response to water starvation in a drought resistant plant So, Bill went away and grew some plants, and so did Ben
Slide 6
When Design Goes Wrong continued Bill chose 3 plants, and Ben chose 4 Bill grew his at home in normal conditions, and Ben grew his in the lab with minimal water Then, after a few days of growth, they each took samples from their plants and ran 2D-PAGE
Slide 7
When Design Goes Wrong analysis They used average gels of the 2 groups of plants to find differentially expressed proteins They did t-tests for every spot on the gels, and found 400 of 2500 proteins (95% level) with significantly altered expression in drought conditions What now? They only wanted 10-20
Slide 8
When Design Goes Wrong What did they do wrong? Confounding Experiment cant distinguish between a number of factors: Drought Experimenter effects Difference between home and lab Selection Bill or Ben could be biased in how they selected plants, even unconsciously Randomised selection is preferred Unbalanced Better to have equal numbers in each group for many statistical analyses
Slide 9
When Design Goes Wrong How to improve Grow plants together under same conditions Select an equal number randomly for both Bill and Ben Both half their plants and grow normal and drought plants to the same protocol Better still, either Bill or Ben should do the whole experiment
Slide 10
When Design Goes Wrong Post mortem Even with a rigorously designed experiment, Bill and Ben may still have obtained confusing results It is common to identify many differentially expressed genes/proteins This can be a true reflection of the biology False discovery rate is necessarily high in post-genomic experiments, because of the number of hypotheses being tested Good experimental design could have reduced the complexity of their output providing a base for a robust statistical analysis of the data
Slide 11
Choice of Technology Microarray or proteomics? Affy or two-colour arrays? Reference sample? 2D gels or LC-MS? Single stain or DIGE? Reference sample? No easy (or correct) answers Depends very much on the individual experiment
Slide 12
Further Pitfalls Fahrenheit and the Cow Based on urban myth Still an important message No individual is typical Biological, as well as technical, replicates required
Slide 13
Further Pitfalls The pester problem Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad can I have a puppy, Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy ? Ask a question often enough, eventually youll get the answer youre after
Slide 14
Further Pitfalls The universe doesnt exist -- on average Pooling samples makes little sense: no information about distribution / need STDDEV for significance test My machine/technique is so accurate, I dont need replicates Accuracy has little effect on biological variance
Slide 15
Doing Things Right Some ideas for good design Blocking Replicates Calculating power
Slide 16
Doing Things Right Blocking Flask Gel IEF PAGE
Slide 17
Doing Things Right Replication
Slide 18
Doing Things Right Calculating power Probability density (null hypothesis) Probability density (alternative hypothesis) = probability of false positive (Type I Error) = Power 1- = probability of false negative (Type II Error)
Slide 19
Doing Things Right Calculating power Probability density (null hypothesis) Probability density (alternative hypothesis) = probability of false positive (Type I Error) = Power 1- = probability of false negative (Type II Error)
Slide 20
Doing Things Right Calculating power Probability density (null hypothesis) Probability density (alternative hypothesis) = probability of false positive (Type I Error) = Power 1- = probability of false negative (Type II Error)
Slide 21
Doing Things Right Calculating power Probability density (null hypothesis) Probability density (alternative hypothesis) = probability of false positive (Type I Error) = Power 1- = probability of false negative (Type II Error)
Slide 22
Types of Experiment Time course Cell cycle Following drug challenge Following external stimulus Following release of mutant Mutant vs Wild-Type Normal vs Diseased Developmental Changes Different Tissues Within cell differences
Slide 23
Types of Experiment Novel microarray techniques Genotyping SNP detection Copy Number Assessment Novel proteomics techniques High-throughput interaction detection Phosopho-proteomics Also Protein binding arrays Ligand binding arrays
Slide 24
A couple of quotes You know, the most amazing thing happened to me tonight. I was coming here, on the way to the lecture, and I came in through the parking lot. And you wont believe what happened. I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing! Richard P. Feynman To consult a statistician after an experiment is finished is often merely to ask him to conduct a post-mortem examination. He can perhaps say what the experiment died of. R.A.Fisher, 1938.
Slide 25
Summary Post-genomics technologies are powerful, but expensive Good design gives maximum return for minimum effort