YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Post-Genomics Experimental Design

CSC8309 - Gene Expression and Proteomics

Simon Cockell &Cedric Simillion

Page 2: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Outline

• Introduction– Post-Genomic Technologies– The Importance of Design

• Experimental Design– When Design Goes Bad– More Commonly Made Mistakes– Things Done Right– Types of Experiment

Page 3: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Post-Genomic Technologies

• Set of technologies that have become prevalent since the advent of genome sequencing

• Also referred to as ’functional genomics’ technologies– Transcriptomics– Proteomics– Metabolomics

• 'High-throughput’ techniques, generate lots of data, fast

Page 4: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Importance of Design

• Functional Genomics experiments are expensive

• The quantity of data can mask interesting biological variation (noise)

• Bad design can increase noise• Or at least fail to minimise it

Page 5: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

When Design Goes WrongA trivial example

• Bill and Ben want to identify proteins upregulated in response to water starvation in a drought resistant plant

• So, Bill went away and grew some plants, and so did Ben

Page 6: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

When Design Goes Wrongcontinued

• Bill chose 3 plants, and Ben chose 4• Bill grew his at home in normal

conditions, and Ben grew his in the lab with minimal water

• Then, after a few days of growth, they each took samples from their plants and ran 2D-PAGE

Page 7: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

When Design Goes Wronganalysis

• They used average gels of the 2 groups of plants to find differentially expressed proteins

• They did t-tests for every spot on the gels, and found 400 of 2500 proteins (95% level) with significantly altered expression in drought conditions

• What now? They only wanted 10-20

Page 8: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

When Design Goes WrongWhat did they do wrong?

• Confounding– Experiment can’t distinguish between a number of factors:

• Drought• Experimenter effects• Difference between home and lab

• Selection– Bill or Ben could be biased in how they selected plants,

even unconsciously– Randomised selection is preferred

• Unbalanced– Better to have equal numbers in each group for many statistical

analyses

Page 9: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

When Design Goes WrongHow to improve

• Grow plants together under same conditions

• Select an equal number randomly for both Bill and Ben

• Both half their plants and grow normal and drought plants to the same protocol

• Better still, either Bill or Ben should do the whole experiment

Page 10: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

When Design Goes WrongPost mortem

• Even with a rigorously designed experiment, Bill and Ben may still have obtained confusing results– It is common to identify many differentially expressed

genes/proteins– This can be a true reflection of the biology– False discovery rate is necessarily high in post-genomic

experiments, because of the number of hypotheses being tested

• Good experimental design could have reduced the complexity of their output– providing a base for a robust statistical analysis of the data

Page 11: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Choice of Technology

• Microarray or proteomics?• Affy or two-colour arrays?

– Reference sample?

• 2D gels or LC-MS?• Single stain or DIGE?

– Reference sample?

• No easy (or correct) answers– Depends very much on the individual

experiment

Page 12: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Further Pitfalls

• Fahrenheit and the Cow• Based on urban myth• Still an important message

– No individual is typical– Biological, as well as technical,

replicates required

Page 13: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Further Pitfalls

• The pester problem– Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a

puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad can I have a puppy , Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy?

• Ask a question often enough, eventually you’ll get the answer you’re after

Page 14: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Further Pitfalls

• The universe doesn’t exist -- on average– Pooling samples makes little sense:

no information about distribution / need STDDEV for significance test

• “My machine/technique is so accurate, I don’t need replicates”– Accuracy has little effect on biological

variance

Page 15: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Doing Things RightSome ideas for good design

• Blocking• Replicates

– Calculating power

Page 16: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Doing Things RightBlocking

Flask

Flask

Flask

Gel

Gel

Gel

IEF

PAGE

Page 17: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Doing Things RightReplication

Page 18: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Doing Things RightCalculating power

Probability density (null hypothesis)

Probability density (alternative hypothesis)

= probability of false positive (Type I Error)

= Power

1- = probability of false negative (Type II Error)

Page 19: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Doing Things RightCalculating power

Probability density (null hypothesis)

Probability density (alternative hypothesis)

= probability of false positive (Type I Error)

= Power

1- = probability of false negative (Type II Error)

Page 20: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Doing Things RightCalculating power

Probability density (null hypothesis)

Probability density (alternative hypothesis)

= probability of false positive (Type I Error)

= Power

1- = probability of false negative (Type II Error)

Page 21: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Doing Things RightCalculating power

Probability density (null hypothesis)

Probability density (alternative hypothesis)

= probability of false positive (Type I Error)

= Power

1- = probability of false negative (Type II Error)

Page 22: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Types of Experiment

• Time course– Cell cycle– Following drug challenge– Following external stimulus– Following release of mutant

• Mutant vs Wild-Type• Normal vs Diseased• Developmental Changes• Different Tissues• Within cell differences

Page 23: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Types of Experiment

• Novel microarray techniques– Genotyping– SNP detection– Copy Number Assessment

• Novel proteomics techniques– High-throughput interaction detection– Phosopho-proteomics

• Also…– Protein binding arrays– Ligand binding arrays

Page 24: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

A couple of quotes

• You know, the most amazing thing happened to me tonight. I was coming here, on the way to the lecture, and I came in through the parking lot. And you won’t believe what happened. I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!– Richard P. Feynman

• To consult a statistician after an experiment is finished is often merely to ask him to conduct a post-mortem examination. He can perhaps say what the experiment died of.– R.A.Fisher, 1938.

Page 25: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Summary

• Post-genomics technologies are powerful, but expensive

• Good design gives maximum return for minimum effort

Page 26: Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Any questions?

After the fact questions:[email protected]


Related Documents