Top Banner
Experimental Design SPH 247 Statistical Analysis of Laboratory Data 1 April 2, 2013 SPH 247 Statistical Analysis of Laboratory Data
52

Experimental Design

Feb 24, 2016

Download

Documents

Jon

Experimental Design. SPH 247 Statistical Analysis of Laboratory Data. Basic Principles of Experimental Investigation . Sequential Experimentation Comparison Manipulation Randomization Blocking Simultaneous variation of factors Main effects and interactions Sources of variability. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

Experimental Design

Experimental DesignSPH 247Statistical Analysis ofLaboratory Data1April 2, 2013SPH 247 Statistical Analysis of Laboratory Data3/28/20131Basic Principles of Experimental Investigation Sequential ExperimentationComparisonManipulationRandomizationBlockingSimultaneous variation of factorsMain effects and interactionsSources of variabilityApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data23/28/20132Sequential ExperimentationNo single experiment is definitiveEach experimental result suggests other experimentsScientific investigation is iterative.No experiment can do everything; every experiment should do something, George Box.April 2, 2013SPH 247 Statistical Analysis of Laboratory Data33/28/20133April 2, 2013SPH 247 Statistical Analysis of Laboratory Data4Plan ExperimentPerform ExperimentAnalyze Data from Experiment3/28/20134ComparisonUsually absolute data are meaningless, only comparative data are meaningfulThe level of mRNA in a sample of liver cells is not meaningfulThe comparison of the mRNA levels in samples from normal and diseased liver cells is meaningfulApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data53/28/20135Internal vs. External Comparison Comparison of an experimental results with historical results is likely to misleadMany factors that can influence results other than the intended treatmentBest to include controls or other comparisons in each experimentApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data63/28/20136ManipulationDifferent experimental conditions need to be imposed by the experimenters, not just observed, if at all possibleThe rate of complications in cardiac artery bypass graft surgery may depend on many factors which are not controlled (for example, characteristics of the patient), and may be hard to measureApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data73/28/20137April 2, 2013SPH 247 Statistical Analysis of Laboratory Data8

3/28/20138RandomizationRandomization limits the difference between groups that are due to irrelevant factorsSuch differences will still exist, but can be quantified by analyzing the randomizationThis is a method of controlling for unknown confounding factorsApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data93/28/20139Suppose that 50% of a patient population is femaleA sample of 100 patients will not generally have exactly 50% femalesNumbers of females between 40 and 60 would not be surprisingIn two groups of 100, the disparity between the number of females in the two groups can be as big as 20% simply by chance, but not much largerThis also holds for factors we dont know aboutApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data103/28/201310Randomization does not exactly balance against any specific factorTo do that one should employ blockingInstead it provides a way of quantifying possible imbalance even of unknown factorsRandomization even provides an automatic method of analysis that depends on the design and randomization technique.April 2, 2013SPH 247 Statistical Analysis of Laboratory Data113/28/201311The Farmer from Whidbey IslandVisited the University of Washington with a Whalebone water douser10 Dixie cups, 5 with water, 5 empty, each covered with plywoodPlaced in a random order defined by generating 10 random numbers and sorting the cups by the random numberIf he gets all 10 right, is chance a reasonable explanation?April 2, 2013SPH 247 Statistical Analysis of Laboratory Data123/28/201312The randomness is produced by the process of randomly choosing which 5 of the 10 are to contain waterThere are no other assumptions

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data133/28/201313If the randomization had been to flip a coin for each of the 10 cups, then the probability of getting all 10 right by chance is differentThere are 210 = 1024 ways for the randomization to come out, only one of which is corresponds to the choices, so the chance is 1/1024 = .001The method of randomization mattersIf the farmer could observe condensation on the cups, then this is still evidence of non-randomness, but not of the effectiveness of dousing!

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data143/28/201314Randomization Inference20 tomato plants are divided 10 groups of 2 placed next to each other in the greenhouse (to control for temperature and insolation)In each group of 2, one is chosen using a random number table to receive fertilizer A; the other receives fertilizer BThe yield of each plant in pounds of tomatoesis measuredThe null hypothesis is that the fertilizers are equal in promoting tomato growthApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data153/28/201315April 2, 2013SPH 247 Statistical Analysis of Laboratory Data1612345678910A13282109143107669510888133B14088112142118649811393136diff863-111-23553Pounds of yield of tomatoes for 20 plants3/28/201316The average yield for fertilizer A is 106.3 poundsThe average yield for fertilizer B is 110.4 poundsThe average difference is 4.1Could this have happened by chance? Is it statistically significant?If A and B do not differ in their effects (null hypothesis is true), then the plants yields would have been the same either whether A or B is appliedThe difference would be the negative of what it was if the coin flip had come out the other wayApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data173/28/201317April 2, 2013SPH 247 Statistical Analysis of Laboratory Data18

132 lb140 lb140 lb132 lbFert AFert AFert BFert BActualHypothetical = 8 = 8In pair 1, the yields were 132 and 140. The difference was 8, but it could have been 8With 10 coin flips, there are 210 = 1024 possible outcomes of + or on the differenceThese outcomes are possible outcomes from our action of randomization, and carry no assumptionsThe measurements dont have to be normally distributed or have the same varianceApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data193/28/201319Of the 1024 possible outcomes that are all equally likely under the null hypothesis, only 3 had greater values of the average difference, and only four (including the one observed) had the same value of the average differenceThe likelihood of this happening by chance is [3+4/2]/1024 = .005This does not depend on any assumptions other than that the randomization was correctly doneApril 2, 2013SPH 247 Statistical Analysis of Laboratory Data203/28/201320April 2, 2013SPH 247 Statistical Analysis of Laboratory Data2112345678910A13282109143107669510888133B14088112142118649811393136diff863-111-235533/28/201321April 2, 2013SPH 247 Statistical Analysis of Laboratory Data22

Paired t-test3/28/201322Randomization in practiceWhenever there is a choice, it should be made using a formal randomization procedure, such as Excels rand() function.This protects against unexpected sources of variability such as day, time of day, operator, reagent, etc.April 2, 2013SPH 247 Statistical Analysis of Laboratory Data233/28/201323April 2, 2013SPH 247 Statistical Analysis of Laboratory Data24 Pair NumberFirst Sample Treatment1A or B?2A or B?3A or B?4A or B?5A or B?6A or B?7A or B?8A or B?9A or B?10A or B?3/28/201324April 2, 2013SPH 247 Statistical Analysis of Laboratory Data25 Pair NumFirst Sample Treatmentrandom number1A or B?0.8714132A or B?0.7860363A or B?0.8897854A or B?0.0811205A or B?0.2976146A or B?0.5404837A or B?0.8244918A or B?0.6241339A or B?0.91318710A or B?0.0015993/28/201325=rand() in first cellCopy down the columnHighlight entire column^c (Edit/Copy)Edit/Paste Special/ValuesThis fixes the random numbers so they do not recompute each time=IF(C3