Vanderbilt University Medical Center SRC Presentation Vincent Kokouvi Agboto Assistant Professor/Director of Biostatistics, Meharry Medical College Assistant Professor of Biostatistics, Vanderbilt University Medical Center
Vanderbilt University Medical Center
SRC PresentationVincent Kokouvi AgbotoAssistant Professor/Director of Biostatistics, Meharry Medical CollegeAssistant Professor of Biostatistics, Vanderbilt University Medical Center
Introduction to Experimental Designs in Biological and Clinical Settings.
Overview
1. Introduction
2. Examples of Classical Designs
3. Optimal Experimental Design
4. Other Designs Issues
5. Conclusion
1. Introduction
Experiment: Investigation in which investigator applies some treatments to experimental units and then observes the effects of treatments on the experimental units through measurement of response (s).
1. Introduction
Treatment: Set of conditions applied to experimental units in an experiment.
Experimental Unit: Physical entity to which a treatment is randomly assigned and independently applied.
1. Introduction
Response variable: Characteristic of an experimental unit that is measured after treatment and analyzed to assess the effects of treatments on experimental units.
Observational Unit: Unit on which a response variable is measured.
1. Introduction
Experimental design procedure:
Decision before data collection. Basic idea: Appropriate selection of values of
control variables. Three Fundamental of Experimental Design
Concepts: Randomization, Blocking, Replication. (R. A. Fisher)
1. Introduction
Important stages of an Experimental Research: Background of the experiment; Choice of factors; Reduction of error; Choice of model; Design criterion and Size of the design; Choice of an experimental design; Conduct of the experiment and Analysis of the data
1. Introduction
Classical (Standard) DesignsOptimal Experimental Design: Only
alternative when the standard designs do not provide us with adequate answers
2. Examples of Classical Designs
Example1: Soils Moisture and gene Expression in maize seedlings.
Example2: Drug and Feed Consumption on Gene Expression in rats.
Example3: Treatments on Gene Expression in dairy cattle.
Example 1
Experiment: Effect of three soil moisture levels on gene expression in maize seedlings.
Total of 36 seedlings were grown in 12 pots with 3 seedlings per pot.
Three soil moisture levels (low, medium, high) randomly assigned to the 12 pots.
After three weeks, RNA extracted from the above ground tissues of each seedling.
Each of the 36 RNA samples was hybridized to a microarray slide to measure gene expression.
Example 1 (continued)
Treatment: The three moisture levels Experimental Unit: Moisture levels randomly
assigned to the pots Pots: experimental units. A pot consisting of 3 seedlings is one experimental unit.
Observational units: Gene expression was measured for each seedling Seedlings: Observational units.
Response variable: Each probe on the microarray slide provide one response variable.
This is the Standard Experimental Design (CRD).
Example 2
Experiment: Gauge the effects of a drug and feed consumption on gene expression in rats.
A total of 40 rats were housed in individual cages. Half of them calorie-restricted diet (R); Another
half Provided with access to feeders that were full so calories intake unrestricted (U).
Within each diet group, four doses of an experimental drug (1, 2, 3, 4) rats with 5 rats per dose within each diet group.
Example 2 (continued)
At the conclusion of the study, gene expression was measured for each rat using microarrays.
Example 2 (continued)
Treatment (factors): Diet and Drug. Factor Diet (R, U); Factor Drug (1, 2, 3, 4) Each combination of diet and drug: Treatment (R1,
R2, R3, R4, U1, U2, U3, U4). Each rat: Experimental unit/Observational unit. Response variable: Each probe on the microarray
slide. This is a full factorial treatment design. It was
used because all possible combination of diet and drug were considered.
Example 3
Experiment: Study the effects of 5 treatments (A, B, C, D, E) on gene expression in dairy cattle.
A total of 25 GeneChips and a total of 25 cows, located on 5 farms with 5 cows on each farm are available for the experiment.
Which of the following designs is better from a statistical standpoint?
Example 3 (Continued)
Design 1: To reduce variability within treatment groups, randomly assign the 5 treatments to the 5 farms so all 5 cows on any one farm receive the same treatment. Measure gene expression using one GeneChip for each cow.
Design 2: Randomly assign the 5 treatments to the 5 farms within each farm so that all 5 treatments are represented on each farm. Measure gene expression using one GeneChip for each cow.
Example 3 (continued)
Design 1 Design 2
Farm 1: B B B B B Farm 1: A B E D C
Farm 2: D D D D D Farm 2: E D A C B
Farm 3: A A A A A Farm 3: C D E A B
Farm 4: E E E E E Farm 4: A B E C D
Farm 5: C C C C C Farm 5: C A D B E
Example 3 (continued)
Observation Units: Cows in both designs. Experimental Units: Farms in Design 1 and Cows in
Design 2. Design 2: a randomized complete block design
(RCBD) with a group of 5 cows on a farm serving as a block of experimental units.
Design 1 has no replication because only 1 experimental unit for each treatment. Design 2 has 5 replications per treatment.
Design 3 (continued)
Design 2 is by far the better design. We can compare treatments directly among
cows that share the same environment. With Design 1, it is impossible to separate
difference in expression due to treatment effects from differences in expression due to farm effects.
3. Optimal Experimental Design
3.1. Motivation Example
3.2. Comments on Orthogonal Designs.
3.3. Some Examples of Non-Orthogonal
Designs
3.4. Optimal Designs
3.1. Motivating Example
Suppose that the yield is linearly related to temperature whose range is [50, 150]: Y= a + b X
If we want conduct experiments at two points, which of the following will we choose: Design1 at 50 and 150? Design2 at 70 and 130? Design3 at 90 and 110?
3.1. Motivating Example
What is the optimal design in this case?Better design among the three designs
mentioned
3.1. Motivating Example
It is the design1 because it gives the smallest confidence region for the parameters (D-optimality) and also give the smallest maximum variance for the predicted responses (G-optimality)
3.2. Comments on orthogonal Designs
Pros (Many desirable properties)
- Easy to calculate - Easy to interpret - Maximum Precision (in some sense) - Tabled designs widely available
3.2. Comments on Orthogonal Designs
Cons: Not applicable if
- Irregular design space - Mixture experiments - Sample size not power of 2 - Mixed qual and quant factors - Fixed covariates - Nonlinear models
2.3. Some Examples of Non-Orthogonal Designs
16-run design with 8 two-level factors with main effects and 6 interactions: BC, CH, BH, DE, EF, DF
12-run mixed level design with one 3 level factor and 9 two-level factors
2.4. Optimal Designs
Optimal Experimental Design (OED): Standard alternative when classical designs not applicable.
Choice of a particular experimental design: Depends on the experimenter’s design criterion (optimization problem).
OED: Reduce costs of experimentation by allowing statistical models to be estimated with fewer experimental runs; Evaluated using statistical criteria.
3.4. Optimal Designs
Ynxp ~ N (X + , 2I), Xnxp: design matrix, : unknown px1 parameter vector and 2: known
y(xi) = f’(xi) + i
X=[f(x1), …, f(xn)]’
3.4. Optimal Designs
Design : Probability measure over a compact region with (xi) = i
places weight (xi) on xi
Problem: n(xi) is not necessary an integer
3.4. Optimal Designs
Approximate design: = x1 x2… xn
1 2…n with (dx) =1 and 0 i 1
Exact design: n(xi) must be an integer
3.4. Optimal Designs
nM()=X’X= m(x)(dx)= f(x) f’(x) (dx) = i f(xi)f(xi)’ : Information matrix of
Optimality crietria: * = arg max (M())
3.5. Some Useful Criteria
D-Optimality: max |X’X|: A-Optimality: min{trace (X’X)-1}G-Optimality: min{max d(x)} where d(x)
= f’(x)(X’X)-1f(x)V-Optimality: min{average d(x)}
3.5. Some Useful Criteria
D and A-Optimality: Estimation based criteria.
G and V-Optimality: Prediction based criteria.
3.6. Algorithms for Optimal Designs
Development of efficient computing methods and high power computer systems Great interest in algorithmic approaches.
In general: Difficult to find exact designs analytically. Finding exact designs Solving a large nonlinear
mixed integer programming problem. In practice: Find designs close to the best design
locally optimal introduction of exact design algorithms.
3.6. Algorithms for Optimal Designs
Typical Exact Design Algorithm steps:
- Choose an initial feasible solution design
- Modify solution slightly, by exchanging a
point in the design for a point in the design
space .
3.6. Algorithms for Optimal Designs
Fedorov algorithm (Fedorov, 1969).Modified Fedorov algorithm(Johnson and
Nachtsheim, 1983).K-L exchange algorithm (Donev and
Atkinson, 1988).Coordinate exchange algorithm (Meyer
and Nachtsheim, 1995).Columnwise-Pairwise (CP) algorithm
(Wu and Li, 1999).
3.7. Software for the Computation of Optimal Designs
SAS JMP Matlab R C++
4. Other Designs Issues
Supersaturated Designs Bayesian Designs Model Robust Designs Model Discrimination Designs
5. Conclusion
All problems are different Statistical knowledge will help improve the design. Get involved with the statistician (biostatistician)
early in the process. Collaborate closely with people who know the
background of the study. Even the most sophisticated statistical analysis could
save do much to save a study based on a “bad design”.
References
Agboto V. , 2006. Bayesian approaches to model robust and model discrimination designs. Unpublished Ph.D. dissertation, School of Statistics, University of Minnesota.
Agboto V, Nachtsheim C, Li W. Screening designs for model discrimination. Journal of Statistical Planning and Inference,140:3, 766-780, 2010.
Atkinson, A.C & Donev, A.N. (1992): “Optimal Experimental Designs”. Oxford Statistical Sciences Series:8, 1-328.
Chaloner, K. (1984). “Bayesian experimental design: A review”. Statistical Science 10, 273-304.
Cook, R. D. & Nachtsheim, C. J. (1982). “A comparison of algorithms for constructing exact D-opitmal designs”. Technometrics 22, 315-324.
Li, W. & Wu, C. F. J. (1997). “Columwise-pairwise algorithms with applications to the construction of supersaturated designs”. Technometrics 39, 171-179.