Sampling based appr'ximation of confidence intervals for functions of genetic covariance matrices Karin Meyer 1 David Houle 2 1 Animal Genetics and Breeding Unit, University of New England, Armidale NSW 2351 2 Department of Biological Science, Florida State University, Tallahassee, FL 32306-4295 AAABG 2013
15
Embed
Sampling based approximation of confidence intervals for functions of genetic covariance matrices
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
1Animal Genetics and Breeding Unit, University of New England, Armidale NSW 2351
2Department of Biological Science, Florida State University, Tallahassee, FL 32306-4295
AAABG 2013
Sampling standard errors | Introduction
REML sampling variances
REML estimates of covariance components
– multivariate normal distribution: θ̂θθ ∼ N (θθθ, I(θθθ)−1)
– inverse of information matrix −→ sampling errors– large sample theory; asymptotic lower bounds
Linear functions of estimates
– sampling variances readily obtained
Non-linear functions
– obtain 1st order Taylor series expansion– evaluate sampling variance of linear approximation– needs partial derivatives w.r.t. all variables−→ can be complicated / tedious−→ options for evaluating in REML software limited
Confidence intervals: ±zα s.e.
– misleading at boundary of parameter space?
K. M. | 2 / 12
“Delta method”
Sampling standard errors | Introduction
Alternatives
Dealing with boundary conditions
– Derive confidence intervals from profile likelihood– Bayesian estimation
General procedure
– Sample data, repeat analysis −→ distribution over reps– slow & laborious!
Objectives
1 Propose new scheme
– sample from (theoretical) distribution of estimates– simple & fast
2 Examine quality of approximation of sampling errors
K. M. | 3 / 12
Sampling standard errors | Introduction
Alternatives
Dealing with boundary conditions
– Derive confidence intervals from profile likelihood– Bayesian estimation
General procedure
– Sample data, repeat analysis −→ distribution over reps– slow & laborious!
Objectives
1 Propose new scheme
– sample from (theoretical) distribution of estimates– simple & fast
2 Examine quality of approximation of sampling errors
K. M. | 3 / 12
Sampling standard errors | Method
Sampling scheme
Large sample theory– (RE)ML estimates have MVN distribution– Sampling covariance ∝ inverse of information matrix
Sample from this distribution
θ̃θθ ∼ N
�
θ̂θθ, H(θ̂θθ)−1
�
Information matrix
– Use same parameterisation as REML analysis,→ eliminate linear approx., account for constraints
– Evaluate function(s) of interest for θ̃θθ– Examine distribution over replicates
Mandel, M. (2013) Simulation-based confidence intervals forfunctions with complicated derivatives. American Statistician67, 76–81.
K. M. | 4 / 12
Sampling standard errors | Method
Sampling scheme
Large sample theory– (RE)ML estimates have MVN distribution– Sampling covariance ∝ inverse of information matrix
Sample from this distribution
θ̃θθ ∼ N
�
θ̂θθ, H(θ̂θθ)−1
�
Information matrix
– Use same parameterisation as REML analysis,→ eliminate linear approx., account for constraints
– Evaluate function(s) of interest for θ̃θθ– Examine distribution over replicates
Mandel, M. (2013) Simulation-based confidence intervals forfunctions with complicated derivatives. American Statistician67, 76–81.
K. M. | 4 / 12
Sampling standard errors | Simulation
Does it work?
Simulate two data sets
– 4000 animals, 6 traits– h2 = 2× (0.2,0.3,0.4)
– σ2P
= 100– rE = 0.3– a) rG = 0.5, b) rG = |0.7||i−j|
REML analysis
– AI algorithm– Cholesky factor
Estimates
– θ̂θθ– H(θ̂θθ)
Compare estimates of sampling variances
REML Based on H(θ̂θθ), “Delta” method
Empirical Re-sample data using estimates as popul.values, repeat analysis; 10000 replicates
Approx. Sample from MVN distribution, N(θ̂θθ,H(θ̂θθ)−1)
200000 replicates
K. M. | 5 / 12
Sampling standard errors | Results
Sampling covariances for Σ̂ΣΣG - a∗
Empirical vs. REML Approximate vs. REML Approximate vs. Empirical
– accommodates arbitrary functions– yields good approximation of sampling variances– easier than Delta method for complicated derivatives– more appropriate confidence interval at boundary of
parameter space– but:
−→ relies on large sample theory−→ information matrix needs to be safely p.d.−→ assumes θ̂θθ ≈ θθθ