ESTIMATION OF STATISTICAL PARAMETERS Estimation theory is a branch of statistics based on measured/empirical data that has a random component. An estimator attempts to approximate the unknown parameters using the measurements. 1 In statistics, estimation refers to the process by which one makes inferences about a population, based on information obtained from a sample
29
Embed
ESTIMATION OF STATISTICAL PARAMETERS Estimation theory is a branch of statistics based on measured/empirical data that has a random component. An estimator.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ESTIMATION OF STATISTICAL PARAMETERS
Estimation theory is a branch of statistics based on measured/empirical data that has a random component.
An estimator attempts to approximate the unknown parameters using the measurements.
1
In statistics, estimation refers to the process by which one makes inferences about a population, based on information obtained from a sample
OUTLINEObjectives:
• Describe the characteristics of the normal distribution in statistical terms
• Explain the concept of a confidence interval
and how it relates to an estimated parameter
Point Estimate vs. Interval Estimate
Statisticians use sample statistics to estimate population parameters.
For example:• sample means are used to estimate population means;• sample proportions, to estimate population proportions.
3
An estimate of a population parameter may be expressed in:
Point estimate. A point estimate of a population parameter is a single value of a statistic.
For example, the sample mean x is a point estimate of the population mean μ.
Similarly, the sample proportion p is a point estimate of the population proportion P.
4
Point Estimate vs. Interval Estimate
Interval estimate. An interval estimate is defined by two numbers, between which a population parameter is said to lie.
For example, a < x < b is an interval estimate of the population mean μ. It indicates that the population mean is greater than a but less than b.
Confidence IntervalsStatisticians use a confidence interval to express the precision and
uncertainty associated with a particular sampling method.
A confidence interval consists of three parts.
1. A confidence level.
2. A statistic.
3. A margin of error.
The confidence level describes the uncertainty of a sampling method.
The statistic and the margin of error define an interval estimate that describes the precision of the method.
The interval estimate of a confidence interval is defined by:
the sample statistic + margin of error.
The probability part of a confidence interval is called a confidence level.
The confidence level describes how strongly we believe that a particular sampling method will produce a confidence interval that includes the true population parameter.
5
Standard Error• To compute a confidence interval for a statistic, you
need to know the the standard deviation or the standard error of the statistic.
• This lesson describes how to find the standard deviation and standard error, and shows how the two measures are related
NotationThe following notation is helpful, when we talk about the standard deviation and the standard error.
Population parameter Sample statistic
N: Number of observations in the population n: Number of observations in the sample
μ: Population mean x: Sample estimate of population mean
σ: Population standard deviation s: Sample estimate of σ
Standard Deviation of Sample Estimates
• Statisticians use sample statistics to estimate population parameters. Naturally, the value of a statistic may vary from one sample to the next.
• The variability of a statistic is measured by its standard deviation.
8
Statistic Standard Deviation
Population mean
Statistic Standard Error
Sample mean,
The equations for the standard error are identical to the equations for the standard deviation, except for one thing - the standard error equations use statistics where the standard deviation equations use parameters. Specifically, the standard error equations use p in place of P, and s in place of σ.
Central Limit Theorem• The distribution of sample means (sampling distribution) from a
population is approximately normal if the sample size is large, i.e.,
9
1. The population distribution can be non-normal. 2. Given the population has mean m, then the mean of the sampling distribution, 3. if the population has variance s2, the standard
deviation of the sampling distribution, or the standard error (a measure of the amount of sampling error) is
Estimation & Confidence Intervals• Normal distribution:
• Gaussian distribution• Symmetric• Not skewed• Unimodal• Described by two parameters:
• Probability density function:• μ & σ are parameters• μ = mean• σ = standard deviation• π, e = constants
• Normal distribution: Why do we use it!• Many biological variables follow a normal distribution • The normal distribution is well-understood, mathematically
• Punctual estimation• Is a value for estimated theoretical parameter
• m (sample mean) is a punctual estimation of μ (population mean)
• Is influenced by the fluctuations from sampling• Could be very far away from the real value of the
estimated parameter
11
Point Estimations•
Why Confidence Intervals?We are not only interested in finding the point estimate for the mean, but also determining how accurate the point estimate is.
The Central Limit Theorem plays a key role here. We assume that the sample standard deviation is close to the population standard deviation (which will almost always be true for large samples).
Then the Central Limit Theorem tells us that the standard deviation of the sampling distribution is
13
We will be interested in finding an interval around x such that there is a large probability that the actual mean falls inside of this interval.
This interval is called a confidence interval and the large probability is called the confidence level.
DefinitionsA range around the sample estimate in which the population estimate is expected to fall with a specified degree of confidence, usually 95% of the time at a significance level of 5%.
α = significance levelThe range defined by the critical values will contains the population
estimator with a probability of 1-α It is applied when variables are normal distributed!
14
15
Confidence Intervals95% Confidence Interval for m:
16
Definition 1:
You can be 95% sure that the true mean (μ) will fall within the upper and lower bounds.
Definition 2:
95% of the intervals constructed using sample means ( x ) will contain the true mean ( μ ).
Confidence Intervals• It is calculated taking into consideration:
• The sample or population size• The type of investigated variable (qualitative OR quantitative)
Formula of calculus comprised two parts:I. One estimator of the quality of sample based on which the
population estimator was computed (standard error)• Standard error: is a measure of how good our best guess is.• Standard error: the bigger the sample, the smaller the standard
error.• Standard error: i always smaller than the standard deviation
II. Degree of confidence (Zα score)
It is possible to be calculated for any estimator but is most frequent used for mean
Confidence Intervals for Means• Standard error of mean is equal to standard deviation
divided by square root of number of observations:• If standard deviation is high, the chance of error in estimator is high• If sample size is large, the chance of error in estimator is small.
n
sZX,
n
sZX
Confidence Intervals for Means
• Lower confidence limit is smaller than the mean• Upper confidence limit is higher than the mean• For the 95% confidence intervals: Z5% = 1.96
• For the 99% confidence intervals : Z1% = 2.58
n
sZX,
n
sZX
Confidence Interval for a Mean When the Population Standard Deviation is Unknown
When the population is normal or if the sample size is large, then the sampling distribution will also be normal, but the use of s to replace s is not that accurate.
The smaller the sample size the worse the approximation will be. Hence we can expect that some adjustment will be made based on the sample size. The adjustment we make is that we do not use the normal curve for this approximation.
Instead, we use the Student t distribution that is based on the sample size. We proceed as before, but we change the table that we use. This distribution looks like the normal distribution, but as the sample size decreases it spreads out. For large n it nearly matches the normal curve. We say that the distribution has n - 1 degrees of freedom.
Confidence Intervals
CI for μ if n>120:
90% CI : x ± 1.65 ()
95% CI : x ± 1.96 ()
99% CI : x ± 2.58 ()
21
CI for μ if n<120:
90% CI : x ± t,n-1 ()
95% CI : x ± t,n-1 ()
99% CI : x ± t,n-1 ()
where t,n-1 distribution is read from table "t" at the and n-1 degrees of freedom The EXCEL function T.INV.2T ((probability grade_libertate)
A fellow wanted to determine the average serum creatinine level among healthy elderly adult male subjects from Timisoara city. From the literature she could not find any information on on μ or s of serum creatinine among local healthy elderly males.
She measured 15 health elderly male volunteers from Timisoara city and the sample mean sCr is 0.94 mg/dL with a sample standard deviation of 0.15 mg/dL.
What should be the 95% CI for μ ?
25
26
Confidence Intervals• Solution:
27
Example
• Suppose a student measuring the boiling temperature of a certain liquid observes the readings (in degrees Celsius) 102.5, 101.7, 103.1, 100.9, 100.5, and 102.2 on 6 different samples of the liquid.
• He calculates the sample mean to be 101.82. • If he knows that the standard deviation for this procedure is 1.2
degrees, what is the confidence interval for the population mean at a 95% confidence level?
• In other words, the student wishes to estimate the true mean boiling temperature of the liquid using the results of his measurements. If the measurements follow a normal distribution, then the sample mean will have the distribution N(,/n). Since the sample size is 6, the standard deviation of the sample mean is equal to 1.2/sqrt(6) = 0.49.
28
Remember!1. Correct estimation of a statistical parameter is done with
confidence intervals (CI).
2. Confidence intervals depend by the sample, size and standard error.
3. The confidence intervals is larger for:
• High value of standard error• Small sample sizes