A New Approach for Comparing Means of Two Populations By Brad Moss And Sponsored by Dr. Chand Chauhan.

A New Approach for Comparing Means of Two Populations

By Brad MossAnd Sponsored by Dr. Chand Chauhan

The Problem

• The research that is being presented is about a current problem in statistics.

• The problem is how do you compare two means of different populations when the variance of each population is unknown and unequal.

• There are many ways to deal with this problem. We will compare a new method to one of the other methods during this presentation.

The Idea!

• After consulting with Dr. Chauhan and reading “A Note on the Ratio of Two Normally Distributed Variables”

• It was decided that we could try to get around the unknown/unequal variance problem by taking a ratio of the estimated means which has not been done before.

• We would approximate this ratio as a normal distribution and from that, decide whether or not the means are different.

A Context for the Idea

• Let’s say we are employees of “Get Better Drug Company” and that we are in the process of developing a drug for weight loss.

• Our scientists have developed two drugs, A and B.

• The company can only mass produce one of these drugs.

Things We Will Want to Know

• Overall, what is the mean weight loss for people taking each drug?

• Is one drug more effective than the other?

How To Answer

• First we approximate the means of the two samples with averages known as and .

• Then we compare the means by either taking the ratio of the averages or we take the difference of the averages.

• Lets’ consider the ratio of the averages which is what my research is about.

How this Works

• A ratio (otherwise know as a fraction) of the two means should be one or close to one.

• How close to 1 is close enough?• For this, we create what is know as a

Confidence Interval using the estimated ratio and an estimate of the variance.

• If 1 falls into this interval, we will say that the means are statistically the same.

The Ratio Can Be Approximated by a Normal Distribution.

• According to the paper mentioned on slide 3, a ratio of two normally distributed random variables can be approximated as a normal distribution.

• This happens when the standard deviation of one of the random variables is significantly smaller than the other.

• The ratio of the standard deviations should exceed 19 to 9 for this to work assuming the means are equal.

How Can We Use This

• Since the ratio of two normally distributed random variables is approximately normal.

• And the average values of normally distributed variables are normally distributed as well (see Theorem 8, page 187 in “An Introduction to Probability and Statistical Inference”)

• That is if Y and X are normally distributed, then and are normally distributed.

How Can We Use This

• The last fact is important, because we don’t know our true mean, so we approximate our true mean with the averages.

• According to the paper, the mean of is the following: when Y and X are independent.

• CVx is the ratio of the standard deviation of x to its mean. The mu’s stand for the true mean.

How Can We Use This

• The variance of is: • Once again we are assuming Y and X are

independent.• So, now we want to replace with the ratio of

our approximated means

How Does that Change Things

• Well according to statistics. We see that and and

• Where E() stands for expected value of (…) and Var stands for Variance of (…)

• The n’s stand for sample sizes with the subscript referencing to which sample they belong.

Now Substituting

• )• Since the standard deviation of X is small, then

CVx is small and is being divided by a bigger number.

• Therefore we can ignore in the approximate mean. Thus

The Confidence Interval!

• Now we need to determine how close to 1 do we have to be in order to say that the means are the same.

• So we will create an interval around our approximate mean.

• In statistics, we convert our normal values back to standard normal by subtracting the mean and then dividing by the standard deviation.


• So • Now we want to choose two numbers “a” and

“b” on the real line such that the probability of is 95%


• Since we are working with a standard normal distribution, we will choose “a” and “b” to be -1.96 and 1.96 because in a standard normal, =.95

• so,

Doing Some Algebra

• After some Algebra we get: • Thus the end points of the Interval are:• and

Results

• So, given those two endpoints, if 1 is between them, then the means are statistically the same.

• Otherwise the means are different.

Results

• Using a program called Minitab, I ran 5000 simulations for several cases and the results are on the following slides.

Case 1: Equal Sample Sizes

• Y~N(100,10), X~N(100,0.5), Sample Size 30• Confidence Interval of 94.04%• Mean Length: 0.0711821 • Y~N(100,10), X~N(100,2), Sample Size 30• Confidence Interval of94.42%• Mean Length: 0.0726839

• Y~N(100,10), X~N(100,5), Sample Size 30• Confidence Interval of 94.54%• Mean Length: 0.0795576

Case 2: Y has smaller sample size

• Y~N(100,10) Sample Size 15, X~N(100,0.5) Sample Size 20• Confidence Interval of 93.34%• Mean Length: 0.100127

• Y~N(100,10) Sample size 15, X~N(100,2) Sample Size 20• Confidence Interval of 92.88%• Mean Length: 0.101451

• Y~N(100,10) Sample Size 15, X~N(100,5) Sample Size 20• Confidence Interval of 93.68%• Mean Length: 0.108929

Case 3: X has smaller sample size

• Y~N(100,10) Sample Size 20, X~N(100,0.5) Sample Size 15• Confidence Interval of 93.8%• Mean Length: 0.0869802

• Y~N(100,10) Sample Size 20, X~N(100,2) Sample Size 15• Confidence Interval of 93.56%• Mean Length: .0891369

• Y~N(100,10) Sample Size 20, X~N(100,5) Sample Size 15• Confidence Interval of 93.84• Mean Length: 0.100611

Sources• Jack Hayya, Donald Armstrong, and Nicolas Gressis. “A Note on the Ratio

of Two Normally Distributed Variables” Management Science 21.11 (1975). 14 Jan 2011 < http://www.jstor.org/stable/2629897 >

• Roussas, George. An Introduction To Probability and Statistical Inference. San Diego: Elsevier Science, 2003

http://www.jstor.org/stable/2629897

A New Approach for Comparing Means of Two Populations By Brad Moss And Sponsored by Dr. Chand Chauhan.

Documents

estimated means

confidence interval

mean length

means of different populations

smaller sample size

estimated ratio

approximate mean

worksa ratio