Would you change the Channel? A survey by the a known
organization found that 45% of the peoplewho were offended by a
television program would change the channel, while 15% would turn
off their television sets. The survey further stated that the
margin of error is 3% points, and 4000 adults were interviewed.
Several Questions arise: 1. How do these estimates compare with the
true population percentage?2. What is meant by a margin of error of
3 percentage points? 3. Is the sample of 4000 large enough to
represent the population of all adults who watch television in the
Philippines? STATISTICAL INFERENCE: Estimation Estimation Is a
process of estimating the value of a parameter from information
obtained from a sample Two Types of Estimates Point Estimates
Interval Estimates Point Estimate Is a specific numerical value
estimate of a parameter. Interval Estimate is an interval or range
of values used to estimate the parameter. This estimate may or may
not contain the value of the parameter being estimated. Three
Properties of a good estimator The estimator should unbiased
estimator. The estimator should be consistent. For a consistent
estimator, as sample size increases the value of the estimator
approaches the value of the parameter estimated. The estimator
should be a relatively efficient estimator. (has smallest variance)
Confidence level Is the degree of assurance that a particular
statistical statement is correct, under specified conditions.
Confidence Interval Is a specific interval estimate of a parameter
determined by using data obtained from a sample and by using the
specific confidence level of estimate Significance Level Is the
degree uncertainty about the statistical statement under the same
conditions used to determine the confidence level.Significance
levels are symbolized by: Mathematically, Confidence level +
Significance level = 1 Confidence Intervals Use to estimate range
of possible values parameter, rather than a single value. When you
use a confidence interval instead of a point estimator, you lose a
degree of precision but you gain a large degree of confidence. In
general form:
Where: lower limit = point estimator error of estimate upper
limit = point estimator + error of estimate Formula for the
confidence interval of the Mean for a Specific alpha - Maximum
error of estimate Maximum error of estimate Is the maximum likely
difference between the point estimate of a parameter and the actual
value of the parameter Examples: 1. A researcher wishes to estimate
the average amount of money a persons spends on lottery ticket each
month. A sample of 50 people who play the lottery found the mea to
be 19 dollars an the standard deviation to be 6.8. Find the best
point estimate of the population mean and the 95% confidence
interval of the population mean. Examples: 2. A survey of 30 adults
found that the mean age of a persons primary vehicle is 5.6 years.
Assuming the standard deviation of the population is 0.8 year, find
the best point estimate of the population mean and the 99%
confidence interval of the population mean Formula for the
confidence interval of the Mean for a Specific alpha The degrees of
freedom( df) are n - 1 STATISTICAL HYPOTHESIS TESTING How much
better is better? Suppose a school superintendent reads an article
which states that the overall meanentrance exam score is 85.
furthermore, suppose that, for a sample of students, the average of
the entrance exam scores in the superintendents school district is
88. Can the superintendent conclude that the students in his school
district scored higher than the average? Question Arises: Is there
a real difference in the means? Is the difference simply due to
chance? Statistical Hypothesis is an assertion or conjecture
concerning one or more populations. This conjecture may or may not
be true.
Types of Hypothesis 1. Null Hypothesis -( Ho)is the hypothesis
that is being tested; it represents what the experimenter doubts to
be true. 2.Alternative Hypothesis ( Ha) -is the operational
statement of the theory that the experimenter believes to be true
and wishes to prove. It is the contradiction of the null
hypothesis. It also specifies an existence of a difference or a
relationship, therefore it is non- directional. Illustration of how
hypotheses should be stated: Situation A: A medical researcher is
interested in finding out where a new medication will have any
undesirable side effects. The researcher is particularly concerned
with the pulse rate of the patients who take the medication. Will
the pulse rate increase, decrease, or remain unchanged after a
patient takes a medication? Since the researcher knows that the
mean pulse rate for the population under study is 82 beats per
minute, the hypotheses for this situation are The null hypothesis
specifies that the mean will remain unchanged, and the alternative
states that it will be different. This test is called TWO-TAILED
TEST Situation B: A chemist invents an additive to increase the
life of an automobile battery. If the mean lifetime of the
automobile battery without the additive is 36 months, then the
hypotheses are: in this situation, the chemist is interested only
in increasingthe lifetimeof the batteries, so her alternative
hypothesis is that the mean is greater than 36 months. This test is
called RIGHT-TAILED TEST Situation C: A contractor wishes a lower
heating bills by using a special type or insulation in houses. If
the average of the monthly heating bills is 500 pesos, her
hypotheses about heating costs with the use of insulation are: This
test is called LEFT-TAILED TEST Two-tailed test Right-tailed test
Left-tailed test Summary: Exercises: State the null and alternative
hypotheses for each conjecture. A.A researcher thinks if expectant
mothers use vitamin pills, the birth weight of the babies will
increase. The average birth weight or the population is 8.6 pounds
Exercises: B.An engineer hypothesizes that the mean number
ofdefects can be decreased in a manufacturing process of compact
disks by using robots instead of humans for certain tasks. The mean
number of defective disks per 1000 is 18. Exercises: C. A
psychologist fells that playing soft music during a test will
change the results of the test. The psychologist is not sure
whether the grades will be higher or lower. In the past, the mean
of the scores was 73. Solution: Test Statistic -is a statistics
whose value is calculated from sample measurements and on which the
statistical decisions will be based. Types of Error 1. Type I
Error-is the error made by rejecting the null hypothesis when it is
true. The probability of type I error is denoted by . 2. Type II
Error - is the error made by accepting ( not rejecting ) the null
hypothesis when it is false. The probability of a Type II error is
denoted by . Level of Significance ( ) is the maximum probability
ofcommitting Type I error the researcher is willing to commit. 3
levels: a. 0.1b. 0.05c. 0.01 Critical Value separates the critical
region from the non-critical region. The symbol is C.V Critical
Region or Rejection Region -is the set of values of the test
statistic for which the null hypothesis will be rejected. The
acceptance regionis the set of values of the test statistic for
which the null hypothesis will not be rejected. The acceptance and
rejection regions are separated by a critical value of the test
statistic. Finding Critical values: Find the critical value(s) for
each situation and draw the appropriate figure, showing the
critical region. a. A left-tailed with = 0.10 b. A two-tailed test
with = 0.02 c. A right-tailed with = 0.005 Factors to be consider
in selectingStatistical Tests Each test is appropriate under
certain conditions. When selecting a test consider four factors:
structure of the null hypotheses the level of measurement allowed,
or required, of the test sample size distribution of the responses
(if the distribution is normal or not) Steps in Hypothesis Testing
1. Formulate the hypothesis and identify the claim. 2. Determine
the critical value 3. Determine the computed value of the test
statistics from the given conditions. 4. Make a decision. In making
a decision we compare the computed value to the critical value. We
shall have two possibilities. If the computed value is less than
the critical value, we accept the null hypothesis and reject the
alternative hypothesis. If the computed value is greater than the
critical value, we reject the null hypothesis and accept the
alternative hypothesis. 5.Summarize the results.Types of
Statistical Test Z Test T- Test Chi-Square Analysis ANOVA
Correlation Coefficient Z Test The simplest and most common test on
thesignificanceofsampledata.The
applicationofZtestrequiresnormalityof
distribution.Thesamplesizeshouldbe greater than or equal to 30.
This test is one oftheparametrictestssinceitutilizethe
twopopulationparametersand.Ifthe population standard deviation is
not known, then the sample standard deviation can be
used.TheZ-testcanbeappliedintwo ways: One Sample Mean Test Formula:
where : X bar sample mean hypothesized value of thepopulation mean
- population standard deviation n -sample size o n XZcomputed) (
=Example: 1. A researcher reports that the average salary of
assistant professors is more than 42, 000 dollars. A sample of 30
assistant professors has a mean salary of 43,260 dollars.At = 0.05,
test the claim that the assistantprofessors earn more than 42,000
dollars a year. The standard deviation of the population is 5230
dollars. Solution: Step 1: Step 2: Since = 0.05 and the test is a
right-tailed test, the critical value is z = + 1.65 Step 3: Step 4:
Step 5: There is not enough evidence to support the claim that
assistant professors earn more than 42,000 dollars a year. Example:
2.The medical rehabilitation Education Foundation reports that the
average cost of rehabilitation for stroke victims is 24,672
dollars. To see if the average cost of rehabilitation is
differentat a particular hospital, researcher selects a random
sample of 35 stroke victims at the hospital and finds that the
average cost of their rehabilitation is 25,226 dollars. The
standard deviation of the population is 3251.At = 0.01, can it be
concluded that the average cost of stoke rehabilitation at a
particular hospital is different from 24,672 dollars? Two Sample
Mean Test. Formula: where:= the variance of sample 1 = the variance
of sample 2 =size of sample 1 = size of sample 2
2221212 1n nx xZcomputedo o+=21o22o1n2nCritical Values of Z at
different level of Significance Test typeLevel of significance
.01.025.05.10 One tailed 2.33 1.961.6451.28 Two tailed 2.575
2.331.96 1.645 Example : 1.Asuppliersellsropes.Heclaimsthattheropes
have a mean strengthof 34 lbs and a variance
of64lbs.Arandomsampleof32ropes selectedfromashipmentyieldsamean
strengthof31lbs.Areyougoingtorejectthe claim of the supplier at .o5
level? 2.An admission test was administered to incoming
freshmenintwocolleges.Twoindependent
samplesof150studentseacharerandomly
selectedandthemeanscoresofthegiven samplesare88and85.Assumethatthe
variancesofthetestscoresare40and35
respectively.Isthedifferencebetweenthemean scores significant or
can be attributed to chance? Use .01 level significance.
T- test When the sample is small n < 30 andwhen only the
sample variance is known use the t- test. The use of t- test
involves the use of the degree of freedom of the distribution. The
degree of freedom ( df) varies accordingly to the particular type
of t test to be used.Degrees of Freedom (df) Are the number of
values that are free to vary after a sample statistic has been
computed, and they tell the researcher which specific curve to use
when a distribution consists of family of curves. OneSample mean
test Formula: where:df = n 1
sn Xtcomputed) ( =Steps on Hypothesis testing State the
Hypotheses and identify the claim Find the critical values Compute
the test value Make the decision to reject the or nor reject the
null hypothesis. Summarize the results. Examples: 1. A job
placement director claims that the average starting salary for
nurses is 24, 000 dollars.A sample of 10 nurses salaries has a mean
of 23,450 dollars anda standard deviation of 400 dollars. Is there
enough evidence to reject the directors claim at = 0.05? Solution:
Step 1 Step 2: the critical values are +2.262 and -2.262for = 0.05
and d.f. = 9 Step 3: Step 4: Step 5: There is enough evidence to
reject the claim that the starting salary of nurses is 24, 000
dollars. Examples: 2.An educator claims that the average salary of
substitute teachers in a school district is less than 60 dollars
per day. A random sample of eight school districts is selected, and
the daily salaries (in dollars) are shown. Is there enough evidence
to support the educators claim at = 0.10? 6056605570556055 Two
Sample Mean test Formula: where :df = n1+n2 - 2
2 1 2 122 221 12 11 12) 1 ( ) 1 (n n n ns n s nx xtcomputed+ - +
+ =Exercises :
1.ABC company, a manufacturer of automobile tires claims that
the average life of its product is 45, 600 miles. A random sample
of 15 tires was chosen and resulted to a mean life of 43, 500 miles
with standard deviation of 3, 000 miles. 2. It is claimed that the
mean drying time of a certain brand of nail polish is less than or
equal to 25 minutes. Would you agree to this claim if a random
sample of 16 bottles show a mean drying time of 26 minutes with a
standard deviation of 2.4 minutes, using .01 level of significance?
3. A random sample of 25 cartons of a certain brand of powdered
milk showed a mean content of 237 grams with a standard deviation
of 8.56 grams, while a sample of 20 cartons of another brand of
powdered milk showed a mean content of 240 grams with a standard
deviation of 9.75grams. Using a .05 level of significance, is there
a difference in the mean content of two brands of powdered milk?
CHI-SQUARE TEST The objective in Chi-square test is to compare the
differences of the sample frequencies with expected frequencies. As
in the case of t-test, the tabular/critical value of the chi-square
statistics is dependent on two factors the level of significance
and the degrees of freedom. The level of significance in this test
need not be divided by two. TEST FOR INDEPENDENCE
Thetestforindependenceisusedto determine whether two variables are
related ornot.Sincetwovariablesareinvolved,the
frequenciesareenteredinabivariatetable or contingency table. The
dimension of such tableisdefinedbytheexpressionrxc where r
indicates the number of rows and c
indicatesthenumbersofcolumn.Ifthenull
hypothesisforindependenceisrejected, thenarelationshipbetweenthetwo
variables exists. Formula: Where: = observed number of cases in the
ith row of the jth column
= expected number of cases under Ho Df =( r 1)(c -1) Df =( r
1)(c -1) Note: Thetestisvalidifatleast80%ofthecellshave
expectedfrequenciesofatleast5nocellhasan expected frequency 1
Ifmanyexpectedfrequenciesareverysmall,
researcherscommonlycombinecategoriesof
variablestoobtainatablehavinglargercell
frequencies.Generally,oneshouldnotpool categories unless there is a
natural way to combine them.
Fora2x2contingencytable,acorrectioncalled
Yatescorrectionforcontinuityisapplied.The formula then becomes.
Example: A survey was conducted to determine whether gender and age
are related among stereo shop customers. A total of 200 respondents
was taken and the results are presented below. Conducta test
whether gender and age of stereo shop costumers are independent at
1% level of significance. AgeGender MaleFemaleTotal Under 306050100
30 and over801090 TOTAL14060200 Test whether a persons music
preference is related to his intelligence as measured by IQ at 5%
level of significance. The observed frequencies are presented
below. Music Preference IQ HighMediumLowTotal Classical40261783
Pop475925131 Rock8310479266 TOTAL170189121480 Correlational
Analysis You are interested in testing the null hypothesis that two
variables are not correlated. Both variables are at the interval
level of measurement or higher. A normal distribution of responses
is not required. FORMULAS Pearson r 2222||.|
\|||.|
\|||.|
\|||.|
\|= NYNYNXNXNYNXNXYrWhere: X is the scores in a test Y is the
scores in a test N is the number of examinees Interpretation of the
Pearson r 0.90 to 1.00( -0.90 to -1.00)Very high positive/negative
correlation 0.70 to 0.90 (-0.70 to -0.90)High positive/negative
correlation 0.50 to 0.70 (-0.50 to -0.70)Moderate positive/negative
correlation 0.30 to 0.50 (-0.30 to -0.50)Low positive/
negativecorrelation 0.00 to 0.30 (0.00 to -0.30)Little , if any
correlation To know whether the obtained correlation coefficient is
significant i.e., that a real correlation exists or that the
obtained r is not merely due to a sampling variation a, t- test for
testing the significance of r could be used. FORMULA: df = n-2
Where: r = the obtained Pearson r n = sample size 212rnr t=Example:
A study was made to determine the relationship existing between the
grade in Calculus and the grade in Fortan Computer Language. A
random sample of 10 computer students in certain university were
taken and the following results of the sampling. Is the
relationship significant at 0.05 level? Student no. 12345678910
Calculus (x) 75838077897892869384 Fortan (y) 78877876928189899184
Analysis of Variance (ANOVA) Interested in testing a null
hypothesis to find whether or not the means in more than two
samples are the same. Very similar to the T-test (the T-test is in
fact a variation of ANOVA). Used to compare the means of more than
two groups. Can be used with small samples.