Lecture SamplingJJJJJKK

8/14/2019 Lecture SamplingJJJJJKK

http://slidepdf.com/reader/full/lecture-samplingjjjjjkk 1/134

BAHADAR SHAHCHAIRMAN,DEPARTMENT OF

MANAGEMENT SCIENCES



SAMPLING: A Scientific

Method of Data Collection



SAMPLE

•It is a Unit that selected from population

•Representers of the population

•Purpose to draw the inference



Very difficult to study each and every unit of thepopulation when population unit are heterogeneous

Time Constraints

Finance





The population having significant variations Heterogeneous), observation

of multiple individual needed to find all possible characteristics that may

exist



Population

The entire group of people of interest from whom the

researcher needs to obtain information

Element (sampling unit)

One unit from a population

Sampling

The selection of a subset of the population through varioussampling techniques

Sampling Frame

Listing of population from which a sample is chosen. Thesampling frame for any probability sample is a complete list of

all the cases in the population from which your sample will be

drown





Population Vs. Sample

Population of Interest

Sample

Population Sample

Parameter Statistic

We measure the sample using statistics in order to draw inferences about the

population and its parameters.





Representative

Accessible

Low cost



SAMPLING



Population

SampleSampling

Frame

Sampling Process

What you

want to talk

about

What you

actually

observe inthe data

Inference





Define the population

Identify the sampling frameSelect a sampling design or procedure

Determine the sample size

Draw the sample





Classification of Sampling Methods

Sampling

Methods

Probability

Samples

Simple

Randomluster

Systematic Stratified

Non-

probability

Quotaudgment

Convenience Snowball

Multistage



Each and every unit of the population has the

equal chance for selection as a sampling unit

Also called formal sampling or random sampling

Probability samples are more accurate

Probability samples allow us to estimate the

accuracy of the sample



Simple Random Sampling

Stratified Sampling

Cluster Sampling

Systematic Sampling

Multistage Sampling



Simple Random Sampling

The purest form of probability sampling

Assures each element in the population has an

equal chance of being included in the sample

Random number generators



Simple random sampling



Types of Simple Random Sample

With replacement

Without replacement



With replacement

The unit once selected has the chance for again

selection

Without replacement

The unit once selected can not be selected

again



Tippet method

Lottery Method

Random Table



6 8 4 2 5 7 9 5 4 1 2 5 6 3 2 1 4 05 8 2 0 3 2 1 5 4 7 8 5 9 6 2 0 2 4

3 6 2 3 3 3 2 5 4 7 8 9 1 2 0 3 2 5

9 8 5 2 6 3 0 1 7 4 2 4 5 0 3 6 8 6





Disadvantage

High cost; low frequency of use

Requires sampling frame

oes not use researchers’ expertise

Larger risk of random error than stratified



Population is divided into two or more groups

called strata, according to some criterion, such asgeographic location, grade level, age, or income,

and subsamples are randomly selected from eachstrata.

Elements within each strata are homogeneous, but

are heterogeneous ross strata



Stratified Random Sampling



Types of Stratified Random Sampling

Proportionate Stratified Random Sampling

Equal proportion of sample unit are selected from eachstrata

Disproportionate Stratified Random Sampling

Also called as equal allocation technique and sample unit

decided according to analytical consideration



Advantage

Assures representation of all groups in

sample population needed

Characteristics of each stratum can be

estimated and comparisons made

Reduces variability from systematic



Disadvantage

Requires accurate information on proportionsof each stratum

Stratified lists costly to prepare



Cluster Sampling





Section 4

Section 5

Section 3

Section 2ection 1



Advantage

Low cost/high frequency of use

Requires list of all clusters, but only of individuals within chosen

clusters

Can estimate characteristics of both cluster and population

For multistage, has strengths of used methods

Researchers lack a good sampling frame for a dispersed population



Disadvantage

The cost to reach an element to sample is very high

Usually less expensive than SRS but not as accurate

Each stage in cluster sampling introduces sampling error—the more stages there are, the more error there tends to be



Systematic Random Sampling

Order all units in the sampling frame based on some

variable and then every nth number on the list isselected

Gaps between elements are equal and Constant There is periodicity.

N= Sampling Interval







Advantage

Moderate cost; moderate usage

External validity high; internal validity high;

statistical estimation of error

Simple to draw sample; easy to verify



Periodic ordering

Requires sampling frame



Multistage Random Sampling

Pr im a ry S e c o n d a r y



1

2

3

4

5

6

7

8

9

1 0

C lu s t e rs

1

2

3

4

5

6

7

8

9

1 0

1 1

1 2

1 3

1 4

1 5

Cl u s te r s S i m p l e R a n d o m S a m p l i n g w i th i n S e c o n d a r y



Select all schools; thens mple

within schools

Sample schools; then measure ll students

Sample schools; then s mple students



Non Probability Sampling



Involves non random methods in selection of sample

All have not equal chance of being selected

Selection depend upon situation

Considerably less expensive

Convenient

Sample chosen in many ways



Purposive Sampling

Quota sampling larger populations)

Snowball sampling

Self-selection sampling

Convenience sampling



Purposive Sampling

Also called judgment Sampling

The sampling procedure in which an experienced research

selects the sample based on some appropriatecharacteristic of sample members… to serve a purpose

When taking sample reject, people who do not fit for a

particular profile

Start with a purpose in mind





Demerit

Bias selection of sample may occur

Time consuming process



Quota Sampling

The population is divided into cells on the basis of

relevant control characteristics.

A quota of sample units is established for each cell

A convenience sample is drawn for each cell until the

quota is met

It is entirely non random and it is normally used for

interview surveys



Used when research budget limited

Very extensively used/understood

No need for list of population elements

Introduces some elements of stratification

Demerit

Variability and bias cannot be measured orcontrolled

Time Consuming

Projecting data beyond sample not justified



The research starts with a key person and introduce the

next one to become a chain

Make contact with one or two cases in the population

Ask these cases to identify further cases.

Stop when either no new cases are given or the sample is

as large as manageable



Demerit

low cost

Useful in specific circumstances

Useful for locating rare populations

Bias because sampling units not independent




It occurs when you allow each case usually individuals, to

identify their desire to take part in the research you

therefore

Publicize your need for cases, either by advertising through

appropriate media or by asking them to take part

Collect data from those who respond



Demerit

More accurate

Useful in specific circumstances to serve the purpose

More costly due to Advertizing

Mass are left



Called as Accidental / Incidental Sampling

Selecting haphazardly those cases that are easiest

to obtain

Sample most available are chosen

It is done at the “convenience” of the researcher

Convenience Sampling



Merit



Merit

Very low cost

Extensively used/understood

No need for list of population elements

Demerit Variability and bias cannot be measured or

controlled


Restriction of Generalization



The ever increasing demand for research has created a need foran efficient method of determining the sample size needed to berepresentative of a given population. In the article “Small SampleTechniques,” the research division of the National EducationAssociation has published a formula for determining samplesize. Regrettably a table has not bee available for ready, easyreference which could have been constructed using the following

formula. s = X²NP(1− P) ÷ d² (N −1) + X²P(1− P). s = required sample size. X² = the table value of chi-square for 1 degree of freedom at the

desired confidence level (3.841). N = the population size.

P = the population proportion (assumed to be .50 since thiswould provide the maximum sample size). d = the degree of accuracy expressed as a proportion (.05).





N S N S N S

10 10 220 140 1200 291

15 14 230 144 1300 297

20 19 240 148 1400 302

25 24 250 152 1500 306

30 28 260 155 1600 310

35 32 270 159 1700 313

40 36 280 162 1800 317

45 40 290 165 1900 320

50 44 300 169 2000 322

55 48 320 175 2200 327

60 52 340 181 2400 331

65 56 360 186 2600 335

70 59 380 191 2800 338

75 63 400 196 3000 341

80 66 420 201 3500 346

85 70 440 205 4000 351

90 73 60 210 4500 354

95 76 480 214 5000 357

100 80 500 217 6000 361

110 86 550 226 7000 364

120 92 600 234 8000 367

130 97 650 242 9000 368

140 103 700 248 10000 370

150 108 750 254 15000 375

160 113 800 260 20000 377

170 118 850 265 30000 379

180 123 900 269 40000 380

190 127 950 274 50000 381

200 132 1000 278 75000 382 210 136 1100 285 1000000 384





In addition to the purpose of the study andpopulation size, three criteria usually willneed to be specified to determine theappropriate sample size: the level of

precision, the level of confidence or risk, andthe degree of variability in the attributesbeing measured (Miaoulis and Michener,1976). Each of these is reviewed below.



The level of precision, sometimes calledsampling error, is the range in which the truevalue of the population is estimated to be. Thisrange is often expressed in percentage points(e.g., ±5 percent) in the same way that results for

political campaign polls are reported by themedia. Thus, if a researcher finds that 60% offarmers in the sample have adopted arecommended practice with a precision rate of±5%, then he or she can conclude that between55% and 65% of farmers in the population haveadopted the practice.



The confidence or risk level is based on ideas encompassed under theCentral Limit Theorem. The key idea encompassed in the Central LimitTheorem is that when a population is repeatedly sampled, the averagevalue of the attribute obtained by those samples is equal to the truepopulation value. Furthermore, the values obtained by these samples aredistributed normally about the true value, with some samples having ahigher value and some obtaining a lower score than the true populationvalue. In a normal distribution, approximately 95% of the sample values

are within two standard deviations of the true population value (e.g.,mean). In other words, this means that if a 95% confidence level is selected, 95

out of 100 samples will have the true population value within the rangeof precision specified earlier (Figure 1). There is always a chance that thesample you obtain does not represent the true population value. Suchsamples with extreme values are represented by the shaded areas inFigure 1. This risk is reduced for 99% confidence levels and increased for

90% (or lower) confidence levels.



The third criterion, the degree of variability in the attributesbeing measured, refers to the distribution of attributes inthe population. The more heterogeneous a population, thelarger the sample size required to obtain a given level ofprecision. The less variable (more homogeneous) apopulation, the smaller the sample size. Note that aproportion of 50% indicates a greater level of variabilitythan either 20% or 80%. This is because 20% and 80%indicate that a large majority do not or do, respectively,have the attribute of interest. Because a proportion of .5indicates the maximum variability in a population, it isoften used in determining a more conservative samplesize, that is, the sample size may be larger than if the true

variability of the population attribute were used.



There are several approaches to determiningthe sample size. These include using a censusfor small populations, imitating a sample sizeof similar studies, using published tables, and

applying formulas to calculate a sample size.

Each strategy is discussed below.



1. Using a Census for Small Populations One approach is to use the entire population as the sample. Although

cost considerations make this impossible for large populations, a censusis attractive for small populations (e.g., 200 or less). A census eliminatessampling error and provides data on all the individuals in the population.In addition, some costs such as questionnaire design and developing thesampling frame are “fixed,” that is, they will be the same for samples of50 or 200. Finally, virtually the entire population would have to be

sampled in small populations to achieve a desirable level of precision. 2. Using a Sample Size of a Similar Study

Another approach is to use the same sample size as those of studiessimilar to the one you plan. Without reviewing the procedures employedin these studies you may run the risk of repeating errors that were madein determining the sample size for another study. However, a review ofthe literature in your discipline can provide guidance about “typical”

sample sizes that are used.



3. Using Published Tables A third way to determine sample size is to rely on

published tables, which provide the sample size for agiven set of criteria. Table 1 and Table 2 presentsample sizes that would be necessary for givencombinations of precision, confidence levels, and

variability. Please note two things. First, these samplesizes reflect the number of obtained responses andnot necessarily the number of surveys mailed orinterviews planned (this number is often increased tocompensate for nonresponse). Second, the samplesizes in Table 2 presume that the attributes beingmeasured are distributed normally or nearly so. If thisassumption cannot be met, then the entirepopulation may need to be surveyed.





Size of Population Sample Size (n) for Precision (e) of:

±5% ±7% ±10%100 81 67 51125 96 78 56150 110 86 61175 122 94 64200 134 101 67225 144 107 70

250 154 112 72275 163 117 74300 172 121 76325 180 125 77350 187 129 78375 194 132 80400 201 135 81425 207 138 82450 212 140 82



4. Using Formulas to Calculate a Sample Size Although tables can provide a useful guide

for determining the sample size, you mayneed to calculate the necessary sample size

for a different combination of levels ofprecision, confidence, and variability. Thefourth approach to determining sample sizeis the application of one of several formulas

(Equation 5 was used to calculate the samplesizes in Table 1 and Table 2 ).



A. Formula For Calcul ating A Sample For Proportions For populations that are large, Cochran (1963:75) developed the Equation 1 to

yield a epresentative sample for proportions. Z²pq

n₀ = ---------- e² Which is valid where n0 is the sample size, Z2 is the abscissa of the normal curve

that cuts off an area α at the tails (1 – α equals the desired confidence level, e.g.,

95%)1, e is the desired level of precision, p is the estimated proportion of anattribute that is present in the population, and q is 1-p. The value for Z is found instatistical tables which contain the area under the normal curve.

To illustrate, suppose we wish to evaluate a state-wide Extension program inwhich farmers were encouraged to adopt a new practice. Assume there is a largepopulation but that we do not know the variability in the proportion that will adoptthe practice; therefore, assume p=.5 (maximum variability). Furthermore, supposewe desire a 95% confidence level and ±5% precision. The resulting sample size isdemonstrated in Equation 2.

Z²pq (1.96)² (.5) (.5) n₀ = ---------- = ------------------------- = 385

farmers e² (.5)²



B. Finite Population Correct ion For Proportions If the population is small then the sample size can be reduced slightly. This is because a given

sample size provides proportionately more information for a small population than for a largepopulation. The sample size (n0)

can be adjusted using Equation 3.

n₀

n = --------------------

n₀-1

1+ -------------N

Where n is the sample size and N is the population size.

Suppose our evaluation of farmers‟ adoption of the new practice only affected 2,000 farmers.The sample size that would now be necessary is shown in Equation 4.

n₀ 385

n = ---------------- = ------------------ = 323 farmers

n₀-1 385-1

1+ ------------- 1 + -------------N 2000

As you can see, this adjustment (called the finite population correction) can substantiallyreduce the necessary sample size for small populations.



A Simplified Formula For Proportions Yamane (1967:886) provides a simplified formula tocalculate sample sizes. This formula was used to calculatethe sample sizes in Tables 2 and 3 and is shown below. A95% confidence level and P = .5 are assumed for Equation5.

Nn= --------------1+ (e)²

Where n is the sample size, N is the population size, and eis the level of precision. When this formula is applied tothe above sample, we get Equation 6.

N 2000n= ----------- = -------------- = 333

1+ (e)² 1+2000 (.5)²



Formula For Sample Size For The Mean

The use of tables and formulas to determine sample size in the above discussion employedproportions that assume a dichotomous response for the attributes being measured. There aretwo methods to determine sample size for variables that are polytomous or continuous. Onemethod is to combine responses into two categories and then use a sample size based onproportion (Smith, 1983). The second method is to use the formula for the sample size for themean. The formula of the sample size for the mean is similar to that of the proportion, exceptfor the measure of variability. The formula for the mean employs σ2 instead of (p x q), asshown in Equation 7.

Z²ɒ²

n₀= ---------e²

Where n0 is the sample size, z is the abscissa of the normal curve that cuts off an area σ at thetails, e is the desired level of precision (in the same unit of measure as the variance), and σ2 isthe variance of an attribute in the population. The disadvantage of the sample size based onthe mean is that a “good” estimate of the population variance is necessary.

Often, an estimate is not available. Furthermore, the sample size can vary widely from oneattribute to another because each is likely to have a different variance. Because of theseproblems, the sample size for the proportion is frequently preferred





In addition, an adjustment in the sample size may be needed toaccommodate a comparative analysis of subgroups (e.g., such as anevaluation of program participants with nonparticipants). Sudman (1976)suggests that a minimum of 100 elements is needed for each majorgroup or subgroup in the sample and for each minor subgroup, asample of 20 to 50 elements is necessary. Similarly, Kish (1965) saysthat 30 to 200 elements are sufficient when the attribute is present 20to 80 percent of the time (i.e., the distribution approaches normality).

On the other hand, skewed distributions can result in serious departuresfrom normality even for moderate size samples (Kish, 1965:17). Then alarger sample or a census is required.

Finally, the sample size formulas provide the number of responses thatneed to be obtained. Many researchers commonly add 10% to thesample size to compensate for persons that the researcher is unable tocontact. The sample size also is often increased by 30% to compensatefor nonresponse. Thus, the number of mailed surveys or planned

interviews can be substantially larger than the number required for adesired level of confidence and precision.



A hypothesis is a kind of truth claim about some aspect of theworld: for instance, the attitudes of patients or the prevalence of

a disease in a population. Research sets out to try to prove thistruth claim (or, more properly, to reject the null hypothesis - atruth claim phrased as a negative). For example, let us thinkabout the following hypothesis:

Levels of Efficiency are affected by Satisfaction and the relatednull hypothesis:

Levels of Efficiency are not affected by Satisfaction Let us imagine that we have this as our research hypothesis, and

we are planning research to test it. We will undertake a trial,comparing groups of employees who are working in differentorganization, to assess the extent of efficiency in thesedifferent groupings. Obviously the findings of a study -- whileinteresting in themselves -- only have value if they can be

generalised, to discover something about the topic which can beapplied in other organizations. If we find an association, then wewill want to do something to increase efficiency (by increasingsatisfaction). So our study has to have external validity, that is,the capacity to be generalised beyond the subjects actually in thestudy





Sample Errors

Non Sample Errors



Error caused by the act of taking a sample

They cause sample results to be different from the results of

census

Differences between the sample and the population that exist

only because of the observations that happened to be selected for

the sample

Statistical Errors are sample error

We have no control over



Non Response Error

Response Error

Not Control by Sample Size







respondent gives an incorrect answer, e.g. due to prestige or competence

implications, or due to sensitivity or social undesirability of question

respondent misunderstands the requirements

lack of motivation to give an accurate answer

“lazy” respondent gives an“average” answer

question requires memory/recall

proxy respondents are used, i.e. taking answers from someone other than

the respondent





The question is unclear, ambiguous or difficult to answer

The list of possible answers suggested in the recording instrumentis incomplete

Requested information assumes a framework unfamiliar to therespondent

The definitions used by the survey are different from those used by

the respondent e.g. how many part-time employees do you have?See next slide for an example)



Non-sampling errors are inevitable in production of national

statistics. Important that:-

At planning stage, all potential non-sampling errors are listed and stepstaken to minimise them are considered.

If data are collected from other sources, question procedures adoptedfor data collection, and data verification at each step of the data chain.

Critically view the data collected and attempt to resolve queries

immediately they arise. Document sources of non-sampling errors so that results presented can

be interpreted meaningfully.



What any researcher wants is to be right! They want to discoverthat there is an association between two variables: say, asthmaand traffic pollution, but only if such an association really exists.If there is no such association, then they want their study tosupport the null hypothesis that the two are not related. (Whilethe former may be more exciting, both are important findings).

What no researcher wants is to be wrong! No-one wants to find

an association which does not really exist, or - just asimportantly - not find an association which does exist. Both such situations can arise in any piece of research. The first

(finding an association which is not really there) is called a Type Ierror. It is the error of falsely rejecting a true null hypothesis.(Think through this carefully. What we are talking about herecould also be called a false positive. An example would be a

study which rejects the null hypothesis that there is noassociation between ill-health and deprivation. The findingssuggest such an association, but in reality, no such relationshipexists.)



The measurement of such generalisability of a study is done by

statistical tests of inference. You may be familiar with some such tests:tests such as the chi-squared test, the t-test, and tests of correlation.We will not look at these tests in any detail, but we need to understandthat the purpose of these and other tests of statistical inference is toassess the extent to which the findings of a study can be accepted asvalid for the population from which the study sample has been drawn. Ifthe statistics we use suggest that the findings are 'true', then we can behappy to conclude (within certain limits of probability), that the study'sfindings can be generalised, and we can act on them (to improvenutrition among children under five years, for instance).

From common sense, we see that the larger the sample is, the easier it isto be satisfied that it is representative of the population from which it isdrawn: but how large does it need to be? This is the question that weneed to answer, and to do so, we need to think a little more about thepossibilities that our findings may not reflect reality: that we have

committed an error in our conclusions.





For any piece of research that tries to make inferences from asample to a population

there are four possible outcomes: two are desirable, tworender the research worthless.

Figure 1 shows these four possible outcomes

diagrammatically.

Null Hypothesisis

POPULATION

False True

False Cell 1

CurrentResult

Cell 2

Type 1 Error(Alpha)

True Cell 3Type IIerror(beta)

Cell 4Current Result





Cell 3. This cell similarly reflects an undesirable outcome of a study.

Here, as in Cell 4, a study supports the null hypothesis, implying thatthere is no association between ill health and deprivation in thepopulation under investigation. But in reality, the null hypothesis is falseand there is an association in the real world which the study does notfind. This mistake is the Type II error of accepting a false nullhypothesis. and is the result of having a sample size which is too smallto allow detection of the association by statistical tests at an acceptablelevel of significance (say p = 0.05). The likelihood of committing a TypeII error is the beta (β) value of a statistical test, and the value (1 - β ) isthe statistical power of the test. Thus the statistical power of a test isthe likelihood of avoiding a Type II error i.e. the probability that the testwill reject the null hypothesis when the null hypothesis is false.Conventionally, a value of 0.80 or 80% is the target value for statisticalpower, representing a likelihood that four times out of five a study willreject a false null hypothesis, although values greater than 80% e.g. 90%

are also sometimes used. Outcomes of studies which fall into cell 3 areincorrect; β or its complement (1-β) are the measures of the likelihoodof such an outcome of a study.





Not all quantitative studies involve hypothesis-testing,some studies merely seek to describe the phenomenaunder examination. Whereas hypothesis testing willinvolve comparing the characteristics of two or moregroups, a descriptive survey may be concerned solely withdescribing the characteristics of a single group. The aim ofthis type of survey is often to obtain an accurate estimate

of a particular figure, such as a mean or a proportion. Forexample, we may want to know how many times, in anaverage week, that a general practitioner sees patientsnewly presenting with asthma. In addition we may alsowant to know what proportion of these patients admit tosmoking five or more cigarettes a day. In these

circumstances, the aim is not to compare this figure withanother group, but rather, to accurately reflect the realfigure in the wider population.





2. The degree of precision which we can accept. This is often

presented in the form of a confidence interval. For example, asurvey of a sample of patients indicates that 35 per cent smoke.Are we willing to accept that the figure for the wider populationlies between 25 and 45 per cent, (allowing a margin for randomerror (MRE) of 10% either way), or do we want to be more precise,such that the confidence interval is three per cent each way, andthe true figure falls between 32 and 38 per cent? As we can seefrom the following table, the smaller the allowed margin forrandom error, the larger the sample must be.

Margin for random error Sample size+ or – 10% 88+ or – 5% 350

+ or – 3% 971+ or – 2% 2188+ or – 1% 8750



How large must a sample be to estimate the mean value of the population?

Suppose we wish to measure the number of times that the average patientwith asthma consults her/his general practitioner for treatment?

a) First, the SE (standard error) is calculated by deciding upon the accuracylevel which you require. If, for instance, you wish your survey to produce avery accurate answer with only a small confidence interval, then you mightdecide that you want to be 95% confident that the mean average figureproduced by your survey is no more than plus or minus two visits to the GP.

For example, if you thought that your survey might produce a mean estimate

of 12.5 visits per year, then your confidence interval in this case would be12.5 ± two visits. Your confidence interval would then tell you that you couldreasonably (more detail on what „reasonably‟ means below!) expect the trueaverage rate of visits in the population to be somewhere between 10.5 and14.5 visits per year.

Now decide on your required significance level. If you decide on 95%,

(meaning that 19 times out of 20 the true population mean falls within theconfidence limit of 10.5 and 14.5 visits), the standard error is calculated bydividing the MRE by 1.96. So, in this case, the standard error is 2 divided by1.96 = 1.02.

If you want a 95 confidence interval, then divide the maximum acceptableMRE (margin for random error) by 1.96 to calculate the SE.

If instead you want a 99 confidence interval, then divide the maximum



b) The formula to calculate the sample size for a mean (or point) estimate is:

N = SE 2

SD where N = the required sample size, SD = the stand ard deviation, and SE = the standard error of the mean The standard deviation could be estimated either by looking at some previous

study or by carrying out a pilot study. Suppose that previous data showed that thestandard deviation of the number of visits made to a GP in a year was 10, then wewould input this into the formula as follows:N = SE ² = 10 ² = ( 9.8) ² = 96.12 = 97 (rounded to nearest patient)

SD 1.02 If we are to be 95% confident that the answer achieved is correct ± two visits, then

the required sample is 97 - before making allowance for a proportion of thepeople leaving the study early and failing to provide outcome data.





N = P(100% -P

(SE)² With P = 70% and SE=2.55, we have: N= 70% (100% 70%) = 2100 = 323.28 = 324 (rounded upwards) 2.55% 6.50 So, in order to be 95% confident that the true proportion of people saying they are

satisfied lies within ± 5% of the answer, we will require a sample size of 324. Thisassumes that the likely answer is around 70% with a range between 65% and 75%.

Of course, in real life, we often have absolutely no idea what the likely proportionis going to be. There may be no previous data and no time to carry out a pilot. Inthese circumstances, it is safer to assume the worst case scenario and assume thatthe proportion is likely to be 50%. Other things being equal, this will allow for thelargest possible sample size - and in most circumstances it is preferable to have aslight overestimate of the number of people needed, rather than anunderestimate.

(If we wished to use a 99% level of significance, so we might be 99% confident thatour confidence parameters include the true figure, then we need to divide theconfidence interval by 2.56. In this case, the standard error would be 5/2.56 =1.94. Using the formula above, we find that this would require a sample size of558.)



As we saw earlier in this pack, studies which testhypotheses (seeking to generalise from a studyto a population), need sufficient power tominimise the likelihood of Type I and Type IIerrors. Both statistical significance and statistical

power are affected by sample size. The chancesof gaining a statistically significant result will beincreased by enlarging a study's sample. Putanother way, the statistical power of a study isenhanced as sample size increases. Let us look at

each of these aspects of inferential research inturn.



When a researcher uses a statistical test, what they are doing is testing theirresults against a gold standard. If the test gives a positive result (this is usuallyknown as 'achieving statistical significance'), then they can be relatively satisfied

that their results are 'true', and that the real world situation is that discovered inthe study (Cell 1 in Fig 1). If the test does not give significant results (non-significant or NS), then they can be reasonably satisfied that the results reflect Cell4, where they have found no association and no such association exists.

However, we can never be absolutely certain that we have a result which falls inCells 1 or 4. Statistical significance represents the likelihood of committing a TypeI error (Cell 2).Let us imagine that we have results suggesting an associationbetween ill-health and deprivation, and a t-test (a test to compare the results oftwo different groups) gives a value which indicates that at the 5% or 0.05 level of

statistical significance, there is more ill-health among a group of high scorers onthe Jarman Index of deprivation than among a group of low scorers.

What this means is that 95 per cent of the time, we can be certain that this resultreflects a true effect (Cell 1). Five per cent of the time, it is a chance result,resulting from random associations in the sample we chose. If the t-test value ishigher, we might reach 1% or 0.01 significance. Now, the result will only be achance association one per cent of the time .

Tests of statistical significance are designed to account for sample size, thus the

larger a sample; the 'easier' it is for results to reach significance. A study whichcompares two groups of 10 patients will have to demonstrate a much greaterdifference between the groups than a study with 1000 patients in each group. Thisis fair: the larger study is much more likely to be 'representative' of a populationthan the smaller one. To summarize: statistical significance is a measure of thelikelihood that positive results reflect a real effect, and that the findings can

be used to make conclusions about differences which really exist.



Because of the way statistical tests are designed, as we have just seen, they build

in a safety margin to avoid generalising false positive results which could havedisastrous or expensive consequences. But researchers who use small samplesalso run the risk of not being able to demonstrate differences or associationswhich really do exist. Thus they are in danger of committing a Type II error (Cell 3in Fig 1), of accepting a false null hypothesis. Such studies are „under-powered‟,not possessing sufficient statistical power to detect the effects they set out todetect. Conventionally, the target is a power of 80% or 0.8, meaning that a studyhas an 80 per cent likelihood of detecting a difference or association which reallyexists.

Examination of research undertaken in various fields of study suggests that manystudies do not meet this 0.8 conventional target for power (Fox and Mathers1997).

What this means is that many studies have a much reduced likelihood of beingable to discern the effects which they set out to seek: a study, with a power of0.66 for some specified treatment effect, will only detect that effect (if true) twotimes out of three. A non-significant finding of a study may thus simply reflect theinadequate power of the study to detect differences or associations at levels which

are conventionally accepted as statistically significant.



When a study has only small (say less than 50%) power to detect

a useful result, one must ask the simple question of suchresearch: „Why did you bother when your study had little chanceof finding what you set out to find?‟

Sample size calculations need to be undertaken prior to a studyto avoid both the

wasteful consequences of under-powering, (or of overpowering

in which sample sizes are excessively large, with higher thannecessary study costs and, perhaps, the needless involvement oftoo many patients, which has ethical implications.).

Statistical power calculations are also sometimes undertakenafter a study has been completed, to assess the likelihood of astudy having discovered effects.

Statistical power is a function of three variables: sample size, the

chosen level of statistical significance (a) and effect size. While calculation of

power entails recourse to tables of values for these variables, thecalculation is relatively straightforward in most cases.



As was mentioned earlier, there is a trade-off between significance and power, becauseas one tries to reduce the chances of generating false negative results, the likelihood ofa false positive result increases. Researchers need to decide which is more crucial, and

set the significance level accordingly. In Exercise 3 you were asked to decide, in varioussituations, whether a Type I or Type II error was more serious - based on clinical andother criteria.

Fortunately both statistical significance and power are increased by increasing samplesize, so increasing sample size will reduce likelihoods of both Type I and Type II errors.However, that does not mean that researchers necessarily need to vastly increase thesize of their samples, at great expense of time and resources.

The other factor affecting the power of a study is the effect size (ES) which is underinvestigation in the study. This is a measure of „how wrong the null hypothesis is‟. Forexample, we might compare the efficacy of two bronchodilators for treating an asthmaattack. The ES is the difference in efficacy between the two drugs. An effect size may bea difference between groups or the strength of an association between variables such asill-health and deprivation.

If an ES is small, then many studies with small sample sizes are likely to be

underpowered. But if an ES is large, then a relatively small scale study could havesufficient power to identify the effect under investigation. It is sometimes possible toincrease the effect size (for example, by making more extreme comparisons, orundertaking a longer or more powerful intervention), but usually this is the intractableelement in the equation, and accurate estimation of the effect size is essential forcalculating power before a study begins, and hence the necessary sample size.



However, when critiquing business educationresearch, Wunsch (1986) stated that “two ofthe most consistent flaws included

(1) is regard for sampling error when

determining sample size, and (2) disregard for response and nonresponse

bias” (p. 31).



The question then is, how large of a sample is required to

infer research findings back to a population? Standard textbook authors and researchers offer tested

methods that allow studies to take full advantage ofstatistical measurements, which in

turn give researchers the upper hand in determining thecorrect sample size. Sample size is one of the four inter-related features of a study design that can influence thedetection of significant differences, relationships orinteractions (Peers, 1996). Generally, these survey designstry to minimize both alpha error (finding a difference thatdoes not actually exist in the population) and beta

error (failing to find a difference that actually exists in thepopulation) (Peers, 1996).



However, improvement is needed. Researchersare learning experimental statistics from highlycompetent statisticians and then doing their bestto apply the formulas and approaches they learnto their research design. A simple survey of

published manuscripts reveals numerous errorsand questionable approaches to sample sizeselection, and serves as proof that improvementis needed. Many researchers could benefit from areal-life primer on the tools needed to properly

conduct research, including, but not limited to,sample size selection.



Primary Variables of Measurement

The researcher must make decisions as to which variables will beincorporated into formula calculations. For example, if the researcherplans to use a seven-point scale to measure a continuous variable, e.g.,

job satisfaction, and also plans to determine if the respondents differ bycertain categorical variables, e.g., gender, tenured, educational level,etc., which ariable(s) should be used as the basis for sample size? Thisis important because the use of gender as the primary variable will resultin a substantially larger sample size than if one used the seven-pointscale as the primary variable of measure. Cochran (1977) addressed thisissue by stating that “One method of determining sample size is tospecify margins of error for the items that are regarded as most vital tothe survey. An estimation of the sample size needed is first madeseparately for each of these important items” (p. 81). When thesecalculations are completed, researchers will have a range of n‟s, usuallyranging from smaller n‟s for scaled, continuous variables, to larger n‟s

for dichotomous or categorical variables. The researcher should makesampling decisions based on these data. If the n‟s for the variables ofinterest are relatively close, the researcher can simply use the largest nas the sample size and be confident that the sample size will provide thedesired results.



More commonly, there is a sufficient variation amongthe n‟s so that we are reluctant to choose the largest,either from budgetary considerations or because thiswill give an over-all standard of precisionsubstantially higher than originally contemplated. Inthis event, the desired standard of precision may berelaxed for certain of the items, in order to permit theuse of a smaller value of n (Cochran, 1977, p. 81).

The researcher may also decide to use thisinformation in deciding whether to keep all of thevariables identified in the study. “In some cases, then‟s are so discordant that certain of them must bedropped from the inquiry; . . .” (Cochran, 1977, p. 81)



Cochran‟s (1977) formula uses two key factors: (1)the risk the researcher is willing to accept in the

study, commonly called the margin of error, or theerror the researcher is willing to accept, and (2) thealpha level, the level of acceptable risk the researcheris willing to accept that the true margin of error

exceeds the acceptable margin of error; i.e., theprobability that differences revealed by statisticalanalyses really do not exist; also known as Type Ierror. Another type of error will not be addressedfurther here, namely, Type II error, also known asbeta error. Type II error occurs when statistical

procedures result in a judgment of no significantdifferences when these differences do indeed exist





Acceptable Margin of Error. The general rule relative to

acceptable margins of error in educational and socialresearch is as follows: For categorical data, 5% margin oferror is acceptable, and, for continuous data, 3% margin oferror is acceptable (Krejcie & Morgan, 1970). For example,a 3% margin of error would result in the researcher beingconfident that the true mean of a seven point scale is

within ±.21 (.03 times seven points on the scale) of themean calculated from the research sample. For adichotomous variable, a 5% margin of error would result inthe researcher being confident that the proportion ofrespondents who were male was within ±5% of theproportion calculated from the research sample.

Researchers may increase these values when a highermargin of error is acceptable or may decrease these valueswhen a higher degree of precision is needed.



A critical component of sample size formulas is the estimation of

variance in the primary variables of interest in the study. Theresearcher does not have direct control over variance and mustincorporate variance estimates into research design. Cochran(1977) listed four ways of estimating population variances forsample size determinations: (1) take the sample in two steps,and use the results of the first step to determine how manyadditional responses are needed to attain an appropriate sample

size based on the variance observed in the first step data; (2) usepilot study results; (3) use data from previous studies of thesame or a similar population; or (4) estimate or guess thestructure of the population assisted by some logicalmathematical results. The first three ways are logical andproduce valid estimates of variance; therefore, they do not needto be discussed further.

However, in many educational and social research studies, it isnot feasible to use any of the first three ways and the researchermust estimate variance using the fourth method.



A researcher typically needs to estimate the variance of scaled and

categorical variables. To estimate the variance of a scaled variable, onemust determine the inclusive range of the scale, and then divide by thenumber of standard deviations that would include all possible values inthe range, and then square this number. For example, if a researcherused a seven-point scale and given that six standard deviations (three toeach side of the mean) would capture 98% of all responses, thecalculations would be as follows:

7 (number of points on the scale)S = ---------------------------------------------

6 (number of standard deviations)When estimating the variance of a dichotomous (proportional) variablesuch as gender, Krejcie and Morgan (1970) recommended thatresearchers should use .50 as an estimate of the population proportion.This proportion will result in the maximization of variance, which willalso produce the maximum sample size. This proportion can be used toestimate variance in the population. For example, squaring .50 willresult in a population variance estimate of .25 for a dichotomousvariable.



Before proceeding with sample size calculations, assuming continuous

data, the researcher should determine if a categorical variable will play aprimary role in data analysis. If so, the categorical sample size formulasshould be used. If this is not the case, the sample size formulas forcontinuous data described in this section are appropriate.

Assume that a researcher has set the alpha level a priori at .05, plans touse a seven point scale, has set the level of acceptable error at 3%, andhas estimated the standard deviation of the scale as 1.167. Cochran‟ssample size formula for continuous data and an example of its use ispresented here along with the explanations as to how these decisionswere made.

(t)2 * (s)2 (1.96)2(1.167)2 no = ------------- = -------------- = 118 (d)2 (7*.03)2 Where t = value for selected alpha level of .025 in each tail = 1.96 (the

alpha level of .05 indicates the level of risk the researcher is willing totake that true margin of error may exceed the acceptable margin oferror.)



Where s = estimate of standard deviation in the population = 1.167. (estimate of

variance deviation for 7 point scale calculated by using 7 [inclusive range of scale]divided by 6 [number of standard deviations that include almost all (approximately98%) of the possible values in the range]).

Where d = acceptable margin of error for mean being estimated = .21. (numberof points on primary scale * acceptable margin of error; points on primary scale =7; acceptable margin of error = .03 [error researcher is willing to except]).

Therefore, for a population of 1,679, the required sample size is 118. However,since this sample size exceeds 5% of the population (1,679*.05=84), Cochran‟s(1977) correction formula should be used to calculate the final sample size. Thesecalculations are as follows:

no (118) n= -------------------- = ---------------- = 111 (1 + no / Population) (1 + 118/1679) Where population size = 1, 679. Where n0 = required return sample size

according to Cochran‟s formula= 118. Where n1 = required return sample size because sample > 5% of population.



These procedures result in the minimum returned sample size. If a

researcher has a captive audience, this sample size may be attainedeasily. However, since many educational and social research studiesoften use data collection methods such as surveys and other voluntaryparticipation methods, the response rates are typically well below 100%.Salkind (1997) recommended oversampling when he stated that “If youare mailing out surveys or questionnaires, . . . . Count on increasing yoursample size by 40%-50% toaccount for lost mail and uncooperativesubjects” (p. 107). Fink (1995) stated that “Oversampling can add costs

to the survey but is often necessary” (p. 36). Cochran (1977) stated that“A second consequence is, of course, that the variances of estimates areincreased because the sample actually obtained is smaller than thetarget sample. This factor can be allowed for, at least approximately, inselecting the size of the sample” (p. 396).

However, many researchers criticize the use of over-sampling to ensurethat this minimum sample size is achieved and suggestions on how to

secure the minimal sample size are scarce.



If the researcher decides to use oversampling, four methods may be used to

determine the anticipated response rate: (1) take the sample in two steps, and usethe results of the first step to estimate how many additional responses may beexpected from the second step; (2) use pilot study results; (3) use responses ratesfrom previous studies of the same or a similar population; or (4) estimate theresponse rate. The first three ways are logical and will produce valid estimates ofresponse rates; therefore, they do not need to be discussed further. Estimatingresponse rates is not an exact

science. A researcher may be able to consult other researchers or review theresearch literature in similar fields to determine the response rates that have been

achieved with similar and, if necessary, dissimilar populations Therefore, in this example, it was anticipated that a response rate of 65% would be

achieved based on prior research experience. Given a required minimum samplesize (corrected) of 111, the following calculations were used to determine thedrawn sample size required to produce the minimum sample size:

Where anticipated return rate = 65%. Where n2 = sample size adjusted for response rate. Where minimum sample size (corrected) = 111. Therefore, n2 = 111/.65 = 171.



The sample size formulas and procedures used for

categorical data are very similar, but some variations doexist. Assume a researcher has set the alpha level a prioriat .05, plans to use a proportional variable, has set thelevel of acceptable error at 5%, and has estimated thestandard deviation of the scale as .5. Cochran‟s samplesize formula for categorical data and an example of its use

is presented here along with explanations as to how thesedecisions were made. (t)2 * (p)(q) no= --------------------- (d)2 (1.96)2(.5)(.5) no= ---------------------- = 384 (.05)2



Th l l i f ll



These calculations are as follows:

no

n1= ----------------------

(1 + no / Population)

(384)

n1= --------------------- = 313

(1 + 384/1679)

Where population size = 1,679

Where n0 = required return sample size according to Cochran‟s formula=384

Where n1 = required return sample size because sample > 5% of populationThese procedures result in a minimum returned sample size of 313. Usingthe same oversampling procedures as cited in the continuous data example,and again assuming a response rate of 65%, a minimum drawn sample size

of 482 should be used. These calculations were based on the following: Where anticipated return rate = 65%.

Where n2 = sample size adjusted for response rate.

Where minimum sample size (corrected) = 313.

Therefore, n2 = 313/.65 = 482



Table 1 presents sample size values that will beappropriate for many common samplingproblems. The table includes sample sizes forboth continuous and categorical data assumingalpha levels of .10, .05, or .01. The margins of

error used in the table were .03 for continuousdata and .05 for 48 Bartlett, Kotrlik, & Higginscategorical data. Researchers may use this tableif the margin of error shown is appropriate fortheir study; however, the appropriate sample size

must be calculated if these error rates are notappropriate.



Situations exist where the procedures described in the previous paragraphs

will not satisfy the needs of a study and two examples will be addressedhere. One situation is when the researcher wishes to use multiple regressionanalysis in a study. To use multiple regression analysis, the ratio ofobservations to independent variables should not fall below five. If thisminimum is not followed, there is a risk for overfitting, “. . . making theresults too specific to the sample, thus lacking generalizability” (Hair,Anderson, Tatham, & Black, 1995, p. 105). A more conservative ratio, of tenobservations for each independent variable was reported optimal by Millerand Kunce (1973) and Halinski and Feldt (1970).

These ratios are especially critical in using regression analyses withcontinuous data because sample sizes for continuous data are typically muchsmaller than sample sizes for categorical data. Therefore, there is apossibility that the random sample will not be sufficient if multiple variables

are used in the regression analysis. For example, in the continuous dataillustration, a population of 1,679 was utilized and it was determined that aminimum returned sample size of 111 was required. The sample size for apopulation of 1,679 in the categorical data example was 313. Table 2,developed based on the recommendations cited in the previous paragraph,uses both the five to one and ten to one ratios.





Sample size for:Maximum numberof regressors ifratio is:

5 to 1 10 to 1

Continuous data: n = 111 22 11Categorical data: n = 313 62 31



As shown in Table 2, if the researcher uses the optimal ratio of ten to

one with continuous data, the number of regressors (independentvariables) in the multiple regression model would be limited to 11.Larger numbers of regressors could be used with the other situationsshown. It should be noted that if a variable such as ethnicity isincorporated into the categorical example, this variable must be dummycoded, which will result in multiple variables utilized in the model ratherthan a single variable. One variable for each ethnic group, e.g., White,Black, Hispanic, Asian, American Indian would each be coded as 1=yes

and 2=no in the regression model, which would result in five variablesrather than one in the regression model. In the continuous data example, if a researcher planned to use 14

variables in a multiple regression analysis and wished to use the optimalratio of ten to one, the returned sample size must be increased from111 to 140. This sample size of 140 would be calculated from taking thenumber of independent variables to be entered in the regression

(fourteen) and multiplying them by the number of the ratio (ten). Cautionshould be used when making this decision because raising the samplesize above the level indicated by the sample size formula will increasethe probability of Type I error.



If the researcher plans to use factor analysis in a study, the same ratio considerationsdiscussed under multiple regression should be used, with one additional criteria, namely, thatfactor analysis should not be done with less than 100 observations. It should be noted

that an increase in sample size will decrease the level at which an item loading on a factor issignificant. For example, assuming an alpha level of .05, a factor would have to load at a levelof .75 or higher to be significant in a sample size of 50, while a factor would only have to loadat a level of .30 to be significant in a sample size of 350 (Hair et al., 1995). Sampling non-respondents. Donald (1967), Hagbert (1968), Johnson (1959), and Miller and Smith (1983)recommend that the researcher take a random sample of 10-20% of non-respondents to use innon-respondent follow-up analyses. If nonrespondents are treated as a potentially differentpopulation, it does not appear that this recommendation is valid or adequate. Rather, theresearcher could consider using Cochran‟s formula to determine an adequate sample ofnonrespondents for the non-respondent follow-up response analyses.

Budget, time and other constraints. Often, the researcher is faced with various constraints thatmay force them to use inadequate sample sizes because of practical versus statistical reasons.These constraints may include budget, time, personnel, and other resource limitations. Inthese cases, researchers should report both the appropriate sample sizes along with thesample sizes actually used in the study, the reasons for using inadequate sample sizes, and adiscussion of the effect the inadequate sample sizes may have on the results of the study. Theresearcher should exercise caution when making programmatic recommendations based onresearch conducted with inadequate sample sizes



Lecture SamplingJJJJJKK

Documents