Top Banner
Complex sampling design & analysis. A revision Assoc. Prof. Dr. Jamalludin Ab Rahman MD MPH Department of Community Medicine Kulliyyah of Medicine
26

Complex sampling design & analysis

Feb 08, 2017

Download

Education

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Complex sampling design & analysis

Complex sampling design & analysis. A revisionAssoc. Prof. Dr. Jamalludin Ab Rahman MD MPHDepartment of Community MedicineKulliyyah of Medicine

Page 2: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

2

Content Sampling method & sample size for survey What is complex sampling method Sampling weight Complex sampling analysis

6-7t

h Ap

ril 2

016

Page 3: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

3

About sampling Not feasible to select ALL population Best sampling should be able to represent population Sampling error occurs when statistics ≠ parameters Sampling error is not sampling bias Sampling error is random, sampling bias is predictable

(systematic) Sampling design affects sampling error Standard error measures sampling error

6-7t

h Ap

ril 2

016

Page 4: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

4

6-7t

h Ap

ril 2

016

The aim of any sampling plan should is to reduce sampling error, and to avoid sampling

bias

Page 5: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

5

Describe the sample Target population – inferred population Study population – representative of the target population Sampling frame – list of sampling unit Sampling unit – unit to be sampled Observation unit – unit to be observed/measured

6-7t

h Ap

ril 2

016

Page 6: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

6

Sampling method Random vs. non-random Random ensures representativeness Simple vs. complex SRS = all samples have equal chance to be

selected i.e. equal probability of selection

Anything not SRS is complex sampling

6-7t

h Ap

ril 2

016

Page 7: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

7

6-7t

h Ap

ril 2

016

Simple Random Sampling Systematic

Random Sampling

Stratified Random Sampling

Page 8: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

8

Stratified versus cluster sampling Stratified for heterogeneous groups

e.g. male-female, age groups Cluster for homogenous groups – rarely

homogenous, only in ideal situation e.g. schools, districts

6-7t

h Ap

ril 2

016

Page 9: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

9

6-7t

h Ap

ril 2

016

Cluster Stratified

• There are clusters not selected at all

• Large variance

• All strata selected• Smaller variance

Page 10: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

Design Effect (deff) Design Effect = How much the sample differ from population Different value for different variable Usually deff for complex survey >> 1 If > 1.5, meaning effective loss 50% of sample if

designed using SRS

6-7t

h Ap

ril 2

016

10

Page 11: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

Design Factor (deft) Design factor (deft) is sqrt(deff) ~ effect of

sampling to standard error If deft = 2, the SE is twice larger than if the

sampling design is SRS The use of deff or deft, is as guide (a priori) to

measure sample size or to measure whether sample size has been adequately achieved (post hoc)

6-7t

h Ap

ril 2

016

11

Page 12: Complex sampling design & analysis

Sampling Weight aka Probability Weight N/n (inverse of sampling fraction) Two stage = (N1/n1)*(N2/n2) The sum of PW = population Weighting can increase standard error

12

Page 13: Complex sampling design & analysis

Sampling weight… Why? There is always imperfection in sampling Weighting will try to correct

1. Unequal probability of selection – base/design weight

2. Non-response bias3. Stratification in population – trying to represent true

characteristics of population e.g. by sex, ethnic etc. – post stratification

Slide | 13

Page 14: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

14

Example N = 100,000 people Sample (n) = 1000 Therefore, SW = 100,000/1000 = 100 Every 1 sample represents 100 people in that

region

6-7t

h Ap

ril 2

016

Page 15: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

15

Example – two stage 6-7t

h Ap

ril 2

016

 Grade  Class Students SW1 SW2 SWN1 n1 N2 n2 N1/n1 N2/n2 SW1*SW2

1 5 3 150 30 1.7 5.0 8.32 6 3 180 30 2.0 6.0 12.03 6 3 175 30 2.0 5.8 11.74 7 3 185 30 2.3 6.2 14.45 4 3 170 30 1.3 5.7 7.6

* Non-proportionate distribution

Page 16: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

16

Example – stratified, one-stage 6-7t

h Ap

ril 2

016

Population Size   Sample Size   Sampling Weight

  District 1 District 2   District 1 District 2   District 1 District 2

  Urban Rural Urban Rural   Urban Rural Urban Rural   Urban Rural Urban Rural

Under 18 10000 13000 20000 15000 100 100 100 100 100 130 200 150

18-60 30000 25000 60000 45000 100 100 100 100 300 250 600 450

Above 60 5000 7000 5000 10000   100 100 100 100   50 70 50 100

45000 45000 85000 70000 300 300 300 300

1 sample from District 1 urban represents 100 people1 sample from District 2 urban represents 200 people

* Non-proportionate distribution

Page 17: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

17

Complex sampling analysis Accommodate sampling weight Adjust for standard error

6-7t

h Ap

ril 2

016

Page 18: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

18

Estimating standard error Linearization method

(Taylor’s series) – assume linear association Replication method – sub-sample & calculate

variance for each samples – e.g. BRR (Balanced Repeated Replication), Jacknife, bootstrapping

6-7t

h Ap

ril 2

016

Page 19: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

19

Practical Session

6-7t

h Ap

ril 2

016

Page 20: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

20

Practical Sampling distribution Calculating sampling weight Preparing data for analysis Complex sample analysis (using SPSS)

6-7t

h Ap

ril 2

016

Page 21: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

21

Sampling distribution Using 2016 adult household by location

(urban/rural) in Malaysia, prepare sampling distribution to represent up to Malaysian urban/rural if the sample size calculated is 10,000 respondents

Taking 12 LQ per EB and 2 adults per LQ Proportionate to size

6-7t

h Ap

ril 2

016

Page 22: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

22

6-7t

h Ap

ril 2

016

    Population Size by census ('000)*

No. State Urban Rural Total

1 Johor 1,682 537 2,219

2 Kedah 905 433 1,338

3 Kelantan 508 543 1,050

4 Melaka 537 47 584

5 Negeri Sembilan 492 198 690

6 Pahang 564 427 991

7 Perak 1,260 394 1,653

8 Perlis 102 66 167

9 Pulau Pinang 1,069 69 1,138

10 Sabah 1,064 597 1,661

11 Sarawak 1,009 694 1,703

12 Selangor 3,583 274 3,857

13 Terengganu 450 250 700

14 WP Kuala Lumpur 1,133 1,133

15 WP Labuan 50 6 57

16 WP Putrajaya 46 46

    14,454 4,533 18,987

Page 23: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

23

Calculating sampling weight 6-7t

h Ap

ril 2

016

PSU (Kindergarten) SSU (Children) URBAN RURAL URBAN RURAL

  Total population *

Kindergarten visited

Total population *

Kindergarten visited

Total population *

Children Examined

Total population *

Children Examined

FT Kuala Lumpur 471

34

-

-

10,940

687

-

-

Perlis 65

5

222

7

1,007

97

2,557

113

Kedah 164

19

757

69

1,913

203

9,154

846

Penang 297

21

316

24

4,845

402

4,496

366

Perak 356

19

1,040

55

6,382

412

12,627

819

Selangor 1,051

93

607

55

22,951

2,204

7,994

815

Negeri Sembilan 206

15

420

30

2,924

253

4,850

373

Melaka 131

8

384

22

1,941

125

5,111

316

Johor 586

42

1,121

80

9,389

779

13,594

1,163

Pahang 235

13

873

45

4,188

224

12,092

642

Terengganu 400

21

813

35

6,979

336

9,308

427

Kelantan 144

9

1,042

58

2,924

178

14,882

934

FT Putrajaya 71

4

-

-

2,170

127

-

-

Sabah 395

32

1,230

101

10,330

998

13,837

1,006

Sarawak 590

30

1,493

67

13,395

644

14,936

725

FT Labuan 74

8

-

-

1,400

135

-

-

Total 5,236

373

10,318

648

103,678

7,804

125,438

8,545

Page 24: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

24

Preparing data for analysis Merge SW into dataset

6-7t

h Ap

ril 2

016

Page 25: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

25

6-7t

h Ap

ril 2

016

Page 26: Complex sampling design & analysis

(C) J

amal

ludi

n Ab

Rah

man

. All

right

s res

erve

d.

27

Complex sample analysis Preparing cs plan Analysis using SPSS

6-7t

h Ap

ril 2

016