Top Banner
Lab 1-1 Ka-fu Wong © 2003 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data
21

Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Lab 1-1 Ka-fu Wong © 2003

Dr. Ka-fu WONG

ECON1003Analysis of Economic Data

Page 2: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-2

Counting Green Beans in the Bottle

We are interested in knowing the number of green beans in the bottle.

Tools: We do not have a weight balance. If we

have a balance, we can take out a small number of beans and weight them. We can than estimate the number of beans in the bottle.

We do have a pack of red beans. What do we need to do to obtain a

reasonable estimate?

Page 3: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-3

Capture/Re-captureCapture/Re-capture

GOAL:

1. Illustrate that how to estimate the population size when the cost of counting all individuals is prohibitive.

2. Illustrate how easy and intuitive statistics could be. Statistics need not be completely deep, murky, and mysterious. Our common sense can help us to negotiate our way through the course.

In-class Lab

Page 4: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-4

History and examples of capture / recapture method

Capture-recapture methods were originally developed in the wildlife biology to monitor the census of bird, fish, and inset populations (counting all individuals is prohibitive). Recently, these methods have been utilized considerably in the areas of disease and event monitoring.

http://www.pitt.edu/~yuc2/cr/history.htm

Page 5: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-5

The fish example

Estimating the number of fish in a lake or pond. C fish is caught, tagged, and returned to

the lake. Later on, R fish are caught and checked

for tags. Say T of them have tags. The numbers C, R, and T are used to

estimate the fish population.

Page 6: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-6

Green beans in a bottle

The objective is to estimate the number of green beans in a bottle.

Capture one cup of beans. Count them and call it C. Replace the green beans with red beans. Put them back into the bag.

Capture another cup of beans. Count the total number of beans (R) and the number of red beans (T).

Based on this information, How to obtain a reasonable

estimate of the number of beans in the bag?

Page 7: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-7

Green beans in a bag

We know that C/N ≈ T/R Hence, a simple estimate is

CR/T C= the number of beans capture

in the first round. R= the total number of beans

capture in the second round. T= the number of red beans

capture in the second round.

Page 8: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-8

Simulations to see the properties of this proposed estimator

How good is the proposed estimator? To see the properties of this proposed

estimator, I have use MATLAB to simulation our Capture-recapture experiment with different numbers of capture (C) and different numbers of recapture (R), relative to the total number of fish in the pond.

Throughout, N=500 and 1000 simulations

Page 9: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-9

Simulation design – via MATLAB

Individual simulation experiment: Create 500 fish, labelled 1 to 500. Capture a random sample of C fish, mark

them by converting their label to zero. Capture another random sample of R fish.

Count the number of marked fish in the sample. Call it T.

Compute the estimate as CR/T. Repeat this experiment 1000 times. Hence,

we have 1000 estimates. Compute the mean and standard deviation

of these 1000 estimates.

Page 10: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-10

Properties of our estimatorIncreasing C and R

N C R S Mean Std

500 40 40 971 640.76 401.57

500 60 60 1000 579.22 321.54

500 80 80 1000 533.61 154.67

500 100 100 1000 522.85 104.29

500 120 120 1000 513.82 77.41

500 140 140 1000 507.04 60.98

500 250 250 1000 500.64 22.93

500 500 500 1000 500.00 0.00

•N = Total number of fish in the pond.•C = number of captured fish.•R = number of re-captured fish.•S = number of simulation with non-zero marked fish in recapture.

Page 11: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-11

Properties of our estimatorConstant C and increasing R

N C R S Mean Std

500 120 40 1000 507.86 75.07

500 120 60 1000 513.40 79.55

500 120 80 1000 508.19 73.56

500 120 100 1000 511.24 74.55

500 120 120 1000 510.93 75.41

500 120 140 1000 511.21 75.63

500 120 250 1000 510.49 74.04

500 120 500 1000 507.47 77.32

•N = Total number of fish in the pond.•C = number of captured fish.•R = number of re-captured fish.•S = number of simulation with non-zero marked fish in recapture.

Page 12: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-12

Properties of our estimatorIncreasing C and constant R

N C R S Mean Std

500 40 120 961 646.59 405.72

500 60 120 1000 582.17 327.97

500 80 120 1000 533.28 142.23

500 100 120 1000 512.28 95.40

500 120 120 1000 508.78 78.75

500 140 120 1000 507.50 60.61

500 250 120 1000 500.86 22.38

500 500 120 1000 500.00 0.00

•N = Total number of fish in the pond.•C = number of captured fish.•R = number of re-captured fish.•S = number of simulation with non-zero marked fish in recapture.

Page 13: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-13

Conclusion from the simulations

The proposed estimator generally overestimate the number of fish in pond, i.e., estimate is larger than the true number of fish in pond.

That is, there is a bias. Holding R constant, increasing the number of

capture (C) helps: Bias is reduced, i.e., Mean is closer to the true

population The estimator is more precise, i.e., standard

deviation of the estimator is smaller. Holding C constant, increasing the number of

recapture (R) does not help: Bias is more or less unchanged. The precision of the estimator is more or less

unchanged.

Page 14: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-14

Additional issues

Our proposed estimator is good enough but it can be better. Alternative estimators have been developed to reduce or eliminate the bias of estimating N.

For instance, Seber (1982, p.60) suggests an estimator of N

(C+1)(R+1)/(T+1) – 1(Note that our proposed formula is CR/T.)

Seber, G. (1982): The Estimation of Animal Abundance and Related Parameters, second edition, Charles.

Page 15: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-15

Simulations to see the properties of this modified estimator

How good is the modified estimator? To see the properties of this modified

estimator, we repeat the above simulation exercise with this new formula.

(C+1)(R+1)/(T+1) – 1

Page 16: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-16

Properties of modified estimatorIncreasing C and R

N C R S Mean Std

500 40 40 1000 488.60 271.05

500 60 60 1000 504.39 202.16

500 80 80 1000 498.88 121.47

500 100 100 1000 501.72 91.20

500 120 120 1000 498.10 72.01

500 140 140 1000 501.14 58.44

500 250 250 1000 498.60 21.72

500 500 500 1000 500.00 0.00

•N = Total number of fish in the pond.•C = number of captured fish.•R = number of re-captured fish.•S = number of simulation with non-zero marked fish in recapture.

Page 17: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-17

Properties of modified estimatorConstant C and increasing R

N C R S Mean Std

500 120 40 1000 498.55 67.38

500 120 60 1000 500.05 71.54

500 120 80 1000 495.58 69.22

500 120 100 1000 497.01 71.14

500 120 120 1000 498.45 71.05

500 120 140 1000 495.17 67.46

500 120 250 1000 500.41 75.29

500 120 500 1000 496.73 74.27

•N = Total number of fish in the pond.•C = number of captured fish.•R = number of re-captured fish.•S = number of simulation with non-zero marked fish in recapture.

Page 18: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-18

Properties of modified estimatorIncreasing C and constant R

N C R S Mean Std

500 40 120 1000 491.84 291.00

500 60 120 1000 499.33 216.81

500 80 120 1000 496.51 117.05

500 100 120 1000 493.50 87.53

500 120 120 1000 503.24 73.65

500 140 120 1000 498.59 56.30

500 250 120 1000 499.76 22.58

500 500 120 1000 500.00 0.00

•N = Total number of fish in the pond.•C = number of captured fish.•R = number of re-captured fish.•S = number of simulation with non-zero marked fish in recapture.

Page 19: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-19

Conclusion from the simulations

The modified estimator performs better than the original estimator. There is no apparent bias. The estimator is more precise.

Holding R constant, increasing the number of capture (C) helps: The estimator is more precise, i.e.,

standard deviation of the estimator is smaller.

Holding C constant, increasing the number of recapture (R) does not help: The precision of the estimator is more or

less unchanged.

Page 20: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Ka-fu Wong © 2003 Lab 1-20

What to take away today

Statistics could be easy and intuitive. Statistics need not be completely deep,

murky, and mysterious. Our common sense can help us to negotiate

our way through the course.

Syllabus will be distributed and discussed on Wednesday 22 January 2003.

Page 21: Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.

Lab 1-21 Ka-fu Wong © 2003

- END -

In-class Lab

Capture / Capture / recapturerecapture