Computing in Computing in Archaeology Archaeology Session 9. Sampling Session 9. Sampling Assemblages Assemblages © Richard Haddlesey www.medievalarchitecture.net
Mar 27, 2015
Computing in Computing in ArchaeologyArchaeology
Session 9. Sampling Session 9. Sampling AssemblagesAssemblages
© Richard Haddlesey www.medievalarchitecture.net
AimsAims
To become familiar with sampling To become familiar with sampling practices in an archaeological practices in an archaeological contextcontext
Introduction to SamplingIntroduction to Sampling
An area of excavation is a sample of An area of excavation is a sample of the complete site which in itself is a the complete site which in itself is a sample of all sites of that type. The sample of all sites of that type. The same goes for artefact assemblages.same goes for artefact assemblages.
The essence of all sampling is to gain The essence of all sampling is to gain the maximum amount of information the maximum amount of information by measuring or testing just a part of by measuring or testing just a part of the available materialthe available material
Fletcher & Lock 2005, 66
Archaeological sample
Sampled population
Target population
Formal definitionsFormal definitions
PopulationPopulation: the whole group or set of : the whole group or set of objects about which inference is to be objects about which inference is to be mademade
Sampling fameSampling fame: a list of the items, units : a list of the items, units or objects that could be sampledor objects that could be sampled
VariableVariable: a characteristic which is to be : a characteristic which is to be measured for the units, such as weight of measured for the units, such as weight of spearheadsspearheads
Fletcher & Lock 2005, 66
Formal definitionsFormal definitions
SampleSample: the subset or part of the : the subset or part of the population that is selectedpopulation that is selected
Sample sizeSample size:: the number in the sample. the number in the sample. A sample size of 5 is considered small, A sample size of 5 is considered small, while, formally, a sample size of 50 is while, formally, a sample size of 50 is large. The sample size maybe stated as a large. The sample size maybe stated as a percentage of the sampling frame, e.g. a percentage of the sampling frame, e.g. a 10% sample10% sample
Fletcher & Lock 2005, 67
Sampling strategies
• a simple random sample (probability sample USA)
• a systematic sample
• a stratified sample
• a cluster sample
population – 100 units
. . . etc
100 obsidian spearheads
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
population – 100 units
A simple random number sample
Random samplingRandom sampling
If we have a sample of 100 spearheads, If we have a sample of 100 spearheads, we simply pick 10 random numbers (i.e. we simply pick 10 random numbers (i.e. 10%)10%)
Computers can help generate random Computers can help generate random sequences, but are not necessarysequences, but are not necessary
You must avoid bias in your selection as You must avoid bias in your selection as this can result in scrutiny from others this can result in scrutiny from others
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
a simple random number sample
A systematic sample
Systematic samplingSystematic sampling
To take a systematic approach, we To take a systematic approach, we could choose every number ending in could choose every number ending in 4. Once again this would give us our 4. Once again this would give us our 10%10%
This method has the advantage of This method has the advantage of being easy to design unless the units being easy to design unless the units have inherent patterning in their have inherent patterning in their orderorder
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
a systematic sample
A stratified sample
Stratified samplingStratified sampling
Here we take a random sample 5 Here we take a random sample 5 from the top and five from the from the top and five from the bottombottom
Or 5 from the left, 5 right etcOr 5 from the left, 5 right etc
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
a stratified sample
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
a stratified sample
A cluster sample
Cluster samplingCluster sampling
Rather than select individual items, Rather than select individual items, select clusters or groups of items select clusters or groups of items that are close togetherthat are close together
This may result in bias valuesThis may result in bias values
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
a cluster sample
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
a cluster sample
Downside to systematicDownside to systematic
Totally miss this context
Common sample statistics:Common sample statistics:
x – the sample mean
s – the sample standard deviation
p – the sample proportion (i.e. the proportion of the sample having a particular characteristic)
StatsStats
The true population values for these The true population values for these statistics are usually unknown, and statistics are usually unknown, and formally denoted by Greek lettersformally denoted by Greek letters
x – the sample mean
s – the sample standard deviation
p – the sample proportion
Common sample statistics:
μ – the population mean
known value estimate for
x – the sample mean
s – the sample standard deviation
p – the sample proportion
Common sample statistics:
μ – the population mean
σ – the population standard deviation
known value estimate for
x – the sample mean
s – the sample standard deviation
p – the sample proportion
Common sample statistics:
μ – the population mean
σ – the population standard deviation
π – the population proportion
known value estimate for
The central-limit theorem The central-limit theorem (the law of averages)(the law of averages)
In order to comment on how good an In order to comment on how good an estimate the sample statistics are, estimate the sample statistics are, the nature of their distribution needs the nature of their distribution needs to be knownto be known
See See • Fletcher & Lock (2Fletcher & Lock (2ndnd ED) 2005, ED) 2005, Digging Digging
NumbersNumbers Oxbow 70-9 Oxbow 70-9