An Automated Record Linkage System for the Canadian Census, 1871- 1881 L. Antonie (University of Guelph) P. Baskerville (Universities of Alberta and Victoria) K. Inwood (University of Guelph) J. A. Ross (University of Guelph) Record Linkage Workshop, May 24 th -25 th , 2010, University of Guelph
20
Embed
An Automated Record Linkage System for the Canadian Census, 1871-1881 L. Antonie (University of Guelph) P. Baskerville (Universities of Alberta and Victoria)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
An Automated Record Linkage System for the Canadian
Census, 1871-1881
L. Antonie (University of Guelph)P. Baskerville (Universities of Alberta and Victoria)
K. Inwood (University of Guelph)J. A. Ross (University of Guelph)
Record Linkage Workshop, May 24th-25th, 2010, University of Guelph
‘Unbiased’ links connecting individuals/households over several
census years
A comprehensive infrastructure of longitudinal data
What we are working towards
1851Census
1871Census
1881Census 1891
Census
1901Census
1906 Census
1916Census
1911Census
US 1880
Census
US 1900
Census
Current Work
100% of 1871
CensusAutomatic LinkingAutomatic Linking
4,277,807 records
3,601,663 records
Partners and collaborators: FamilySearch, Church of Latter Day Saints, Minnesota Population Center, Université de Montréal, University of Alberta
100% of 1871
Census
100% of 1871
Census
100% of 1881
Census
100% of 1871
Census
Existing (True) Links
• Ontario Industrial Proprietors – 8429 links
• Logan Township – 1760 links
• St. James Church, Toronto – 232 links
• Quebec City Boys – 1403 links
• Bias– family- context– others?
Logan Twp
Guelph
Attributes for Automatic Linking
• Last Name - string
• First Name - string
• Gender – binary
• Age - number
• Birthplace - number
• Marital status – single, married, divorced, widowed, unknown
Automatic Linkage
• The challenges:1) Identify the same person2) Deal with attribute characteristics3) Manage computational expense