GESIS – Vocabulary, Statistics, Time and Geography Combining Statistics and Text for a View of Irish Cultural Heritage IASSIST 2009, Tampere Finland, May 27, 2009 • Fredric C. Gey – UC Data Archive & Technical Assistance. – University of California, Berkeley – http://ucdata.berkeley.edu/gey.html • Institute for Museum and Library Services Grants: – Seamless search of textual and numeric databases (1999-2002), – Going places in the catalog: Improved Geographic Access (2002-2004), – What Where, When and Why– support for the learner (2004-2006), – Bringing Lives to Light – Biography in Context (2006-2008) – Context and Relationships – Ireland and Irish Studies, (2007-2010) – Colleagues: Michael Buckland, Ray Larson, Kim Carl, Jeanette Zerneke, host of students including Ryan Shaw and Vivien Petras – Collaboration with Centre for Digitisation, Queens University, Belfast • Paul Ell, collaborating PI
19
Embed
GESIS – Vocabulary, Statistics, Time and Geography Combining Statistics and Text for a View of Irish Cultural Heritage IASSIST 2009, Tampere Finland, May.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
GESIS – Vocabulary, Statistics, Time and Geography
Combining Statistics and Text for a View of Irish Cultural Heritage
IASSIST 2009, Tampere Finland, May 27, 2009
• Fredric C. Gey– UC Data Archive & Technical Assistance. – University of California, Berkeley– http://ucdata.berkeley.edu/gey.html
• Institute for Museum and Library Services Grants: – Seamless search of textual and numeric databases (1999-2002),– Going places in the catalog: Improved Geographic Access (2002-2004), – What Where, When and Why– support for the learner (2004-2006), – Bringing Lives to Light – Biography in Context (2006-2008) – Context and Relationships – Ireland and Irish Studies, (2007-2010) – Colleagues: Michael Buckland, Ray Larson, Kim Carl, Jeanette Zerneke, host of
students including Ryan Shaw and Vivien Petras– Collaboration with Centre for Digitisation, Queens University, Belfast
• Paul Ell, collaborating PI
GESIS – Vocabulary, Statistics, Time and Geography
HETEROGENEOUS DIGITAL INFORMATION SEARCHCurrent Search Technology
(multiple independent searches without search aids)
Bibliography
Full Text
Maps and other Geospatial data Music and other
media
QUERY
Numeric Statistical Databases
Patents
GESIS – Vocabulary, Statistics, Time and Geography
Heterogeneous Digital Information SearchDirect Mappings and Search Between Multiple Information Types
QUERY
Bibliography
Full Text
Numeric Statistical Databases
Patents
EVMs
EVM
mEVM
pEVMt
EVMg
QUERYplus
Maps and other Geospatial data
Music and other media
GESIS – Vocabulary, Statistics, Time and Geography
Context and Relationships: Ireland and Irish Studies (Goals)
(2007-2010 NEH/IMLS Grant)
• Enable automatic and manual editorial markup of scanned scholarly materials for personal names and
geography • Recognition of place/person names in middle English
and Gaelic• Combine historical statistics with external search of
documents by geographic commonality
• Utilize Hogan’s Onomasticon Goedelicum locorum et tribuum Hiberniae et Scotiae An index, with identifications, to the Gaelic names of places and tribes (1909 Edmund Hogan, SJ), a kind of concordance of Irish documents by place
GESIS – Vocabulary, Statistics, Time and Geography
Who, What, Where When IMLS Project(2004-200 IMLS grant)
Developed multi-genre search using common geography (data/books)
GESIS – Vocabulary, Statistics, Time and Geography
Biography Markup and Search Goals(2006-2006 IMLS grant)
• To develop tools for editors, archivists and compilers of historical papers – Emma Goldman papers
• To develop display in time/space to facilitate historical discovery, i.e. who lived there at the same time and what important events occurred there• To visualize biography as an ordered sequence of 4-tuple events (activity,
time, place, other-people) – developing biographical markup standards
• Congressional Biography – automatic markup of place, date, time-range
<biog source="cong_dict" page_start="19" page_end="19"> <name> ADAMS, JOHN QUINCY. </name> <text> Born in Braintree, now Quincy, Mass., July 11, 1767. When ten years of age, he
accompanied his father to France ; and when fifteen, was private secretary to the American Minister in Russia. He was graduated at Harvard University in 1787 ; studied law in Newburyport, and settled in Boston. From 1794 to 1801 he was American Minister to Holland, England, Sweden, and Prussia. He was a Senator in Congress from 1803 to 1808
</text></biog>
GESIS – Vocabulary, Statistics, Time and Geography
Biography Markup: Emma Goldman Travels(2006-2009 IMLS grant)
The Atom format feeds directly into GOOGLE maps
GESIS – Vocabulary, Statistics, Time and Geography
From Publishing Context to Building Context
GESIS – Vocabulary, Statistics, Time and Geography
Context and Relationships: Ireland and Irish Studies
(2007-2009 NEH/IMLS Grant)
• Collaboration with Center for Digitization, Queens University Belfast– Digitizing ~500,000 pages of Irish Historical and Cultural Studies
• To develop display and contextual search in time/space to facilitate scholarly discovery: http://gray.ischool.berkeley.edu/oldw4/irish/
Digital Library of Core Materials on Ireland exemplar
• £620,000 grant from JISC to digitise journals, monographs and manuscripts relating to Irish Studies and create the foundations of a digital library resource
• Initial archive of around 470,000 pages
• 100 journals covering 200 year period and about 400,000 pages
• 2,500 pages of manuscript
• 205 key monographs
• Machine-readable text for all journals and monographs and some manuscripts
• Detailed ‘object’ level metadata
Project Imperatives
• Access to rare resources without visiting Belfast
• Resource discovery – use of less common journals
• New, complex searching using detailed metadata and semantic searching
• Serendipity
• A one stop shop for journals – and more
• Enhanced research developing from better access
Insert image
GESIS – Vocabulary, Statistics, Time and Geography
Ireland and Irish Studies: Statistical Data about Ireland
• Center for Digitization, Queens University Belfast has digitized 200 years of Irish Historical Statistics
• We wish to integrate statistical data display with scholarly search and browsing by time and place
The Database of Irish Historical Statistics
• 32,934,018 data values from 1821 to 1971, and then linked to contemporary digital sources
• Mostly census data but also annual agricultural statistics, civil registration information, crime statistics . . .
• Topics include population statistics, crop and stock data, language, literacy, religion, occupations, employment, housing, emigration, industry and industrial structure, trade and commerce, wages, pauperism etc
GESIS – Vocabulary, Statistics, Time and Geography
Ireland and Irish Studies: Our new approachUtilize the capabilities of Google Earth
• Obtain historic Irish sub-county boundary files (Baronies and Poor Law Union)
GESIS – Vocabulary, Statistics, Time and Geography
Ireland and Irish Studies: Our new approachUtilize the capabilities of Google Earth (2)
• Utilize the KML markup language to integrate statistical data display with scholarly search and browsing by time and place
GESIS – Vocabulary, Statistics, Time and Geography
Ireland and Irish Studies: Google Earth (3)Search links added to statistical data display
GESIS – Vocabulary, Statistics, Time and Geography
Ireland and Irish Studies: next steps
Add more statistics Religion (percent Catholic, Protestant, other) Agriculture
Add more resources to search
Begin working with and geographically indexing the 500k pages of Irish journals and books.
Refine our user interfaces and develop more prototype demonstrations
GESIS – Vocabulary, Statistics, Time and Geography
References
• M Buckland and L Lancaster 2004, "Combining Place, Time, and Topic" D-Lib Magazine, May 2004, Volume 10 Number 5 http://www.dlib.org/dlib/may04/buckland/05buckland.html
• M Buckland, A Chen, F Gey & R Larson, 2006. “Search Across Different Media: Numeric Data Sets and Text Files.” Information Technology and Libraries. December 2006, pp 181-189.
• M Buckland, A Chen, F Gey, R Larson, R Mostern & V Petras 2007 ”Geographic
Search: Catalogs, Gazetteers, and Maps.” College & Research Libraries, Sept 2007
• F Gey, R Shaw, R Larson, M Buckland, B Pateman and D Melia, “Marking Up Cultural Materials for Time and Geography,” in Proceedings of the Workshop on Information Access to Cultural Heritage, Aarhus, Denmark, Sept 28, 2008.
• F Gey, R Shaw, R Larson, B Pateman, “Biography as events in time and space”, Proceedings of ACM GIS Conference, Irvine, California, Nov 4-7, 2008
• Emma Goldman papers (http://sunsite.berkeley.edu/Goldman/) • http://www.ucc.ie:8080/cocoon/doi/locus (onomasticon)