The TARO Project T exas A rchival R esources O nline Fred Gilmore Sr Operating Systems Specialist UT Austin General Libraries [email protected] April 30, 2004
Feb 11, 2016
The TARO ProjectTexas Archival Resources Online
Fred GilmoreSr Operating Systems SpecialistUT Austin General [email protected]
April 30, 2004
What It Is . . .
A project to make Texas archive and manuscript collection finding aids available through the Web.
“finding aid”: descriptive summary and inventory of a material collection housed at a specific archive; not the materials themselves.
Currently: 1500+ searchable, browsable finding aids, 5000+ hits / day
How it came to be . . .
Two grant funded phases:– Outsourced scanning, OCR, XML tagging of
existing paper finding aids– Training/hardware/software for creation of
new finding aids – Phase I (2000 – 2001) : 14 participating
repositories– Phase II (2002 – 2003) : additional 11
repositories
Participating Repositories Alexander Architectural Archive
(UT Austin) Center For American History
(UT Austin) Benson Latin America
Collection (UT Austin) Ransom Humanities Research
Center (UT Austin) Texas State Library Texas Tech Southwest
Collection/University Archives University of Houston Special
Collections/University Archives Rice University Texas A&M
Houston Public Library Austin History Center UT San Antonio Texas State University Southern Methodist University UT Medical Branch – Galveston MD Anderson UT El Paso UT Pan American UT Arlington
How It Came To Be . . .
Why XML?– Compose once, format many– XML and related standards make data
exchange/reuse, description easier through separation.
Creating content for TARO
Archives staff:– Edit or compose XML tagged electronic
version of finding aid (new finding aids are created using text/XML editor such as Corel XMetaL)
– Submit file to UT Austin server
.
.<unittitle label="Title:" encodinganalog="245$a">Thomas J. Rollins Papers,<unitdate type="inclusive" encodinganalog="245$f" label="Dates:"
era="ce" calendar="gregorian">1875-1997 and undated</unitdate></unittitle><abstract label="Abstract:" encodinganalog="520$a">The personal papers of Thomas J. Rollins from 1875-1997 and
undated.</abstract><unitid countrycode="us" repositorycode="TxLT-SW"
encodinganalog="099" label="Collection #">S 1261.1</unitid><repository label="Repository:" encodinganalog="852$a"><corpname><subarea>Southwest Collection/Special Collections
Library,</subarea>..
Creating Content For TARO
UT Austin technical staff:– XML file is moved into production, error
checked, translated into three HTML varieties for viewing.
– HTML content is indexed for searching (keyword and fielded), sorted into repository lists for browsing
http://www.lib.utexas.edu/taro/ttusw/00054/tsw-00054.html
http://www.lib.utexas.edu/taro/ttusw/00054/tsw-00054.html
Advantages
Pages picked up by Google and give content higher visibility.
Multiple views of content including ability to customize view by running the XML document against a personal stylesheet.
Processing fully automated. HTML translated files can be available within hours.
DC metadata and OAI records provide additional access points.
Challenges
Relationships– Mediating local needs with federated site
requirements.– Encouraging supplemental metadata creation.
Resources– Introducing improvements without dedicated
staff on either end.
Challenges
Realities of the Web– User education. Practically a meta-site.
Content expectations not met.– Finding aids can be large. Load times a
problem.– XML Unicode requirements make special
characters tricky.
Future Plans
Searching: search XML directly Content: fund the creation, serving of
pictures, sound, video Participation: more repositories = more
content Access: Open Archives, RDF metadata Flexibility: provide stylesheet for direct XML
browsing, PDF creation for hardcopy