Technology, workflow, and protocols in collaboratively edited digitical editions Juan Garcés British Library eIS 20 June 2007
Jan 12, 2016
Technology, workflow, and protocols
in collaboratively edited digitical editions
Juan GarcésBritish Library
eIS20 June 2007
Overview
• Technology– XML
• Workflow– quality control– quality improvement
• Protocols– author attribution– identification and retrieval
• What is ‘text’?
Technology
XML
• Text Encoding Initiative– open standard and guidelines– de facto standard for Humanities texts– crucial: consistency (ODD), separation of
critical perspective (?)
• challenge: OHCO data model only allows one hierarchy
• encoding disagreement• texts are more complex
Desideratum
• simple editing environment that allows:– encoding of heterogeneous aspects of the
text– multiple instances of the same ‘layer’
(disagreement)– analysis of interrelation between instances
and layers
Workflow
Quality control: peer review/refereeing
• uphold standards of academic disciplines• stricter application since the middle of the twentieth
century• anonymity (seldom ‘double-masked’) and independence• criticisms:
– slow process (sometimes iterative process)– susceptible to control by elites and to personal jealousy– lacks accountability– may be biased and inconsistent – failure to catch all fundamental errors– fraud
Quality control:wikipedia model
• mass-publication tool converted into mass-authoring tool• everyone can edit contents• mistakes are eradicated by community• advantages:
– timeliness– impressive workforce– democracy
• problems:– susceptible to spam and vandalism– always a work in progress– downplays individual contribution– deters participation by scholars
Quality control:hybrids
• alternatives to traditional peer review:– open peer review (reviewers’ names made known)– parallel open peer review– voluntary peer review (publication first)– extended peer review (beyond publication date)
• true hybrids:– content-appropriate marriage of community-oriented,
collaborative editing and scholarly editorial process
Quality improvement:sequential print publication
Edition 1
Manuscript/surrogate
Editor 1
Editor 2
Editor 3
Edition 2
Edition 3
improved Edition
Quality improvement:simultaneous digital publication
Editor 1
Editor 2
Editor 3
Manuscript/surrogate
improved Edition
Editionimproved
Protocols
Author attribution• social, legal, and technical genealogy
– social: 18th c. introduced a new concept of individualised authorship based on the idea of a creative genius working alone - the “privileged moment of individualization in the history of ideas, knowledge, literature, philosophy, and the sciences” (Foucault)
– legal: “1710 Copyright Act”, or “Act for the Encouragement of Learning and the Securing the Property of Copies of Books to the Rightful Owners Thereof”
– technological: coincides with the perfection of the movable types printing press• essential for evaluating professional output of Humanists (grant application,
tenure, etc.)• solutions for collaborative ‘authoring’:
– hierarchy of authors (lead, assistant, etc. – pre-assigned?)– editing profile (contribution broken down into modular or granular input – how to
quantify quality?)– peer assessment
• for any solution eeds to be accepted in professional evaluation scenarios!
The Canonical Text Services (CTS) Protocol
• developed by Neel Smith in conjunction with the Center for Hellenic Studies (Washington, DC)
• defines a network service for identifying and working with texts
• permanence and citabilityof scholarly published works – they are “works possessing an explicitly identified edition and explicitly identified citation scheme, that can be irrevocably and identically replicated”
• digital library distributed objects accessible via a suite of network services (simple identification and retrieval)
The Canonical Text Services (CTS) Protocol
• hierarchical TextInventory (following FRBR, includes identification of how to validate a document):– TextGroup+ (author, collection)– Work+ (notional)– Edition/Translation* (specific versions)– Exemplar* (specific physical copies)
• hierarchical model for citation of sections of a work (recursively nesting <citation>, mapping XPath expression)
• requests– requests expressed as URL parameters– replies formatted as well-formed XML – requests: GetCapabilities, GetWorks, GetValidReff,
GetDocumentMetadata, GetPassage, DownloadText
Desiderata
• impermanence (time stamps, editions)
• new entities (data repository vs. VRE scenario)