What’s the difference between Tony Blair and Mother Theresa? (Human Language Technology for Preservation return on investment) http ://gate.ac.uk/ http://nlp.shef.ac.uk/ Hamish Cunningham Dept. Computer Science, University of Sheffield Alghero, March 2004
19
Embed
What’s the difference between Tony Blair and Mother Theresa? (Human Language Technology for Preservation return on investment) //gate.ac.uk
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
What’s the difference between Tony
Blair and Mother Theresa?
(Human Language Technology for
Preservation return on investment)
http://gate.ac.uk/ http://nlp.shef.ac.uk/
Hamish CunninghamDept. Computer Science, University of Sheffield
A bit of a nuisance (GATE users)GATE team projects. Past:• Conceptual indexing: MUMIS:
automatic semantic indices for sports video
• MUSE, cross-genre entitiy finder• HSL, Health-and-safety IE• Old Bailey: collaboration with HRI
on 17th century court reports• Multiflora: plant taxonomy text
analysis for biodiversity research e-science
Present:• Advanced Knowledge
Technologies: €12m UK five site collaborative project
• EMILLE: S. Asian languages corpus
• ACE / TIDES: Arabic, Chinese NE• JHU summer w/s on semtaggingFuture:• Five new projects inc. PrestoSpace
Thousands of users at hundreds of sites. A representative sample: • the American National Corpus project • the Perseus Digital Library project,
Tufts University, US• Longman Pearson publishing, UK• Merck KgAa, Germany• Canon Europe, UK• Knight Ridder, US• BBN (leading HLT research lab), US• SMEs inc. Sirma AI Ltd., Bulgaria• Imperial College, London, the University
of Manchester, UMIST, the University of Karlsruhe, Vassar College, the University of Southern California and a large number of other UK, US and EU Universities
• UK and EU projects inc. MyGrid, CLEF, dotkom, AMITIES, Cub Reporter, EMILLE, Poesia...
17(19)
GATE – infrastructure for semantic metadata extraction
• Combines learning and rule-based methods (new work on mixed-initiative learning
• Allows combination of IE and IR • Enables use of large-scale linguistic resources
for IE, such as WordNet• Supports ontologies as part of IE applications -
Ontology-Based IE• Supports languages from Hindi to Chinese,
Italian to German
18(19)
(Not the) MAD Semantics Architecture
EN
FormalText
FormalText
FormalTextFormal
TextFormal
TextFormal
TextFormalText
FormalText
FormalTextText
Sources
IE
IE
IE
IT
FormalText
FormalText
FormalTextFormalText
FormalText
FormalTextFormalText
FormalText
FormalText
Signal md, Transcr-iptions
ASR,etc.
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
AVSignals
Merging Final Annotations
Formal
Text
Formal
TextForma
lText
Anno-tations
MultilingualConceptual
Q & A
...
Ontology-Based
Metadata
19(19)
Archiving is not a luxury•C21st: all the C20th mistakes but bigger & better?
•If you don’t know where you’ve been, how can you know where you’re going?
•Archives: ammunition in the war on ignorance
•Ammunition is useless if you can’t find it: new technology must make our history accessible to all, for all our futures
More information: http://gate.ac.uk/ http://www.prestospace.org/