Temporal Information Retrieval Stefano Giovanni Rizzo 1 Danilo Montesi 2 , Matteo Brucato 3 1,2 Department of Computer Science and Engineering - University of Bologna 3 School of Computer Science - University of Massachusetts Amherst 1 [email protected]
37
Embed
Temporal Information Retrieval - unibo.itmontesi/CBD/Articoli/Seminario_TIR.pdf · § The temporal information in documents can be used to ... § Temporal Information Retrieval aims
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Temporal Information Retrieval
Stefano Giovanni Rizzo1 Danilo Montesi2, Matteo Brucato3
1,2 Department of Computer Science and Engineering - University of Bologna 3 School of Computer Science - University of Massachusetts Amherst
Information Retrieval § Information Retrieval (IR) is defined as the activity of
§ finding information resources (usually documents)
§ of an unstructured nature (usually text)
§ relevant to an information need (the user query)
§ from within a large collection (e.g. the web).
§ But… is the text truly unstructured? § As we’ll see structured information, like temporal
expressions, can be extracted from text data resources by means of NLP tools.
§ The temporal information in documents can be used to improve the IR effectiveness.
IR Effectiveness
§ To assess the effectiveness of an IR system (the quality of its search results), there are two parameters about the system’s returned results for a query: § Precision: What fraction of the returned results are
relevant to the information need?
§ Recall: What fraction of the relevant documents in the collection were returned by the system?
IR Evaluation
§ To measure ad hoc IR effectiveness in the standard way, we need a test collection consisting of three things: 1. A document collection
2. A set of information needs, in the form of user queries
3. A set of relevance judgments, in the form of binary assessment of either relevant or nonrelevant for each query–document pair.
Precision and Recall (1/3)
Document collection
Documents relevant to a
specific query
Retrieved Documents
Relevant Retrieved Documents
Precision and Recall (2/3)
Relevant Documents
Retrieved Documents
Relevant Retrieved Documents = ∩
Precision and Recall (3/3)
Relevant Documents
Retrieved Documents
Relevant Retrieved Documents
= Precision
Relevant Retrieved Documents
= Recall
Precision @ K ▪ Set a rank threshold K ▪ Compute % relevant in top K ▪ Ignores documents ranked lower than K ▪ Ex: ▪ Precision@3 is 2/3 ▪ Precision@4 is 2/4 ▪ Precision@5 is 3/5 ▪ In similar fashion we have Recall@K
Example: Average Precision
Example: MAP
Temporal Relevance
§ The set of temporal properties of a document is one key aspect that determine a document’s relevance for a query. Despite this, temporal information is still treated as a simple textual keyword by traditional IR.
§ Temporal Information Retrieval aims at satisfying these temporal needs and at combining traditional notions of document relevance with the so-called temporal relevance.
Time and Temporal Scope
§ Time is an ubiquitous dimension of nearly every collection of documents § Digital libraries, news stories, tweets, the Web, …