Chances and Challenges in Comparing Cross-Language Retrieval Tools Giovanna Roda Vienna, Austria Irf Symposium 2010 / June 3, 2010
Nov 11, 2014
Chances and Challenges in ComparingCross-Language Retrieval Tools
Giovanna RodaVienna, Austria
Irf Symposium 2010 / June 3, 2010
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
Clef–Ip 2009: the task
The main task in the Clef–Ip track was to find prior art for agiven patent.
Prior art search
Prior art search consists in identifying all information (includingnon-patent literature) that might be relevant to a patent’s claim ofnovelty.
Clef–Ip 2009: the task
The main task in the Clef–Ip track was to find prior art for agiven patent.
Prior art search
Prior art search consists in identifying all information (includingnon-patent literature) that might be relevant to a patent’s claim ofnovelty.
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
Participants - 2009 track
15 participants
48 experimentssubmitted for the maintask
10 experimentssubmitted for thelanguage tasks
Participants - 2009 track
15 participants
48 experimentssubmitted for the maintask
10 experimentssubmitted for thelanguage tasks
Participants - 2009 track
15 participants
48 experimentssubmitted for the maintask
10 experimentssubmitted for thelanguage tasks
Participants - 2009 track
15 participants
48 experimentssubmitted for the maintask
10 experimentssubmitted for thelanguage tasks
2009-2010: participants
2009-2010: evolution of the CLEF-IP track
2009
2010
1 task: prior art search
prior art candidate searchand classification task
targeting granted patents
patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009
2010
1 task: prior art search
prior art candidate searchand classification task
targeting granted patents
patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search
prior art candidate searchand classification task
targeting granted patents
patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents
patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia 4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia 4 industrial participants
families and citations include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia 4 industrial participants
families and citations include forward citations
manual assessments expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia 4 industrial participants
families and citations include forward citations
manual assessments expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
Extended citations as relevance assessments
direct citations and their families
Extended citations as relevance assessments
direct citations of family members ...
Extended citations as relevance assessments
... and their families
Patent families
A patent family consists of patents granted by different patentauthorities but related to the same invention.
simple family all family members share the same priority number
extended family there are several definitions, in the INPADOCdatabase all documents which are directly orindirectly linked via a priority number belong to thesame family
Patent families
A patent family consists of patents granted by different patentauthorities but related to the same invention.
simple family all family members share the same priority number
extended family there are several definitions, in the INPADOCdatabase all documents which are directly orindirectly linked via a priority number belong to thesame family
Patent families
A patent family consists of patents granted by different patentauthorities but related to the same invention.
simple family all family members share the same priority number
extended family there are several definitions, in the INPADOCdatabase all documents which are directly orindirectly linked via a priority number belong to thesame family
Patent families
Patent documents are linked bypriorities
Patent families
Patent documents are linked bypriorities
INPADOC family.
Patent families
Patent documents are linked bypriorities
Clef–Ip uses simple families.
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
How good are the CLEF-IP relevance assessments?
CLEF-IP uses families + citations:
How good are the CLEF-IP relevance assessments?
how complete are extendedcitations as a relevanceassessments?
will every prior art patent beincluded in this set?
and if not, what percentageof prior art items are capturedby extended citations?
when considering forwardcitations, how good areextended citations as a priorart candidate set?
How good are the CLEF-IP relevance assessments?
how complete are extendedcitations as a relevanceassessments?
will every prior art patent beincluded in this set?
and if not, what percentageof prior art items are capturedby extended citations?
when considering forwardcitations, how good areextended citations as a priorart candidate set?
How good are the CLEF-IP relevance assessments?
how complete are extendedcitations as a relevanceassessments?
will every prior art patent beincluded in this set?
and if not, what percentageof prior art items are capturedby extended citations?
when considering forwardcitations, how good areextended citations as a priorart candidate set?
How good are the CLEF-IP relevance assessments?
how complete are extendedcitations as a relevanceassessments?
will every prior art patent beincluded in this set?
and if not, what percentageof prior art items are capturedby extended citations?
when considering forwardcitations, how good areextended citations as a priorart candidate set?
Feedback from patent experts needed
Quality of prior art candidate sets has to be assessed
Feedback from patent experts needed
Know-how of patent search experts is needed
Feedback from patent experts needed
at Clef–Ip 2009 7 patent search professionals assessed 12search results
the task was not well defined and there weremisunderstandings on the concept of relevance
amount of data was not sufficient to draw conclusions
Feedback from patent experts needed
at Clef–Ip 2009 7 patent search professionals assessed 12search results
the task was not well defined and there weremisunderstandings on the concept of relevance
amount of data was not sufficient to draw conclusions
Feedback from patent experts needed
at Clef–Ip 2009 7 patent search professionals assessed 12search results
the task was not well defined and there weremisunderstandings on the concept of relevance
amount of data was not sufficient to draw conclusions
Feedback from patent experts needed
Some initiatives associated with Clef–Ip
The results of evaluation tracks are mostly useful for the researchcommunity.
This community often produces prototypes that are of littleinterest to the end-user.
Next I’d like to present two concrete outcomes - not of Clef–Ipdirectly but arising from work in patent retrieval evaluation
Some initiatives associated with Clef–Ip
The results of evaluation tracks are mostly useful for the researchcommunity.
This community often produces prototypes that are of littleinterest to the end-user.
Next I’d like to present two concrete outcomes - not of Clef–Ipdirectly but arising from work in patent retrieval evaluation
Some initiatives associated with Clef–Ip
The results of evaluation tracks are mostly useful for the researchcommunity.
This community often produces prototypes that are of littleinterest to the end-user.
Next I’d like to present two concrete outcomes - not of Clef–Ipdirectly but arising from work in patent retrieval evaluation
Soire
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
Spinque
Spinque
a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)
introduces search-by-strategy
provides optimized strategies for patent search - tested onCLEF-IP data
transparency: understand your search results to improvestrategy
Spinque
a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)
introduces search-by-strategy
provides optimized strategies for patent search - tested onCLEF-IP data
transparency: understand your search results to improvestrategy
Spinque
a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)
introduces search-by-strategy
provides optimized strategies for patent search - tested onCLEF-IP data
transparency: understand your search results to improvestrategy
Spinque
a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)
introduces search-by-strategy
provides optimized strategies for patent search - tested onCLEF-IP data
transparency: understand your search results to improvestrategy
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
Some additional investigations
Some citations were hard to find
% runs class≤ 5 hard
5 < x ≤ 10 very difficult
10 < x ≤ 50 difficult
50 < x ≤ 75 medium
75 < x ≤ 100 easy
Some additional investigations
Some citations were hard to find
% runs class≤ 5 hard
5 < x ≤ 10 very difficult
10 < x ≤ 50 difficult
50 < x ≤ 75 medium
75 < x ≤ 100 easy
Some additional investigations
We looked at the content of citations and citing patents.
Some additional investigations
Ongoing investigations.
Thank you for your attention.