Top Banner
> design > publish > search! http://www.spinque.com/ How to Search Annotated Text by Strategy? Roberto Cornacchia Wouter Alink Arjen P. De Vries Spinque B.V. CLIN 2013, 18 January 2013
21

How to Search Annotated Text by Strategy?

Jun 27, 2015

Download

Technology

Arjen de Vries

Spinque presentation at the 23rd Meeting of Computational Linguistics in the Netherlands (CLIN 2013).
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: How to Search Annotated Text by Strategy?

> design > publish > search!

http://www.spinque.com/

How to Search Annotated Text by Strategy?

Roberto CornacchiaWouter Alink

Arjen P. De Vries

Spinque B.V.

CLIN 2013, 18 January 2013

Page 2: How to Search Annotated Text by Strategy?

> design > publish > search!

Search by Strategy

Design the way you would like to search

● A search engine design framework

● Custom search engines built from “Strategies”, which:● are designed as graphs● abstract data processing● combine different data sources● incorporate probabilistic reasoning● translate to database queries

http://www.spinque.com/

Page 3: How to Search Annotated Text by Strategy?

3

> design > publish > search!

Search by Strategy

All housesAll houses

Rank full-text

Rank full-text

Query termsQuery terms

Rankon location

Rankon location

DifferenceDifference

Selecton attribute

Selecton attribute

UnionUnion

Rankon location

Rankon location

Crime mapCrime map

Don't try and program the ultimate search engine

Design a number of domain-specific search strategies

Click. Generate Web search engines on probabilistic DB

Page 4: How to Search Annotated Text by Strategy?

4

> design > publish > search!

Multiple domains, custom UIs

Page 5: How to Search Annotated Text by Strategy?

5

> design > publish > search!

Multiple domains, custom UIs

Page 6: How to Search Annotated Text by Strategy?

6

> design > publish > search!

Multiple domains, custom UIs

Page 7: How to Search Annotated Text by Strategy?

7

> design > publish > search!

Multiple domains, custom UIs

Page 8: How to Search Annotated Text by Strategy?

8

> design > publish > search!

Strategy Editor

Page 9: How to Search Annotated Text by Strategy?

9

> design > publish > search!

Not only "documents"

Page 10: How to Search Annotated Text by Strategy?

10

> design > publish > search!

What's in the DB?

term obj freq

t0

o3

0.03

t0

o5

0.21

t1

o2

0.08

subj pred / attr obj / val p

Roberto speaks_to You 0.95

You listen_to Roberto 0.6

speech minutes 15 0.8

obj f1

... fN

o0

0.12 ... 0.84

o1

0.54 ... 0

o2

0.23 ... 0.31

obj pre size level

o0

100 50 0

o1

110 20 1

o2

144 16 2

Full-text search Annotation search

Feature-vectors (CBIR, SVM) Hierarchical search

Page 11: How to Search Annotated Text by Strategy?

11

> design > publish > search!

Choose hot topics from (kid-)news

http://www.opstel.eu

Rank on date Expand

Extract terms

Kid news

Page 12: How to Search Annotated Text by Strategy?

12

> design > publish > search!

Use POS annotations

Text

Annotated text: we are interested in NPs

<abstract date="2013-01-15"> Lilly de pitbull is een held. De hond uit de Amerikaanse staat Massachusetts heeft …</abstract>

<abstract date="2013-01-15"> <NP>Lilly de pitbull</NP> is <NP>een held</NP>. <NP>De hond uit de Amerikaanse staat Massachusetts</NP> heeft …</abstract>

Page 13: How to Search Annotated Text by Strategy?

13

> design > publish > search!

"Lilly de held" on Alpino

Page 14: How to Search Annotated Text by Strategy?

14

> design > publish > search!

Choose hot topics from (kid-)news

http://www.opstel.eu

Rank on dateExpand

Top terms

Kid news

Top NPs

Page 15: How to Search Annotated Text by Strategy?

15

> design > publish > search!

Topic suggestion for kids

http://www.opstel.eu

Page 16: How to Search Annotated Text by Strategy?

16

> design > publish > search!

Topic suggestion for kids

Data: Wikipedia, magazines for children, ..

Left branch: rank data sources on annotations, e.g.: Most seen content – hot topics Seen during night-time? Probably not for kids

Right branch: query expansion using recent (hot) content

Can we improve this by adding.. ? Text reading level (machine learning) Handle spelling mistakes in query expansion Syntactic dependencies

Page 17: How to Search Annotated Text by Strategy?

17

> design > publish > search!

Example: syntactic dependencies

AEGIR dependency parser for English (Koster et al.)

Parses text, outputs dependency triples "PGs prevent the mucosal damage .. "

[PG,SUBJ,prevent][prevent,OBJ,damage][damage,ATTR,mucosal]

...

CLEFIP 2011: Combining document representations for prior-art retrieval, Eva D'hondt, Suzan Verberne, Wouter Alink, Roberto Cornacchia

Page 18: How to Search Annotated Text by Strategy?

18

> design > publish > search!

Prior art search.Designed by Eva D'hondt, Nijmegen

Page 19: How to Search Annotated Text by Strategy?

19

> design > publish > search!

Find patents containing similar triples

Page 20: How to Search Annotated Text by Strategy?

20

> design > publish > search!

Recap

All housesAll houses

Rank full-text

Rank full-text

Query termsQuery terms

Rankon location

Rankon location

DifferenceDifference

Selecton attribute

Selecton attribute

UnionUnion

Rankon location

Rankon location

Crime mapCrime map

Strategies encapsulatedomain expert knowledge(how to find)

Strategies abstract awaysearch expert knowledge(how to search)

Strategies facilitate knowledge management

Store / share / publish / refine

Minimise the effort needed to design/update complex domain-specific search engines

YOU can easily experiment with (new) data representations, ranking formulas,

annotations, etc.

Page 21: How to Search Annotated Text by Strategy?

21

> design > publish > search!

Thank you

www.spinque.com