Top Banner
© FINDWISE 2012 Implementing and designing search solutions Gothenburg University – Gothenburg – 2012-03-08
52
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Designing and Implementing Search Solutions

© FINDWISE 2012

Implementing and designing search solutions

Gothenburg University – Gothenburg – 2012-03-08

Page 2: Designing and Implementing Search Solutions

Agenda

• Introduction to Findwise

•Technical approach

•DIY UX design

•Research

Page 3: Designing and Implementing Search Solutions

• Founded in 2005

• Offices in Sweden, Denmark, Norway and Poland

• 72 employees (February 2012)

• Our objective is to be a leading provider of Findability solutions utilising the full potential of search technology to create customer business value

About Findwise

Page 4: Designing and Implementing Search Solutions

Technology independent

  Creating search-driven Findability solutions based on market-leading commercial and open source search technology platforms:

Autonomy IDOL Microsoft (SharePoint and FAST Search products) Google GSA IBM ICA/OmniFind LucidWorks Apache Lucene/Solr (Open source) and more…

Page 5: Designing and Implementing Search Solutions

Findability Challenges

Employee productivity (DN article, March 2011):

”The effort to find the right information costs an average company 80,000 SEK per employee and year”

Customer Service quality and efficiency (Accenture report, March 2011):

“69% of agents don't have answers to help service customers”

E-commerce conversion rate (Google survey, December 2010):

“77% of those surveyed used search within an e-commerce website to find products”

Page 6: Designing and Implementing Search Solutions

Information overload?

Page 7: Designing and Implementing Search Solutions

A search engine alone is not enough

Page 8: Designing and Implementing Search Solutions

Technical approach

Page 9: Designing and Implementing Search Solutions

RE-USE

Page 10: Designing and Implementing Search Solutions

STANDARD

Page 11: Designing and Implementing Search Solutions

Standard architecture

Page 12: Designing and Implementing Search Solutions

Search core

Page 13: Designing and Implementing Search Solutions

Search core - overview

Title: Brown foxContent: The quick brown fox jumps over the lazy dogAuthor: Tobias Berg

Documents

Title: My dogContent: My old dog cannot jump anymoreAuthor: Svetoslav Marinov

Term Documents

… …

fox 1

jump 1,2

lazy 1

dog 1,2

tobias 1

berg 1

… …

Inverted index

TokenizationStemmingStop-word…

Page 14: Designing and Implementing Search Solutions

Relevancy

Retrieveddocuments Relevant

documents

•Precision – how many of the retrieved documents are relevant?

•Recall – how many of the relevant documents were retrieved?

Page 15: Designing and Implementing Search Solutions

Relevancy

Recall find everything related to the query

- lemmatization- synonyms- wildcards- anti-phrasing- or-operator

Precision find only entities related to the query

- exact word matching- exact phrase

matching- and-operator

GoalImprove precision,

without sacrificing recall

Page 16: Designing and Implementing Search Solutions

Search core – relevance score

•TF/IDF

•Field length

•Field weight• Title *2

• Author *4

• Content *1

•Freshness•…

Page 17: Designing and Implementing Search Solutions

Search Core

•Optimized for full-text search

•Sub-second responses

•Tunable relevance

•Scalable

•Configurable & Extendable

{query}

Find matching documents

Score documents

{result}

Page 18: Designing and Implementing Search Solutions

Standard architecture

Page 19: Designing and Implementing Search Solutions

Connectors

Page 20: Designing and Implementing Search Solutions

Connectors – fetch data

Database connector

Id Product name

Description Price

1 Wheel Makes the bus go round round round

45

2 Window A shield of glass

12

Id Book name Abstract Author

1 Ulysses Irish novel James Joyce

2 Crime and Punishment

Russion novel

Dostoevsky, Fyodor

Database connector

Page 21: Designing and Implementing Search Solutions

Connector framework – code example

public void execute() {//Insert code to fetch content

}

public void interrupt() {//Insert code to handle interrupt signal

} public void init() {

//Insert code to initialize connnector }

Page 22: Designing and Implementing Search Solutions

Connector Frameworks

http://incubator.apache.org/connectors/

http://code.google.com/p/google-enterprise-connector-manager/

• Existing connectors• Re-usable• Configuration interfaces• Standardized implementation

Page 23: Designing and Implementing Search Solutions

Standard architecture

Page 24: Designing and Implementing Search Solutions

Pipeline

Page 25: Designing and Implementing Search Solutions

Pipeline - overview

• PDF/Office -> Text• Lemmatization• Language identification• NER• Phonetic search• Keyword extraction• External calls• …

Page 26: Designing and Implementing Search Solutions

Pipeline framework – code example

protected void addAction(Document doc) throws PipelineException {//Insert codedoc.addField(“Title”,”Hello world!”);

}

protected void updateAction(Document doc) throws PipelineException {//Insert code addAction(item);

} protected void deleteAction(Document doc) throws PipelineException {

//Insert code }

Page 27: Designing and Implementing Search Solutions

NLP tools and approaches

• Open source:GATE, OpenNLP, UIMA, StanfordNLP, Mallet, Apache Mahout

• Proprietary:

IBM LanguageWare• Own components:

e.g. KeywordExtraction Service; LanguageIdentify• POS taggers – Hunpos, OpenNLP, Mallet• Dependency Parsers – MaltParser, StanfordParser• NER – rule-based + statistical models• Document summarization• Document clustering

Page 28: Designing and Implementing Search Solutions

Pipeline – configuration example

Page 29: Designing and Implementing Search Solutions

Pipeline frameworks

Findwise Hydra

http://www.pypes.org/

http://www.openpipeline.com/

• Re-usable stages• Configuration interface• Focus on task

Page 30: Designing and Implementing Search Solutions

Putting it all together

Page 31: Designing and Implementing Search Solutions

What the frell is UX design?

Page 32: Designing and Implementing Search Solutions

What the frell is UX design?

• Interaction design

•Usability Engineering

• Information Architecture

•Visual Design

Page 33: Designing and Implementing Search Solutions

Findwise UX design principles

Users want results

Dialogue not monologue

Participation builds trust

Answer frequent questions

Simple but powerful

Page 34: Designing and Implementing Search Solutions

Users want results

Page 35: Designing and Implementing Search Solutions

Dialogue not monologue

Page 36: Designing and Implementing Search Solutions

Participation builds trust

Page 37: Designing and Implementing Search Solutions

Answer frequent questions

Page 38: Designing and Implementing Search Solutions

Simple but powerful

Page 39: Designing and Implementing Search Solutions

Findwise UX design principles

Users want results

Dialogue not monologue

Participation builds trust

Answer frequent questions

Simple but powerful

Page 40: Designing and Implementing Search Solutions

DIY UX design

Page 41: Designing and Implementing Search Solutions

DIY UX design

Design research

Analytics

Usability tests

Iterate!

Page 42: Designing and Implementing Search Solutions

Design research

•Be easy to reach – keep contact

•Let users requests guide you when prioritizing new features

•Listen & try to discover the underlying problem

•Try to find out what the user needs not what they say they want

Page 43: Designing and Implementing Search Solutions

Analytics

•Web analytics

•Search analytics

•A/B testing

Page 44: Designing and Implementing Search Solutions

Usability tests

•Test early - test often

•Use sketches, paper prototypes, static prototypes and working prototypes!

•Create real tasks or problems

•Don’t ask them how they would want it

•Test on friends and family or colleagues

Page 45: Designing and Implementing Search Solutions

Iterate!

Page 46: Designing and Implementing Search Solutions

Why UX design?

•Improved requirements

•Better feedback

•Eliminate bias

•Less development time

Page 47: Designing and Implementing Search Solutions

Summary

•Listen & try to discover the underlying problem

•Search analytics – Top queries

•Do usability tests early & often

• Iterate!

Page 48: Designing and Implementing Search Solutions

Research

•Collaboration with Universities

GU, Borås, KTH, Copenhangen U.

•EU projects

RUSHES

• Master’s Thesis supervision

Chalmers, KTH, Lund

Page 49: Designing and Implementing Search Solutions

Master’s Thesis projects

•A way to test ideas

•A way to recruit people

•A way to cooperate with Universities

•Keyword Extraction

•Document Clustering

•NER

•Document summarization

•Extracting structural information from text

•Query log analysis

Page 50: Designing and Implementing Search Solutions

Resources - books

•The design of everyday things

•Don’t make me think

•Search analytics for your site

•ManifoldCF in Action

•Taming Text

Page 52: Designing and Implementing Search Solutions

Tobias Berg

Björn Klockljung Johansson

Svetoslav Marinov

[email protected]

[email protected]

[email protected]

Thanks!