Top Banner
2003.12.02 - SLIDE 1 IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/ is202/f03/ SIMS 202: Information Organization and Retrieval
70

2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 1IS 202 – FALL 2003

Lecture 23: Interfaces for Information Retrieval II

Prof. Ray Larson & Prof. Marc Davis

UC Berkeley SIMS

Tuesday and Thursday 10:30 am - 12:00 pm

Fall 2003http://www.sims.berkeley.edu/academics/courses/is202/f03/

SIMS 202:

Information Organization

and Retrieval

Page 2: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 2IS 202 – FALL 2003

Lecture Overview

• Review of Last Time– Introduction to HCI

– Why Interfaces Don’t Work

– Early Visions: Memex

• Interfaces for Information Retrieval II

• Discussion Questions

• Action Items for Next Time

Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

Page 3: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 3IS 202 – FALL 2003

Lecture Overview

• Review of Last Time– Introduction to HCI

– Why Interfaces Don’t Work

– Early Visions: Memex

• Interfaces for Information Retrieval II

• Discussion Questions

• Action Items for Next Time

Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

Page 4: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 4IS 202 – FALL 2003

“Drawing the Circles”

Page 5: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 5

Human-Computer Interaction (HCI)

• Human– The end-users of a program– The others in the organization– The designers of the program

• Computer– The machines the programs run on

• Interaction– The users tell the computers what they want– The computers communicate results– The computer may also tell users what the computer

wants them to do

Page 6: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 6IS 202 – FALL 2003

Shneiderman’s Design Principles

• Provide informative feedback

• Permit easy reversal of actions

• Support an internal locus of control

• Reduce working memory load

• Provide alternative interfaces for expert and novice users

Page 7: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 7IS 202 – FALL 2003

HCI for IR

• Information seeking is an imprecise process

• UI should aid users in understanding and expressing their information needs– Help formulate queries– Select among available information sources– Understand search results– Keep track of the progress of their search

Page 8: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 8

How to Design and Build UIs

• Task analysis

• Rapid prototyping

• Evaluation

• Implementation

Design

Prototype

Evaluate

Iterate at every stage!

Page 9: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 9IS 202 – FALL 2003

Evaluation Techniques

• Qualitative vs. quantitative methods• Qualitative (non-numeric, discursive,

ethnographic)– Focus groups– Interviews– Surveys– User observation– Participatory design sessions

• Quantitative (numeric, statistical, empirical)– User testing– System testing

Page 10: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 10IS 202 – FALL 2003

Why Interfaces Don’t Work

• Because…– We still think of using the interface– We still talk of designing the interface– We still talk of improving the interface

• “We need to aid the task, not the interface to the task.”

• “The computer of the future should be invisible.”

Page 11: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 11IS 202 – FALL 2003

“What Dr. Bush Foresees”

Cyclops CameraWorn on forehead, it would photograph anything you see and want to record. Film would be developed at once by dry photography.

MicrofilmIt could reduce Encyclopaedia Britannica to volume of a matchbox. Material cost: 5¢. Thus a whole library could be kept in a desk.

VocoderA machine which could type when talked to. But you might have to talk a special phonetic language to this mechanical supersecretary.

Thinking machineA development of the mathematical calculator. Give it premises and it would pass out conclusions, all in accordance with logic.

MemexAn aid to memory. Like the brain, Memex would file material by association. Press a key and it would run through a “trail” of facts.

Page 12: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 12IS 202 – FALL 2003

Interaction Paradigms for IR

• Direct manipulation– Query specification– Query refinement– Result selection

• Delegation– Agents– Recommender systems– Filtering

Page 13: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 13IS 202 – FALL 2003

Lecture Overview

• Review of Last Time– Introduction to HCI

– Why Interfaces Don’t Work

– Early Visions: Memex

• Interfaces for Information Retrieval II

• Discussion Questions

• Action Items for Next Time

Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

Page 14: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 14IS 202 – FALL 2003

HCI For IR

• Browsing– Visualizing collections and documents– Navigating collections and documents

• Searching– Formulating queries– Visualizing results– Navigating results– Refining queries– Selecting results

Page 15: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 15IS 202 – FALL 2003

Information Visualization

• Utility– Inherently visual data– Making the abstract concrete– Making the invisible visible

• Techniques– Icons– Color highlighting– Brushing and linking– Panning and zooming– Focus-plus-context– Magic lenses– Animation

Page 16: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 16IS 202 – FALL 2003

Mapping

• Logical structure of the information– Hierarchy– Rank– Proximity– Similarity distance– Term frequency– History of changes– Etc.

• Perceptual representation of the information– Outlines, trees, graphs– Color, size, shape,

distance– Symbolic icons– Animation, interaction– Etc.

Page 17: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 17IS 202 – FALL 2003

Task = Information Access

• The standard interaction model for information access

1) Start with an information need2) Select a system and collections to search on3) Formulate a query4) Send the query to the system5) Receive the results6) Scan, evaluate, and interpret the results7) Stop, or8) Reformulate the query and go to Step 4

Page 18: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 18IS 202 – FALL 2003

HCI Questions for IR

• Where does a user start? – Faced with a large set of collections, how can

a user choose one to begin with?

• How will a user formulate a query?

• How will a user scan, evaluate, and interpret the results?

• How can a user reformulate a query?

Page 19: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 19IS 202 – FALL 2003

HCI for IR: Collection Selection

Question 1: Where does the user start?

Page 20: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 20IS 202 – FALL 2003

Starting Points for Search

• Faced with a prompt or an empty entry form … how to start?– Lists of sources– Overviews

• Clusters• Category Hierarchies/Subject Codes• Co-citation links

– Examples, Wizards, and Guided Tours– Automatic source selection

Page 21: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 21IS 202 – FALL 2003

List of Sources

• Have to guess based on the name

• Requires prior exposure/experience

Page 22: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 22IS 202 – FALL 2003

Old Lexis-Nexis Interface

Page 23: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 23IS 202 – FALL 2003

Overviews

• Supervised (manual) category overviews– Yahoo!– HiBrowse– MeSHBrowse

• Unsupervised (automated) groupings – Clustering– Kohonen feature maps

Page 24: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 24IS 202 – FALL 2003

Yahoo! Interface

Page 25: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 25IS 202 – FALL 2003

Summary: Category Labels

• Advantages– Interpretable– Capture summary information– Describe multiple facets of content– Domain dependent, and so descriptive

• Disadvantages– Do not scale well (for organizing documents)– Domain dependent, so costly to acquire– May mismatch users’ interests

Page 26: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 26IS 202 – FALL 2003

Text Clustering

• What clustering does– Finds overall similarities among groups of documents– Finds overall similarities among groups of tokens– Picks out some themes, ignores others

• How clustering works– Cluster entire collection– Find cluster centroid that best matches the query– Problems with clustering

• It is expensive• It doesn’t work well

Page 27: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 27IS 202 – FALL 2003

Scatter/Gather Interface

Page 28: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 28IS 202 – FALL 2003

“ThemeScapes” Clustering

Page 29: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 29IS 202 – FALL 2003

Kohonen Feature Maps on Text

Page 30: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 30IS 202 – FALL 2003

Summary: Clustering

• Advantages– Get an overview of main themes– Domain independent

• Disadvantages– Many of the ways documents could group together

are not shown– Not always easy to understand what they mean– Can’t see what documents are about– Documents may be forced into one position in

semantic space– Hard to view titles

• Perhaps more suited for pattern discovery– Problem: often only one view on the space

Page 31: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 31IS 202 – FALL 2003

HCI for IR: Query Formulation

• Question 2: How will a user formulate a query?

Page 32: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 32IS 202 – FALL 2003

Query Specification

• Interaction styles (Shneiderman 97)– Command language– Form fill– Menu selection– Direct manipulation– Natural language

• What about gesture, eye-tracking, or implicit inputs like reading habits?

Page 33: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 33IS 202 – FALL 2003

Command-Based Query Specification

• COMMAND ATTRIBUTE value CONNECTOR …– FIND PA shneiderman AND TW interface

• What are the ATTRIBUTE names?

• What are the COMMAND names?

• What are allowable values?

Page 34: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 34IS 202 – FALL 2003

Form-Based Query Specification

Page 35: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 35IS 202 – FALL 2003

Form-Based Query Specification

Page 36: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 36IS 202 – FALL 2003

Direct Manipulation Query Specification

Page 37: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 37IS 202 – FALL 2003

Menu-Based Query Specification

Page 38: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 38IS 202 – FALL 2003

Natural Language Query

• AskJeeves– http://www.ask.com/

Page 39: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 39IS 202 – FALL 2003

HCI for IR: Viewing Results

• Question 3: How will a user scan, evaluate, and interpret the results?

Page 40: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 40IS 202 – FALL 2003

Display of Retrieval Results

• Goal– Minimize time/effort for deciding which

documents to examine in detail

• Idea– Show the roles of the query terms in the

retrieved documents, making use of document structure

Page 41: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 41IS 202 – FALL 2003

Putting Results in Context

• Interfaces should – Give hints about the roles terms play in the

collection– Give hints about what will happen if various

terms are combined– Show explicitly why documents are retrieved

in response to the query– Summarize compactly the subset of interest

Page 42: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 42IS 202 – FALL 2003

Putting Results in Context

• Visualizations of query term distribution– KWIC, TileBars, SeeSoft, Virtual Shakespeare

• Visualizing shared subsets of query terms– InfoCrystal, VIBE

• Table of contents as context– SuperBook, Cha-Cha

Page 43: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 43IS 202 – FALL 2003

KWIC (Keyword in Context)

Page 44: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 44IS 202 – FALL 2003

TileBars

• Graphical representation of term distribution and overlap• Simultaneously indicate

– Relative document length

– Query term frequencies

– Query term distributions

– Query term overlap

Page 45: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 45IS 202 – FALL 2003

TileBars Example

• Mainly about both DBMS & reliability

• Mainly about DBMS, discusses reliability

• Mainly about, say, banking, with a subtopic discussion on DBMS/Reliability

• Mainly about high-tech layoffs

Query terms:

What roles do they play in retrieved documents?

DBMS (Database Systems)

Reliability

Page 46: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 46IS 202 – FALL 2003

TileBars Example

Page 47: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 47IS 202 – FALL 2003

SeeSoft (Eick & Wills 95)

Page 48: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 48IS 202 – FALL 2003

David Small: Virtual Shakespeare

Page 49: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 49IS 202 – FALL 2003

Other Approaches

• Show how often each query term occurs in sets of retrieved documents– VIBE (Korfhage ‘91)– InfoCrystal (Spoerri ‘94)

Page 50: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 50IS 202 – FALL 2003

VIBE (Olson et al. 93, Korfhage 93)

Page 51: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 51IS 202 – FALL 2003

InfoCrystal (Spoerri 94)

Page 52: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 52IS 202 – FALL 2003

Problems with InfoCrystal

• Can’t see proximity or frequency of terms within documents

• Quantities not represented graphically

• More than 4 terms hard to handle

• No help in selecting terms to begin with

Page 53: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 53IS 202 – FALL 2003

Cha-Cha (Chen & Hearst 98)

• Shows “Table-Of-Contents”-like view, like SuperBook

• Focus+Context using hyperlinks to create the TOC

• Integrates Web Site structure navigation with search

Page 54: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 54IS 202 – FALL 2003

HCI for IR: Query Reformulation

• Question 4: How can a user reformulate a query?

Page 55: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 55IS 202 – FALL 2003

Query Reformulation

• Thesaurus expansion– Suggest terms similar to query terms

• Relevance feedback– Suggest terms (and documents) similar to

retrieved documents that have been judged to be relevant

– “More like this” interaction

Page 56: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 56IS 202 – FALL 2003

Relevance Feedback

• Modify existing query based on relevance judgements– Extract terms from relevant documents and add them

to the query– And/or re-weight the terms already in the query

• Two main approaches– Automatic (pseudo-relevance feedback)– Users select relevant documents

• Users/system select terms from an automatically generated list

Page 57: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 57IS 202 – FALL 2003

Revealing Internals

• Opaque (black box) – (Like web search engines)

• Transparent – (See used terms after Relevance Feedback )

• Penetrable – (Choose suggested terms before Relevance

Feedback )

• Which do you think worked best?

Page 58: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 58IS 202 – FALL 2003

Effectiveness Results

• Subjects using Relevance Feedback showed 17% - 34% better performance than without Relevance Feedback

• Subjects with penetration case did 15% better as a group than those in opaque and transparent cases

Page 59: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 59IS 202 – FALL 2003

Summary: Relevance Feedback

• Iterative query modification can improve precision and recall for a standing query

• In at least one study, users were able to make good choices by seeing which terms were suggested for Relevance Feedback and selecting among them

• So … “more like this” can be useful!

Page 60: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 60IS 202 – FALL 2003

Summary: HCI for IR

• Focus on the task, not the tool• Be aware of

– User abilities and differences– Prior work and innovations– Design guidelines and rules-of-thumb

• Iterate, iterate, iterate

• It is very difficult to design good UIs• It is very difficult to evaluate search UIs• Better interfaces in future should produce better

IR experiences

Page 61: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 61IS 202 – FALL 2003

Lecture Overview

• Review of Last Time– Introduction to HCI

– Why Interfaces Don’t Work

– Early Visions: Memex

• Interfaces for Information Retrieval II

• Discussion Questions

• Action Items for Next Time

Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

Page 62: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 62IS 202 – FALL 2003

Discussion Questions

• Arthur Law on Interfaces for IR– Using visualization in web information retrieval

revealed poor results for navigation. However, this study was conducted in 1998. Are people more accustomed to these tools now with websites such as "http://www.smartmoney.com/marketmap/"? Perhaps this method of navigation will be better for the computer generation and their higher comfort level for using the web.

Page 63: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 63IS 202 – FALL 2003

Discussion Questions

• Arthur Law on Interfaces for IR– There are various examples of command line

approaches and visual approaches. Individuals perform differently with each method so will the next step involve combining these methods to optimize each person's task of information retrieval? Or will a dominant company, i.e., LexisNexis or Google enforce one method of doing queries?

Page 64: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 64IS 202 – FALL 2003

Discussion Questions

• Paul Laskowski on Interfaces for IR– MIR describes at least six sources of contextual

information for the documents returned by a query: metadata, term scores, location of terms in each document, combinations of terms present in each document, tables of contents, and hyperlink structure. Which of these sources provides the most help for selecting relevant documents (or does it depend on the task)? Which types of context can help with reformulating a query? In the case of the location of terms, several tools are listed that graphically show where terms are placed in each document. I imagine using this to select documents where the terms appear in the same paragraph. Should this process be automated so that documents score higher when the search terms are near to each other? In what other ways might I use this information?

Page 65: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 65IS 202 – FALL 2003

Discussion Questions

• Brooke Maury on Interfaces for IR– In chapter 10.7, Hearst discusses an application

developed by Kozierok and Maes that keeps track of a user’s activities and makes recommendations based on previous action or situations. What impact does this “assistant/agent” application have on privacy? Is this too heavy a price to pay for achieving a positive human computer exchange or a more successful retrieval? If a system is charged with “looking over the shoulder” of a user, is there an ethical imperative to encrypt that information or otherwise provide safeguards against the misuse or abuse of that information?

Page 66: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 66IS 202 – FALL 2003

Discussion Questions

• Brooke Maury on Interfaces for IR– The study by Koenemann and Belkin

suggests that the most effective systems will allow users total control and access to what information is used for decision-making (They call such applications ‘penetrable.’). The system developed by Kozierok & Maes makes a number of important decisions without input from the user. Should K & M’s application be more ‘penetrable’?

Page 67: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 67IS 202 – FALL 2003

Discussion Questions

• Dan Perkel on Interfaces for IR– While the web "has suddenly made vast quantities of

information available globally" (MIR, 322) some would argue that it also comes at the price of a giant step backwards in terms of interfaces (As one example, compare the functionality of and types of interaction allowed by an email web app such as YahooMail/HotMail with an email client such as Eudora/Outlook/AppleMail). What does this say about the future of visualization techniques for IR? What needs to happen (technically, business-wise, other) for a top search engine to add an interactive visualization component to its search results?

Page 68: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 68IS 202 – FALL 2003

Discussion Questions

• Joseph Hall on Interfaces for IR– In section 10.9 of MIR: "The field of information visualization needs some

new ideas about how to display large, abstract information spaces intuitively.“ The seems to be the "holy grail" of HCI. Something that can intuitively deal with large information spaces... with feeble human brains providing imperfect queries. For example, a nowhere-near feeble brain and pretty direct query is evidenced by danah boyd's most recent blog entry: turtles all the way down http://www.zephoria.org/thoughts/archives/000889.html#000889 In this blog entry, danah has already queried the state-of-the-art search tool, Google, and unfortunately came across conflicting results.

– While Google can handle large information spaces sometimes the PageRank algorithm is just not enough. Seeing as humans tend to think in terms of "concentration"[1], what are some of the "penetrable" ways that IR tools could more effectively facilitate the human thought process instead of simply retrieving information?

– [1] An old card game that requires remembering exactly where you saw a certain card for retrieval later.

Page 69: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 69IS 202 – FALL 2003

Lecture Overview

• Review of Last Time– Introduction to HCI

– Why Interfaces Don’t Work

– Early Visions: Memex

• Interfaces for Information Retrieval II

• Discussion Questions

• Action Items for Next Time

Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

Page 70: 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

2003.12.02 - SLIDE 70IS 202 – FALL 2003

Next Time

• Wishter DEMO!

• Final Exam Review