© Tefko Saracevic1 Search strategy & tactics Governed by effectiveness&feedback.

© Tefko Saracevic 1

Search strategySearch strategy& tactics& tactics

Search strategySearch strategy& tactics& tactics

Governed byGoverned by

effectivenesseffectiveness

& &

feedbackfeedback


Some definitions

• Search statement (query):– set of search terms with

logical connectors and attributes - file and system dependent

• Search strategy (big picture):– overall approach to

searching of a questionselection of systems, files,

search statements & tactics, sequence, output formats; cost, time aspects


Some definitions (cont.)

• Search tactics (action choices):– choices & variations in search

statements terms, connectors, attributes

• Move :– modifications of search

strategies or tactics that are aimed at improving the results

• Cycle (particularly applicable to systems such as DIALOG):

– set of commands from start (begin) to viewing (type) results, or from a viewing to a viewing command


Some definitions (cont.)

• Effectiveness :– performance as to

objectivesto what degree did a search

accomplish what desired?how well done in terms of

relevance?

• Efficiency :– performance as to costs

at what cost and/or effort, time?

Both KEY concepts & criteria for selection of strategy, tactics & evaluation


Effectiveness criteria

• Search tactics chosen & changed following some criteria of accomplishment, such as:– none - no thought given– relevance (very often)– magnitude (also very often)– output attributes– topic/strategy

• Tactics altered interactively– role & types of feedbackKnowing what tactics may produce what results key to professional searcher


Relevance:key concept in IR

• Attribute/criterion reflecting effectiveness of exchange of inf. between people (users) & IR systems in communication contacts, based on valuation by people

• Some attributes:– in IR - user dependent– multidimensional or faceted– dynamic– measurable - somewhat– intuitively well understood


Types of relevance

• Several types considered:– Systems or algorithmic

relevancerelation between between a

query as entered and objects in the file of a system as retrieved or failed to be retrieved by a given procedure or algorithm. Comparative effectiveness.

– Topical or subject relevance: relation between topic in the

query & topic covered by the retrieved objects, or objects in the file(s) of the system, or even in existence; Aboutness..


Types of relevance (cont.) – Cognitive relevance or

pertinence:relation between state of knowledge &

cognitive inf. need of a user and the objects provided or in the file(s). Informativeness, novelty ...

– Motivational or affective relevancerelation between intents, goals &

motivations of a user & objects retrieved by a system or in the file, or even in existence. Satisfaction ...

– Situational relevance or utility: relation between the task or problem-at-

hand. and the objects retrieved (or in the files). Relates to usefulness in decision-making, reduction of uncertainty ...


Effectiveness measures

• Precision:– probability that given that an

object is retrieved it is relevant, or the ratio of relevant items retrieved to all items retrieved

• Recall:– probability that given that an

object is relevant it is retrieved, or the ratio of relevant items retrieved to all relevant items in a file

• Precision easy to establish, recall is not

union of retrievals as a “trick” to establish recall


Precision =

a

a + b

Recall =a

a + c

Calculation

High precision = maximize a, minimize b

High recall = maximize a, minimize c

JudgedRELEVANT

JudgedNOT RELEVANT

ItemsRETRIEVED

aNo. of items

relevant & retrieved

bnot relevant &

retrievedItems

NOT RETRIEVEDc

relevant &not retrieved

dnot relevant &not retrieved


Interpretation: PRECISION

• Precision= percent of relevant stuff you have in your answer– or conversely percent of junk– high precision = most stuff

relevant– low precision = a lot of junk

• Some users demand high precision– do not want to wade through

much stuff– but it comes at a price: relevant

stuff may be missed tradeoff


• A file may have a lot of relevant stuff

• Recall = percent of that relevant stuff in the file that you retrieved– conversely percent of stuff you

missed– high recall = you missed little– low recall = you missed a lot

• Some users demand high recall (e.g. PhD students doing dissertation)

– want to make sure that important stuff is not missed

– but will have to pay a price of wading through a lot of junk

tradeoff

Interpretation:RECALL


Precision-recall trade-off

• USUALLY: precision & recall are inversely related– higher recall usually lower

precision & vice versa100 %

100 %0

Ideal

Usual

Impr

ovem

ents

Pre

cisi

on

Recall


Interpretation:TRADE-OFF

• It is like in life, usually:– you get some lose some

• Usually, but not alwayskeep in mind these are

probabilities

– when you have high precision most stuff you got is relevant or on the target but you missed stuff that is also relevant – it was left behind

– when you have high recall you did not miss much but you got also a lot of junk - wading through itYou use different tactics for high recall from those for high precision


Search tactics

• What variations possible?– several ‘things’ in a query

can be selected or changed that affect effectiveness

– each variation has consequence in output if I do X then Y will happen

1. LOGIC – choice of connectors among

terms (AND, OR, NOT, W …)

2. SCOPE– no. of terms linked - ANDs(A AND B vs A AND B AND C)


Search tactics (cont.)

3.EXHAUSTIVITY– for each concept no. of related

terms - OR connections(A OR B vs. A OR B OR C)

4. TERM SPECIFICITY– for each concept level in hierarchy(broader vs narrower terms)

5. SEARCHABLE FIELDS– choice for text terms & non-text

attributes e.g. titles only, limit as to years

6. FILE OR SYSTEM SPECIFIC CAPABILITIES– e.g. ranking, sorting


Effectiveness “laws”

SCOPE- adding more ANDs

EXHAUSTIVITY- adding more more

ORs

USE OF NOTs- adding more NOTs

BROAD TERM USE– low specificity

Output size: downRecall: downPrecision: up

Output size: upRecall: upPrecision: downOutput size downRecall: downPrecision: up

Output size: upRecall: upPrecision: downOutput size: downRecall: downPrecision: up

PHRASE USE - high specificity


Tactics: What to do?

• To increase precision:– use precision devices

• To increase recall:– use recall devices

• Each will also affect magnitude of output

• With experience use of these devices will become will become second nature


Recall, precision devices

BROADENING higher recall:Fewer ANDsMore ORsFewer NOTsMore free textFewer controlledMore synonymsBroader termsLess specificMore truncationFewer qualifiersFewer limitsCitation growing

NARROWING -higher

precision:More ANDsFewer ORsMore NOTsLess free textMore controlledLess synonymsNarrower termsMore specificLess truncationMore qualifiersMore limitsBuilding blocks


Other tactics• Citation growing:

– find a relevant document– look for documents cited in– look for documents citing it– repeat on newly found

relevant documents

• Building blocks– find documents with term A– review – add term B & so on

• Using different feedbacks– a most important tool


Feedback in searching

• Any feedback implies loops– a completion of a process

provides information for modification, if any, for the next process

– information from output is used to change previous or create new input

• In searching:– some information taken from

output of a search is used to do something with next query (search statement)

examine what you got to decide what to do next in searching

– a basic tactic in searching

• Several feedback types used in searching– each used for different decisions


Feedback types

• Content relevance feedback– judge relevance of items retrieved– make decision what to do next

switch files, change exhaustivity …

• Term relevance feedback– find relevant documents– examine what other terms used in

those documents – search using additional terms

also called query modification & in some systems done automatically

• Magnitude feedback– on the basis of size of output

make tactical decisions often the size so big that documents

are not examined but next search done to limit size


Feedback types (cont.)

• Tactical review feedback– after a number of queries (search

statements) in the same search review tactics as to getting desired outputs

review terms, logic, limits …

– change tactics accordingly

• Strategic review feedback– after a while (or after consultation

with user) review the “big” picture on what searched and how

sources, terms, relevant documents, need satisfaction, changes in question, query …

– do next searches accordingly– used in reiterative searching

• There is a difference between reviewing strategy & tactics– but they can be combined


Bates Berry-picking model of searching

“…moving through many actions towards a general goal of satisfactory completion of research related to information need.”– query is shifting (continually)

as search progresses queries are changing

different tactics are used

– searcher (user) may move through a variety of sourcesnew files, resources may be usedstrategy may change


Berry-picking …

– new information may provide new ideas, new directionsfeedback is used in various ways

– question is not satisfied by a single set of answers, but by a series of selections & bits of information found along the wayresults may vary & may have to

be provided in appropriate ways & means

© Tefko Saracevic1 Search strategy & tactics Governed by effectiveness&feedback.

Documents

retrieved objects

search statements tactics

tefko saracevic6 relevance

search terms

terms of relevance

tactics evaluation slide

cognitive relevance

situational relevance