Top Banner
Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist
17

Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Jan 20, 2016

Download

Documents

Emmeline Doyle
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Basics ofInformation Retrieval and

Query Formulation

Bekele Negeri DuresaNuclear Information Specialist

Page 2: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Outline

Information Retrieval The INIS Collection Planning searches Formulating queries Measure of Search effectiveness Using search results and queries

2

Page 3: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Information Retrieval

Information Retrieval (IR) is finding material (usually documents) that satisfies an information need from within large collections (usually stored on computers).

Today we frequently think first of “web search”, but includes:

E-mail search Searching your laptop Corporate knowledge bases Structured bibliographic databases

3

Page 4: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

The INIS Collection

INIS Collection is comprised of

The INIS bibliographic Database (3,842,516 records) The INIS Nonconventional Literature (NCL) collection

(510,458 total and 369,140 publicly available) A bibliographic database consists of data (records) whose

attribute are described in fields and a means by which to search these fields, a search engine.

Databases may look different on screen but the underlying principles for searching and formulating search strategies are common to all.

4

Page 5: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Bibliographic Databases (Types)

Bibliographic Databases: provide only citation or reference author(s), title, subject(s) and publisher..) and with this information you should be able to locate the item in the Library.

Bibliographic with some full text content: these enhanced databases include keywords and abstracts and often, but not always, include the full text of a set of records.

Bibliographic databases with full text content: these databases include the entire full text for all articles and other documents indexed.

5

Page 6: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Planning your search

“Understanding the Problem is Half of the Solution”

Define precisely the information you are seeking

Identify the concepts that represent the problem

6

Page 7: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Planning a search: example

Risk of medical radiation exposure? To whom: to patients or doctors and medical technicians? If to patients: are you concerned about exposure due to

radiodiagnostics (CT, X-ray..) or due to radiation therapy or both?

If radiotherapy: are you concerned about radionuclide therapy or external irradiation therapy?

If about personnel: are you interested about safety policy of medical establishements?

7

Page 8: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Planning your search (Cont’d)

Topic: Risk of radiation exposure of medical staff in a radiotherapy department

Concepts: Exposure to radiation Medical staff Radiotherapy

Are we looking for documents In certain language, from certain country.. Latest publications Only records with full text documents, or journal citations..?

8

Page 9: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

IINIS Database Fields

Numerical (Exact or Range search) Year of publication (PY) Reference Number (RN)

Free Text Title (TI) Authors (AU) Source (SO) Abstract (AB)

Controlled Vocabulary Language (LA) Country of Input (CO) Descriptors (DE)

Indexer-assigned descriptors (DEI) Computer-upposted descriptors (DEC)

9

Page 10: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Search Strategy Simple search

single search term or phrase “Oncology” , “nuclear safety”

Advanced search (combining concepts) Boolean Operators: OR, AND, NOT Text Operators any (includes any), all (includes all), exact phrase Numeric Operators

equal, more, less, more or equal, less or equal Truncation, Wildcard Multilingual Search

10

Page 11: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Query Syntax Google Search Appliance

11

Page 12: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Query Formulation

Translating your search concepts into proper search syntax

For the Topic: Risk of radiation exposure of medical staff in a radiotherapy department Simple to complex MEDICAL PERSONNEL (is a BT for RADIOLOGICAL PERSONNEL) OCCUPATIONAL EXPOSURE or OCCUPATIONAL SAFETY or RADIATION PROTECTION RADIOTHERAPY Try to search for individual terms and explore the database; you may identify other key concepts like

radiation doses, ALARA, dose limits… Then combine them using boolean operators (and, or, not) Some databases allow you to combine searches while others allow you to combine your results

during selection of records

12

Page 13: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Measuring Search Effectiveness

Precision & Recall

Recall: the ratio of the number of relevant records retrieved to the total number of relevant records in the database.

13

Page 14: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Precision: the ratio of the number of relevant records retrieved to the totalnumber of irrelevant and relevant records retrieved.

Precision and recall are Inversely relatedHigh recall = comprehensive retrieval

but high noiseHigh Precision = only relevant records

but miss out good records

Source: http://www.creighton.edu/fileadmin/user/HSL/docs/ref/Searching_-_Recall_Precision.pdf

14

Page 15: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Optimize your search strategy

Precision Search in particular field Search in DEI (indexer assigned descriptor) Use exact Phrase Combine using “AND”

Recall Search across fields Combine synonyms, related terms, broad or general terms Use “any” or “all” words

Optimise Use your best judgment From simple to complex

15

Page 16: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Using your Query and search results

Selecting relevant records select format (pdf/ html/excel..) Printing/saving Email search results

Storing query Save and run query Subscribe Feeds

16

Page 17: Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.

Thank you!

17