Bioinformatioc: Information Retrieval - II

Information Retrieval - II

Information retrieval (IR) is the science of searching for documents, for information within documents and for metadata about documents, as well as that of searching relational databases and the World Wide Web.

IR is interdisciplinary, based on computer science, mathematics, library science, information science, information architecture, cognitive psychology, linguistics, statistics and physics.

Information Storage and Retrieval (ISAR):

Operations performed by the hardware and software used in indexing and storing a file of machine-readable records whenever a user queries the system for information relevant to a specific topic. For records to be retrieved, the search statement must be expressed in syntax executable by the computer.

Information Storage and Retrieval (ISAR):Information Storage and Retrieval (ISAR):

A computer hardware and software system designed to accept, store, manipulate, and analyze data and to report results, usually on a regular, ongoing basis. An IS usually consists of a data input subsystem, a data storage and retrieval subsystem, a data analysis and manipulation subsystem, and a reporting subsystem.

Information Storage and Retrieval (ISAR):

Widely used in scientific research, business management, medicine and health, resource management, and other fields that require statistical reporting.

Information Retrieval Process:

An information retrieval process begins when a user enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In information retrieval a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevancy


An object is an entity which keeps or stores information in a database. User queries are matched to objects stored in the database. Depending on the application the data objects may be, for example, text documents, images or videos. Often the documents themselves are not kept or stored directly in the IR system, but are instead represented in the system by document surrogates.


Most IR systems compute a numeric score on how well each object in the database match the query, and rank the objects according to this value. The top ranking objects are then shown to the user. The process may then be iterated if the user wishes to refine the query.

Performance measures:Performance measures:

Many different measures for evaluating the performance of information retrieval systems have been proposed. The measures require a collection of documents and a query. All common measures described here assume a ground truth notion of relevancy: every document is known to be either relevant or non-relevant to a particular query. In practice queries may be ill-posed and there may be different shades of relevancy. Precision and Recall are two widely used measures for evaluating the quality of results in domains such as Information Retrieval and statistical classification.

Performance measures:Performance measures:

Precision can be seen as a measure of exactness or fidelity, whereas Recall is a measure of completeness.

In an Information Retrieval scenario, Precision is defined as the number of relevant documents retrieved by a search divided by the total number of documents retrieved by that search, and Recall is defined as the number of relevant documents retrieved by a search divided by the total number of existing relevant documents (which should have been retrieved).

Performance measures

Precision

Precision is the fraction of the documents retrieved that are relevant to the user's information need.

Performance measures

Recall Recall is the fraction of the documents that are relevant to

the query that are successfully retrieved.

Recall = A/(A+D)

Proportion of documents relevant to a search questionthat are retrieved by a given search formulation.

Precision = A/(A+B)

Proportion of documents retrieved by a given search formulation that is relevant to the search question.

General applications of information retrieval * Digital libraries * Information filtering * Media search o Blog search o Image retrieval o Music retrieval o News search o Speech retrieval o Video retrieval * Search engines o Desktop search o Enterprise search o Federated search o Mobile search o Social search o Web search

Domain specific applications of information retrieval

* Expert search finding * Genomic information retrieval * Geographic information retrieval * Information retrieval for chemical structures * Information retrieval in software engineering * Legal information retrieval * Vertical search

District Health Information System (DHIS)

The District Health Information System (DHIS) is a highly flexible, open-source health management information system and data warehouse. It is developed by the Health Information Systems Programme (HISP) project.

District Health Information System (DHIS)

The solution covers aggregated routine data, semi-permanent data (staffing, equipment, infrastructure, population estimates), survey/audit data, and certain types of case-based on patient-based data (for instance disease notification or patient satisfaction surveys). The system supports the capture of data linked to any level in an organizational hierarchy, any data collection frequency, a high degree of customization at both the input and output side. It has been translated into a number of languages.

Health Information Systems Program (HISP)

Procticals

EHR (Google Health and MS HealthVault)MRS (OpenMRS)HIS (DISH v 2.0)CPOEBioinformatics Portal (PU)

Thank you...

https://www.researchgate.net/

Bioinformatioc: Information Retrieval - II

Technology

information retrieval

information retrieval

information relevant

genomic information

legal information retrieval

information retrieval

information science

information storage