Product: BioXM Knowledge Management Environment Applications: knowledge management and semantic data integration, research collaboration, information publishing and project management Product contact: [email protected]Biomax Informatics AG, Lochhamer Str. 9, 82152 Martinsried, Germany, +49 89 895574-0 Sophic Systems Alliance, Inc., 200 Main Street, Suite 201, Falmouth, MA 02536, USA, (508) 495-3801 Internet: www.biomax.com, www.sophicalliance.com Efficient biomedical literature mining Scientists involved in disease-specific research, target-gene identification, target validation, chemical-compound development, diagnostics and treatment spend valuable time screening scientific literature. At the expense of productive research time, the results are often unstructured and notably incomplete. The BioLT™ Literature Mining Tool provides an alternative. The intuitive software performs structured text mining using a number of highly curated biological and medical term dictionaries. The tool extracts relations from search terms and their synonyms to terms in selected dictionaries. More than 166 million pre-calculated relations and free- text search capabilities ensure comprehensive research area coverage. The resulting structured information can be easily shared, extended and updated. The results provide a starting point to generate and refine knowledge and hypotheses. The BioLT tool allows researchers to save time and produce significantly superior output compared to common PubMed searches. Integration of the BioLT tool in research infrastructures, for example the BioXM™ Knowledge Management Environment, can improve efficiencies and outcomes of R&D projects considerably. Building a knowledge base for oncology with BioLT linguistics The BioLT tool is the central data mining component used to create a manually curated, up-to-date index covering all cancer genes, including their compound and disease relationships. Preliminary results were published at ISMB 2005. Biomax also offers to carry out customized text-mining projects for other disease areas and biological contexts. The BioLT tool provides comprehensive, structured and ranked answers to the following types of questions: • Which genes and proteins are known to be related to breast cancer? (For example, the BioLT tool presents a sorted list of about 3,000 gene/protein terms, compared to over 130,000 abstracts in PubMed) • For obesity, which genes show genetic variation and which varients are described (e.g., nutrigenomics and pharmacogenomics use cases)? • Which diseases and drug compounds are potentially related to Alzheimer’s disease? The BioLT tool with query results for diseases related to the protein apoE
2
Embed
Efficient biomedical literature mining › documents › sophicdocs › BioLT Product P… · Efficient biomedical literature mining Scientists involved in disease-specific research,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Product: BioXM Knowledge Management Environment
Applications: knowledge management and semantic data integration,
research collaboration, information publishing and project management
Sophic Systems Alliance, Inc., 200 Main Street, Suite 201, Falmouth, MA 02536, USA, (508) 495-3801
Internet: www.biomax.com, www.sophicalliance.com
Automatically generated expert
knowledge
The BioLT tool delivers clearly
structured results with extraordinary
recall and precision, as shown in the
following benchmark example. The
BioLT results were compared to a
manually curated list of "all major
pathways and hereditary cancer
predisposition types" each related to
one of 57 representative predisposition
genes (Vogelstein and Kinzler, 2004*).
With 100% recall, all 57 genes and 57
cancer types were represented in the
BioLT dictionaries. 95% of the
relationships were ranked in the top
three results of up to thousands of hits.
For the remaining three genes, the
corresponding diseases were found in
positions four and five.
The BioLT tool automatically generates
comprehensive results comparable to
the knowledge of expert scientists. The
BioLT text-mining approach works for
other disease areas (such as
cardiovascular, neurological and
infectious diseases) and for additional
biological research areas as well.
Integration into biological and
medical project management
The BioLT tool uses hiqh-quality
thematic dictionaries to identify
relationships between research objects.
The dictionaries can be extended and
customized. The following dictionaries
are currently available:
• Disease — 260,000 entries
• Gene name — 130,000 human gene
names, including name variants
• Compound — 82,000 entries
• Pathway — 61,000 entries
• Organism — 275,000 entries
• Other subdomains (e.g.,
polymorphism, therapy, tissues, cells)
These relationship data sets can be
imported into the BioXM Knowledge
Management Environment for further
curation. With the upload, they are
automatically integrated into a user-
defined biological or medical context.
Thus, BioLT results become part of an
efficient infrastructure even for large
distributed R&D projects.
* Vogelstein B and Kinzler KW (2004) Cancergenes and the pathways they control. Nat Med10(8):789–99
Text-mining technology
In contrast to classical information
retrieval systems, the BioLT software
preprocesses the underlying text
databases (such as scientific or patent
information) with specific background
information. The system first recognizes
all chunks of text (phrases), special
patterns for scientific notations and
words belonging to terminology
dictionaries. After the syntactic analysis,
the system tries to determine the
meaning of ambiguous terms. To
ensure the most complete results,
potentially false meanings are marked,
but are not deleted from the knowledge
database. The resulting text databases
are manually curated by experts to
create the thematic dictionaries used
by the BioLT system.
The BioLT tool uses the BioRS™
Integration and Retrieval System to add
Boolean free-text search capabilities.
Diverse analysis parameters including
the scope of the search, the level of
precision, the resolution of terms with
multiple meanings and the statistical
representation of the results can be
selected.
Dictionary terms in all abstracts BioLT results in the context of a clinical study, displayed in the BioXM software
Biomax, BioLT, BioRS and BioXM are registered trademarks of Biomax Informatics AG in Germany and other countries. Registered names, trademarks, etc., used in this docu-ment, even when not specifically marked as such, are not to be considered unprotected by law. BIOLTPPR0602
FREE TRIAL The BioLT tool for efficient text mining of the MEDLINE database is available using a common Web browser from
www.biomax.com/products/biolt/biolt.htm. Contact us for a free demo account and see how the BioLT tool can speed your research.