Top Banner
• text mining https://store.theartofservice.com/the-text-mining- toolkit.html
41

Text mining .

Jan 01, 2016

Download

Documents

Prudence Fowler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Text mining .

• text mining

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 2: Text mining .

Text mining

1 Typical text mining tasks include text categorization, text clustering, concept mining|concept/entity

extraction, production of granular taxonomies, sentiment analysis,

document summarization, and entity relation modeling (i.e., learning relations between named entity

recognition|named entities).

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 3: Text mining .

Text mining - Text mining and text analytics

1 The latter term is now used more frequently in business settings while text mining is used in some of the

earliest application areas, dating to the 1980s, notably life-sciences

research and government intelligence.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 4: Text mining .

Text mining - History

1 Labor-intensive manual text mining approaches first surfaced in the mid-1980s,

but technological advances have enabled the field to advance during the past decade. Text mining is an interdisciplinary field that draws

on information retrieval, data mining, machine learning, statistics, and computational

linguistics. As most information (common estimates say over 80%) is currently stored as

text, text mining is believed to have a high commercial potential value.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 5: Text mining .

Text mining - Security applications

1 Many text mining software packages are marketed for security appliance|

security applications, especially monitoring and analysis of online plain text sources such as Internet

news, blogs, etc. for national security purposes. It is also involved in the

study of text encryption/decryption.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 6: Text mining .

Text mining - Biomedical applications

1 A range of text mining applications in the biomedical literature has been described.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 7: Text mining .

Text mining - Biomedical applications

1 One online text mining application in the biomedical literature is PubGene that combines biomedical text mining with network visualization as an Internet

service. TPX is a concept-assisted search and navigation tool for biomedical

literature analyses - it runs on PubMed/PubMed Central|PMC and can be

configured, on request, to run on local literature repositories too.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 8: Text mining .

Text mining - Software applications

1 Text mining methods and software is also being researched and developed

by major firms, including IBM and Microsoft, to further automate the

mining and analysis processes, and by different firms working in the area of search and indexing in general as

a way to improve their results.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 9: Text mining .

Text mining - Online media applications

1 Text mining is being used by large media companies, such as the Tribune Company,

to clarify information and to provide readers with greater search experiences, which in turn increases site stickiness and

revenue. Additionally, on the back end, editors are benefiting by being able to

share, associate and package news across properties, significantly increasing

opportunities to monetize content.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 10: Text mining .

Text mining - Marketing applications

1 Text mining is starting to be used in marketing as well, more specifically in analytical customer relationship management. Coussement and Van den Poel (2008) apply it to improve

predictive analytics models for customer churn (customer attrition).

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 11: Text mining .

Text mining - Academic applications

1 Therefore, initiatives have been taken such as Nature (journal)|Nature's proposal for an Open Text Mining Interface (OTMI) and the National

Institutes of Health's common Journal Publishing Document Type Definition

(DTD) that would provide semantic cues to machines to answer specific queries contained within text without removing

publisher barriers to public access.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 12: Text mining .

Text mining - Academic applications

1 Academic institutions have also become involved in the text mining initiative:

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 13: Text mining .

Text mining - Academic applications

1 With an initial focus on text mining in the biology|biological and biomedical

sciences, research has since expanded into the areas of social

sciences.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 14: Text mining .

Text mining - Academic applications

1 *In the United States, the UC Berkeley School of Information|

School of Information at University of California, Berkeley is developing a

program called BioText to assist biology researchers in text mining

and analysis.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 15: Text mining .

Text mining - Academic applications

1 Further, private initiatives also offer tools

for academic text mining:

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 16: Text mining .

Text mining - Software and applications

1 Text mining computer programs are available from many commercial software|commercial and open source companies and sources.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 17: Text mining .

Text mining - Commercial

1 * AeroText – a suite of text mining applications for content analysis. Content used can be in multiple

languages.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 18: Text mining .

Text mining - Commercial

1 * Attensity – hosted, integrated and stand-alone text mining (analytics) software that uses natural language

processing technology to address collective intelligence in Social Media

and forums; the voice of the customer in surveys and emails;

customer relationship management; e-services; research and e-discovery; risk and compliance; and intelligence

analysis.https://store.theartofservice.com/the-text-mining-toolkit.html

Page 19: Text mining .

Text mining - Commercial

1 * Autonomy Corporation|Autonomy – text mining, clustering and categorization

software

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 20: Text mining .

Text mining - Commercial

1 * Clarabridge – text analytics (text mining) software, including natural language (NLP), machine learning,

clustering and categorization. Provides SaaS, hosted and on-

premise text and sentiment analytics that enables companies to collect, listen to, analyze, and act on the Voice of the Customer (VOC) from both external (Twitter, Facebook, Yelp!, product forums, etc.) and

internal sources (Call Center notes, CRM, Enterprise Data Warehouse, BI,

surveys, emails, etc.).

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 21: Text mining .

Text mining - Commercial

1 * WordStat - Content analysis and text mining add-on module of QDA

Miner for analyzing large amounts of text data.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 22: Text mining .

Text mining - Open source

1 * The programming language R (programming language)|R provides

a framework for text mining applications in the package tm

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 23: Text mining .

Text mining - Open source

1 * KH Coder - For content analysis, text mining or

corpus linguistics.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 24: Text mining .

Text mining - Implications

1 Until recently, websites most often used text-based searches, which only found documents containing specific user-defined words or phrases. Now, through use of a semantic web, text

mining can find content based on meaning and context (rather than

just by a specific word).

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 25: Text mining .

Text mining - Implications

1 Additionally, text mining software can be used to build large dossiers of information

about specific people and events. For example, large datasets based on data

extracted from news reports can be built to facilitate social networks analysis or

counter-intelligence. In effect, the text mining software may act in a capacity

similar to an intelligence analyst or research librarian, albeit with a more limited scope of

analysis.https://store.theartofservice.com/the-text-mining-toolkit.html

Page 26: Text mining .

Text mining - Implications

1 Text mining is also used in some email spam filters as a way of

determining the characteristics of messages that are likely to be

advertisements or other unwanted material.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 27: Text mining .

Intelligent text analysis - Text mining and text analytics

1 The term 'text analytics' describes a set of Linguistics|linguistic, statistical, and machine learning techniques that model and

structure the information content of textual sources for business intelligence, exploratory data analysis, research, or

investigation.[ http://intelligent-enterprise.informationweek.com/blog/archives/2007/02/defining_text_a.html Defining Text Analytics] The term is

roughly synonymous with text mining; indeed, Ronen Feldman modified a 2000 description of text mining[

http://www.cs.cmu.edu/~dunja/CFPWshKDD2000.html KDD-2000 Workshop on Text Mining] in 2004 to describe text

analytics.[ http://www.ir.iit.edu/cikm2004/tutorials.html#T2 Text Analytics: Theory and Practice] The latter term is now used more frequently in business settings while text mining is used in

some of the earliest application areas, dating to the 1980s, notably life-sciences research and government intelligence.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 28: Text mining .

National Centre for Text Mining

1 is a publicly funded text mining (TM) centre. It was established to provide support, advice, and information on TM technologies and to disseminate

information from the larger TM community, while also providing

tailored services and tools in response to the requirements of the

United Kingdom academic community.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 29: Text mining .

National Centre for Text Mining

1 The software tools and services which NaCTeM supplies allow researchers to apply text mining techniques to problems within

their specific areas of interest - examples of these tools are highlighted below. In addition

to providing services, the Centre is also involved in, and makes significant

contributions to, the text mining research community both nationally and

internationally in initiatives such as Europe PubMed Central.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 30: Text mining .

National Centre for Text Mining - Resources

1 [http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/home/wiki.cgi?

page=GENIA+corpus 'GENIA'] a collection of reference materials for the development of biomedical text

mining systems

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 31: Text mining .

List of text mining software - Commercial

1 * AUTINDEX - is a commercial text mining software package based on

sophisticated linguistics by IAI (Institute for Applied Information

Sciences), Saarbrücken.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 32: Text mining .

List of text mining software - Commercial

1 * Eduworks – software and solutions providing analytics and text mining

in education, competency management, and training.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 33: Text mining .

Biomedical text mining

1 'Biomedical text mining' (also known as 'BioNLP') refers to text

mining applied to texts and literature of the biomedical and molecular

biology domain. It is a rather recent research field on the edge of natural language processing, bioinformatics,

medical informatics and computational linguistics.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 34: Text mining .

Biomedical text mining

1 There is an increasing interest in text mining and information extraction

strategies applied to the biomedical and molecular biology literature due

to the increasing number of electronically available publications

stored in databases such as PubMed.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 35: Text mining .

Biomedical text mining - Main applications

1 Information extraction and text mining methods have been explored

to extract information related to biological processes and diseases.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 36: Text mining .

Biomedical text mining - Examples

1 * [http://u-compare.org/index.html U-Compare] - U-Compare is an

integrated text mining/natural language processing system based

on the UIMA Framework, with an emphasis on components for

biomedical text mining.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 37: Text mining .

Biomedical text mining - Examples

1 * [http://www-tsujii.is.s.u-tokyo.ac.jp/medie/ MEDIE] - an

intelligent search engine to retrieve biomedical correlations from

MEDLINE, based on indexing by Natural Language Processing and

Text Mining techniques

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 38: Text mining .

Biomedical text mining - Examples

1 * [http://www.nextbio.com NextBio]- Life sciences search engine with a

text mining functionality that utilizes PubMed abstracts

[http://www.nextbio.com/b/home/generalSearch.nb?q=breast+cancer (ex: literature search)] and clinical trials

[http://www.nextbio.com/b/home/generalSearch.nb?

q=breast+cancer#sitype=TRIALS (example)] to return concepts

relevant to the query based on a number of heuristics including ontology relationships, journal impact, publication date, and

authorship.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 39: Text mining .

Biomedical text mining - Examples

1 * [http://brainarray.mbni.med.umich.edu/Brainarray/prototype/PubAnatomy/

PubAnatomy ] — An interactive visual search engine that provides new ways to explore relationships

among Medline literature, text mining results, anatomical

structures, gene expression and other background information.

https://store.theartofservice.com/the-text-mining-toolkit.html

Page 40: Text mining .

Biomedical text mining - Examples

1 * [http://anote-project.org @Note2] — A workbench for Biomedical Text

Mining (Including Information Retrieval, Name Entity Recognition

and Relation Extraction plugins)

https://store.theartofservice.com/the-text-mining-toolkit.html