Top Banner
Beth Golden Manager, Editorial Services Factiva Intelligent Indexing™ SLA 2004
19

Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Mar 27, 2015

Download

Documents

Maria Todd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Beth Golden

Manager, Editorial Services

Factiva Intelligent Indexing™

SLA 2004

Page 2: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Agenda

• Factiva Intelligent Indexing™

• Application of Factiva Intelligent Indexing™

• Pros and Cons

• Quality Control

Page 3: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Factiva Intelligent Indexing™

Factiva Taxonomy

320,000 companies

760+ industries

450+ news subjects

370+ regions

22 languages

Page 4: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

FII Structure

• One universal taxonomy

• Building blocks

• Inclusive hierarchy

• Polyarchy

• Synonyms and alias names

• Full descriptions

• Variable depth and breadth

Page 5: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Polyarchy

• Internet/Online services

• E-commerce

• Internet browsers

• Internet portals

• Internet search engines

• Internet service providers

• etc.

• Computers

• Computer hardware

• Computer services

• Computer stores

• Networking

• Semiconductors

Software

• Applications software

• GroupWare

• Intelligent agents

• Internet browsers

• etc.

Page 6: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Factiva Intelligent Indexing™

Company Codes

Industry Codes

Subject Codes

Region Codes

Codes On documents Search

Page 7: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

FII Application

• Code mapping

• Entity extraction

• Rule-based system

• Linguistic analysis software

• Manual review

Page 8: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Code Mapping

• Most information providers provide some form of metadata. This is

matched to relevant Factiva indexing terms.

• Advantages:

• Easy and quick

• Efficient use of existing data

• Disadvantages:

• Mismatches between coding schemes

• Different interpretations of same concepts

• Variable quality – which sources do you trust?

Page 9: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Entity extraction

• This tool finds company names which are then compared to our

controlled vocabulary.

• Advantages:

• Consistent

• Precise

• Disadvantages:

• Ambiguous names

• High maintenance costs

Page 10: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Symbology Snapshot

Page 11: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Rule-based system

• Sets of IF-THEN statements established by editors, information

architects, or subject-matter experts.

• Advantages:

• Good at highly formulaic content

• Precise

• Disadvantages:

• Need thousands of rules for a complete system

• Maintenance of the rules themselves becomes VERY expensive!

• Only captures explicit concepts

Page 12: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Example

Page 13: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Linguistics-based categorization

• This tool is currently employed across all English, French, German and Spanish language publications. A combination of linguistic analysis and statistical algorithms allows new content to be compared to example data and coded appropriately.

• Advantages:

• Scales to millions of documents, thousands of categories, multiple languages

• Copes well with change

• Fits editorial workflow

• Good fine-tuning tools – editorial control

• Codes implicit as well as explicit concepts

• Disadvantages:

• Training time and cost

Page 14: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Editorial Control

• Set relevance levels

• Maintain training set

• Stop words - correlation and multiple meanings

• "Chechnya" to the industries model, as it was triggering the freelance

journalist code (because so many of them were dying there)

Page 15: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Manual coding

• About 200 editors spread across main time zones

• Advantages:

• Humans easily grasp the gist of the story

• Cope well with exceptions

• Visible/Controllable

• Disadvantages:

• Very resource-intensive = Expensive

• Slow

• Inconsistent (subjective and temporal)

• Not scalable

Page 16: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Review process

• Lists reviewed every three months, redefinition, new codes,

expansion changes

• Market research/customer feedback and behavior

• Changes to parent schemes/standards

• Editorial/Quality control feedback

• Internal coding forum

• 45-day notice period

Page 17: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Quality control

• Sampling by editors

• Scoring for precision and recall

• Analysis by source, language, code, editor etc.

• Feedback to editors and systems

• Corrective action

Page 18: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Results

• Three million articles coded a month

• All receive a level of autocoding

• Seventy-nine percent automation or more than two million are auto-

coded with no further manual review

Page 19: Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Recap

• Factiva’s taxonomy is Factiva Intelligent Indexing™

• Factiva uses a hybrid methodology for application

• Factiva has a coding team for governance and maintenance

• End result: Factiva Intelligent Indexing™ leverages our editorial

strengths, combining human experience and expertise with the latest

automation software to implement a completely flexible and granular

indexing system across all of our content.