Integrating an Enterprise Taxonomy with Local Variations Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional Services http://www.kapsgroup.com
Dec 27, 2015
Integrating an Enterprise Taxonomy with Local Variations
Tom ReamyChief Knowledge Architect
KAPS Group
Program Chair – Text Analytics World
Knowledge Architecture Professional Services
http://www.kapsgroup.com
2
Agenda
Introduction – Information Environment Research Approach Integrated Solution – governance, technology – text analytics Conclusions
3
Introduction: KAPS Group
Knowledge Architecture Professional Services – Network of Consultants Applied Theory – Faceted & emotion taxonomies, natural categories
Services:– Strategy – IM & KM - Text Analytics, Social Media, Integration– Taxonomy/Text Analytics, Social Media development, consulting– Text Analytics Quick Start – Audit, Evaluation, Pilot
Partners – Smart Logic, Expert Systems, SAS, SAP, IBM, FAST, Concept Searching, Attensity, Clarabridge, Lexalytics
Clients: Genentech, Novartis, Northwestern Mutual Life, Financial Times, Hyatt, Home Depot, Harvard Business Library, British Parliament, Battelle, Amdocs, FDA, GAO, World Bank, Dept. of Transportation, etc.
Program Chair – Text Analytics World – March 29-April 1 - SF Presentations, Articles, White Papers – www.kapsgroup.com Current – Book – Text Analytics: How to Conquer Information Overload,
Get Real Value from Social Media, and Add Smart Text to Big Data
Information Environment
Multi-National Financial Institution-10,000+ Diversity - multiple languages, cultures, information needs
and behaviors, organizational cultures Initial Application – knowledge management networks
– Network definition – somewhat by subject area, but also political
Multiple applications – search, browse, web sites– Expertise location, Accounting-resource, analysis
Multiple audiences – internal and external, expert and non-expert (everyone a non-expert in something)
4
Approach
First step – research into variations– Use cases, levels of granularity– Common terms with different meanings
Interviews with multiple groups, roles, levels– Contextual interviews, information interviews– Taxonomy interviews – suggested terms and relationships
Analysis – taxonomies, search logs suggest facets, HR expertise descriptions, local web sites, keywords, clustering, new terms
Group sessions – representatives of multiple constituencies – talking out the differences
5
6
Current Environment Overview
Current form of Topics: Long and flat – 2 levels– Difficult to build on, desire for more specificity for experts and
content, usability issues, no place for new topics Multiple taxonomies – topics, organizational, Web site
browse, industry codes– Partial overlaps, conflicting – Political – Social Development & Gender
Variations – official term, relationships of terms– New terms mostly at lower levels and stable structure
Cross-cutting topics – Finance of Education, Poverty
7
Elements of the Solution
Taxonomy is only one part of the solution– Faceted metadata and text analytics– Enterprise taxonomy – death of?
Analysis of taxonomy – suitable for categorization & views– Structure – not too flat, not too large– Orthogonal categories – easier to tag and easier to map
variations Idea of Views – browse by local variations – map to official topics
– Supported by software – Pool Party– Role-based views, Activity-based views
Solution: integration of multiple components – two critical-Governance and Text Analytics
Mid-tier layer
Information Services/Semantic Layer
Enterprise Information Integration
Analysis & ReportingEnterprise search
engineData mining,
Text Analytics engine Statistical Analysis
Predictive Analytics
Front-end application
Dashboard Ad-hoc query Mobile Apps Portals/web LoB Client UI Mashups Enterprise Search UI
Metadata MgtWeb content EmailDocuments Business data
Structured Data
Statistical data
Structured Data
Shared drives
VocabulariesTaxonomies
Reference Data
Core Metadata
Corporate data Model
Reference ArchitectureG
over
nan
ce
Consolidated Repositories
Metadata Repository/
Registry
Operational Data Store
Data Warehouse
Master Data
Data marts
Unstructured data
Information Standards & Policy
Multilingual
Data SourcesUnstructured data Unstructured data Unstructured data
External
External Data
Data MovementETL Data Services ESB …
Data processing engine (e.g Hadoop)
e-publish, Day, Drupal, blogs etc
Wbdocs,Jolis
SAP, PeopleSoft, Finance etc
JiveSharepoint
Factiva NewsResearch dbs
DDP
9
Text Analytics – power and flexibility
Critical – Text Analytics tool – Same taxonomy term but different criteria, rules – Documents tagged for different uses, audiences
Education – for specialists– Deep complex rules, very fine granularity, specialists jargon-
acronyms Education – for generalists
– High level rules, general terms, simple Education within Social Development
– Generalist rules plus social development terms – birth weight
10
Proposed Model for a Taxonomy Eco-System
New Topic Taxonomy– Enhanced structure and coverage – deeper framework to
build on– Implemented in new software – flexible, solves old debates– Facets – remove complexity, increase coverage – 10 X 10.– Powered by auto-categorization - tagging, advanced
applications– Combine with current data –HR (Expertise), other
Taxonomy is part of an integrated information management – Search, Content management, IM Policy, External Web, etc.
Facets – Subject (topic), Industry, Program, Methods, Business Activities, Organization, Skills, Document Type, Project, Product, Geography
Governance Structure
Executive FunctionSponsorship
Tax Management Central
Tax Management (Anchors & Regions)
Users & SMEs
Taxonomy Management
IM focal Point, KM, web etc (Anchors & Regions, IMT)
Revise/approve tax structureRules for changesText Analytics/Research
Manage implementation Gather & make changesContent & tagging analysisCoordinate feedbackCommunication / Training
Provide feedback
Feedback loop
Set overall policy and strategies Drive direct acceptance
IT Systems Coordinate changes in dependent systems
Working Group
Prioritize changes, cross-cutting
Strategic LevelIntegration with existing Information Management
Operational
Resource DecisionsOrganizational structure
12
Critical Success Factors: Governance
Governance Policy & Process & Enforcement– Incorporate enforcement into publishing process / Hybrid Auto-cat
Taxonomy management is part of overall information management with additional taxonomy roles/functions
Best Practice: combination of central and distributed teams Taxonomy specific: Taxonomy Manager – Central & Networks
– Revise tax structure, rules for changes, manage implementation– Enforcement – combination of central & Networks
Feedback – metrics – identify need for new terms, remove old terms– Combination of user feedback in application & periodic analysis
Conclusion
Taxonomies are an enterprise resource Danger of monolithic over-riding local variations
– Less useful and/or ignored Danger of chaos of multiple variations losing ability to coordinate
and communicate Solution: Research into users, use cases, semantic resources Integrated solution – importance of distributed governance Integrated solution – text analytics to reflect local variations and
provide a means to integrate into unified solution Facets, text analytics and browse views solve 75%, rest is
manageable No one was entirely happy – must be doing something right
13
Questions? Tom Reamy
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
www.TextAnalyticsWorld.com March 29-April 1, San Francisco