Implementing Advanced Analytics Platform
Post on 09-Jan-2017
271 Views
Preview:
Transcript
© 2015 IBM Corporation
Session 1977: Implementing Advanced Analytics Platform –Successes & Architecture Decisions
Dr. Arvind Sathi
Mathews Thomas
Thomas Eunice
Richard Harken
• IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal
without notice at IBM’s sole discretion.
• Information regarding potential future products is intended to outline our general product direction
and it should not be relied on in making a purchasing decision.
• The information mentioned regarding potential future products is not a commitment, promise, or
legal obligation to deliver any material, code or functionality. Information about potential future
products may not be incorporated into any contract.
• The development, release, and timing of any future features or functionality described for our
products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a
controlled environment. The actual throughput or performance that any user will experience will vary
depending upon many factors, including considerations such as the amount of multiprogramming in the
user’s job stream, the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results similar to those stated
here.
Please Note:
2
Overview
• Motivation for Advanced Analytics Platform
• Business Use Cases
• Application Architecture
• Data Science Discussion
• Data Engineering Discussion
• Q&A
2
Motivation for Advanced Analytics Platform in the Cognitive Era
Key Disruptive Trends:
• Growing interest in applying the results of advanced analytics to improve business performance
• The rapid growth in available data, particularly new sources of data — such as unstructured data from customer interactions and streaming volumes of machine-generated data.
• Increasing requirements for higher data and decision velocity
• Shortage of data science skills – how do we leverage small number of data scientists for increasing number of applications
• Limitations in the use and scaling of existing BI tools
• Open sourced platforms
Data Sources and sizes
4
Data SourceDaily
Volumes
Data types available for
Customer Experience
Analytics
CRM / Billing 100s of
Gigabytes
Subscription and demographics
Call Detail
Records / Web
Logs
Terabytes Voice and SMS usage, Web
interactions
Product Usage
Data
10s of
Terabytes
Data and video usage
IoT Data 100s of
Terabytes
Driving data for connected
cars, connected home events
• Real-time Decision Engines – need the real-time data right away, and require real-time scoring engines to rank order and select candidates.
• Operational Dashboards – require data in near real-time across large cross-section of the enterprise.
• Advanced analytics (data scientist) users – require raw data for complex statistical and text-analytics sandbox.
• Business Analysts – require curated batch data for standard and ad hoc reporting.
• Stewards – require source data to make governance decisions.
Emerging Users
5
Overview
• Motivation for Advanced Analytics Platform
• Business Use Cases
• Application Architecture
• Data Science Discussion
• Data Engineering Discussion
• Q&A
6
Subscriber Profiling
& Enrichment
How can I uncover new
insights from subscriber
data for better marketing,
customer care & network
operations?
Subscriber Analytics
(Segmentation)
How do I create subscriber
micro-segments based on
subscriber usage, channel
interaction and mobility
patterns?
Social Media Insight
How can I gain insights
on brand, product &
service reputation,
marketing campaign
impact on various
customer segments?
Proactive
Care
How can I improve
revenue from call
center and lower costs?
Counter Fraud
Management
How can I better predict,
detect and investigate
voice and data fraud?
Network Analytics Based on
Customer Insight
How can I innovate and
improve my network for
better subscriber
experience?
Internet of Things Analytics
and Usage
How can I capitalize on
insights gathered from
IoT to offer personalized
value-added services?
Customer Data
Location
MonetizationHow can I monetize
subscriber data for
higher revenue &
profits?
Using our catalog of industry use cases, we have prioritized the following use cases
for industry solutions.
Key Telecom Business Value Cases
Innovate
Business
Models
Transform
Business for
Higher
Efficiency
Improve
Subscriber
Insight
KPI Correlation
How do I drive new and
deeply correlated
insights on key
measures enabling
new value:
NPS, Churn, Cross Sell
Customer Experience
ManagementHow can I measure and
improve subscriber quality
of experience across all
channels and services?
Vertical Analytics
IntegrationHow do I partner to build
value added offerings
for other industries?
Retail
Transportation
Financial
Proactive Marketing
& Sales
How can I deliver targeted
marketing campaigns for
higher acceptance rate?
How can I improve
customer care?
NBA, NBO, PBA, PBO,
Omni-Channel
Accelerate Digital Transformation
Real Time Actionable Insight (Value Roadmap)
DECISIONSINSIGHTS OUTCOMES
MeasureResults
HistoricalData
SUBSCRIBER PROFILING & ENRICHMENT
• Hangout• Location• Trends• Behavior• Lifestyle
Go
tha
m
Cit
y
Night Owls
PREDICTIVE
ANALYTICS (SCORES)
Sports Fans
Lunch Crowd
KPI-DRIVEN ACTIONABLE
INSIGHTS• NPS• Churn• Upsell• Cross-
Sell
BUSINESS DECISIONS
Upgrade
PhoneBad
Device
Low
NPS
Wrong
Plan
DATA SOURCE COLLECTION & EXTRACTION
DATA / VALUE
SOCIAL
NETWORK
TROUBLE TICKETS BILLING
DEVICES
APPs
OPERATIONS TRANSFORMATI
ON
• Proactive Care • Enhanced Sales &
Marketing • Fraud & Security• Revenue Assurance• Insights Monetization• New Business Models
BUSINESS OUTCOMES
Business Maturity
INDIVIDUAL SUBSCRIBEREXPERIENCE
• Device • Usage• Customer type• Network • Service
Experience
CUSTOMER PROFILE
(INSIGHTS)
iPhone 5C
Congested 3G Cell
Heavy Netflix Users
Issue Resolution
• Solve• Steps to solve• None
Next Best Action (NBA)
• compare w/ similar• Tier 2 support• Tier 3 support
Next Best Offer (NBO)
• Sales• Up-sell / Cross-sell
Inbound
ResolveIndividual
Call
Reactive Care
Customer can’t access Netflix video
on their smartphone, so they ring a customer care
agent
MonitorTrends
outbound Communicate to impacted Subscribers
Netflixcongestion issue
Proactive Care
Mobile App Push
Mobile Web Push
1
Omni Channel Outbound Communication
How is care different?
Psycholinguistic
NPS
Usage
Mobility
CRM
Experience
Interest
Other
UnstructuredOther
Structured
Customer Insights
Networking Insights
Sample Insights
Quality of video/data
Number of dropped calls
Number and type of users
Normal changes vs. abnormalities
Trending spots
Mobility Pattern
Target Segments
Heavy Video users
Regularly at cell tower
Propensity to Churn
High Value Customer
Track
Results
Notify
• High Value
Customer
• Watches Video
• Impacted by
Networking
Issues
Network
congestion
issue
Customer
Insights1
Monitoring
Trends
Real Time Analytics
Business process
Rules management
Proactive Care – Network
Upsell
Segmentation
Campaign
management
Key Flows
Data Sources
Real Time Analytics
Predictive Models
Operational Decisions
Management
Business Process Management
Campaign Management
Mobile Channel
Dashboard
Overview
• Motivation for Advanced Analytics Platform
• Business Use Cases
• Application Architecture
• Data Science Discussion
• Data Engineering Discussion
• Q&A
14
Advanced Analytic Platform (AAP) - Architecture Overview
Data Lake
Descriptive &
Predictive
Modeling
Real Time
Analytics
TDR
Tra
nsactions
Usage
Real-time Action
Marketing
Customer Care
NOC/SOC
Network Planning
...
Analyst
Workbench
Batch Action
CRM
Billing
Care
Ba
tch
ET
L
Sm
art
Filte
r
Convers
ations
Social Media
Chat
NetworkEngg
CDR
Call Center
Con
tin
uo
us In
ge
st / P
ars
ing
Unific
ation
IntelligentCampaigns
Data
Governance
ProactiveCare
CounterFraud
Real-timeDash Boards
Segmentation
NetworkConfigurations
Dash Boards
Reports
Visualization
Stream and Mediation
Analytics
Data Mart
SQL
Accees
1
2 3
6
5
4
7
8
10
9
Architecture Walk Through
Step Description
1 Continuous Ingest / Parsing: CDR data is parsed from ASN.1 format.
2 Unification: CDR & TDR data is unified into a common format and identified with subscribers.
3 Smart filter: Data is filtered for real-time, predictive and descriptive analytics. All data is sent
to the lake.
4 Batch ETL: Source data from transactions and conversation is ingested and sent to the lake
after appropriate transformations.
5 Data Governance: Transactional data is organized into master data, with data quality and
matching. Conversation data is aligned to master data.
6 Descriptive & Predictive Modeling: Creates aggregations, derived attributes and scoring
models.
7 Real-time analytics: Various counters are used for real-time aggregations. Scoring engines
are used for predictive scoring.
8 Real-time Action: Real-time aggregations and scores are sent to respective action engines
and real-time dash boards
9 Batch Action: Tables with aggregate data and derived attributes are made available to batch
consumers.
10 Analyst Workbench: Governed data, aggregations and derivations are made available to
analysts for reports, visualizations and dash boards.
Architecture Decision – Bring expertise to data
• In high velocity or high volume situations, data can not be moved across many tools.
• Many filtering decisions have to be done closer to the source to bring down false positives.
• These filters must be dynamic and changed by business users.
Source Filter Target
Filter criteria
Architecture Decision – Identity Resolution • Identity resolution provides a way to connect various facts about an entity and resolve
differences.
scrila34@msn.com
Job
Applicant
Identity Thief
Top 200
Customer
Criminal
Investigation
Architecture Decision - Feedback and Machine Learning
Predictive models can be compared for their success and fine tuned
using the following steps:
Step 1 – Many predictive models are developed simultaneously
Step 2 – These models are tested using test or real data
Step 3 – Results are compared and used for fine tuning the models
Sensor
Predictive Modeler
Scorer
Analytics Engine
High Velocity
High Volume
DriveInteract with the customer to seek permission to use location information and send campaign, record interaction and results.
DiscoverCollect historical behavioral data, past acts, and success rates. Analyze historical data to formulate patterns and changes required to campaign selection and design rules.
DecideUse background information, past campaigns, privacy preferences, customer reaction to past campaigns, purchase intent, preferences expressed in social media to design campaign.
DetectDetect in real time if a transaction relates to targeted consumers. Identify, align, score, and send for further processing (e.g., a targeted customer driving towards mall)
Architecture Decision – Integration
Detect observations about a target
Take action in real time – when it
matters
Find new targets by analyzing historical
data
Identify patterns over time and
actions required
Drive
Detect
Discover
Decide
TargetSubscriber
20
Filter
definitions
Filtered
Data
Decisions
Feedback
Interrogations
Overview
• Motivation for Advanced Analytics Platform
• Business Use Cases
• Application Architecture
• Data Science Discussion
• Data Engineering Discussion
• Q&A
21
Advanced Analytics using Hadoop Lake. Streams and SPSS
22
Reference
Data
Changes to
Reference
Data
Event Data
Event Data
Data
Integration
Movement
Hadoop
Lake
Local
Appliance
Infosphere
Streams
SPSS
Analytics
Server
SPSS
Modeler
Server
Real Time
Analytics
SPSS
Modeler
Client
Data at Rest
(Historical Data)
Data in Motion
(Real Time)
Real-time Models
Location Analytics
CDRsLocation
Affinity
Common locations
by time of day and
day of week
2-6 weeks of
CDRs with
location info
High speed
aggregations and
calculations on big data
Preferred
Locations
Location
algorithms using
SPSS
Home, Work,
Weekend, Locations
Mobility Profiling Outputs
Usage Profiles Heavy Voice
SMS Mostly
No Data
Quality of Service Individual QoS measure
Detailed relation to ARPU and
CLV
Sentiment analysis From surveys and comments
Contact center data
Social media
Usage Direction Declining / Increasing
For each service
e.g. Increasing Data, declining
Voice
Personality Profiles Commuter
Homebody
Night Owl
Interests and Preferences OTT Messaging Travel, Shopping,
Betting
App preferences e.g. Travel, Games
Handset prefs
Preferred Locations Hangouts for groups
Popular Home and Work locations
Mode of Travel; train, car, walk
What is the profile of persons in each hangout
Social Networks and Best
Buddies Who calls who
Who hangs out with Whom?
Who are the influencers
How it Works – Buddy ModelPhysical Relationship Social Network
Seven pass algorithm creates a sparse matrix of all events within space time boxes (defined as Cell Masts and 2 minute intervals)
Subscriber pairs in the same space time box are counted as a “hit”, then ranked by hits.
Subscriber pairs that have many hits in many locations or time frames are kept (above a threshold for a coincidental relationship connection)
The resulting pairs and hit counts are passed to IBM's SNA algorithm to create the final networks
Input: 2+ weeks of xDR data for a large metropolitan area
Profile Name Description
Night Owl Primarily active at night
Homebody Does not visit many locations
Delivering the goods Visits many locations during the day (Delivery truck driver, postman, etc)
Commuter/Daily GrinderA daily commuter, home → office → home
Predictable/Norm Peterson Activity inside the 2nd standard deviation*
Active Active at many times of the day with no clear pattern
IBM Mobility Lifestyle Definitions
* from the Television show, “Cheers”. Norm was an accountant who went to the same pub every night
Discovery of Mobility Lifestyle• A typical discovery uses statistical tools to identify pattern in data.
• Discovery may contribute new derived attributes for further analysis or reporting.
Night Owls at Night
Delivery People During the Day
Quiet Weekday peoplego for dinner on weekends
Almost no Homebodies any time
Hangout Analysis
What are the most common Lifestyle Profiles at these places
Overview
• Motivation for Advanced Analytics Platform
• Business Use Cases
• Application Architecture
• Data Science Discussion
• Data Engineering Discussion
• Q&A
29
Illustrative Data Engineering Requirements
• Security by user role
• Classification of data
• Taxonomy, Semantics
• Auditability
• Scalability
• Lineage - metadata and data, relationships across business and technical
• monitoring
• Auto discovery
• Multitenant and enterprise class (separation of orgs, or sub orgs)
• Policies, leverageable , implementable policies, and business rules
• Integrateable, open API, integration, publishable
• Harvest/ingest metadata from various sources
Illustrative Data Engineering Requirements, cont.
• Regulatory compliant - like auditable by dodfrank, hipaa etc
• automate as much of this as possible
• Support transactional an analytic workloads
• Realtime updates
• Sensitive data protection
• Needs to support structured and unstructured data
• Search
• History ( point in time relevance)
• versioning
• Workflow - can't force it thought through policy and procedure
• Support for MDM
• Open APIs (see content from ING slide below)
Illustrative Data Engineering Requirements, cont.
• Support for Json, cobol, Nested models, non relational models, all data is not defined as relational
• Characterize data auto classify for self services, also do this for data lineage
• Usability Improvements
Analytics/Reporting/Consumption
Data Sources
Information Fabric
Operational Data
Reporting/ Warehouse Data
Services Layer
3rd Party Data
Information Governance Catalog
/
ATLAS
Cognos SPSS R ML
SparkInformation
ServerStreams
Landing Zone
Discovery Zone
Harmonized Zone
Optim
Guardium Optim
Security (LDAP, Kerberos, HTTPS, Certificates)
1
2
3
5
4
Information Governance
Overview
• Motivation for Advanced Analytics Platform
• Business Use Cases
• Application Architecture
• Data Science Discussion
• Data Engineering Discussion
• Q&A
34
Reading Material• IBM Developer Works
Explore the advanced analytics platform, Part 1: Support your business requirements using big data and advanced analytics
Explore the advanced analytics platform, Part 2: Explore use cases that cross multiple industries using the advanced analytics platform
Explore the advanced analytics platform, Part 3: Analyze unstructured text using patterns
Explore the advanced analytics platform, Part 4: Analyze location data to determine movement patterns using a mobility profile pattern
Explore the advanced analytics platform, Part 5: Deep dive into discovery and visualization
Explore the advanced analytics platform, Part 6: Dive into orchestration with a combination of SPSS, Operational Decision Management (ODM), and Streams using care and fraud management case studies
• IBM Data Magazine
Mining Data in a High-Performance Sandbox - Fulfill data analysts’ dreams with data warehouse appliances for in-database analytics and data mining
Target Behavior in Real Time for Effective Outcomes: Part 1 - How real-time, adaptive architectures can drive management decisions for specific use cases
Target Behavior in Real Time for Effective Outcomes: Part 2 Drive marketing and business management decisions using a real-time, adaptive architecture
• Books
Big Data Analytics: Disruptive Technologies for Changing the Game
Engaging Customers Using Big Data: How Marketing Analytics Are Transforming Business
Engaging Customers using Big Data by Arvind Sathi
36
BIG DATA IS RAPIDLY TRANSFORMING HOW COMPANIES MARKET TO THEIR CUSTOMERS.
Dr. Sathi uses a series of examples across many industries, such as retail, telecommunications, financial services, electronics, high tech, and media, to describe how each marketing function is undergoing fundamental changes: how personalized advertising is delivered using online channels where the marketers identify the specific customer and tailor their messaging based on customer behavior, context, and intention; how customer behaviors are collected from a variety of sources across many industries and combined to identify micro segments; and how online and physical stores collaborate to provide a unified shopping experience and deliver product information.
Engaging Customers Using Big Data provides the tools and techniques necessary to effectively implement big data into your marketing strategy, including statistical techniques, qualitative reasoning, and real-time pattern detection, and more.
Come and collect a signed copy of the book at the Book store – Monday October 26, 4:30 to 5:00 PM.
We Value Your Feedback!
Don’t forget to submit your Insight session and speaker
feedback! Your feedback is very important to us – we use it
to continually improve the conference.
Access the Insight Conference Connect tool at
insight2015survey.com to quickly submit your surveys from
your smartphone, laptop or conference kiosk.
37
38
Notices and Disclaimers
Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form
without written permission from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for
accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to
update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO
EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO,
LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted
according to the terms and conditions of the agreements under which they are provided.
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as
illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other
results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services
available in all countries in which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the
views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or
other guidance or advice to any individual participant or their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the
identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the
customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will
ensure that the customer is in compliance with any law.
39
Notices and Disclaimers (con’t)
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly
available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance,
compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to
interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights,
trademarks or other intellectual property right.
• IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document
Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM
SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON,
OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®,
pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ,
Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of
International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at:
www.ibm.com/legal/copytrade.shtml.
© 2015 IBM Corporation
Thank You
top related