Enhancing Demand Forecast through Advanced Analytics Gil Graciani, José Mejías RapidMiner’s Wisdom October, 2018
Enhancing Demand Forecast through Advanced Analytics
Gil Graciani, José Mejías
RapidMiner’s Wisdom October, 2018
Agenda
• Selecting a Data Science Platform
• Enhancing Demand Forecast • Problem Statement
• Key Innovations
• Results
Analytic IDE Objectives
Tool Objectives Business Benefits
Single, multi-tenant platform that supports advanced analytics for both Data Scientists and Citizen Data Scientists
Increased data driven decision making through adoption & greater collaboration & algorithm sharing
Platform that scales (preferably horizontally) by compute capability as well as data size
Faster speed to insights through flexibility & agility to quickly adapt to growth
Platform that interfaces with EA Analytics platform applications providing in-database, distributed computing via Spark and Hadoop and streaming analytics.
Ability to drive business transformation through embedded analytics
Platform that supports Data Preparation needs for advanced analytics Increased agility and speed to insights through automation
Data Prep and Algorithms (E2E) that can be automated (scheduled) Ability to leverage Dev Ops via Automation
Platform output that can be embedded into BI analytic reporting tools and applications
Expanded use of algorithmic intelligence via ability to distribute insights across broader BI ecosystem
Platform supports self-service Greater agility, faster speed to insight and lower support costs
Analytic IDE Tool Evaluation Process and Outcome
Research Conducted
• Gartner ratings
• Magic quadrant ratings
• Individual capability scores
• Input from Analytic SMEs
Vendor Demos
• Dataiku
• RapidMiner
• Data Science Inc.
• Knime
• Alteryx
In-house proof-of-concept
• RapidMiner
RapidMiner chosen based on:
• Scalability / Open Source
• Big data ecosystem connectivity (Hadoop)
• Integration with Python and R
• One of the most intuitive UIs
• Templated Business Use Case (e.g.
“Predictive maintenance”, etc.)
• “Wisdom of crowds” social analytic
recommendation feature
Evaluation Process
EA Platform Tools Landscape
5
Data Lake Domain Views Consumption
Process
Tools
Framework
IDE
Skills
Management
Basic-advanced data managementBasic-advanced analytics skills
Advanced DM/BI/Visualization/WebDev
Scheduler:
Ingestion Enrichment Transform Analysis Delivery
Consumption DM
Analytics DM
Domain DM
Atomic Data
Raw Data
Stream
Batch
Proof
Enhance
Merge
Aggregate
Normalize
Descriptive
Diagnostic
ML
BI
Insights
Actions
Governance Cooperative: DEV: IT Enable - Business Produce PRO: IT Enable – Business/IT Produce
Workflow:
IDE = Integrated Development Environment
Catalog:
Data Governance:
Admin: Security: Automation:
OLAP:
Discovery Layer
6
SERP
SFDC
Business Managed
Data
Edge
Node
Discovery Layer
Consumption LayerRefined LayerRaw Layer
Data Sources Enterprise Analytics PlatformData
ConsumersData Ingestion
IF
BI Reports
Data Scienceas needed
Archive
Big Data Engineer Toolkit
BMT
Personas will determine the tools in the toolkit. Technologies listed are subject to change based on needs of the user community.
Data Science Toolkit
BMT
Environment
• 4 TB Storage limitation for Discovery Layer• Environment dedicated to Data Science & Big Data
Engineers/Analysts• Access Through Visualization Tools subscribe through
UAM
External DataRM
Server
Problem Statement
We are building a next gen model...• Unique, Artificial Intelligence(AI) based
Predictive Capability
• with a long-term (15yrs), econometric, total market perspective,
• leveraging new inputs & next gen analytics• Machine & Deep Learning, Customer Analytics,
Channel Inventory data & Lifecycle Analytics
• Optimized for accuracy improvement
• Intelligent Design: Will deliver increased value in future years with continuous improvement
Solution: Better & Broader Data Set
+ Next Gen Advanced Analytics
= Improved Planning
Goal/Value: Improve planning accuracy by at least 40%
Problem Statement: Advanced analytics, econometric, market & long term
considerations are excluded from our current planning, which is primarily
bottoms-up
Stats Models: ARIMA
Machine Learning: Random Forest, KNN, GBM, Neural Nets
Deep Learning: Recurrent Neural Nets
Ensemble Approach for Final Model Selection
Market Data: IDC
Econ Data: OECD, Duke CFO Survey
Internal: Orders, Customer, SC
ALFA Solution Overview
Our Journey
10
Issue: Excellent concept, but team lacked ingestion, automation
& modeling expertise
Final Solution: RapidMiner as an end to end analytics solution
Initial Solution: Cover gaps with consultants
Results: • Aspiring Data Scientist Mentality• New capabilities developed and
leveraged for planning tools• Saved consulting & SaaS fees
Why RapidMiner?
• Ingestion, Enrichment, Transformation and Analysis in one automation tool
• One vs Many tools to learn and link for automation
• Leveraging for Demand Planning tool dev.
• It’s “where the puck is going”• RapidMiner’s first company deployment
• IT supported • Investing our time in higher value add tasks vs
supporting ad-hoc / unique tools
• Leaders in Gartner Magic Quadrant
11
RapidMiner Benefits• Integration with R
• Intuitive UI / Easy to Learn
• Excellent Expert Support
• Big data ecosystem connectivity (Hadoop)
• “Wisdom of crowds” social analytic
recommendation feature
• Scalability / Open Source
Taking the capabilities of the team to the next level
Total Line Fam A Fam B
AMS Pilot: Mar-Aug
Current MAPE ALFA Pilot MAPE
Results Overview
13
49% Better
65% Better
10% Better
68% Better
58% Better
75% Better
Total Line Fam A Fam B
EMEA Pilot: Mar-Aug
Current MAPE ALFA Pilot MAPE
Accelerated time to Insight
~6 months
RapidMiner Key Enabler
Ingestion, Enrichment, Transformation, Analysis
Discovery, Modelling, Integration
Outstanding Support
Model Achieving stable, sustainable performance
Beating the 40% improvement goal
Future improvements potential
Key Messages
>60%Sys Volume Coverage
up to 75%Improvement
All LinesDelivering Improvement over Current Process