Business Intelligence Data Mining (Part 2 of 2)
Jun 14, 2015
Business IntelligenceData Mining
(Part 2 of 2)
The End?
How far can I go?
• Storing and analyzing historical data you can see just one part of reality (the past and the present)
• Is there a way to answer questions not yet made? Can I look into the future?
• Can I predict how my business is going to work? What about the market? And my customers?
Data Mining
• Is a process to extract patterns from data
• “We’re drowning in data but information thirsty”
• Data Mining borrows techniques from statistics, probability, maths, artificial intelligence and other fields
Business Problems• Recommendations
• Anomaly Detection
• Customer abandon analysis
• Risk Management
• Customer segmentation
• Targeted advertising
• Projections
Data Mining Tasks
• Classification
• Estimation / Regression
• Prediction / Projection (Forecasting)
• Association Rules / Affinity Groups
• Clusterization
Predictive Models• Classifications
• Discrete value prediction
• Yes, No
• High, Medium, Low
• Estimation / Regression
• Continuous value prediction
• Amounts
• Numbers
• Projection / Forecasting
Descriptive Models
• Association Rules / Affinity
• Looks for correlation indexes among diverse associated elements
• Market Basket Analysis
• Clusterization
• Groups items according to similarity
• “Automatic” classification
Work Cycle
Transform Data to
Information
Act with Information
Measure Results
Identify Business Opportunities
Data Mining and DWh
• The Data Warehsouse unifies diverse data sources in one common repository
• Before the DM process, you must have reliable data sources
• Data must be presented in a way that eases analysis
Project Cycle• Business Problem Formulation
• Data Gathering
• Data transformation and cleansing
• Model Construction
• Model Evaluation
• Reports and Prediction
• Application Integration
• Model Management
What is a Model?
• The model is a set of conclusions reached (in mathematical format) after data processing
• Is used to extract knowledge and to compare it to new data to reach to new conclusions
• It has some efficency percentage
• Must be adjusted to make helpful predictions
• It is time-constrainted
CasesOutlook Temperature (C) Humidity Wind Play Golf?
Sunny 29.4 85% NO No
Sunny 26.6 90% YES No
Overcast 28.3 78% NO Yes
Rainy 21.1 96% NO Yes
Rainy 20.0 80% NO Yes
Rainy 18.3 70% YES No
Overcast 17.7 65% YES Yes
Sunny 22.2 95% NO No
Sunny 20.5 70% NO Yes
Rainy 23.8 80% NO Yes
Sunny 23.8 70% YES Yes
Overcast 22.2 90% YES Yes
Overcast 27.2 75% NO Yes
Rainy 21.6 80% YES No
Model
Outlook
YES Wind Humidity
YES YESNO NO
Overcast Rainy Sunny
NO YES >77.5<=77.5
Data Mining Algorithms• Naive Bayes
• Decission Trees
• Autoregression trees (ARTxp and ARIMA)
• K-Means
• Kohonen Maps
• Neural Networks
• Logistic regression
• Time Series
Where can I use them?
• Marketing: Segmentation, Campaigns, Results, Loyalty,...
• Sales: Behaviour detection, Sales habits
• Finances: Investments, Portfolio Management
• Banks and Assurance: Credit Check
• Security: Fraud Detection
• Medicine: Possible treatment analysis
• Manufacturing: Quality Control
• Internet: Click analysis, Text Mining
Data Mining and CRM (1)
• Detect the best prospect / customers
• Select the best communication channel for prospects / customers
• Select an appropriate message to prospects / customers
• Cross-selling, Up-selling and sales recommendation engines
Data Mining and CRM (2)
• Improve direct marketing campaign results
• Customer base segmentation
• Reduce credit risk exposure
• Customer Lifetime Value
• Customer retention and loss
Clustering
• “Self” Customer Segmentation
• Descriptive Characteristics
• Behavioural Characteristics
• Relationship
• Purchases
• Payments
Classification
• Customers by purchase behaviour
• Customers by payment behaviour
• Customers by resources devoted/needed to their service
• Customers by credit profile
• Customers by attention required
Association Rules
• Market Basket Analysis
• Cross Selling
• Up Selling
Prediction / Forecasting
• Revenue Projection
• Payment Projection
• Number of Products sold Projection
• Cash Flow Projection
Some other DM cases
• Key Influencers
• Predictions Calculator
Some Possible Problems (1)
• To learn things that are not true
• The patterns may not represent any underlying rule
• The model may not represent a relevant number of examples
• Data may be in a detail level not enough for analysis
Possible Problems... (1I)
• To learn things that are true, but not useful
• Learn things that we already knew
• Learn things that cannot be applied
Thank you!