Data Mining with SQL Server 2008 Amaryllis Guillot-Plasse Senior Data Mining/BI consultant Pierre-Louis Usselmann Senior BI consultant
Jan 04, 2016
Data Mining withSQL Server 2008
Amaryllis Guillot-Plasse Senior Data Mining/BI consultant Pierre-Louis Usselmann Senior BI consultant
SOGETI
Consul-ting
Services
Outsour- cing
Services
Techno-logy
Services
Local Profes-sional
Services
Competence Centers
Business IntelligenceWebTechSAPInfrastructure Management
Application Services
Audit, ConsultingProject ManagementDvlpt, MigrationImplementationERPTestingApplication Mngmt
High Tech Consulting
Engineering ScienceEngineering R&DEmbedeed SoftwareScientific Calculations
Infrastructure Services
AuditConsultingProject MngmtInsourcingSecurityHelpdeskRoll out
Agenda
What is Data Mining?Data Mining with SQL Server 2008DemoConclusion
What is Data Mining?
What is Data Mining ?
Exploring and analysing big volumes of data using statistical techniques and computing in order to transform raw data into valuable information
Data Mining is also known as:Machine LearningPredictive Analytics
Typical Applications of Data Mining
Customer Lifetime valuePredict customer purchasing or behaviour (churn, migration to other products…)Promotion and sale of additional productsFraud detectionFinancial risk assessment (loans etc.)Segmentation and clustering of customers to understand them betterBetter advertisingIncome and profit forecasting
Role of Data Mining in BI
Query,Reportin
g
Time
Bu
sin
ess
Valu
e
OLAP
Data Mining
Real-time
Personalization
How many customer
did we lose ?
What was their age?
Which customer
types are at risk and why
?
What should we offer this customer
right now ?
Measurement (historical)
Prediction(future)
Data Mining process
Mining Model Mining Model
Training Data
DB dataClient dataApplication data
Data MiningEngine
Data To Predict
Predicted Data
Mining ModelDB dataClient dataApplication data“Just one row”
Data MiningEngine
Data Mining with SQL Server 2008
Data Mining Lifecycle – CRISP DM
Business UnderstandingDefine Business Objectives
Putting Data Mining to Work
Making Changes to the Business
DataUnderstandi
ngData Collection
DataPreparation
Cleansing & Transformation
ModelingMining Task
EvaluationMining Model Assessment
Prediction (Scoring)
DeploymentApplication Integration
Data
SQL Server 2008 Data Mining Process
Business UnderstandingDefine Business Objectives
Putting Data Mining to Work
Making Changes to the Business
DataUnderstandi
ngData Collection
DataPreparation
Cleansing & Transformation
ModelingMining Task
EvaluationMining Model Assessment
Prediction (Scoring)
DeploymentApplication Integration
Data
Analysis ServicesIntegration ServicesExcel
Analysis Services(Data Mining)
Analysis ServicesIntegration ServicesReporting Services
Analysis Services(Data Mining)
Server Mining Architecture
Analysis ServicesServer
Mining Model
Data Mining Algorithm
Your Application
OLE DB/ ADOMD/ XMLA
Deploy
BI Dev Studio (Visual Studio)
App Data
DataSource
How Microsoft Delivers Predictive Analytics
Data Mining SQL extensions(DMX)
Application Developer
Data Mining
Specialist
Microsoft Dynamics CRMAnalytics Foundation
SQL Server 2008 Business Intelligence Development Studio
Microsoft SQL Server 2008 Analysis Services
Information Worker
Data Mining Add-ins for the 2007 Microsoft Office system
Microsoft SQL Server 2008 Data Mining
BI Analyst
Custom Algorithms
Nine Data Mining algorithms available
Association rulesClusteringDecision TreesLinear regressionLogistic regressionNaïve BayesNeural netsSequence clusteringTime series
Decision Trees Time Series
Association Naïve Bayes
ClusteringNeural Networks
Sequence Clustering
Programmatically creating model
CREATE MINING MODEL MyModel
(
[CustID] LONG KEY,
[Gender] TEXT DISCRETE,
[Marital Status] TEXT DISCRETE,
[Education] TEXT DISCRETE,
[Home Ownership] TEXT DISCRETE PREDICT,
[Age] LONG CONTINUOUS,
[Income] DOUBLE CONTINUOUS,
[Products] TABLE
(
[Product Name] TEXT KEY )
…
) USING Microsoft_Decision_Trees
Possibility of nested case: a table instead of a unique value
Data Mining Add-Ins for Excel 2007
Data PreparationData ModelingAccuracy and ValidationModel Usage and Management
“What Microsoft has done is to make data mining available on the desktop to everyone” - David Norris, Associate Analyst, Bloor Research
Demo
Data Mining with SQL Server 2008(Analysis Services & BI Dev Studio)
Date : March,13th 2008Id : F-B109Track : DataBaseLevel : 300
Will they buy a bike ?
Conclusion
What’s new in SQL Server 2008 for Data Mining?
Enhancement of Microsoft Time series algorithm with ARIMA
Easy partition of your data into training and test sets
Possibility to build mining models on filtered subsets (e.g just male customers)
Access to all mining structure columns, not just columns included in the model (drillthrough functionality)
Cross-validation feature added
Conclusion
Nine Data Mining algorithms + viewers
BI Dev Studio for developers and analysts
Integration with SSIS, SSAS, and SSRS
New world of “smart applications”
Complete platform for all levels of data mining experience (via interface or programming)
Community Resources
“Data Mining with SQL Server 2005”Book by Jamie MacLennan and ZhaoHui TangWiley 2005, ISBN 0-471-46261-6
SQL Server Data Miningwww.SQLServerDataMining.com
SQL Server Developer Centerhttp://msdn.microsoft.com/sql
SQL Server Forumshttp://forums.microsoft.com/msdn
Trial Software and Virtual Labshttp://www.microsoft.com/technet/downloads/trials/default.mspx
Q&A
Come and visit us at our stand
© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.