Getting started with Data Analytics with Azure Machine Learning Bhakthi Liyanage Northern VA CodeCamp Spring 2016 30 April 2016
Apr 15, 2017
Getting started with Data Analytics with Azure Machine Learning
Bhakthi LiyanageNorthern VA CodeCamp Spring 201630 April 2016
Bhakthi LiyanageBank of America Merrill Lynch
Who am I?Sr. SharePoint Architect16+ years in the IT industry10+ years in SharePoint
@bhakthil
https://www.linkedin.com/pub/bhakthi-liyanage/14/15/912
https://github.com/bhakthil
Agenda• Introducing machine learning• Introducing Azure Machine Learning• Machine Learning Lifecycle• Demo• Summary• Q & A
What is machine learning?Academic DefinitionMachine learning is a subfield of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.
Simple DefinitionComputing systems that become smarter with learning and experienceExperience = Past data + human input
• Being able to predict the future with a reasonable accuracy
ReportsYesterday Today Tomorrow
Business Intelligence
Predictive Analytics
Pred
ictab
ility
Time
Why machine learning?
Roles in machine learningData scientist
A highly educated and skilled person who can solve complex data problems by employing deep expertise in scientific disciplines (mathematics, statistics or computer science)
Data professionalA skilled person who creates or maintains data systems, data solutions, or implements predictive modellingRoles: Database Administrator, Database Developer, or BI Developer
Software developerA skilled person who designs and develops programming logic, and can apply machine learning to integrate predictive functionality into applications
Problem Identification What problems are we trying
to solve?◦ Anomaly detection◦ Customer churn◦ Predictive maintenance◦ Recommendations system
What data do we have or do we have any data at all?◦ Data already available via sensory
systems, transactional databases, customer sales databases, etc.
Predictive maintenance
Vision Analytics
Recommenda-tion engines
Advertising analysis
Weather forecasting for business planning
Social network analysis
Legal discovery and document archiving
Pricing analysis
Fraud detection
Churn analysis
Equipment monitoring
Location-based tracking and services
Personalized Insurance
Data Data Consist of
◦ Features (aka input parameters) : The data that is fed in to the model
◦ Identify which features relevant for the problem
◦ Labels : Historical result of each observation Training Data
◦ Pairing of features and label◦ Historical
Data Validation◦ Used to verify the trained model
LearningSupervised
◦ Machine learning task of inferring a function/model from labeled training data or examples
◦ Training data consist of both features and labelsUn-supervised
◦ Machine learning task of inferring a function to describe hidden structure from unlabeled data
◦ Data contains only features
Azure Machine LearningOne solution for machine learning Enables powerful cloud-based predictive analytics Professionals can easily build, deploy and share
advanced analytics solutions Browser based, Rapid Deployment Connects seamlessly with other Azure data-related
services, including: Azure HDInsight (Big Data) Azure SQL Database, and Virtual Machines
Models are consumed via ML API service
Machine learning lifecycleDefine
Objective
Collect Data
Prepare Data
Train Models
Evaluate
Models
Deploy
Manage
Integrate
It is important to start a machine learning project with a clearly defined objective
I need to predict customer churn rate for next 6 months…
Define Objective
I need to suggest relevant products to
the customers
I need to know when my manufacturing equipment will fail
Collecting complete data is critical◦ Garbage in ► Garbage out
Datasets can be sourced from:◦ Internal sources, i.e. operational systems, data warehouse, etc.◦ External sources◦ Different formats, i.e. relational, multidimensional, text, map-
reduce Combining datasets can enrich data
◦ E.g., integrate internal data to external data like weather, or market intelligence data
◦ Weather data with flight delay data◦ Population data with energy consumption data
Collect Data
Prepare data for machine learning◦ Transform to cleanse, reduce or reformat◦ Isolate and flag abnormal data◦ Appropriately substitute missing values◦ Categorize continuous values into ranges◦ Normalize continuous values between 0 and 1
Of course, having the required data to begin with is important◦ When designing systems, give consideration to attributes that
may be required as inputs for future modeling, e.g. demographic data: Birth date, gender, etc.
Prepare Data
This stage is iterative, and experimentation involves:◦ Selecting a machine learning algorithm◦ Defining inputs and outputs◦ Optimizing by configuring algorithm parameters
Model evaluation is critical to determine:◦ Accuracy, Reliability, Usefulness
Train Models
Evaluate
Models
First, add a scoring experiment– Training logic is replaced with a trained model– Inputs and output end-points are added– Module properties can be parameterized
Publish the experiment to the gallery– Learn from others by discovering experiments– Contribute and showcase your experiments
Deploy
Integrate
Integrate the experiment with external applications– Integration offers REST web service end points– Each web service offers two methods:
• Request/Response Service (RRS) ► Low latency, highly scalable web service
• Batch Execution Service (BES) ► High volume, asynchronous scoring of many records
Azure Machine LearningOne solution for machine learning
Stream analytics, blob storage, Azure SQL, HDInsight
Azure ML Services
Clients
Azure ML Studio
ML web service end-points
Data Model Development Model Deployment Operationalize
Power BI/DashboardsMobile AppsWeb Apps
Azure Portal
Azure Ops Team
ML Studio
Data Scientist
HDInsight
Azure Storage
Desktop Data
Azure Portal & ML API service
Azure Ops Team
ML API service Developer
ML Studio and the Data Professional• Access and prepare data• Create, test and train models• Collaborate • One click to stage for production
via the API service
Azure Portal & ML API serviceand the Azure Ops Team• Create ML Studio workspace• Assign storage account(s)• Monitor ML consumption• See alerts when model is ready• Deploy models to web service
ML API service and the Application Developer• Tested models available as a URL that can be called from any endpoint
Business users easily access results from anywhere, on any device
Azure Machine LearningOne solution for machine learning
Faster towards solutions
Mashup of powerful algorithms
Global scaling of solutions via cloud API
Elastic, pay-as-you-go model with low operative costs
Quick and easy extensibility with cloud functions such asPower BI, Hadoop (Azure HDInsight) and cloud storage
Azure Machine LearningOne solution for machine learning
SummaryMachine Learning is a subfield of computer science and statistics that deals with the construction and study of systems that can learn from data.
Azure Machine Learning key attributes:Fully managed ► No hardware or software to buyIntegrated ► Drag, drop, connect and configure
Best-in-class algorithms ► Proven solutions from Xbox and BingR built in ► Use over 400 R packages, or bring your own R or Python code
Deploy in minutes ► Operationalize with a clickFlexible consumption ► Any device capable of consuming REST API
Machine Learning is now approachable to developers