Top Banner
Python and Data Analytics Python and Data Analytics Understand the problem By Understanding the Data Predictive Model Building: Balancing Performance, Complexity, and theBig Data
23

Python and data analytics

Jan 23, 2018

Download

Education

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Python and data analytics

Python and Data AnalyticsPython and Data Analytics

•Understand the problem By Understanding the Data•Predictive Model Building: Balancing Performance, Complexity, and theBig Data

Page 2: Python and data analytics

Machine learning Machine learning machine learning explores the study and

construction of algorithms that can learn from and make predictions on data.

Page 3: Python and data analytics
Page 4: Python and data analytics

Predictive model building Predictive model building The process of building a predictive model is called

training.

Attributes: the variables being used to make predictions is known as:◦ Predictors.◦ Features◦ Independent variables ◦ Input

Labels are also known as,◦ Outcomes◦ Targets ◦ Dependent variables ◦ Responses

Page 5: Python and data analytics

A machine learning project may not be linear, but it has a number of well known steps:

Define Problem.Prepare Data.Evaluate Algorithms.Improve Results.Present Results.

Page 6: Python and data analytics

the iris dataset has following the iris dataset has following structure structure Attributes are numeric so you have to figure out

how to load and handle data.It is a classification problem, allowing you to

practice with perhaps an easier type of supervised learning algorithm.

It is a multi-class classification problem (multi-nominal) that may require some specialized handling.

It only has 4 attributes and 150 rows, meaning it is small and easily fits into memory.

All of the numeric attributes are in the same units and the same scale, not requiring any special scaling or transforms to get started.

Page 7: Python and data analytics

Machine Learning in Python: Machine Learning in Python: Step-By-StepStep-By-Step

Installing the Python and SciPy platform.

Loading the dataset.Summarizing the dataset.Visualizing the dataset.Evaluating some algorithms.Making some predictions.

Page 8: Python and data analytics

Basic library in python Basic library in python NumPy‘s array type augments the Python language

with an efficient data structure useful for numerical work, e.g., manipulating matrices. NumPy also provides basic numerical routines, such as tools for finding eigenvectors.

SciPy contains additional routines needed in scientific work: for example, routines for computing integrals numerically, solving differential equations, optimization, and sparse matrices.

The matplotlib module produces high quality plots. With it you can turn your data or your models into figures for presentations or articles. No need to do the numerical work in one program, save the data, and plot it with another program.

Page 9: Python and data analytics

The Pandas module is a massive collaboration of many modules along with some unique features to make a very powerful module.

Pandas is great for data manipulation, data analysis, and data visualization.

The Pandas modules uses objects to allow for data analysis at a fairly high performance rate in comparison to typical Python procedures. With it, we can easily read and write from and to CSV files, or even databases.

From there, we can manipulate the data by columns, create new columns, and even base the new columns on other column data.

The scikit library used forSimple and efficient tools for data mining and data analysisAccessible to everybody, and reusable in various contextsBuilt on NumPy, SciPy, and matplotlibOpen source, commercially usable

Page 10: Python and data analytics

NumPy: Base n-dimensional array package

SciPy: Fundamental library for scientific computing

Matplotlib: Comprehensive 2D/3D plotting

IPython: Enhanced interactive consoleSympy: Symbolic mathematicsPandas: Data structures and analysis

Page 11: Python and data analytics

1. Downloading, Installing and Starting Python SciPy

1.1 Install SciPy LibrariesThere are 5 key libraries that you will need to

install. Below is a list of the Python SciPy libraries required for this tutorial:

scipynumpymatplotlibpandassklearn

Page 12: Python and data analytics
Page 13: Python and data analytics
Page 14: Python and data analytics
Page 15: Python and data analytics
Page 16: Python and data analytics
Page 17: Python and data analytics
Page 18: Python and data analytics
Page 19: Python and data analytics
Page 20: Python and data analytics
Page 21: Python and data analytics
Page 22: Python and data analytics
Page 23: Python and data analytics

http://machinelearningmastery.com/machine-learning-in-python-step-by-step/