Top Banner
NIT HAMIRPUR TOPIC :- DATA ANALYTICS PRESENTED BY :- Bhanu Pratap EED, NIT Hamirpur
25

Data analytics

Apr 16, 2017

Download

Data & Analytics

Bhanu Pratap
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data analytics

NIT HAMIRPUR

TOPIC :-

DATA ANALYTICS

PRESENTED BY :-Bhanu PratapEED, NIT Hamirpur

Page 2: Data analytics

Data vs. Information: Data are simply facts or figures — bits of information, but not

information itself. When data are processed, interpreted, organized, structured or

presented to make them meaningful or useful, they are called information.

Information provides context for data.

Examples of Data and Information The history of temperature readings all over the world for the past 100 years is data. If this data is organized and analyzed to find that global temperature is rising, then that is information.

Page 3: Data analytics

Data is everywhere: Nowadays, everyone has to deal with mounds of data,

whether they call themselves “data analysts” or not. But people who possess a toolbox of data analysis skills have

a massive edge on everyone else, because;• They understand what to do with all that stuff. • They know how to translate raw numbers into intelligence

that drives real-world action. • They know how to break down and structure complex

problems and data sets to get right to the heart of problems in their business.

Page 4: Data analytics

Data Analytics:  Data Analytics the science of examining raw data with the purpose of

converting it into information useful for decision-making or drawing conclusions about that information by users. Data is collected and analyzed to answer questions, test hypotheses or disprove theories.

Data Analytics involves applying an algorithmic or mechanical process to derive insights. For example, running through a number of data sets to look for meaningful correlations between each other.

The focus of Data Analytics lies in inference, which is the process of deriving conclusions that are solely based on what the researcher already knows. 

Page 5: Data analytics

Methodology: Data collection 1. Calibration 2. Data management 3. Data cleaning.

Exploratory data analysis

Modeling and algorithms

Data Mining

Data Visualization

Page 6: Data analytics

Data collection:

Page 7: Data analytics

Data Management:

.

Data Cleaning: Data cleansing is hard to do, hard to maintain, hard to

know where to start. There seem to always be errors, dupes, or format inconsistencies.

One of the most challenging aspects of data cleansing has got to be maintaining a clean list of data, whether it’s sourced from multiple vendors or manually entered by your hard-working interns, or a combination of both.

One mistype could create a whole myriad of problems within your database, and can lead to hours upon hours of manual cleansing that could so easily have been avoided. So what is the solution to these frustrating, time consuming problems?

Data management comprises all the disciplines related to managing data as a valuable resource.

Page 8: Data analytics

A simple, five-step data cleansing process that can help you target the areas where your data is weak and needs more attention. Plan Analyze to Cleanse Implement Automation Append Missing Data MonitorFrom the first planning stage up to the last step of monitoring your cleansed data, the process will help your team zone in on dupes and other problems within your data. So you can start small and make incremental changes, repeating the process several times to continue improving data quality.

Page 9: Data analytics

When looking at data you should focus on high priority data, and start small. The fields you will want to identify will be unique to your business and what information you are specifically looking for, but it may include: job title, role, email address, phone, industry, revenue, etc.

It would be beneficial to create and put into place specific validation rules at this point to standardize and cleanse the existing data as well as automate this process for the future. For example, making sure your postal codes and state codes agree, making sure the addresses are all standardized the same way, etc. Seek out your IT team members in help with setting these up! They are more help than just deleting a virus.

Plan:

Page 10: Data analytics

Analyze to Cleanse: After you have an idea of the priority data your

company desires, it’s important to go through the data you already have in order to see what is missing, what can be thrown out, and what, if any, are gaps between them.

You will also need to identify a set of resources to handle and manually cleanse exceptions to your rules. The amount of manual intervention is directly correlated to the amount of acceptable levels of data quality you have. Once you build out a list of rules or standards, it’ll be much easier to actually begin cleansing

Page 11: Data analytics

Implement Automation:

Once you’ve begun to cleanse, you should begin to standardize and cleanse the flow of new data as it enters the system by creating scripts or workflows. These can be run in real-time or in batch (daily, weekly, monthly) depending on how much data you’re working with. These routines can be applied to new data, or to previously keyed-in data.Append Missing Data:Step four is important especially for records that cannot be automatically corrected. Examples of this are emails, phone numbers, industry, company size, etc.It’s important to identify the correct way of getting a hold of the missing data, whether it’s from 3rd party append sites, reaching out to the contacts or just via good old-fashioned Google.

Page 12: Data analytics

Monitor: You will want to set up a periodic review so that you

can monitor issues before they become a major problem.

You should be monitoring your database on a whole as well as in individual units, the contacts, accounts, etc.

You should also be aware of bounce rates, and keep track of bounced emails as well as response rates.

It’s important to keep up-to-date.

Page 13: Data analytics

The end of this cycle, or step six if you will, is to bring the whole process full circle. Revisit your plans from the first step and reevaluate. Can your priorities be changed? Do the rules you implemented still fit into your overall business strategy? Pinpointing these necessary changes will equip you to work through the cycle; make changes that benefit your process and conduct periodic reviews to make sure that your data cleansing is running with smoothness and accuracy.

Follow this cycle and you’ll be well on your way to having the cleanest and thus most effective data.

Page 14: Data analytics

Exploratory Data Analysis(EDA):

Once the data is cleaned, it can be analyzed. Analysts may apply a variety of techniques referred to as exploratory data analysis to begin understanding the messages contained in the data. Exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.

The process of exploration may result in additional data cleaning or additional requests for data, so these activities may be iterative in nature. 

Descriptive statistics such as the average or median may be generated to help understand the data. 

Page 15: Data analytics

Modeling and Algorithms: Mathematical formulas or models called algorithms may be applied to the

data to identify relationships among the variables, such as correlation or causation. In general terms, models may be developed to evaluate a particular variable in the data based on other variable(s) in the data, with some residual error depending on model accuracy (i.e., Data = Model + Error).

Inferential statistics includes techniques to measure relationships between particular variables. For example, analysis may be used to model whether a change in advertising (independent variable x) explains the variation in sales (dependent variable y). In mathematical terms, y (sales) is a function of x (advertising). It may be described as y = ax + b + error, where the model is designed such that a and b minimize the error when the model predicts y for a given range of values of x. Analysts may attempt to build models that are descriptive of the data to simplify analysis and communicate results.

Page 16: Data analytics

Data Mining: Data mining is the process of finding anomalies,

patterns and correlations within large data sets to predict outcomes. Using a broad range of techniques, you can use this information to increase revenues, cut costs, improve customer relationships, reduce risks and more.

Its foundation comprises three intertwined scientific disciplines: Statistics (the numeric study of data relationships), Artificial intelligence (human-like intelligence displayed by software and/or machines) Machine Learning (algorithms that can learn from data to make predictions).

Page 17: Data analytics

Over the last decade, advances in processing power and speed have enabled us to move beyond manual, tedious and time-consuming practices to quick, easy and automated data analysis.

The more complex the data sets collected, the more potential there is to uncover relevant insights.

Retailers, banks, manufacturers, telecommunications providers and insurers, among others, are using data mining to discover relationships among everything from pricing, promotions and demographics to how the economy, risk, competition and social media are affecting their business models, revenues, operations and customer relationships.

Page 18: Data analytics

Data Visualization: Data visualization is the presentation of data in a

pictorial or graphical format. It enables decision makers to see analytics

presented visually, so they can grasp difficult concepts or identify new patterns.

Computers made it possible to process large amounts of data at lightning-fast speeds. Today, data visualization has become a rapidly evolving blend of science and art that is certain to change the corporate landscape over the next few years.

Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easiely with data visualization software.

Page 19: Data analytics

Example of Data visualization:

Page 20: Data analytics

It is used in a number of industries to allow the organizations and companies to make better decisions as well as verify and disprove existing theories or models.

Healthcare: • The main challenge for hospitals with cost pressures tightens is

to treat as many patients as they can efficiently, keeping in mind the improvement of quality of care.

• Instrument and machine data is being used increasingly to track as well as optimize patient flow, treatment, and equipment use in the hospitals.

• It is estimated that there will be a 1% efficiency gain that could yield more than $63 billion in the global health care savings.

Application

Page 21: Data analytics

Travel: • Data analytics is able to optimize the buying experience through the

mobile/ web log and the social media data analysis. • Travel sights can gain insights into the customer’s desires and preferences. • Products can be up-sold by correlating the current sales to the subsequent

browsing increase browse-to-buy conversions via customized packages and offers.

• Personalized travel recommendations can also be delivered by data analytics based on social media data.  

Gaming: • Data Analytics helps in collecting data to optimize and spend within as well

as across games. • Game companies gain insight into the dislikes, the relationships, and the

likes of the users.

Page 22: Data analytics

• Most firms are using data analytics for energy management, including smart-grid management, energy optimization, energy distribution, and building automation in utility companies.

• The application here is centered on the controlling and monitoring of network devices, dispatch crews, and manage service outrages.

• Utilities are given the ability to integrate millions of data points in the network performance and lets the engineers to use the analytics to monitor the network.

Energy Management:

Page 23: Data analytics

Meter Data Analytics refers to the analysis of data emitted by electric smart meters that record consumption of electric energy.

Replacement of traditional scalar meters with smart meters is a growing trend primarily in North America and Europe.

These smart meters send usage data to the central head end systems as often as every minute from each meter whether installed at a residential or a commercial or an industrial customer.

Analyzing this voluminous data is as crucial to utility companies as collecting the data itself. Some of the major reasons for the analysis are:

• To make efficient energy buying decisions based on the usage patterns,

• Launching energy efficiency or energy rebate programs,• Energy theft detection,• Comparing and correcting metering service provider performance, and• Detecting and reducing unbilled energy.

Meter Data Analytics:

Page 24: Data analytics

References: http://www.diffen.com/difference/Data_vs_Infor

mation https://en.wikipedia.org/wiki/Meter_data_analy

tics http://searchdatamanagement.techtarget.com

/definition/data-analytics https://www.simplilearn.com/data-science-vs-b

ig-data-vs-data-analytics-article http://www.carboncredentials.com/data-visuali

zation-smart-meters-a-first-hand-account/ http://searchbusinessanalytics.techtarget.com/

definition/data-visualization http://www.sas.com/en_us/insights/big-data/da

ta-visualization.html https://en.wikipedia.org/wiki/Exploratory_data_

analysis

Page 25: Data analytics

Thanks!