Data Mining and Analysis with Excel PivotTables · PDF fileData Mining and Analysis . With . Excel PivotTables . and . The QI Macros . ... The Pivot Table gives me a choice of viewing
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Data Mining and Analysis With
Excel PivotTables and
The QI Macros By Jay Arthur, The KnowWare® Man
It’s an old, but true saying that what gets measured gets done. That’s why so many
companies are looking to measure their performance in ways that drive optimal
performance. This report will show you how to:
1. Use Pivot Tables to mine your data.
2. Use the QI Macros to chart your data
3. Other Tips and Tricks
If you measure the wrong thing, you get wrong-headed behavior. If you measure the
right thing, you get right behavior. It’s that simple. So before you start setting up a pivot
table, you need to make sure you’re measuring the things that will drive your business
forward, not backward.
To succeed at Six Sigma or any process improvement effort, you'll often have to
analyze and summarize text data. Most companies have lots of transaction data from "flat
files" like the one shown below, but because the data consists of text and raw numbers,
they sometimes have a hard time figuring out what to do with it.
Use Pivot Tables to Perform More Detailed Analysis
Many processes produce one code or measurement each time an event happens. Do you ever collect event or transaction data like these product defects?
To summarize these by hand would be insanely time consuming. Excel's Pivot Table tool will help you summarize your data just about any way you want.
1. Select the labels and data to be summarized, in this case, select columns A: J. These often need to be summarized to simplify your analysis.
2. From Excel's pull-down menu, choose: DATA-Pivot Table and Pivot Chart Report. The Pivot-Table Wizard will guide you. The defaults are analyze data from an Excel list or database (which is what we’re doing) and create a
6. Click and drag the data labels into the appropriate area of the pivot table to get the summarization you want. In this case I chose to summarize defect code
#1 by vendor. I dragged the vendor code into the Page Field so that I can select by vendor code:
7. Notice that the data is organized by a count of defect codes by product, but I might want to identify the most frequent type of error to focus my analysis. To do this, click on PivotTable-Field Settings to reveal this window:
Then click on the Advanced Button and change the AutoSort options to descending and the Using field to Count of # of Defects 1 and click OK:
This will sort the data in descending order by number of defects:
8. To change how the data is summarized, use the pivot table wizard or double click on the top left-hand cell. For online tutorials, Google "Excel Pivot Table". In this case, instead of counts of the number of products with this defect, what if we want to count the total number of defects? Double click on “Count of # of Defects 1” and change the "Summarize by" to “Sum”:
Notice that this view shows a very different error, 354, as key to improvement.
9. Select labels and totals, and draw charts using your summarized data. Now you can just select the data in columns A4:B65 or you can click on the PivotTable button and choose PivotChart:
This starts to give you a Pareto chart perspective: four errors (354, 371,372, 261) account for a high percentage of the total errors, while dozens of error codes account for very little of the total problem. Time for some root cause analysis on each of these key errors.
Multiple Column Analysis
What if you have multiple columns of error codes as we have in this example? Excel will let you analyze these as well:
In this example, COUNTIF will look for every cell that begins with any number of
characters (“=*”), followed by whatever is in the keyword cell (&B2&), followed by any
number of characters (“*”). The asterisk is a wildcard character that matches zero or
more characters).
Wildcard characters you can use to find text or numbers
To find text or numbers that have some characters or digits in common, use a wildcard
character. A wildcard character represents one or more unspecified characters.
Use To find
? (question mark) Any single character in the same position as the question mark For example, “sm?th” finds "smith" and "smyth"
* (asterisk) Any number of characters in the same position as the asterisk For example, “*east” finds "Northeast" and "Southeast"
~ (tilde) followed by ?, *, or ~
A question mark, asterisk, or tilde For example, What~? finds "What?"
I could also use the wild card characters to match any occurrence of M*D*C*R which will give me a close approximation to the total number of Medicare errors.
I might also want to know how many were denied and how many were transferred. It’s easy to add keywords and counts using COUNTIF and the keyword formula.
These counts can then be graphed as bar charts (to show comparison in size) or pareto
charts using the QI Macros. These can become part of your dashboard as well.
As you can see, this makes comment field analysis much simpler than most people
can imagine. Where pivot tables work on fields that have fixed values, COUNTIF works