8/7/2019 23 Summary New
1/63
Quantitative Methods and Business
Statistics for Decision Making
(MSA606)
A.RameshDepartment of Mechanical Engineering NIT Calicut
Email:[email protected], [email protected]
Phone:0495-228-6540
8/7/2019 23 Summary New
2/63
2
What is a Decision?
Decision
A reasoned choice among alternatives
Examples:
Where to advertise a new product
What stock to buy
What movie to see
Where to go for dinner
Where to locate a new plant
Which mode of transportation to choose
8/7/2019 23 Summary New
3/63
3
Decision Elements
Decision Statement
What are we trying to decide?
Alternative:
What are the options?
Decision Criteria:
How are we going to judge the merits of eachalternative?
8/7/2019 23 Summary New
4/63
4
Types of Decisions
Type of structure - Nature of task
Level of decision making - Scope
Structured Unstructured
Strategic
Managerial
Operational
8/7/2019 23 Summary New
5/63
5
Observation.. We face numerous decisions in life
& business.
We can use Statistics to analyzethe potential outcomes of decisionalternatives.
8/7/2019 23 Summary New
6/63
Quantitative Analysis
Quantitative Analysis ProcessModel Development
Data Preparation
Model Solution
Report Generation
8/7/2019 23 Summary New
7/63
7
REALITY MODEL
INTERPRETATION SOLUTION
Assumptions
Approximations
Algorithm
Heuristic
ANALYSIS
Implementation
General Modeling Scheme
8/7/2019 23 Summary New
8/63
8/7/2019 23 Summary New
9/63
9
What Statisticians Do
Statisticians look for patterns in data to help makedecisions in business, industry, and the biological,physical, psychological, and social sciences.
Statisticians help make important advances in scientificresearch and work in opinion polling, market research,survey management, data analysis, statisticalexperiments, and education.
Statisticians use quantitative abilities, statisticalknowledge, and computing and communication skills tocollaborate with other scientists to work on challengingproblems
8/7/2019 23 Summary New
10/63
10
Statistics The science of data to answer research
questions Formulate a research question(s) (hypothesis)
Collect data Analyze and summarize data
Draw conclusions to answer researchquestion(s) Statistical Inference
In the presence of variation
8/7/2019 23 Summary New
11/63
11
V
ariation What if everyone:
Looked the same
Thought the same
Believed the same
8/7/2019 23 Summary New
12/63
12
8/7/2019 23 Summary New
13/63
13
Populations with variation Everyone looks different
Everyone thinks different
Everyone believes different
V
ariation
8/7/2019 23 Summary New
14/63
14
V
ariation Variation is everywhere Individuals Repeated measurements on the same
individual Almost everything varies over time
Because variation is everywhere,statistical conclusions are not certain. Probability statement Confidence statement Margin of error
8/7/2019 23 Summary New
15/63
15
Where the Data Come From is
Important Good data intelligent human effort
Bad data laziness, lack of
understanding, or a desire to mislead Know where the data come from
Understand statistics
Example: Did you know that 45% ofstatistics are made up on the spot????
8/7/2019 23 Summary New
16/63
16
Manipulating the Facts Data collection sampling and
measurement biases, ignoring influential
variables Data summarization graphically
misrepresenting data, choosing misleading
statistics Statistical Inference reporting invalid
conclusions and interpretations
8/7/2019 23 Summary New
17/63
17
Manipulating Data Collection Sampling biases:
One group in a population is overrepresentedcompared to another.
8/7/2019 23 Summary New
18/63
8/7/2019 23 Summary New
19/63
19
Manipulating Data Summarization
Graphically misrepresenting data
8/7/2019 23 Summary New
20/63
20
Understanding Data
Individuals & Variables Individuals objects described by a set of
data. May be people, animals, or things
Also called subjects or units. Variables any characteristic of an
individual. A variable can take differentvalues for different individuals.
8/7/2019 23 Summary New
21/63
21
Statistical Concepts & Tools Data representation Various Probability Distributions
Discrete (Binomial, Geometric, Poisson, Uniform etc.) Continuous (Uniform, Exponential, Normal etc.)
Central Limit Theorem Distribution of Sample Means Point Estimates Confidence Interval Type I and Type II errors
Hypothesis Testing Regression: simple/multiple Anova, Non-parametric tests
8/7/2019 23 Summary New
22/63
22
Population Versus Sample
Population the whole a collection of persons, objects, or items
under study
Census gathering data from the entire
population Sample a portion of the whole
a subset of the population
8/7/2019 23 Summary New
23/63
23
Parameter vs. Statistic
Parameter descriptive measure of the
population Usually represented by Greek letters
Statistic descriptive measure of asample Usually represented by Roman letters
8/7/2019 23 Summary New
24/63
24
Levels of Data Measurement
Nominal Lowest level of measurement
Ordinal
Interval
Ratio Highest level of measurement
8/7/2019 23 Summary New
25/63
25
Common concern: Bias
Producing Data/Collecting Data
Sample Surveys Experimentsvs.
Population SnapshotImpose treatmenton subjects/unitsObserve response toimposed treatment
Bias:Systematically favors certain outcomes
8/7/2019 23 Summary New
26/63
26
Commonly used tables Standard normal variate
t
Chi-square
F
Non-parametric
8/7/2019 23 Summary New
27/63
27
Central Limit Theorem Most theory about sample means depends on
assumptions that the mean comes from a
normal distribution. The Central Limit Theorem says that for any
population, if the sample size is large enough,the sample means will be approximatelynormally distributed with the mean equal to thepopulation mean and standard deviation equalto the population standard deviation divided bythe square root of n (/n).
8/7/2019 23 Summary New
28/63
28
Normal Distribution Mother of all !
Standard normal variate (Z) ~ N(Q, W2 )
G2 : Chi-Square Square of Z t distribution small sample size
F Distribution ~ Ratio ofG2
Approximation to Discrete : Binomial etc.
8/7/2019 23 Summary New
29/63
29
Confidence Interval to Estimate Q
when n is Large
Point estimate
IntervalEstimate
XX
n!
X Z
nor
X Zn
X Zn
s
e e
W
WQ
W
8/7/2019 23 Summary New
30/63
30
Distribution of Sample Means
for (1-E)% Confidence
Q X
E
Z0 E
2
Z E2
Z
E2
E2
8/7/2019 23 Summary New
31/63
31
Probability Interpretationof the Level of Confidence
Pr [ ]obn n
e e ! E EW Q W E2 2
1
8/7/2019 23 Summary New
32/63
32
Estimating the Population
Variance Population ParameterW
Estimator ofW
G formula for Single
Variance
2
2
1S
X Xn!
22
21
1
GW
! n S
ndegrees o reedo m = -
8/7/2019 23 Summary New
33/63
33
Confidence Interval forW2
n n
df n
S S
e e
! !
1 1
11
2
2
2
2
2
12
2E EG W G
E level o con idence
8/7/2019 23 Summary New
34/63
34
SelectedG
2 Distributionsdf = 3
df = 5
df = 10
0
8/7/2019 23 Summary New
35/63
35
Statistical Significance
Significance is a statistical term that tells how sure youare that a difference or relationship exists. To say that asignificant difference or relationship exists only tells half
the story. We might be very sure that a relationship exists, but is it
a strong, moderate, or weak relationship? After finding asignificant relationship, it is important to evaluate itsstrength. Significant relationships can be strong or weak.
Significant differences can be large or small. It justdepends on your sample size.
8/7/2019 23 Summary New
36/63
Steps in a Test of Hypothesis 1. Define problem. :Determine H
0and H
A.Select Alpha .
2. Collect data
3. Calculate xbar as an estimate of and s as an estimate of.
4. Check assumptions:
Sample size n is reasonably large (n 30) so can usenormal distribution and estimate with s.
Check for outliers or strong skewness in pop. dist.
5. Calculate Standard Score
6. Compare with Tabulated value to make conclusions.
7. Make conclusions in context of the problem.
E
8/7/2019 23 Summary New
37/63
37
If statistic is higher than the critical
value from the tableThe finding is significant.
Reject the null hypothesis.
The probability is small that the difference orrelationship happened by chance, and p isless than the critical alpha level (p < alpha ).
8/7/2019 23 Summary New
38/63
38
If statistic is lower than the critical
value from the tableThe finding is not significant.
One fails to reject the null hypothesis.
The probability is high that the difference orrelationship happened by chance, and p isgreater than the critical alpha level (p > alpha ).
8/7/2019 23 Summary New
39/63
Partition of Total Sum of Squares inRBDPartition of Total Sum of Squares inRBD
SST
(Total Sum of Squares)
SSC
(Treatment
Sum of Squares)
SSE
(Error Sum of Squares)
SSR
(Sum of Squares
Blocks)
SSE
(Sum of Squares
Error)
8/7/2019 23 Summary New
40/63
40
Regression and Correlation
Regression analysis is the process ofconstructing a mathematical model or
function that can be used to predict ordetermine one variable by another
variable.
Correlation is a measure of the degree ofrelatedness of two variables.
8/7/2019 23 Summary New
41/63
41
Simple Regression Analysis
bivariate (two variables) linear regression -- the most elementary regression model
dependent variable, the variable to bepredicted, usually called Y
independent variable, the predictor or
explanatory variable, usually calledX
8/7/2019 23 Summary New
42/63
42
Regression ModelsDeterministic Regression ModelDeterministic Regression Model
Y =Y = FF00 ++ FF11XX
Probabilistic Regression ModelProbabilistic Regression Model
Y =Y = FF00 ++ FF11X +X + II
FF00 andand FF11 are population parametersare population parameters
FF00 andand FF11 are estimated by sample statistics bare estimated by sample statistics b00 and band b11
8/7/2019 23 Summary New
43/63
43
Equation of the Simple
Regression Line
YY
where
XY
bb
bb
ofvaluepredictedthe
slopesamplethe
interceptsamplethe:
1
0
10!
8/7/2019 23 Summary New
44/63
44
Least Squares Analysis
1 2 2 2
2
2bY Y X Y nX Y
n
X YX Y
n
n
!
!
!
0 1 1b b bY XY
n
X
n! !
8/7/2019 23 Summary New
45/63
45
Least Squares Analysis
S S X X Y Y XYX Y
n
S S n
S S
S S
XY
X X
XY
X X
X X X X
b
! !
! !
!
2
2
2
1
0 1 1b b bY XY
n
X
n! !
8/7/2019 23 Summary New
46/63
46
Parametric vs Nonparametric Statistics
Parametric Statistics are statistical techniques based onassumptions about the population from which the sample data arecollected. Assumption that data being analyzed are randomly selected
from a normally distributed population. Requires quantitative measurement that yield interval or ratio
level data.
Nonparametric Statistics are based on fewer assumptions about thepopulation and the parameters. Sometimes called distribution-free statistics. A variety of nonparametric statistics are available for use with nominal
or ordinal data.
RUN TEST MANN-WHITNEY CHI-SQUARE KRUSKAL-WALLIS
Etc.
8/7/2019 23 Summary New
47/63
47
Which Test to use?Goal Measurement
(from Gaussian
Population)
Rank, Score, or
Measurement (from Non-
Gaussian Population)
Describe one group Mean, SD Median, interquartile range
Compare one group to a
hypothetical valueOne-sample ttest Wilcoxon test
Compare two unpaired
groups
Unpaired ttest Mann-Whitney test
Compare two paired
groups
Paired ttest Wilcoxon test
Compare three or
more unmatched
groups
One-way ANOVA Kruskal-Wallis test
8/7/2019 23 Summary New
48/63
48
Web based Decision Tree to
choose a Statistical test http://www.edu.rcsed.ac.uk/statistics/A%2
0simple%20algorithm%20to%20help%20d
ecide%20the%20statistical%20test%20to%20use.htm
8/7/2019 23 Summary New
49/63
8/7/2019 23 Summary New
50/63
50
Statistical Software
Able to perform a variety of tests
User friendly (Portable, Graphics, ability toexport/import, fast etc.) Excel : Many useful features
Minitab SPSS
8/7/2019 23 Summary New
51/63
51
Checklist for
AStatistical Project ..1.. Statement of purpose/question of interest Summary of data collection e.g. random sample, stratified sample,
available data identify possible sources of bias Why do you believe sample was representative? Summarize the data (concise, well-labeled, easy to read) Numerical or quantitative data Graphs: Pie diagram or histogram measures of central tendency (e.g. mean or median)
measures of spread (e.g. range, SD, IQR) a check for outliers (e.g. z scores,) a check for normality (prob. plot, 68-95-99.7 rule) if needed by your
analysis Quantitative data Proportion in each category
8/7/2019 23 Summary New
52/63
52
Checklist for
AStatistical Project :2.. Statistical inference
Quantitative data
e.g. confidence intervals for mean(s), hypothesis test for mean(s), regression,ANOVA
Qualitative data Include a discussion of why our method is appropriate
Diagnostics
Verification of any assumptions made during statistical inference
Interpretation/Explanation of results
What does it all mean?
Use the above summaries to justify your interpretation
Suggest reasons for what you have observed Overall conclusion, recommendations, future scope
References
8/7/2019 23 Summary New
53/63
53
Statistics about the course MSA606
Registered students : 21
Theory session:31
Tutorial sessions : 6
Mid-Term Evaluation: 1
Mini project : 1
8/7/2019 23 Summary New
54/63
54
Quotable quotes !! Every model is an approximation. It is the data that
are real !
All models are wrong ; some models are useful.
Discovering the unexpected is more important thanconfirming the known !
Among the factors to be considered there will usuallybe the vital few and the trivial many ( Juran)
Theres never been a signal without noise !
8/7/2019 23 Summary New
55/63
8/7/2019 23 Summary New
56/63
PHOENIX
RV SHINTU
STALLON THOMAS VIVEK G
56
Thanks a lot to all of you
8/7/2019 23 Summary New
57/63
Thanks a lot to all of you
TEJAS
ALBY DAVIS
CYRIL AUGUSTINE SUBODH.M.C
57
8/7/2019 23 Summary New
58/63
Thanks a lot to all of you
MATRIX
SHEETHANSHU
SHEKHAR BINESH JOSE
FARIHA
58
8/7/2019 23 Summary New
59/63
Thanks a lot to all of you
SPARK
GALI JAYANTHI
ASWATHY M K.P.SANGEETHA
59
8/7/2019 23 Summary New
60/63
8/7/2019 23 Summary New
61/63
Thanks a lot to all of you
THREE MUSKETEERS
PRADEEP KUMAR .N
JOSE PIUSNEDUMKALLEL
RAMADAS N
61
8/7/2019 23 Summary New
62/63
Thanks a lot to all of you
RUSH
RAHUL NAWANI
SURENDRA BABUTALLURI
GOSWAMI PALAKHARSHADPURI
62
8/7/2019 23 Summary New
63/63
Special Thanks..
Dr.Shaffi
SOMS Office Staffs