23 Summary New

8/7/2019 23 Summary New

1/63

Quantitative Methods and Business

Statistics for Decision Making

(MSA606)

A.RameshDepartment of Mechanical Engineering NIT Calicut

Email:[email protected], [email protected]

Phone:0495-228-6540


2/63

2

What is a Decision?

Decision

A reasoned choice among alternatives

Examples:

Where to advertise a new product

What stock to buy

What movie to see

Where to go for dinner

Where to locate a new plant

Which mode of transportation to choose


3/63

3

Decision Elements

Decision Statement

What are we trying to decide?

Alternative:

What are the options?

Decision Criteria:

How are we going to judge the merits of eachalternative?


4/63

4

Types of Decisions

Type of structure - Nature of task

Level of decision making - Scope

Structured Unstructured

Strategic

Managerial

Operational


5/63

5

Observation.. We face numerous decisions in life

& business.

We can use Statistics to analyzethe potential outcomes of decisionalternatives.


6/63

Quantitative Analysis

Quantitative Analysis ProcessModel Development

Data Preparation

Model Solution

Report Generation


7/63

7

REALITY MODEL

INTERPRETATION SOLUTION

Assumptions

Approximations

Algorithm

Heuristic

ANALYSIS

Implementation

General Modeling Scheme


8/63


9/63

9

What Statisticians Do

Statisticians look for patterns in data to help makedecisions in business, industry, and the biological,physical, psychological, and social sciences.

Statisticians help make important advances in scientificresearch and work in opinion polling, market research,survey management, data analysis, statisticalexperiments, and education.

Statisticians use quantitative abilities, statisticalknowledge, and computing and communication skills tocollaborate with other scientists to work on challengingproblems


10/63

10

Statistics The science of data to answer research

questions Formulate a research question(s) (hypothesis)

Collect data Analyze and summarize data

Draw conclusions to answer researchquestion(s) Statistical Inference

In the presence of variation


11/63

11

V

ariation What if everyone:

Looked the same

Thought the same

Believed the same


12/63

12


13/63

13

Populations with variation Everyone looks different

Everyone thinks different

Everyone believes different

V

ariation


14/63

14

V

ariation Variation is everywhere Individuals Repeated measurements on the same

individual Almost everything varies over time

Because variation is everywhere,statistical conclusions are not certain. Probability statement Confidence statement Margin of error


15/63

15

Where the Data Come From is

Important Good data intelligent human effort

Bad data laziness, lack of

understanding, or a desire to mislead Know where the data come from

Understand statistics

Example: Did you know that 45% ofstatistics are made up on the spot????


16/63

16

Manipulating the Facts Data collection sampling and

measurement biases, ignoring influential

variables Data summarization graphically

misrepresenting data, choosing misleading

statistics Statistical Inference reporting invalid

conclusions and interpretations


17/63

17

Manipulating Data Collection Sampling biases:

One group in a population is overrepresentedcompared to another.


18/63


19/63

19

Manipulating Data Summarization

Graphically misrepresenting data


20/63

20

Understanding Data

Individuals & Variables Individuals objects described by a set of

data. May be people, animals, or things

Also called subjects or units. Variables any characteristic of an

individual. A variable can take differentvalues for different individuals.


21/63

21

Statistical Concepts & Tools Data representation Various Probability Distributions

Discrete (Binomial, Geometric, Poisson, Uniform etc.) Continuous (Uniform, Exponential, Normal etc.)

Central Limit Theorem Distribution of Sample Means Point Estimates Confidence Interval Type I and Type II errors

Hypothesis Testing Regression: simple/multiple Anova, Non-parametric tests


22/63

22

Population Versus Sample

Population the whole a collection of persons, objects, or items

under study

Census gathering data from the entire

population Sample a portion of the whole

a subset of the population


23/63

23

Parameter vs. Statistic

Parameter descriptive measure of the

population Usually represented by Greek letters

Statistic descriptive measure of asample Usually represented by Roman letters


24/63

24

Levels of Data Measurement

Nominal Lowest level of measurement

Ordinal

Interval

Ratio Highest level of measurement


25/63

25

Common concern: Bias

Producing Data/Collecting Data

Sample Surveys Experimentsvs.

Population SnapshotImpose treatmenton subjects/unitsObserve response toimposed treatment

Bias:Systematically favors certain outcomes


26/63

26

Commonly used tables Standard normal variate

t

Chi-square

F

Non-parametric


27/63

27

Central Limit Theorem Most theory about sample means depends on

assumptions that the mean comes from a

normal distribution. The Central Limit Theorem says that for any

population, if the sample size is large enough,the sample means will be approximatelynormally distributed with the mean equal to thepopulation mean and standard deviation equalto the population standard deviation divided bythe square root of n (/n).


28/63

28

Normal Distribution Mother of all !

Standard normal variate (Z) ~ N(Q, W2 )

G2 : Chi-Square Square of Z t distribution small sample size

F Distribution ~ Ratio ofG2

Approximation to Discrete : Binomial etc.


29/63

29

Confidence Interval to Estimate Q

when n is Large

Point estimate

IntervalEstimate

XX

n!

X Z

nor

X Zn

X Zn

s

e e

W

WQ

W


30/63

30

Distribution of Sample Means

for (1-E)% Confidence

Q X

E

Z0 E

2

Z E2

Z

E2

E2


31/63

31

Probability Interpretationof the Level of Confidence

Pr [ ]obn n

e e ! E EW Q W E2 2

1


32/63

32

Estimating the Population

Variance Population ParameterW

Estimator ofW

G formula for Single

Variance

2

2

1S

X Xn!

22

21

1

GW

! n S

ndegrees o reedo m = -


33/63

33

Confidence Interval forW2

n n

df n

S S

e e

! !

1 1

11

2

2

2

2

2

12

2E EG W G

E level o con idence


34/63

34

SelectedG

2 Distributionsdf = 3

df = 5

df = 10

0


35/63

35

Statistical Significance

Significance is a statistical term that tells how sure youare that a difference or relationship exists. To say that asignificant difference or relationship exists only tells half

the story. We might be very sure that a relationship exists, but is it

a strong, moderate, or weak relationship? After finding asignificant relationship, it is important to evaluate itsstrength. Significant relationships can be strong or weak.

Significant differences can be large or small. It justdepends on your sample size.


36/63

Steps in a Test of Hypothesis 1. Define problem. :Determine H

0and H

A.Select Alpha .

2. Collect data

3. Calculate xbar as an estimate of and s as an estimate of.

4. Check assumptions:

Sample size n is reasonably large (n 30) so can usenormal distribution and estimate with s.

Check for outliers or strong skewness in pop. dist.

5. Calculate Standard Score

6. Compare with Tabulated value to make conclusions.

7. Make conclusions in context of the problem.

E


37/63

37

If statistic is higher than the critical

value from the tableThe finding is significant.

Reject the null hypothesis.

The probability is small that the difference orrelationship happened by chance, and p isless than the critical alpha level (p < alpha ).


38/63

38

If statistic is lower than the critical

value from the tableThe finding is not significant.

One fails to reject the null hypothesis.

The probability is high that the difference orrelationship happened by chance, and p isgreater than the critical alpha level (p > alpha ).


39/63

Partition of Total Sum of Squares inRBDPartition of Total Sum of Squares inRBD

SST

(Total Sum of Squares)

SSC

(Treatment

Sum of Squares)

SSE

(Error Sum of Squares)

SSR

(Sum of Squares

Blocks)

SSE

(Sum of Squares

Error)


40/63

40

Regression and Correlation

Regression analysis is the process ofconstructing a mathematical model or

function that can be used to predict ordetermine one variable by another

variable.

Correlation is a measure of the degree ofrelatedness of two variables.


41/63

41

Simple Regression Analysis

bivariate (two variables) linear regression -- the most elementary regression model

dependent variable, the variable to bepredicted, usually called Y

independent variable, the predictor or

explanatory variable, usually calledX


42/63

42

Regression ModelsDeterministic Regression ModelDeterministic Regression Model

Y =Y = FF00 ++ FF11XX

Probabilistic Regression ModelProbabilistic Regression Model

Y =Y = FF00 ++ FF11X +X + II

FF00 andand FF11 are population parametersare population parameters

FF00 andand FF11 are estimated by sample statistics bare estimated by sample statistics b00 and band b11


43/63

43

Equation of the Simple

Regression Line

YY

where

XY

bb

bb

ofvaluepredictedthe

slopesamplethe

interceptsamplethe:

1

0

10!


44/63

44

Least Squares Analysis

1 2 2 2

2

2bY Y X Y nX Y

n

X YX Y

n

n

!

!

!

0 1 1b b bY XY

n

X

n! !


45/63

45

Least Squares Analysis

S S X X Y Y XYX Y

n

S S n

S S

S S

XY

X X

XY

X X

X X X X

b

! !

! !

!

2

2

2

1

0 1 1b b bY XY

n

X

n! !


46/63

46

Parametric vs Nonparametric Statistics

Parametric Statistics are statistical techniques based onassumptions about the population from which the sample data arecollected. Assumption that data being analyzed are randomly selected

from a normally distributed population. Requires quantitative measurement that yield interval or ratio

level data.

Nonparametric Statistics are based on fewer assumptions about thepopulation and the parameters. Sometimes called distribution-free statistics. A variety of nonparametric statistics are available for use with nominal

or ordinal data.

RUN TEST MANN-WHITNEY CHI-SQUARE KRUSKAL-WALLIS

Etc.


47/63

47

Which Test to use?Goal Measurement

(from Gaussian

Population)

Rank, Score, or

Measurement (from Non-

Gaussian Population)

Describe one group Mean, SD Median, interquartile range

Compare one group to a

hypothetical valueOne-sample ttest Wilcoxon test

Compare two unpaired

groups

Unpaired ttest Mann-Whitney test

Compare two paired

groups

Paired ttest Wilcoxon test

Compare three or

more unmatched

groups

One-way ANOVA Kruskal-Wallis test


48/63

48

Web based Decision Tree to

choose a Statistical test http://www.edu.rcsed.ac.uk/statistics/A%2

0simple%20algorithm%20to%20help%20d

ecide%20the%20statistical%20test%20to%20use.htm


49/63


50/63

50

Statistical Software

Able to perform a variety of tests

User friendly (Portable, Graphics, ability toexport/import, fast etc.) Excel : Many useful features

Minitab SPSS


51/63

51

Checklist for

AStatistical Project ..1.. Statement of purpose/question of interest Summary of data collection e.g. random sample, stratified sample,

available data identify possible sources of bias Why do you believe sample was representative? Summarize the data (concise, well-labeled, easy to read) Numerical or quantitative data Graphs: Pie diagram or histogram measures of central tendency (e.g. mean or median)

measures of spread (e.g. range, SD, IQR) a check for outliers (e.g. z scores,) a check for normality (prob. plot, 68-95-99.7 rule) if needed by your

analysis Quantitative data Proportion in each category


52/63

52

Checklist for

AStatistical Project :2.. Statistical inference

Quantitative data

e.g. confidence intervals for mean(s), hypothesis test for mean(s), regression,ANOVA

Qualitative data Include a discussion of why our method is appropriate

Diagnostics

Verification of any assumptions made during statistical inference

Interpretation/Explanation of results

What does it all mean?

Use the above summaries to justify your interpretation

Suggest reasons for what you have observed Overall conclusion, recommendations, future scope

References


53/63

53

Statistics about the course MSA606

Registered students : 21

Theory session:31

Tutorial sessions : 6

Mid-Term Evaluation: 1

Mini project : 1


54/63

54

Quotable quotes !! Every model is an approximation. It is the data that

are real !

All models are wrong ; some models are useful.

Discovering the unexpected is more important thanconfirming the known !

Among the factors to be considered there will usuallybe the vital few and the trivial many ( Juran)

Theres never been a signal without noise !


55/63


56/63

PHOENIX

RV SHINTU

STALLON THOMAS VIVEK G

56

Thanks a lot to all of you


57/63


TEJAS

ALBY DAVIS

CYRIL AUGUSTINE SUBODH.M.C

57


58/63


MATRIX

SHEETHANSHU

SHEKHAR BINESH JOSE

FARIHA

58


59/63


SPARK

GALI JAYANTHI

ASWATHY M K.P.SANGEETHA

59


60/63


61/63


THREE MUSKETEERS

PRADEEP KUMAR .N

JOSE PIUSNEDUMKALLEL

RAMADAS N

61


62/63


RUSH

RAHUL NAWANI

SURENDRA BABUTALLURI

GOSWAMI PALAKHARSHADPURI

62


63/63

Special Thanks..

Dr.Shaffi

SOMS Office Staffs

23 Summary New

Documents