Quantitative Analysis for Emperical Research
Post on 15-Jun-2015
506 Views
Preview:
DESCRIPTION
Transcript
Jointly organized byJNARDDC & IAPQR, Kolkata
A Condensed ReviewBy Amit Kamble
Bridging Training and Research for Industry and the Wider Community
An approach for listening to the data with an open mind, using descriptive and graphical tools.
Planning of ExperimentsPlanning of Experiments
Planning is the first step for human activity – for undertaking any scientific, technological or industrial experiment.
We are told: If you fail to plan, you are only planning to fail !
• An experiment is a means of getting answer to the question experimenter has in mind. Broadly experiments are of two types:
• Experiments for determining the properties of defined sets of things, e.g. assess proof stress, thermal conductivity, yield of product etc.
• Experiments comparative in nature, e.g. to assess effect of different % of an alloying element on tensile strength of aluminum alloy
The experimenter has to have a clear idea about the objective, as many of the facets of planning depend on this.
Objective Includes:
Response VariableResponse Variable TreatmentTreatment
Experimental UnitExperimental Unit
Experimental ErrorExperimental Error
TreatmentThe different procedures, or objects, or levels of factor under comparison in an experiment are different treatments. e.g. different alloying conditions.
Response VariableThe variable(s) on which measurements are to be taken for analysis – is the response variable(s). e.g. effect of chemical composition on quality of alloy, is to be decided whether alloy addition, temp., thermal conductivity, or subset of these would be response variable(s)
Experimental UnitAn experimental unit is the material to which the treatment is applied and on which the response variable(s) is (are) measured.e.g. a specimen of an alloy under defined conditions would be the experimental unit, in an experiment to develop a new alloy.
Experimental ErrorThe unexplained random part of the variation termed as the experimental error.e.g. variation in the measurement of mechanical properties of alloy under different sets of load, etc.
Experimental ErrorThis is technical term, includes all types of uncontrollable extraneous variations due to(i) inherent variability in experimental units(ii) errors associated with measurements(iii) lack of representativeness of sample to the population under study
This part of variation cannot be totally eliminated – we have to live with this !
Basic Principles of experimental design (by R.A. Fisher)
Fisher’s Diagram
*Replication
* Randomization
* Local Control (desirable)
(vital)
Planning of experiments falls into two almost distinct parts, dealing with the principles that should govern(a) The choice of treatments to be compared i.e. observations to be made of, and experimental units to be used(b) The method of assigning treatments to the experimental units and the decision about how many units to be used
Techniques of local controlUse of homogenous blocksUse of supporting variablesConfounding in factorial experiments
The requirements are: The treatment comparisons should be as far
as possible free from systematic error The treatment comparisons should be made
sufficiently precisely The conclusion should have a wide range of
validity The experimental design should be as
simple as possible The uncertainty in the conclusions should be
assessable
The precision of an experiment is measured by the reciprocal of the variance of a mean
1/2x = n/2
The standard error of the estimate of the difference between two treatments is inversely proportional to the square root of the no. of units for each treatment.Standard error is:standard deviation x (2/No. of treatments per unit)
1 1 standard deviation x No. of No. of
units of A units of B
Data ModelingAnalysis of DependenceCategorical Data AnalysisTesting of Statistical
Hypothesis
Describing VariationNo two units of product produced by a manufacturing process are identical. e.g. the net content of can of soft drink varies slightly from can to can
A solved exampleData for Forged piston-ring inside diameter (mm)
Sample no. Observations
1 74.030 74.002 74.019 73.992 74.008
2 73.995 73.992 74.001 74.011 74.004
3 74.009 73.994 73.997 73.987 73.993
… … … … … …
25 73.982 73.984 73.995 74.017 74.013
A frequency distribution of piston ring data and a histogram of frequencies vs. the ring diameter is as shown:
10 0
810
19
2322 22
13
42
1
0
5
10
15
20
25
1Ring Dia. in (mm)
73.965-73.97073.970-73.97573.975-73.98073.980-73.98573.985-73.99073.990-73.99573.995-74.00074.000-74.00574.005-74.01074.010-74.01574.015-74.02074.020-74.02574.025-74.030
Ring Diameter Frequency
73.965-73.970 1
73.970-73.975 0
73.975-73.980 0
73.980-73.985 8
73.985-73.990 10
73.990-73.995 19
73.995-74.000 23
74.000-74.005 22
74.005-74.010 22
74.010-74.015 13
74.015-74.020 4
74.020-74.025 2
74.025-74.030 1
The histogram presents a visual display of the data in which one may see:
1. Shape 2. Location or central tendency 3. Scatter or spread
Numerical summery of datahistogram is helpful o use numerical measure of tendency and scattersuppose that x1,x2,x3,..xn are the observations in sample. The central tendency (average)
x = x1+x2+x3+...+xn = x
n n
The scatter or spread in sample data is measure by sample variance
S2 = (xi – x)2
n-1The sample average of piston ring data = 9250.125/125
= 74.001 mmFor piston ring we find S2 = 0.000102 mm2 & S = 0.010
mm (S.D.)
Probability DistributionsA probability distribution is a mathematical model that relates the value of the variable with the probability of occurrence of that value in the population.
Some discrete distributionsThe Hypergeometric Distributionsuppose a finite population of N items. Say D (DN) of these items falls into a class of interest. A random sample of n items is selected from the population without replacement, and no. of items in sample that fall in class of interest, say x, is observed.Then x is hypergeometric random variable with probability distribution
p(x) = DCx N-DCn-x
NCn
x=0,1,2,3…,n
The mean and variance of hypergeometric distribution are = nD 2 = nD (1 – D) (N – n)
N N N N – 1
Binomial DistributionConsider a process that consists of a sequence of n independent trials where outcome of each trial is either “success” or “failure” (Bernoulli trials), say p, is constant, then no. of “successes” ‘x’, in Bernoulli has binomial distribution as
p(x) = nCx px(1 – p)n-x x= 0,1,2,…,n
The mean and variance of binomial distribution = np 2 = np (1 – p )
Poisson DistributionAnother important discrete distribution is Poisson distribution,
p (x) = e –m mx
x! Where, m>0 & x = 0,1,…The mean and variance of Poisson Distribution
= m 2 = m
Ex: Suppose that the no. of wire bonding defects per unit that occur in a semiconductor device is Poisson distributed with parameter m =4. Then the probability that randomly selected semi conductor device will contain two or fewer wire bounding defects is
p(x2) = e–4 4x x = 0,1,2x!
= 0.0183+0.0733+0.1464=0.2380
Correlation and RegressionMany situation arise in which we may have to study two variables simultaneously, say x and y. and we may be interested to measure numerically the strength of this association between variables. This is problem of Correlation. secondly if one variable is of interest and other variable is auxiliary, in such case we are interested in using mathematical equation for making estimates regarding principle variable. This is known as Regression.
Scatter diagram showing different types of degree of Correlation
x
y
x
y
r= +1x
y
r= -1
x
y
x
y
r = 0
Correlation Coefficient (r):= Covariance (x,y)
var(x) var(y)
A categorical variable or attribute is one for which the measurement scale consists of set of categories.variables that do not have natural ordering is ‘nominal’
Categorical variables having ordered status is called ‘ordinal’Nominal: public school, private schoolOrdinal: primary school, secondary school,
college, universityInterval: years of schooling
Types of data Distribution by single attribute Distribution by several populations by single
attribute Distribution by two attributes Distribution by more than two attributes
Purpose of study Estimation of incidence of levels Measurement of association between attributes Testing homogeneity of several populations in
respect of single attribute Testing significance of association between two or
more attributes Testing goodness of fit
2 x 2 Contingency TableA has two forms:
A (presence or higher level) & (absence or lower level)
B has two forms:B (presence or higher level) & (absence or lower level)
Example: Smoking & Lung Cancer
Smoker Lung Cancer Patent Total
Yes (A) No ()
Yes (B) 183(AB) 645 (B) 828 (B)
No () 59 (A) 2113() 2172 ()
Total 242 (A) 2578 () 3000 (n)
Relative risk = (AB/B)/(A/) = (183/828)/(59/2172) = 8.125
OddsB = (AB/B)/(1 – (AB/B)) = 0.2210/0.7790 = 0.2837
Odds= (A/ )/(1 – (A / )) = 0.272/0.9728 = 0.0280
Odds Ratio = OddsB/Odds
= (AB * )/(A*B) = 10.1611
• Independence Implies AB = (A*B) / n
• Positive Association implies AB > (A*B) / n
• Negative Association implies AB < (A*B) / n
Here (A*B) / n = (242*828)/3000 = 66.792
Which implies positive association
An introduction through examples (Single mean)Ex. 40 samples of an specimen of an aluminum alloy (Sn 6.1%,Cu 1.2%,Ni0.9% rest Al) were tested for density (g/cc). The result obtained were mean x = 2.61and variance = std. deviation = 0.605. Do the data support the conjecture that the mean density of alloy is less than 2.84?Here: H0: =2.84
against H1: <2.84
The test statistics is T = (x - 0)n = (2.61-2.84)40S 0.605
= - 2.404since - .05 = -1.645 and - .01 = -2.326the observed value of T is less than both these values, we conclude that the mean density of the alloy under reference is significantly lower than 2.84
Example of mean from two samplesEx. 32 samples of an specimen of an aluminum alloy (Sn 20.3%,Cu 1.1%,rest Al) were tested for 0.2% compression strength (MN/m2). The result obtained were mean x1 = 102.8 and S1 = 7.9. A set of 35 samples of an specimen of an aluminum alloy (Pb 20.6%,Cu 1.1%,rest Al) were tested for same property, result obtained were mean x2 = 102.8 and S2 = 8.4 do the data support the conjecture that two alloys have identical status in respect of property?
Here: H0: 1= 2against H1: 1 2
The test statistics is T = x1 – x2
S2S1
n1 n2
2 2+
Thus T = 102.8 – 106.5 = - 3.7/(3.9693) = - 1.8578
.025 = 1.960 and .005 = 2.576
Since |T| is less than both these values, we may conclude that in light of given sample, the alloys may be taken to have identical mean 0.2% compression strength
32
8.47.9 2
35
2+
Ex: 40 sample data (double mould) is taken for the study of variation of Mg% in FeSiMg alloy the observed results were 1=7.494 and S1=0.18004. A second set of 45 sample data (single mould) gave values 2=7.5949 and S2=0.19082. Do the data support the hypothesis that the two alloys have identical mean value of Mg% in the population?Here H0: 1=2
H1: 12
By following the test statistics from the previous problem; we have,
T = 7.494 – 7.5979/((0.18004)2/45+(0.19082)2/40)
T = – 1.82245
.025 = 2.014 and .005 = 2.968now, |T| = 1.82245 is less than both these values, we may conclude that in the light of given data the two alloys may be taken to have identical mean of Mg%
The Greatest value of a picture is when it forces us to notice what we never expected to see.
– John W. Tukey
Temperature Variation Vs. Slag
0
5
10
15
1435 1440 1445 1450 1455 1460
Temp
Slag
30 Oct BIL
18-Oct
0
5
10
15
20
25
1435 1440 1445 1450 1455 1460 1465 1470
Temp
Slag
17 Oct Bright
27 Oct GS
Same Charging
0
2
4
6
8
10
12
14
16
18
1430 1440 1450 1460 1470 1480 1490
Temp
Slag
17 Oct Magna
1 Nov 8-10
1 Nov 5-7
1 Nov GS
Same Charge
810121416182022
1430 1440 1450 1460 1470 1480 1490 1500 1510
Temp
Slag
10 Nov BIL
10 Nov 5-7
8 Jan AMTEK
0
5
10
15
20
25
1420 1440 1460 1480 1500 1520 1540
Temp
Slag
16 Oct ISRC16-Dec
Comparision of ladle outer temp. of different linings
0
50
100
150
200
250
300
At Furnace after slagremoval
after Mgplunging
duringtapping
holding tempfor empty
ladle till nextheat
Te
mp
C
Silica Lining Results
Al2O3 Lining Results
Plot of comparison of outer temperature of ladle, Plot of comparison of outer temperature of ladle, ladle lining namely silica & High Alladle lining namely silica & High Al22OO33
Lastly Some Facts
The no. of human beings killed by an Hippopotamus annually is more than a yearly plane crash.
No paper of any size can be folded in half for more than 8 times.
Approximately a human being spend nearly 2 weeks of his life waiting at Red Traffic Signal.
Thank You !!
top related