Page 1
BUSINESS
STATISTICS
Vol-2
ABF 102
ACeL In the modern world of computers and information
technology, the importance of statistics is very well
recogonised by all the disciplines. Statistics has
originated as a science of statehood and found
applications slowly and steadily in Agriculture,
Economics, Commerce, Biology, Medicine, Industry,
planning, education and so on. As on date there is no
other human walk of life, where statistics cannot be
applied.
Amity University
Page 2
Table of Contents CHAPTER SIX: REGRESSION ANALYSIS ................................................................................................. 4
6.1 Meaning ................................................................................................................................................ 4
6.2 Definitions : .......................................................................................................................................... 5
6.3 Regression Line ................................................................................................................................... 5
6.4 Regression Equations and Regression Coefficient ....................................................................... 6
6.5 Difference between Correlation and Regression Analysis ....................................................... 12
Chapter Six: End Chapter Quizzes ........................................................................................................ 14
CHAPTER SEVEN: TIME SERIES ANALYSIS ........................................................................................ 17
7.1 Meaning .............................................................................................................................................. 17
7.2 Definitions .......................................................................................................................................... 18
7.4 Uses or importance of Time-series ............................................................................................... 19
7.6 Components of time series ............................................................................................................. 21
7.6.1 Trend Component ..................................................................................................................... 21
7.6.2 Cyclical Component ................................................................................................................... 22
7.6.3 Seasonal Component ................................................................................................................ 22
7.6.4 Irregular Component ................................................................................................................ 22
7.7 Methods of measuring secular trend or trend ............................................................................ 23
7.8 Measurement of seasonal variations ............................................................................................ 26
7.8.2 Ratio to trend method: ............................................................................................................. 27
7.9 Practical Problems: .......................................................................................................................... 28
Chapter Seven: End Chapter Quizzes .................................................................................................. 34
CHAPTER EIGHT: PROBABILITY .......................................................................................................... 37
8.1 Introduction....................................................................................................................................... 37
8.2.1 Trial .............................................................................................................................................. 37
8.2.2 Random Trial or Random Experiment .................................................................................. 37
8.2.3 Sample space .............................................................................................................................. 38
8.2.4 Event ............................................................................................................................................ 38
8.2.5 Equally Likely Events ................................................................................................................ 38
Page 3
8.2.7 Exhaustive Events ..................................................................................................................... 39
8.2.8 Independent Events .................................................................................................................. 39
8.2.9 Dependent Events ..................................................................................................................... 39
8.2.10 Complementary Events .......................................................................................................... 40
8.3 Definitions .......................................................................................................................................... 40
8.3.1 Mathematical (or A Priori or Classic) Definition ................................................................ 40
8.3.2 Van Mise’s Statistical (or Empirical) Definition .................................................................. 42
8.4 The Law of Probability .................................................................................................................... 46
8.5 Importance of Probability .............................................................................................................. 54
8.6 Practical Problems: .......................................................................................................................... 55
Chapter Eight: End Chapter Quizzes ................................................................................................... 57
BIBLIOGRAPHY ........................................................................................................................................ 61
Page 4
CHAPTER SIX: REGRESSION ANALYSIS
6.1 Meaning
In statistics, regression analysis is a collective name for techniques for
the modeling and analysis of numerical data consisting of values of a
dependent variable (also called response variable or measurement) and of one
or more independent variables (also known as explanatory variables or
predictors). The dependent variable in the regression equation is modeled as a
function of the independent variables, corresponding parameters
("constants"), and an error term.
So Regression analysis is any statistical method where the mean of one or
more random variables is predicted based on other measured random
variables. There are two types of regression analysis, chosen according to
whether the data approximate a straight line, when linear regression is used,
or not, when non-linear regression is used.
Regression can be used for prediction (including forecasting of time-
series data), inference, hypothesis testing, and modeling of causal
relationships. These uses of regression rely heavily on the underlying
assumptions being satisfied. Regression analysis has been criticized as being
misused for these purposes in many cases where the appropriate assumptions
cannot be verified to hold one factor contributing to the misuse of regression
is that it can take considerably more skill to critique a model than to fit a
model.
Page 5
6.2 Definitions : “Regression is the measure of the average relationship between two or
more variables and terms of the original units of the data.”
Morris M. Blair
“One of the most frequently used techniques in economics and business
research, to find a relation between two or more variables that are related
casually, is regression analysis.”
Taro Yamane
“It is often more important to find out what the relation actually is, in
order to estimate or predict one variable and the statistical technique
appropriate to such a case is called regression analysis.”
Wallis and Roberts
6.3 Regression Line A regression line is a line drawn through a scatterplot of two variables.
The line is chosen so that it comes as close to the points as possible. Regression
analysis, on the other hand, is more than curve fitting. It involves fitting a
model with both deterministic and stochastic components. The deterministic
component is called the predictor and the stochastic component is called the
error term.
The simplest form of a regression model contains a dependent variable,
also called the "Y-variable" and a single independent variable, also called the
"X-variable".
Page 6
6.4 Regression Equations and Regression Coefficient Regression equations or estimating equations are algebraic expression
of regression lines. As there are two regression lines, so there are two
regression equation, i.e. regression equation of X on Y and regression
equation of Y on X.
The regression equation of X on Y is :
X = a + bY
Here X is a dependent variable and Y is independent variable. „a ‟ is X
intercept and „b‟ is the slope of line and it represents change in variable X
when there is a unit change in variable Y.
∑X = aN + b∑Y (i)
∑XY = a∑Y + b∑Y2 (ii)
If we solve these two equations, we can compute the values of a and b
constants.
Similarly, regression equation of Y on X is :
Y = a + bX
And if we solve the following two equations, we can find the values of
constants a and b.
∑Y = aN + b∑X (i)
∑XY = a∑X + b∑X2 (ii)
Page 10
Illustration : Students of a class have obtained marks as given below in paper
I and paper II of statistics:
Paper I 45 55 56 58 60 65 68 70 75 80 85
PaperII 56 50 48 60 62 64 65 70 74 82 90
Find the mean, coefficient of correlation, regression coefficient.
Page 12
6.5 Difference between Correlation and Regression Analysis
Both Correlation and Regression Analysis are two important statistical
tools to study the relationship between variables. The difference between the
two can be analysed as under :
Correlation Regression Analysis
1. Correlation measures the
relationship between the two
variables which vary in the same
or opposite direction.
1. Regression means going back or
act of return. It is a mathematical
measure which shows the average
relationship between the two
variables.
2. Here both X and Y variables are
random variables.
2. Here X is a random variable and
Y is a fixed variable. However,
both variables may be random
variables.
Page 13
3. There can be non sense or
spurious correlation between two
variables.
3. There is no such non sense
regression equation.
4. The coefficient of correlation is a
relative measure and it ranges in ±
1.
4. Regression coefficient is an
absolute measure. If we know the
value of independent variable, we
can estimate the value of
dependent variable.
Page 14
Chapter Six: End Chapter Quizzes
1. The term regression was introduced by
a- R. A. Fisher
b- Sir Francis Galton
c- Karl Pearson
d- none of the above
1. If X and Y are two variates, there can be most
a. one regression line
b. two regression lines
c. three regression lines
d. an infinite number of regression lines
2. In regression line of Y on X, the variable X is known as
a- independent variable
b- regressor
c- explanatory variable
d- all the above
3. Regression equation is also named as
a- prediction equation
b- estimating equation
c. line of average relationship
d. all the above
Page 15
5. Scatter diagram of the variate values (X, Y) gives the idea
about
a- functional relationship
b- regression model
c- distribution of errors
d- none of the above
6. If p=0, the lines of regression are
a- coincident
b- parallel
c- perpendicular to each other
d- none of the above
7. Regression coefficient is independent of
a- origin
b- scale
c- both origin and scale
d- neither origin nor scale
8. Regression analysis can be used for
a- reducing the length of confidence interval
b- for prediction of dependent variate value
c- to know the true effect of certain treatments
d- all the above
9. Probable error is used for
a- measuring the error in r
Page 16
b- testing the significance of r
c- both (a) and (b)
d- neither (a) nor (b)
10. If p = 0, the angle between the two lines of regression is
a- 0 degree
b- 90 degree
c- 60 degree
d- 30 degree
Page 17
CHAPTER SEVEN: TIME SERIES ANALYSIS
7.1 Meaning In statistics, signal processing, and many other fields, a time series is a
sequence of data points, measured typically at successive times, spaced at
(often uniform) time intervals. Time series analysis comprises methods that
attempt to understand such time series, often either to understand the
underlying context of the data points (where did they come from? what
generated them?), or to make forecasts (predictions). Time series forecasting
is the use of a model to forecast future events based on known past events: to
forecast future data points before they are measured. A standard example in
econometrics is the opening price of a share of stock based on its past
performance.
The term time series analysis is used to distinguish a problem, firstly
from more ordinary data analysis problems (where there is no natural
ordering of the context of individual observations), and secondly from spatial
data analysis where there is a context that observations (often) relate to
geographical locations. There are additional possibilities in the form of space-
time models (often called spatial-temporal analysis). A time series model will
generally reflect the fact that observations close together in time will be more
closely related than observations further apart. In addition, time series models
will often make use of the natural one-way ordering of time so that values in a
series for a given time will be expressed as deriving in some way from past
values, rather than from future values (see time reversibility.)
Page 18
So a time series is a sequence of observations which are ordered in time
(or space). If observations are made on some phenomenon throughout time, it
is most sensible to display the data in the order in which they arose,
particularly since successive observations will probably be dependent. Time
series are best displayed in a scatter plot. The series value X is plotted on the
vertical axis and time t on the horizontal axis. Time is called the independent
variable (in this case however, something over which you have little control).
There are two kinds of time series data:
1. Continuous, where we have an observation at every instant
of time, e.g. lie detectors, electrocardiograms. We denote this using
observation X at time t, X(t).
2. Discrete, where we have an observation at (usually
regularly) spaced intervals. We denote this as Xt.
7.2 Definitions
“A set of data depending on the time is called a time series.”
------- Kenny and Keeping
“A time series consists of data arranged chronologically.”
------- Croxton and Cowden “A
time series may be defined as a sequence or repeated measurements of a
variable made periodically through time.”
------- C.H.Mayers
7.3 Applications of time series: The application of time series
models is two fold :
Page 19
Obtain an understanding of the underlying forces and
structure that produced the observed data
Fit a model and proceed to forecasting, monitoring or even
feedback and feed forward control.
Time Series Analysis is used for many applications. Few of them are as
follows:
Economic Forecasting
Sales Forecasting
Budgetary Analysis
Stock Market Analysis
Yield Projections
Process and Quality Control
Inventory Studies
Workload Projections
Utility Studies
Census Analysis
7.4 Uses or importance of Time-series
Analysis of time series is useful in every walk of life like business,
economics, science, state, sociology, research work etc. However, following are
its main objectives :
7.4.1 Study of past behaviour: Analysis of time series studies the
past behaviour of data and indicates the changes that have taken place in the
past.
Page 20
7.4.2 Prediction for future: On the basis of analysis of time series,
future predictions can be made easily. For instance, we can predict future sales
and necessary alterations can be done in the production policy.
7.4.3 Facilitate comparisions : We can make comparison of
various time series to know the death rate, birth rate, yield per acre etc.
7.4.4 Evaluation of actual data: On the basis of deviation analysis of
actual data and estimated data obtained from analysis of time series, we can
come to know about the causes of this change.
7.4.5 Prediction of trade cycle: We can know about the factors of
cyclical variations like boom, depression, recession and recovery which are very
important to business community.
7.4.6 Universal utility: The analysis of time series is not only useful
to business community and economists but it is equally to agriculturist,
government, researchers, political and social institutions, scientists etc.
7.5 Difference between seasonal and cyclical
variations
Following are the main differences between the two:
7.5.1 Time period: The duration of seasonal variations is always one
year while year while duration of cyclical variation is more than one year and
it varies from three to eight years.
7.5.2 Regularity: We find regularity in the components of seasonal
variation while there is no regularity in the components of cyclical variations
Page 21
and even the length of components of cyclical variations, viz., boom,
disinflation, depression and recovery is not equal.
7.5.3 Causes of variations: Seasonal variation takes place due to
change in seasons, customs, habits, fashion etc. While cyclical variation takes
place due to change in teconomic activity.
7.5.4 Measurement: Both the variations can be measured, however,
their technique differ. The seasonal variation can be measured more precisely
as its variation is of regular in nature.
7.5.5 Effect of variation: Seasonal variation affect different people
in a different manner while the effect of cyclical variation is the same on the
whole economy.
7.6 Components of time series Following are the components of time series :
7.6.1 Trend Component We want to increase our understanding of a time series by picking out
its main features. One of these main features is the trend component.
Descriptive techniques may be extended to forecast (predict) future values.
Trend is a long term movement in a time series. It is the underlying
direction (an upward or downward tendency) and rate of change in a time
series, when allowance has been made for the other components.
A simple way of detecting trend in seasonal data is to take averages over
a certain period. If these averages change with time we can say that there is
evidence of a trend in the series. There are also more formal tests to enable
detection of trend in time series.
Page 22
It can be helpful to model trend using straight lines, polynomials etc.
7.6.2 Cyclical Component
We want to increase our understanding of a time series by picking out
its main features. One of these main features is the cyclical component.
Descriptive techniques may be extended to forecast (predict) future values.
In weekly or monthly data, the cyclical component describes any
regular fluctuations. It is a non-seasonal component which varies in a
recognisable cycle.
7.6.3 Seasonal Component We want to increase our understanding of a time series by picking out
its main features. One of these main features is the seasonal component.
Descriptive techniques may be extended to forecast (predict) future values.
In weekly or monthly data, the seasonal component, often referred to as
seasonality, is the component of variation in a time series which is dependent
on the time of year. It describes any regular fluctuations with a period of less
than one year. For example, the costs of various types of fruits and vegetables,
unemployment figures and average daily rainfall, all show marked seasonal
variation. We are interested in comparing the seasonal effects within the
years, from year to year; removing seasonal effects so that the time series is
easier to cope with; and, also interested in adjusting a series for seasonal
effects using various models.
7.6.4 Irregular Component We want to increase our understanding of a time series by picking out
its main features. One of these main features is the irregular component (or
'noise'). Descriptive techniques may be extended to forecast (predict) future
values.
Page 23
The irregular component is that left over when the other components of
the series (trend, seasonal and cyclical) have been accounted for.
7.7 Methods of measuring secular trend or trend Broadly speaking there are four methods of measuring trend, they are
as follows :
7.7.1 Free hand curve method: This is the easiest and simplest
method of computing secular trend. In this method, time is plotted on X- axis
and the other variable is plotted on Y- axis. A free hand curve is then drawn
so as to pass from the center of original fluctuations.
Merits:
1. It is the easiest and simplest method of knowing to trend
values.
2. The trend line is drawn without using scale, so it may be a
straight line or a smooth curve line.
3. The method is free from any mathematical formulas.
Demerits:
1. The straight line trends (Yt) drawn on graph will differ
from person to person in the absence of any mathematical formula.
2. If the statistician is biased, the free hand curve will also be
biased.
Page 24
7.7.2 Semi average method: It is a better technique to comparison to
free hand curve method. Under this method variable (Y) is divided into two
equal parts and average of each part is computed separately.
Merits:
1. This method is simple and easy to understand in relation to
moving average and least square method.
2. The trend line (Yt) in this method is a fixed straight line
unlike the free hand curve method where trend line depend upon the
personal judgement of the statistician.
Demerits:
1. The method is based on the assumption of linear trend
whether it exists or not.
2. The method is affected by the limitation of the arithmetic
means.
3. This method is not suitable for removing trend from the
original data.
7.7.3 Moving Average method: This method is a better technique of
knowing trend in relation to semi average method. The trend values are
obtained with a fair degree of accuracy by eliminating cyclical fluctuations. In
this method we calculate average on the basis of moving technique. This
period of moving average is determined on the basis of length of cyclical
fluctuations which varies from 3 to 11 years.
Merits:
1. This technique is easier in relation to method of least
square.
Page 25
2. This technique is effective if the trend of series is irregular.
Demerits:
1. In this method we can not obtain the trend values for all the
years as we leave the first and last year value of data while computing
three years moving average and so on.
2. The basic purpose of trend value is to predict the trend of
future. In this method we can not extend the trend line on both
direction, so this method cannot be used for prediction purposes.
7.7.4 Method of least square: This is the best method of measuring
secular trend. It is the mathematical as well as analytical tool. This method
can be fitted to economic and business time series to make future predictions.
The trend line may be linear or non linear.
Merits :
1. The method of least square does not suffer from subjectivity
or personal judgement as it is a mathematical method.
2. We can compute the trend value of all the given years by
this method.
Demerits:
1. The method is based on mathematical technique, so it is not
easily understandable to a non mathematical person.
2. If we add or delete some observations in the data, the value
of constants „a‟ and „b‟ will change and new trend line will follow.
Page 26
7.8 Measurement of seasonal variations
The short term variations with in a year in a time series are referred to
as seasonal variations. These variations are periodic in nature, viz., weekly,
monthly or quarterly changes. These variations may take place due to change
in seasons like summer, winter, rainy, autumn etc. Thus, seasonal variations
refer to annual repetitive pattern in economic and business activity.
Following measures are used to measure the seasonal variations:
7.8.1 Method of simple averages: This method involves the following
steps :
1. The given time series is arranged by years, months or
quarters.
2. Totals of each month for the given years are obtained.
3. The average of each month is then obtained by
dividing the totals of months by no. of years.
4. Total of average month is obtained and divided by the
no. of months in a year.
5. Considering the average of monthly average as base,
seasonal index is computed for each month by applying the
following formula:
Seasonal index = monthly average for the month/ Average of monthly
average*100
Page 27
7.8.2 Ratio to trend method: This method is based on multiplicative model of time series. It assumes
that seasonal variation for a given period is a constant fraction of the trend
value. The steps for computation of this method are:
1. First of all trend values are calculated by applying the
method of least square on the yearly average.
2. Trend values for each quarter is obtained based on
trend values so obtained.
3. Now divide the original quarterly data by the trend
value of corresponding quarter and multiply the quotient by
hundred. These values are free from trend.
4. To free the data from cyclical and irregular
variations, quarterly data are averaged.
7.8.3 Link relative method: This is one of the most difficult method of
obtaining seasonal variations. Steps involved in this method are:
1. Link relatives are calculated from the given quarterly
data by applying formula:
Current Quarter/ Previous quarter*100
2. Average of link relatives are obtained for each quarter.
3. Chain relatives are then calculated by using the formula:
Chain index = (Current quarterL.R.*Previous quarter chain index)/100
4. I quarter chain index is calculated bases on IV quarter.
5. Chain relatives are adjusted for each quarters by
subtracting (Quarterly effect * 1, quarterly effect * 2, quarterly effect *
3). quarterly effect from II, III, IV quarter.
Page 28
6. Seasonal index is finally computed . since the total of
quarterly index should be 400, while the real total will be much more, so
seasonal index is computed as
Seasonal index = (Chain index of quarter * 400) / Actual total of chain
index of four quarters.
7.9 Practical Problems:
Illustration: Find 3- years moving average from the following data :
Year Sales(in lakh
Rs.)
Year Sale (in lakh
Rs.)
1990 3 1995 15
1991 8 1996 13
1992 10 1997 18
1993 9 1998 17
1994 12 1999 20
Page 30
Link relative method :
Page 31
This is one of the most difficult method of obtaining seasonal
variations. Steps involved in this method are :
Link relatives are calculated from the given quarterly data by applying
formula:
Current Quarter/ Previous quarter*100
Average of link relatives are obtained for each quarter.
Seasonal index is finally computed . since the total of quarterly index
should be 400, while the real total will be much more, so seasonal index
is computed as
Seasonal index = (Chain index of quarter * 400) / Actual total of chain
index of four quarters.
Illustration : Compute seasonal variations by using Link Relative Method
from the following data:
Year I Quarter II Quarter III Quarter IV Quarter
I 45 54 72 60
II 48 56 63 56
III 49 63 70 65
IV 52 65 75 72
Page 33
(iv) Total of correct chain relatives = 100+ 120.08+140.86+124.74
= 485. 68
(v) Seasonal Index
Page 34
Chapter Seven: End Chapter Quizzes
1. A time series is a set of data recorded
a- periodically
b- at time or space intervals
c- at successive points of time
d- all the above
2. The time series analysis helps
a- to compare the two or more series
b- to know the behaviour of business
c- to make predictions
d- all the above
3. A time series is unable to adjust the influences like
a- customs and policy changes
b- seasonal changes
c- long-term influences
d- none of the above
4. A time series consists of
a- two components
b- three components
c- four components
d- five components
Page 35
5. The forecasts on the basis of a time series are
a- cent per cent true
b- true to a great extent
c- never true
d- none of the above
6. The components of the time series attached to long-term
variations is terms as
a- cyclic variation
b- secular trend
c- irregular variation
d- all the above
7. Secular trend is indicative of long-term variation towards
a- increase only
b- decrease only
c- either increase or decrease
d- none of the above
8. Linear trend of a time series indicates towards
a- constant rate of change
b- constant rate of growth
c- change is geometric progression
d- all the above
9. Seasonal variation means the variations occurring with in
a- a number if years
b- parts of year
Page 36
c- parts of month
d- none of the above
10. Cyclic variations in a time series are caused by
a- lockouts in a factory
b- war in a country
c- floods in the states
d- none of the above
Page 37
CHAPTER EIGHT: PROBABILITY
8.1 Introduction
The theory of probability was developed towards the end of the 18th
century and its history suggests that it developed with the study of games and
chance, such as rolling a dice, drawing a card, flipping a coin etc. Apart from
these, uncertainty prevailed in every sphere of life. For instance, one often
predicts: "It will probably rain tonight." "It is quite likely that there will be a
good yield of cereals this year" and so on. This indicates that, in layman‟s
terminology the word „probability‟ thus connotes that there is an uncertainty
about the happening of events. To put „probability‟ on a better footing we
define it. But before doing so, we have to explain a few terms."
8.2 Concepts of probability calculation
Following are the fundamental concepts of probability calculation:
8.2.1 Trial
A procedure or an experiment to collect any statistical data such as
rolling a dice or flipping a coin is called a trial.
8.2.2 Random Trial or Random Experiment
When the outcome of any experiment can not be predicted precisely
then the experiment is called a random trial or random experiment. In other
words, if a random experiment is repeated under identical conditions, the
outcome will vary at random as it is impossible to predict about the
performance of the experiment. For example, if we toss a honest coin or roll
an unbiased dice, we may not get the same results as our expectations.
Page 38
8.2.3 Sample space
The totality of all the outcomes or results of a random experiment is
denoted by Greek alphabet or English alphabets and is called the sample
space. Each outcome or element of this sample space is known as a sample
print.
8.2.4 Event
Any subset of a sample space is called an event. A sample space S serves
as the universal set for all questions related to an experiment 'S' and an event
A w.r.t it is a set of all possible outcomes favorable to the even t A
For example,
A random experiment :- flipping a coin twice
Sample space :- or S = {(HH), (HT), (TH), (TT)}
The question : "both the flipps show same face"
Therefore, the event A : { (HH), (TT) }
8.2.5 Equally Likely Events
All possible results of a random experiment are called equally likely
outcomes and we have no reason to expect any one rather than the other. For
example, as the result of drawing a card from a well shuffled pack, any card
may appear in draw, so that the 52 cards become 52 different events which
are equally likely.
8.2.6 Mutually Exclusive Events
Events are called mutually exclusive or disjoint or incompatible if the
occurrence of one of them precludes the occurrence of all the others. For
example in tossing a coin, there are two mutually exclusive events viz turning
up a head and turning up of a tail. Since both these events cannot happen
simultaneously. But note that events are compatible if it is possible for them to
Page 39
happen simultaneously. For instance in rolling of two dice, the cases of the
face marked 5 appearing on one dice and face 5 appearing on the other, are
compatible.
8.2.7 Exhaustive Events
Events are exhaustive when they include all the possibilities associated
with the same trial. In throwing a coin, the turning up of head and of a tail are
exhaustive events assuming of course that the coin cannot rest on its edge.
8.2.8 Independent Events
Two events are said to be independent if the occurrence of any event
does not affect the occurrence of the other event. For example in tossing of a
coin, the events corresponding to the two successive tosses of it are
independent. The flip of one penny does not affect in any way the flip of a
nickel.
8.2.9 Dependent Events
If the occurrence or non-occurrence of any event affects the happening
of the other, then the events are said to be dependent events. For example, in
drawing a card from a pack of cards, let the event A be the occurrence of a
king in the 1st draw and B be the occurrence of a king in the 1st draw and B
be the occurrence of a king in the second draw. If the card drawn at the first
trial is not replaced then events A and B are independent events.
Note
(1) If an event contains a single simple point i.e. it is a singleton set, then
this event is called an elementary or a simple event.
(2) An event corresponding to the empty set is an "impossible event."
(3) An event corresponding to the entire sample space is called a
„certain event‟.
Page 40
8.2.10 Complementary Events
Let S be the sample space for an experiment and A be an event in S.
Then A is a subset of S. Hence , the complement of A in S is also an event in
S which contains the outcomes which are not favorable to the occurrence of A
i.e. if A occurs, then the outcome of the experiment belongs to A, but if A does
not occur, then the outcomes of the experiment belongs to
It is obvious that A and are mutually exclusive. A = and A
= S.
If S contains n equally likely, mutually exclusive and exhaustive points
and A contains m out of these n points then contains (n - m) sample points.
8.3 Definitions
We shall now consider two definitions of probability :
8.3.1 Mathematical or a priori or classical.
8.3.2 Statistical or empirical.
8.3.1 Mathematical (or A Priori or Classic) Definition
If there are „n‟ exhaustive, mutually exclusive and equally likely cases
and m of them are favorable to an event A, the probability of A happening is
defined as the ratio m/n
Expressed as a formula :-
Page 41
This definition is due to „Laplace.‟ Thus probability is a concept which
measures numerically the degree of certainty or uncertainty of the occurrence
of an event.
For example, the probability of randomly drawing taking from a well-
shuffled deck of cards is 4/52. Since 4 is the number of favorable outcomes (i.e.
4 kings of diamond, spade, club and heart) and 52 is the number of total
outcomes (the number of cards in a deck).
If A is any event of sample space having probability P, then clearly, P is
a positive number (expressed as a fraction or usually as a decimal) not greater
than unity. 0 P 1 i.e. 0 (no chance or for impossible event) to a high of 1
(certainty). Since the number of cases not favorable to A are (n - m), the
probability q that event A will not happen is, q = or q = 1 - m/n or q = 1 -
p.
Now note that the probability q is nothing but the probability of the
complementary event A i.e.
Thus p ( ) = 1 - p or p ( ) = 1 - p ( )
so that p (A) + p ( ) = 1 i.e. p + q = 1
Relative Frequency Definition
The classical definition of probability has a disadvantage i.e. the words
„equally likely‟ are vague. In fact, since these words seem to be synonymous
with "equally probable". This definition is circular as it is defining (in terms)
of itself. Therefore, the estimated or empirical probability of an event is taken
as the relative frequency of the occurrence of the event when the number of
observations is very large.
Page 42
8.3.2 Van Mise’s Statistical (or Empirical) Definition
If trials are to be repeated a great number of times under essentially the
same condition then the limit of the ratio of the number of times that an event
happens to the total number of trials, as the number of trials increases
indefinitely is called the probability of the happening of the event.
It is assumed that the limit exists and finite uniquely. Symbolically p (A)
= p = provided it is finite and unique.
The two definitions are apparently different but both of them can be
reconciled the same sense.
Example Find the probability of getting heads in tossing a coin.
Solution : Experiment : Tossing a coin
Sample space : S = { H, T} n (S) = 2
Event A : getting heads
A = { H} n (A) = 1
Therefore, p (A) = or 0.5
Example Find the probability of getting 3 or 5 in throwing a die.
Solution : Experiment : Throwing a dice
Sample space : S = {1, 2, 3, 4, 5, 6 } n (S) = 2
Event A : getting 3 or 6
A = {3, 6} n (A) = 2
Therefore, p (A) =
Example Two dice are rolled. Find the probability that the score on the
second die is greater than the score on the first die.
Solution : Experiment : Two dice are rolled
Page 43
Sample space : S = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6) (2, 1), (2, 2), (2,
3), (2, 4), (2, 6)}...
(6, 1), (6, 2) (, 3), (6, 4), (6, 5), (6, 6) }
n (S) = 6 6 = 36 Event A : The score on the second die > the score on
the 1st die.
i.e. A = { (1, 2), (1, 3), (1, 4), (1, 5), (1, 6) (2, 3), (2, 4), (2, 5), (2, 6) (3, 4),
(3, 5), (3, 6) (4, 5), (4, 6) (5, 6)}
n (A) = 15
Therefore, p (A) =
Example A coin is tossed three times. Find the probability of getting at
least one head.
Solution : Experiment : A coin is tossed three times.
Sample space : S = {(H H H), (H H T), (HTH), (HTT), (THT), (TTH),
(THH), (TTT) }
n (S) = 8
Event A : getting at least one head
so that A : getting no head at all
= { (TTT) n ( ) = 1
P ( ) =
Therefore, P (A) = 1 - P ( A ) =
Example A ball is drawn at random from a box containing 6 red balls, 4
white balls and 5 blue balls. Determine the probability that the ball drawn is
(i) red (ii) white (iii) blue (iv) not red (v) red or white.
Page 44
Solution : Let R, W and B denote the events of drawing a red ball, a
white ball and a blue ball respectively.
(i)
Note : The two events R and W are „disjoint‟ events.
Example What is the chance that a leap year selected at random will contain
53 Sundays ?
Solution : A leap year has 52 weeks and 2 more days.
The two days can be :
Monday - Tuesday
Tuesday - Wednesday
Wednesday - Thursday
Thursday - Friday
Friday - Saturday
Saturday - Sunday and
Sunday - Monday.
There are 7 outcomes and 2 are favorable to the 53rd Sunday.
Page 45
Now for 53 Sundays in a leap year, P(A)
2 / 7 = 0.29 (Approximately)
Example If four ladies and six gentlemen sit for a photograph in a row
at random, what is the probability that no two ladies will sit together ?
Solution : Now if no two ladies are
to be together, the ladies have 7 positions, 2 at ends and 5 between the
gentlemen
Arrangement L, G1, L, G2, L, G3, L, G4, L, G5, L, G6, L
Example In a class there are 13 students. 5 of them are boys and the rest
are girls. Find the probability that two students selected at random wil be
both girls.
Solution : Two students out of 13 can be selected in ways and two
girls out
of 8 can be selected in ways.
Page 46
Therefore, required probability =
Example A box contains 5 white balls, 4 black balls and 3 red balls.
Three balls are drawn randomly. What is the probability that they will be (i)
white (ii) black (iii) red ?
Solution : Let W, B and R denote the events of drawing three white,
three black and
three red balls respectively.
8.4 The Law of Probability
So far we have discussed probabilities of single events. In many
situations we come across two or more events occurring together. If event A
and event B are two events and either A or B or both occurs, is denoted by
A B or (A + B) and the event that both A and B occurs is denoted by A
B or AB. We term these situations as compound event or the joint
occurrence of events. We may need probability that A or B will happen.
It is denoted by P (A B) or P (A + B). Also we may need the
probability that A and B (both) will happen simultaneously. It is denoted by
P (A B) or P (AB).
Page 47
Consider a situation, you are asked to choose any 3 or any diamond or
both from a well shuffled pack of 52 cards. Now you are interested in the
probability of this situation.
Now see the following diagram.
It is denoted by P (A B) or P (A + B). Also we may need the
probability that A and B (both) will happen simultaneously. It is denoted by P
(A B) or P (AB).
Consider a situation, you are asked to choose any 3 or any diamond or
both from a well shuffled pack of 52 cards. Now you are interested in the
probability of this situation.
Now see the following diagram.
Now count the dots in the area which fulfills the condition
any 3 or any diamond or both. They are 16.
Thus the required probability
Page 48
In the language of set theory, the set any 3 or any diamond or both is
the union of the sets „any 3 which contains 4 cards ‟ and „any diamond‟ which
contains 15 cards. The number of cards in their union is equal to the sum of
these numbers minus the number of cards in the space where they overlap.
Any points in this space, called the intersection of the two sets, is counted here
twice (double counting), once in each set. Dividing by 52 we get the required
probability.
Thus P (any 3 or any diamond or
both)
In general, if the letters A and B stands for any two events, then
Clearly, the outcomes of both A and B are non-mutually exclusive.
Example Two dice are rolled. Find the probability that the score is an
even number or multiple of 3.
Solution : Two dice are rolled.
Sample space = {(1, 1), (1, 2), ............, (6, 6)}
n(S) = 6 6 = 36
Event E : The score is an even number or multiple of 3.
Note here score means the sum of the numbers on both the dice when
they land. For example (1, 1) has score 1 + 1 = 2.
It is clear that the least score is 2 and the highest score (6, 6) 6 + 6 =
12
Page 49
i.e. score 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
Let Event A : Score is an even numbers
A = {(1, 1), (1, 3), (1, 5), (2,2), (2, 4), (2, 6), (3, 1), (3, 3) (3, 5), (4, 2), (4, 4), (4, 6),
(5, 1), (5, 3), (5, 5), (6, 2), (6, 4), (6, 6) }
Therefore n (A) = 18
Let Event B: The score is the multiple of 3
i.e. 3, 6, 9, 12
B = {(1, 2), (1, 5), (2, 4), (2, 1) (3, 6) (3, 3) (4,2), (4, 5), (5, 1), (5,4), (6, 3), (6, 6) }
n (B) = 12
Let Event A B:The score is an even number and multiple of 3 or
(i.e. common to both A and B) AB
AB = {(2, 4), (4, 2), (33,3), (4,2), (5, 1), (6,6)}
n (AB) = 6
Multiplication Law of Probability
Page 50
If there are two independent events; the respective probability of which
are known, then the probability that both will happen is the product of the
probabilities of their happening respectively P (AB) = P (A) P (B)
To compute the probability of two or even more independent event all
occurring (joint occurrence) extent the above law to required number.
For example, first flip a penny, then the nickle and finally flip the dime.
On landing, probability of heads is for a penny
probability of heads is for a nickle
probability of heads is for a dime
Thus the probability of landing three heads will be or 0.125.
(Note that all three events are independent)
Example Three machines I, II and III manufacture respectively 0.4, 0.5
and 0.1 of the total production. The percentage of defective items produced by
I, II and III is 2, 4 and 1 percent respectively for an item randomly chosen,
what is the probability it is defective?
Solution:
Page 51
Example In shuffling a pack of cards, 4 are accidentally dropped one
after another. Find the chance that the missing cards should be one from each
suit.
Solution: Probability of 4 missing cards from different suits are as
follows:
Let H, D, C and S denote heart, diamond, club and spade cards
respectively
Page 52
Example A problem in statistics is given to three students A, B and C
whose chances in solving it are respectively. What is the
probability that the problem will be solved ?
Solution : The probability that A can solve the problem = 1/2
The probability that B cannot solve the problem = 1 - 1/2 = 1/2
Similarly the probabilities that B and C cannot solve problem are
respectively.
Conditional Probability
In many situations you get more information than simply the total
outcomes and favorable outcomes you already have and, hence you are in
position to make yourself more informed to make judgements regarding the
probabilities of such situations. For example, suppose a card is drawn at
random from a deck of 52 cards. Let B denotes the event „the card is a
Page 53
diamond‟ and A denotes the event „the card is red‟. We may then consider the
following probabilities.
Since there are 26 red cards of which 13 are diamonds, the probability
that the card is diamond is . In other words the probability of event B
knowing that A has occurred is .
The probability of B under the condition that A has occurred is known
as condition probability and it is denoted by P (B/A) . Thus P (B/A) = . It
should be observed that the probability of the event B is increased due to the
additional information that the event A has occurred.
Conditional probability found using the formula P (B/A) =
Justification :- P (A/B) =
Similarly P(A/B) =
In both the cases if A and B are independent events then P (A/B) = P (A) and
Page 54
P(B/A) = P(B)
Therefore P(A) = P (AB) = P (A) . P (B)
or P(B) = P (AB) = P(A) . P (B)
8.5 Importance of Probability The theory of probability has its origin in the seventeenth century to
develop the quantitative measure of probability concerning problems related
to the theory of die in gambling.
Later, the theory was used on problems pertaining to chance by
mathematicians. The problems are related to tossing of a coin, possibility of
getting a card of specific suit, possibility of getting balls of specific colour from
a bag of balls. Now a days the law of probability, is used to solve the economic
and business problems. It is also used to solve the problems of our day to day
life even.
The utility of probability can be known by its various uses.
Following are the areas where probability theory has been used :
1. The fundamental laws of statistics like Law of Statistical Regularity
and Law of Inertia of large numbers are based on the theory of probability.
2. The various test of significance like Z –test, F test, Chi – suare test,
are derived from the theory of probability.
3. This theory gives solution to the problems relating to the game of
chance.
4. The decision theories are based on the fundamental laws of
probability.
Page 55
5. The theory is generally used in economic and business decision
making. The theory is very useful in the situations where risk and uncertainty
prevails.
6. The subjective probability is widely used in those situations where
actual measurement of probability is not feasible. It has, thus, added new
dimension to the theory of probability. These probability can be revised at a
later stage on the basis of experience.
8.6 Practical Problems:
Illustration: A single letter is selected at random from the word
„PROBABILITY‟. What is the probability that it is a vowel?
Sollution :
Total number of letters in the word, „PROBABILIT5Y‟ = n = 11
Number of favourable cases = m = 4 ( vowels are – o, a, i, i )
We know that,
P(A)= =
Illustration: Find the probability of having at least one son in a family if
there are two children in a family on an average.
Solution:
Two children in a family may be either :
(1) Both sons
or (2) Son and daughter
Page 56
or (3) Daughter and son
or (4) Both daughters
Thus, total number of equally likely cases = n = 4
At least one son implies that a family may have one son or two sons.
Thus, favourable number of cases = m = 3 (i.e., option, nos 1,2,3,)
P(A) =
Illustration: Find the chance of getting an ace in a draw from a pack of
52 cards.
Solution:
Total number of cards = n = 52
Number of favourable cases = m = 4 (number of aces)
P(A)
Illustration: Suppose an ideal die is tossed twice. What is the probability
of getting a sum of 10 in the two tosses?
Solution:
A die can be tossed first time in = 6 ways
Adie can be tossed second time in = 6 ways
A die can be tossed twice in = 6 × 6 = 36 ways (as per rule of counting)
Number of ways in which we can through two die to get a sum of 10 are
= m = 3 ways (i.e., dot number 4+6+5and 6+4)
P(A)
Page 57
Chapter Eight: End Chapter Quizzes
1. The outcome of tossing a coin is a
a- simple event
b- mutually exclusive event
c- complementary event
d- compound event
2. Classical probability is measured in terms of
a- an absolute value
b- a ratio
c- absolute value and ratio both
d- none of the above
3. Probability is expressed as
a- ratio
b- proportion
c- percentage
d- all the above
4. Classical probability is also known as
a- Laplace‟s probability
b- mathematical probability
c- a priori probability
d- all the above
5. Each outcome of a random experiment is called
Page 58
a- Primary event
b- Compound event
c- Derived event
d- All the above
6. The definition of statistical probability was originally
given by
a- De Moivre
b- Laplace
c- Von-Mises
d- Pascal
7. The definition of priori probability was originally
given by
a- De Moivre
b- Laplace
c- Von-Mises
d- Feller
8. Probability by classical approach has
a- no lecuna
b- only one lecuna
c- only two lecunae
d- many lecunae
9. An event consisting of those elements which are not in
A is called
Page 59
a- primary event
b- derived event
c- simple event
d- complementary event
10. The probability of the intersection of two mutually
exclusive events is always
a- infinity
b- zero
c- one
d- none of the above
Page 60
Answer key to End Chapter Quizzes:
Chapter One
(1) b , (2) c , (3) c , (4) a , (5) b , (6) c , (7) b , (8) c , (9) c , (10) d
Chapter Two
(1) b , (2) b , (3) c , (4) b , (5) d , (6) c , (7) c , (8) c , (9) c , (10) c
Chapter Three
(1) d , (2) c , (3) a , (4) d , (5) b , (6) b , (7) a , (8) d , (9) a , (10) c
Chapter four
(1) d , (2) c , (3) b , (4) a , (5) a , (6) b , (7) c , (8) b , (9) c , (10) d
Chapter Five
(1) c , (2) b , (3) c , (4) c , (5) a , (6) b , (7) b , (8) b , (9) c , (10) b
Chapter Six
(1) b , (2) b , (3) d , (4) d , (5) a , (6) c , (7) a , (8) d , (9) b , (10) b
Chapter Seven
(1) d , (2) d , (3) a , (4) c , (5) b , (6) b , (7) c , (8) a , (9) b , (10) d
Chapter Eight
(1) a , (2) b , (3) d , (4) d , (5) a , (6) c , (7) b , (8) d , (9) d , (10) b
Page 61
BIBLIOGRAPHY
(I) Books :
1. Gupta, S. P. : Business Statistics
2. Sharma, N. L. : Statistics
3. Gupta, K. L. : Business Statistics
4. Gupta, S. P. : Statistical Methods
5. Kapoor & Sancheti : Business Statistics
6. Kothari, C. R. : Quantitative Techniques
7. Agarwal, B. M. : Business Statistics
8. Hooda, R. P. : Introduction to Statistics
9. Sharma, J. K. : Business Statistics
II) Journals, Periodicals, Newspapers and Other useful Publications
1. Economic and Political Weekly.
2. India Today.
3. Business India.
4. Journal of Development Economics.
III) Reports and Other Materials
1. Journals in Statistics.
2. Various reports on Economic and Statistical investigations.