Business Statistics Vol 2 Online

BUSINESS

STATISTICS

Vol-2

ABF 102

ACeL In the modern world of computers and information

technology, the importance of statistics is very well

recogonised by all the disciplines. Statistics has

originated as a science of statehood and found

applications slowly and steadily in Agriculture,

Economics, Commerce, Biology, Medicine, Industry,

planning, education and so on. As on date there is no

other human walk of life, where statistics cannot be

applied.

Amity University

Table of Contents CHAPTER SIX: REGRESSION ANALYSIS ................................................................................................. 4

6.1 Meaning ................................................................................................................................................ 4

6.2 Definitions : .......................................................................................................................................... 5

6.3 Regression Line ................................................................................................................................... 5

6.4 Regression Equations and Regression Coefficient ....................................................................... 6

6.5 Difference between Correlation and Regression Analysis ....................................................... 12

Chapter Six: End Chapter Quizzes ........................................................................................................ 14

CHAPTER SEVEN: TIME SERIES ANALYSIS ........................................................................................ 17

7.1 Meaning .............................................................................................................................................. 17

7.2 Definitions .......................................................................................................................................... 18

7.4 Uses or importance of Time-series ............................................................................................... 19

7.6 Components of time series ............................................................................................................. 21

7.6.1 Trend Component ..................................................................................................................... 21

7.6.2 Cyclical Component ................................................................................................................... 22

7.6.3 Seasonal Component ................................................................................................................ 22

7.6.4 Irregular Component ................................................................................................................ 22

7.7 Methods of measuring secular trend or trend ............................................................................ 23

7.8 Measurement of seasonal variations ............................................................................................ 26

7.8.2 Ratio to trend method: ............................................................................................................. 27

7.9 Practical Problems: .......................................................................................................................... 28

Chapter Seven: End Chapter Quizzes .................................................................................................. 34

CHAPTER EIGHT: PROBABILITY .......................................................................................................... 37

8.1 Introduction....................................................................................................................................... 37

8.2.1 Trial .............................................................................................................................................. 37

8.2.2 Random Trial or Random Experiment .................................................................................. 37

8.2.3 Sample space .............................................................................................................................. 38

8.2.4 Event ............................................................................................................................................ 38

8.2.5 Equally Likely Events ................................................................................................................ 38

8.2.7 Exhaustive Events ..................................................................................................................... 39

8.2.8 Independent Events .................................................................................................................. 39

8.2.9 Dependent Events ..................................................................................................................... 39

8.2.10 Complementary Events .......................................................................................................... 40

8.3 Definitions .......................................................................................................................................... 40

8.3.1 Mathematical (or A Priori or Classic) Definition ................................................................ 40

8.3.2 Van Mise’s Statistical (or Empirical) Definition .................................................................. 42

8.4 The Law of Probability .................................................................................................................... 46

8.5 Importance of Probability .............................................................................................................. 54

8.6 Practical Problems: .......................................................................................................................... 55

Chapter Eight: End Chapter Quizzes ................................................................................................... 57

BIBLIOGRAPHY ........................................................................................................................................ 61

CHAPTER SIX: REGRESSION ANALYSIS

6.1 Meaning

In statistics, regression analysis is a collective name for techniques for

the modeling and analysis of numerical data consisting of values of a

dependent variable (also called response variable or measurement) and of one

or more independent variables (also known as explanatory variables or

predictors). The dependent variable in the regression equation is modeled as a

function of the independent variables, corresponding parameters

("constants"), and an error term.

So Regression analysis is any statistical method where the mean of one or

more random variables is predicted based on other measured random

variables. There are two types of regression analysis, chosen according to

whether the data approximate a straight line, when linear regression is used,

or not, when non-linear regression is used.

Regression can be used for prediction (including forecasting of time-

series data), inference, hypothesis testing, and modeling of causal

relationships. These uses of regression rely heavily on the underlying

assumptions being satisfied. Regression analysis has been criticized as being

misused for these purposes in many cases where the appropriate assumptions

cannot be verified to hold one factor contributing to the misuse of regression

is that it can take considerably more skill to critique a model than to fit a

model.

http://en.wikipedia.org/wiki/Dependent_variable

http://en.wikipedia.org/wiki/Independent_variable

http://en.wikipedia.org/wiki/Parameters

http://en.wikipedia.org/wiki/Error_term

http://cnx.rice.edu/content/m13449/latest


http://en.wikipedia.org/wiki/Prediction

http://en.wikipedia.org/wiki/Forecast

http://en.wikipedia.org/wiki/Time_series

http://en.wikipedia.org/wiki/Time_series

http://en.wikipedia.org/wiki/Inference

http://en.wikipedia.org/wiki/Hypothesis_testing

http://en.wikipedia.org/wiki/Causal

6.2 Definitions : “Regression is the measure of the average relationship between two or

more variables and terms of the original units of the data.”

Morris M. Blair

“One of the most frequently used techniques in economics and business

research, to find a relation between two or more variables that are related

casually, is regression analysis.”

Taro Yamane

“It is often more important to find out what the relation actually is, in

order to estimate or predict one variable and the statistical technique

appropriate to such a case is called regression analysis.”

Wallis and Roberts

6.3 Regression Line A regression line is a line drawn through a scatterplot of two variables.

The line is chosen so that it comes as close to the points as possible. Regression

analysis, on the other hand, is more than curve fitting. It involves fitting a

model with both deterministic and stochastic components. The deterministic

component is called the predictor and the stochastic component is called the

error term.

The simplest form of a regression model contains a dependent variable,

also called the "Y-variable" and a single independent variable, also called the

"X-variable".



6.4 Regression Equations and Regression Coefficient Regression equations or estimating equations are algebraic expression

of regression lines. As there are two regression lines, so there are two

regression equation, i.e. regression equation of X on Y and regression

equation of Y on X.

The regression equation of X on Y is :

X = a + bY

Here X is a dependent variable and Y is independent variable. „a ‟ is X

intercept and „b‟ is the slope of line and it represents change in variable X

when there is a unit change in variable Y.

∑X = aN + b∑Y (i)

∑XY = a∑Y + b∑Y2 (ii)

If we solve these two equations, we can compute the values of a and b

constants.

Similarly, regression equation of Y on X is :

Y = a + bX

And if we solve the following two equations, we can find the values of

constants a and b.

∑Y = aN + b∑X (i)

∑XY = a∑X + b∑X2 (ii)

Illustration : Students of a class have obtained marks as given below in paper

I and paper II of statistics:

Paper I 45 55 56 58 60 65 68 70 75 80 85

PaperII 56 50 48 60 62 64 65 70 74 82 90

Find the mean, coefficient of correlation, regression coefficient.

6.5 Difference between Correlation and Regression Analysis

Both Correlation and Regression Analysis are two important statistical

tools to study the relationship between variables. The difference between the

two can be analysed as under :

Correlation Regression Analysis

1. Correlation measures the

relationship between the two

variables which vary in the same

or opposite direction.

1. Regression means going back or

act of return. It is a mathematical

measure which shows the average

relationship between the two

variables.

2. Here both X and Y variables are

random variables.

2. Here X is a random variable and

Y is a fixed variable. However,

both variables may be random

variables.

3. There can be non sense or

spurious correlation between two

variables.

3. There is no such non sense

regression equation.

4. The coefficient of correlation is a

relative measure and it ranges in ±

1.

4. Regression coefficient is an

absolute measure. If we know the

value of independent variable, we

can estimate the value of

dependent variable.

Chapter Six: End Chapter Quizzes

1. The term regression was introduced by

a- R. A. Fisher

b- Sir Francis Galton

c- Karl Pearson

d- none of the above

1. If X and Y are two variates, there can be most

a. one regression line

b. two regression lines

c. three regression lines

d. an infinite number of regression lines

2. In regression line of Y on X, the variable X is known as

a- independent variable

b- regressor

c- explanatory variable

d- all the above

3. Regression equation is also named as

a- prediction equation

b- estimating equation

c. line of average relationship

d. all the above

5. Scatter diagram of the variate values (X, Y) gives the idea

about

a- functional relationship

b- regression model

c- distribution of errors


6. If p=0, the lines of regression are

a- coincident

b- parallel

c- perpendicular to each other


7. Regression coefficient is independent of

a- origin

b- scale

c- both origin and scale

d- neither origin nor scale

8. Regression analysis can be used for

a- reducing the length of confidence interval

b- for prediction of dependent variate value

c- to know the true effect of certain treatments

d- all the above

9. Probable error is used for

a- measuring the error in r

b- testing the significance of r

c- both (a) and (b)

d- neither (a) nor (b)

10. If p = 0, the angle between the two lines of regression is

a- 0 degree

b- 90 degree

c- 60 degree

d- 30 degree

CHAPTER SEVEN: TIME SERIES ANALYSIS

7.1 Meaning In statistics, signal processing, and many other fields, a time series is a

sequence of data points, measured typically at successive times, spaced at

(often uniform) time intervals. Time series analysis comprises methods that

attempt to understand such time series, often either to understand the

underlying context of the data points (where did they come from? what

generated them?), or to make forecasts (predictions). Time series forecasting

is the use of a model to forecast future events based on known past events: to

forecast future data points before they are measured. A standard example in

econometrics is the opening price of a share of stock based on its past

performance.

The term time series analysis is used to distinguish a problem, firstly

from more ordinary data analysis problems (where there is no natural

ordering of the context of individual observations), and secondly from spatial

data analysis where there is a context that observations (often) relate to

geographical locations. There are additional possibilities in the form of space-

time models (often called spatial-temporal analysis). A time series model will

generally reflect the fact that observations close together in time will be more

closely related than observations further apart. In addition, time series models

will often make use of the natural one-way ordering of time so that values in a

series for a given time will be expressed as deriving in some way from past

values, rather than from future values (see time reversibility.)

http://en.wikipedia.org/wiki/Statistics

http://en.wikipedia.org/wiki/Signal_processing

http://en.wikipedia.org/wiki/Data_point

http://en.wikipedia.org/wiki/Forecast

http://en.wikipedia.org/wiki/Model_(abstract)

http://en.wikipedia.org/wiki/Econometrics

http://en.wikipedia.org/wiki/Stock

http://en.wikipedia.org/wiki/Spatial_data_analysis

http://en.wikipedia.org/wiki/Spatial_data_analysis

http://en.wikipedia.org/w/index.php?title=Spatial-temporal_analysis&action=edit&redlink=1

http://en.wikipedia.org/wiki/Time_reversibility

So a time series is a sequence of observations which are ordered in time

(or space). If observations are made on some phenomenon throughout time, it

is most sensible to display the data in the order in which they arose,

particularly since successive observations will probably be dependent. Time

series are best displayed in a scatter plot. The series value X is plotted on the

vertical axis and time t on the horizontal axis. Time is called the independent

variable (in this case however, something over which you have little control).

There are two kinds of time series data:

1. Continuous, where we have an observation at every instant

of time, e.g. lie detectors, electrocardiograms. We denote this using

observation X at time t, X(t).

2. Discrete, where we have an observation at (usually

regularly) spaced intervals. We denote this as Xt.

7.2 Definitions

“A set of data depending on the time is called a time series.”

------- Kenny and Keeping

“A time series consists of data arranged chronologically.”

------- Croxton and Cowden “A

time series may be defined as a sequence or repeated measurements of a

variable made periodically through time.”

------- C.H.Mayers

7.3 Applications of time series: The application of time series

models is two fold :

Obtain an understanding of the underlying forces and

structure that produced the observed data

Fit a model and proceed to forecasting, monitoring or even

feedback and feed forward control.

Time Series Analysis is used for many applications. Few of them are as

follows:

Economic Forecasting

Sales Forecasting

Budgetary Analysis

Stock Market Analysis

Yield Projections

Process and Quality Control

Inventory Studies

Workload Projections

Utility Studies

Census Analysis

7.4 Uses or importance of Time-series

Analysis of time series is useful in every walk of life like business,

economics, science, state, sociology, research work etc. However, following are

its main objectives :

7.4.1 Study of past behaviour: Analysis of time series studies the

past behaviour of data and indicates the changes that have taken place in the

past.

7.4.2 Prediction for future: On the basis of analysis of time series,

future predictions can be made easily. For instance, we can predict future sales

and necessary alterations can be done in the production policy.

7.4.3 Facilitate comparisions : We can make comparison of

various time series to know the death rate, birth rate, yield per acre etc.

7.4.4 Evaluation of actual data: On the basis of deviation analysis of

actual data and estimated data obtained from analysis of time series, we can

come to know about the causes of this change.

7.4.5 Prediction of trade cycle: We can know about the factors of

cyclical variations like boom, depression, recession and recovery which are very

important to business community.

7.4.6 Universal utility: The analysis of time series is not only useful

to business community and economists but it is equally to agriculturist,

government, researchers, political and social institutions, scientists etc.

7.5 Difference between seasonal and cyclical

variations

Following are the main differences between the two:

7.5.1 Time period: The duration of seasonal variations is always one

year while year while duration of cyclical variation is more than one year and

it varies from three to eight years.

7.5.2 Regularity: We find regularity in the components of seasonal

variation while there is no regularity in the components of cyclical variations

and even the length of components of cyclical variations, viz., boom,

disinflation, depression and recovery is not equal.

7.5.3 Causes of variations: Seasonal variation takes place due to

change in seasons, customs, habits, fashion etc. While cyclical variation takes

place due to change in teconomic activity.

7.5.4 Measurement: Both the variations can be measured, however,

their technique differ. The seasonal variation can be measured more precisely

as its variation is of regular in nature.

7.5.5 Effect of variation: Seasonal variation affect different people

in a different manner while the effect of cyclical variation is the same on the

whole economy.

7.6 Components of time series Following are the components of time series :

7.6.1 Trend Component We want to increase our understanding of a time series by picking out

its main features. One of these main features is the trend component.

Descriptive techniques may be extended to forecast (predict) future values.

Trend is a long term movement in a time series. It is the underlying

direction (an upward or downward tendency) and rate of change in a time

series, when allowance has been made for the other components.

A simple way of detecting trend in seasonal data is to take averages over

a certain period. If these averages change with time we can say that there is

evidence of a trend in the series. There are also more formal tests to enable

detection of trend in time series.

It can be helpful to model trend using straight lines, polynomials etc.

7.6.2 Cyclical Component

We want to increase our understanding of a time series by picking out

its main features. One of these main features is the cyclical component.


In weekly or monthly data, the cyclical component describes any

regular fluctuations. It is a non-seasonal component which varies in a

recognisable cycle.

7.6.3 Seasonal Component We want to increase our understanding of a time series by picking out

its main features. One of these main features is the seasonal component.


In weekly or monthly data, the seasonal component, often referred to as

seasonality, is the component of variation in a time series which is dependent

on the time of year. It describes any regular fluctuations with a period of less

than one year. For example, the costs of various types of fruits and vegetables,

unemployment figures and average daily rainfall, all show marked seasonal

variation. We are interested in comparing the seasonal effects within the

years, from year to year; removing seasonal effects so that the time series is

easier to cope with; and, also interested in adjusting a series for seasonal

effects using various models.

7.6.4 Irregular Component We want to increase our understanding of a time series by picking out

its main features. One of these main features is the irregular component (or

'noise'). Descriptive techniques may be extended to forecast (predict) future

values.

The irregular component is that left over when the other components of

the series (trend, seasonal and cyclical) have been accounted for.

7.7 Methods of measuring secular trend or trend Broadly speaking there are four methods of measuring trend, they are

as follows :

7.7.1 Free hand curve method: This is the easiest and simplest

method of computing secular trend. In this method, time is plotted on X- axis

and the other variable is plotted on Y- axis. A free hand curve is then drawn

so as to pass from the center of original fluctuations.

Merits:

1. It is the easiest and simplest method of knowing to trend

values.

2. The trend line is drawn without using scale, so it may be a

straight line or a smooth curve line.

3. The method is free from any mathematical formulas.

Demerits:

1. The straight line trends (Yt) drawn on graph will differ

from person to person in the absence of any mathematical formula.

2. If the statistician is biased, the free hand curve will also be

biased.

7.7.2 Semi average method: It is a better technique to comparison to

free hand curve method. Under this method variable (Y) is divided into two

equal parts and average of each part is computed separately.

Merits:

1. This method is simple and easy to understand in relation to

moving average and least square method.

2. The trend line (Yt) in this method is a fixed straight line

unlike the free hand curve method where trend line depend upon the

personal judgement of the statistician.

Demerits:

1. The method is based on the assumption of linear trend

whether it exists or not.

2. The method is affected by the limitation of the arithmetic

means.

3. This method is not suitable for removing trend from the

original data.

7.7.3 Moving Average method: This method is a better technique of

knowing trend in relation to semi average method. The trend values are

obtained with a fair degree of accuracy by eliminating cyclical fluctuations. In

this method we calculate average on the basis of moving technique. This

period of moving average is determined on the basis of length of cyclical

fluctuations which varies from 3 to 11 years.

Merits:

1. This technique is easier in relation to method of least

square.

2. This technique is effective if the trend of series is irregular.

Demerits:

1. In this method we can not obtain the trend values for all the

years as we leave the first and last year value of data while computing

three years moving average and so on.

2. The basic purpose of trend value is to predict the trend of

future. In this method we can not extend the trend line on both

direction, so this method cannot be used for prediction purposes.

7.7.4 Method of least square: This is the best method of measuring

secular trend. It is the mathematical as well as analytical tool. This method

can be fitted to economic and business time series to make future predictions.

The trend line may be linear or non linear.

Merits :

1. The method of least square does not suffer from subjectivity

or personal judgement as it is a mathematical method.

2. We can compute the trend value of all the given years by

this method.

Demerits:

1. The method is based on mathematical technique, so it is not

easily understandable to a non mathematical person.

2. If we add or delete some observations in the data, the value

of constants „a‟ and „b‟ will change and new trend line will follow.

7.8 Measurement of seasonal variations

The short term variations with in a year in a time series are referred to

as seasonal variations. These variations are periodic in nature, viz., weekly,

monthly or quarterly changes. These variations may take place due to change

in seasons like summer, winter, rainy, autumn etc. Thus, seasonal variations

refer to annual repetitive pattern in economic and business activity.

Following measures are used to measure the seasonal variations:

7.8.1 Method of simple averages: This method involves the following

steps :

1. The given time series is arranged by years, months or

quarters.

2. Totals of each month for the given years are obtained.

3. The average of each month is then obtained by

dividing the totals of months by no. of years.

4. Total of average month is obtained and divided by the

no. of months in a year.

5. Considering the average of monthly average as base,

seasonal index is computed for each month by applying the

following formula:

Seasonal index = monthly average for the month/ Average of monthly

average*100

7.8.2 Ratio to trend method: This method is based on multiplicative model of time series. It assumes

that seasonal variation for a given period is a constant fraction of the trend

value. The steps for computation of this method are:

1. First of all trend values are calculated by applying the

method of least square on the yearly average.

2. Trend values for each quarter is obtained based on

trend values so obtained.

3. Now divide the original quarterly data by the trend

value of corresponding quarter and multiply the quotient by

hundred. These values are free from trend.

4. To free the data from cyclical and irregular

variations, quarterly data are averaged.

7.8.3 Link relative method: This is one of the most difficult method of

obtaining seasonal variations. Steps involved in this method are:

1. Link relatives are calculated from the given quarterly

data by applying formula:

Current Quarter/ Previous quarter*100

2. Average of link relatives are obtained for each quarter.

3. Chain relatives are then calculated by using the formula:

Chain index = (Current quarterL.R.*Previous quarter chain index)/100

4. I quarter chain index is calculated bases on IV quarter.

5. Chain relatives are adjusted for each quarters by

subtracting (Quarterly effect * 1, quarterly effect * 2, quarterly effect *

3). quarterly effect from II, III, IV quarter.

6. Seasonal index is finally computed . since the total of

quarterly index should be 400, while the real total will be much more, so

seasonal index is computed as

Seasonal index = (Chain index of quarter * 400) / Actual total of chain

index of four quarters.

7.9 Practical Problems:

Illustration: Find 3- years moving average from the following data :

Year Sales(in lakh

Rs.)

Year Sale (in lakh

Rs.)

1990 3 1995 15

1991 8 1996 13

1992 10 1997 18

1993 9 1998 17

1994 12 1999 20

Link relative method :

This is one of the most difficult method of obtaining seasonal

variations. Steps involved in this method are :

Link relatives are calculated from the given quarterly data by applying

formula:

Current Quarter/ Previous quarter*100

Average of link relatives are obtained for each quarter.

Seasonal index is finally computed . since the total of quarterly index

should be 400, while the real total will be much more, so seasonal index

is computed as

Seasonal index = (Chain index of quarter * 400) / Actual total of chain

index of four quarters.

Illustration : Compute seasonal variations by using Link Relative Method

from the following data:

Year I Quarter II Quarter III Quarter IV Quarter

I 45 54 72 60

II 48 56 63 56

III 49 63 70 65

IV 52 65 75 72

(iv) Total of correct chain relatives = 100+ 120.08+140.86+124.74

= 485. 68

(v) Seasonal Index

Chapter Seven: End Chapter Quizzes

1. A time series is a set of data recorded

a- periodically

b- at time or space intervals

c- at successive points of time

d- all the above

2. The time series analysis helps

a- to compare the two or more series

b- to know the behaviour of business

c- to make predictions

d- all the above

3. A time series is unable to adjust the influences like

a- customs and policy changes

b- seasonal changes

c- long-term influences


4. A time series consists of

a- two components

b- three components

c- four components

d- five components

5. The forecasts on the basis of a time series are

a- cent per cent true

b- true to a great extent

c- never true


6. The components of the time series attached to long-term

variations is terms as

a- cyclic variation

b- secular trend

c- irregular variation

d- all the above

7. Secular trend is indicative of long-term variation towards

a- increase only

b- decrease only

c- either increase or decrease


8. Linear trend of a time series indicates towards

a- constant rate of change

b- constant rate of growth

c- change is geometric progression

d- all the above

9. Seasonal variation means the variations occurring with in

a- a number if years

b- parts of year

c- parts of month


10. Cyclic variations in a time series are caused by

a- lockouts in a factory

b- war in a country

c- floods in the states


CHAPTER EIGHT: PROBABILITY

8.1 Introduction

The theory of probability was developed towards the end of the 18th

century and its history suggests that it developed with the study of games and

chance, such as rolling a dice, drawing a card, flipping a coin etc. Apart from

these, uncertainty prevailed in every sphere of life. For instance, one often

predicts: "It will probably rain tonight." "It is quite likely that there will be a

good yield of cereals this year" and so on. This indicates that, in layman‟s

terminology the word „probability‟ thus connotes that there is an uncertainty

about the happening of events. To put „probability‟ on a better footing we

define it. But before doing so, we have to explain a few terms."

8.2 Concepts of probability calculation

Following are the fundamental concepts of probability calculation:

8.2.1 Trial

A procedure or an experiment to collect any statistical data such as

rolling a dice or flipping a coin is called a trial.

8.2.2 Random Trial or Random Experiment

When the outcome of any experiment can not be predicted precisely

then the experiment is called a random trial or random experiment. In other

words, if a random experiment is repeated under identical conditions, the

outcome will vary at random as it is impossible to predict about the

performance of the experiment. For example, if we toss a honest coin or roll

an unbiased dice, we may not get the same results as our expectations.

8.2.3 Sample space

The totality of all the outcomes or results of a random experiment is

denoted by Greek alphabet or English alphabets and is called the sample

space. Each outcome or element of this sample space is known as a sample

print.

8.2.4 Event

Any subset of a sample space is called an event. A sample space S serves

as the universal set for all questions related to an experiment 'S' and an event

A w.r.t it is a set of all possible outcomes favorable to the even t A

For example,

A random experiment :- flipping a coin twice

Sample space :- or S = {(HH), (HT), (TH), (TT)}

The question : "both the flipps show same face"

Therefore, the event A : { (HH), (TT) }

8.2.5 Equally Likely Events

All possible results of a random experiment are called equally likely

outcomes and we have no reason to expect any one rather than the other. For

example, as the result of drawing a card from a well shuffled pack, any card

may appear in draw, so that the 52 cards become 52 different events which

are equally likely.

8.2.6 Mutually Exclusive Events

Events are called mutually exclusive or disjoint or incompatible if the

occurrence of one of them precludes the occurrence of all the others. For

example in tossing a coin, there are two mutually exclusive events viz turning

up a head and turning up of a tail. Since both these events cannot happen

simultaneously. But note that events are compatible if it is possible for them to

happen simultaneously. For instance in rolling of two dice, the cases of the

face marked 5 appearing on one dice and face 5 appearing on the other, are

compatible.

8.2.7 Exhaustive Events

Events are exhaustive when they include all the possibilities associated

with the same trial. In throwing a coin, the turning up of head and of a tail are

exhaustive events assuming of course that the coin cannot rest on its edge.

8.2.8 Independent Events

Two events are said to be independent if the occurrence of any event

does not affect the occurrence of the other event. For example in tossing of a

coin, the events corresponding to the two successive tosses of it are

independent. The flip of one penny does not affect in any way the flip of a

nickel.

8.2.9 Dependent Events

If the occurrence or non-occurrence of any event affects the happening

of the other, then the events are said to be dependent events. For example, in

drawing a card from a pack of cards, let the event A be the occurrence of a

king in the 1st draw and B be the occurrence of a king in the 1st draw and B

be the occurrence of a king in the second draw. If the card drawn at the first

trial is not replaced then events A and B are independent events.

Note

(1) If an event contains a single simple point i.e. it is a singleton set, then

this event is called an elementary or a simple event.

(2) An event corresponding to the empty set is an "impossible event."

(3) An event corresponding to the entire sample space is called a

„certain event‟.

8.2.10 Complementary Events

Let S be the sample space for an experiment and A be an event in S.

Then A is a subset of S. Hence , the complement of A in S is also an event in

S which contains the outcomes which are not favorable to the occurrence of A

i.e. if A occurs, then the outcome of the experiment belongs to A, but if A does

not occur, then the outcomes of the experiment belongs to

It is obvious that A and are mutually exclusive. A = and A

= S.

If S contains n equally likely, mutually exclusive and exhaustive points

and A contains m out of these n points then contains (n - m) sample points.

8.3 Definitions

We shall now consider two definitions of probability :

8.3.1 Mathematical or a priori or classical.

8.3.2 Statistical or empirical.

8.3.1 Mathematical (or A Priori or Classic) Definition

If there are „n‟ exhaustive, mutually exclusive and equally likely cases

and m of them are favorable to an event A, the probability of A happening is

defined as the ratio m/n

Expressed as a formula :-

This definition is due to „Laplace.‟ Thus probability is a concept which

measures numerically the degree of certainty or uncertainty of the occurrence

of an event.

For example, the probability of randomly drawing taking from a well-

shuffled deck of cards is 4/52. Since 4 is the number of favorable outcomes (i.e.

4 kings of diamond, spade, club and heart) and 52 is the number of total

outcomes (the number of cards in a deck).

If A is any event of sample space having probability P, then clearly, P is

a positive number (expressed as a fraction or usually as a decimal) not greater

than unity. 0 P 1 i.e. 0 (no chance or for impossible event) to a high of 1

(certainty). Since the number of cases not favorable to A are (n - m), the

probability q that event A will not happen is, q = or q = 1 - m/n or q = 1 -

p.

Now note that the probability q is nothing but the probability of the

complementary event A i.e.

Thus p ( ) = 1 - p or p ( ) = 1 - p ( )

so that p (A) + p ( ) = 1 i.e. p + q = 1

Relative Frequency Definition

The classical definition of probability has a disadvantage i.e. the words

„equally likely‟ are vague. In fact, since these words seem to be synonymous

with "equally probable". This definition is circular as it is defining (in terms)

of itself. Therefore, the estimated or empirical probability of an event is taken

as the relative frequency of the occurrence of the event when the number of

observations is very large.

8.3.2 Van Mise’s Statistical (or Empirical) Definition

If trials are to be repeated a great number of times under essentially the

same condition then the limit of the ratio of the number of times that an event

happens to the total number of trials, as the number of trials increases

indefinitely is called the probability of the happening of the event.

It is assumed that the limit exists and finite uniquely. Symbolically p (A)

= p = provided it is finite and unique.

The two definitions are apparently different but both of them can be

reconciled the same sense.

Example Find the probability of getting heads in tossing a coin.

Solution : Experiment : Tossing a coin

Sample space : S = { H, T} n (S) = 2

Event A : getting heads

A = { H} n (A) = 1

Therefore, p (A) = or 0.5

Example Find the probability of getting 3 or 5 in throwing a die.

Solution : Experiment : Throwing a dice

Sample space : S = {1, 2, 3, 4, 5, 6 } n (S) = 2

Event A : getting 3 or 6

A = {3, 6} n (A) = 2

Therefore, p (A) =

Example Two dice are rolled. Find the probability that the score on the

second die is greater than the score on the first die.

Solution : Experiment : Two dice are rolled

Sample space : S = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6) (2, 1), (2, 2), (2,

3), (2, 4), (2, 6)}...

(6, 1), (6, 2) (, 3), (6, 4), (6, 5), (6, 6) }

n (S) = 6 6 = 36 Event A : The score on the second die > the score on

the 1st die.

i.e. A = { (1, 2), (1, 3), (1, 4), (1, 5), (1, 6) (2, 3), (2, 4), (2, 5), (2, 6) (3, 4),

(3, 5), (3, 6) (4, 5), (4, 6) (5, 6)}

n (A) = 15

Therefore, p (A) =

Example A coin is tossed three times. Find the probability of getting at

least one head.

Solution : Experiment : A coin is tossed three times.

Sample space : S = {(H H H), (H H T), (HTH), (HTT), (THT), (TTH),

(THH), (TTT) }

n (S) = 8

Event A : getting at least one head

so that A : getting no head at all

= { (TTT) n ( ) = 1

P ( ) =

Therefore, P (A) = 1 - P ( A ) =

Example A ball is drawn at random from a box containing 6 red balls, 4

white balls and 5 blue balls. Determine the probability that the ball drawn is

(i) red (ii) white (iii) blue (iv) not red (v) red or white.

Solution : Let R, W and B denote the events of drawing a red ball, a

white ball and a blue ball respectively.

(i)

Note : The two events R and W are „disjoint‟ events.

Example What is the chance that a leap year selected at random will contain

53 Sundays ?

Solution : A leap year has 52 weeks and 2 more days.

The two days can be :

Monday - Tuesday

Tuesday - Wednesday

Wednesday - Thursday

Thursday - Friday

Friday - Saturday

Saturday - Sunday and

Sunday - Monday.

There are 7 outcomes and 2 are favorable to the 53rd Sunday.

Now for 53 Sundays in a leap year, P(A)

2 / 7 = 0.29 (Approximately)

Example If four ladies and six gentlemen sit for a photograph in a row

at random, what is the probability that no two ladies will sit together ?

Solution : Now if no two ladies are

to be together, the ladies have 7 positions, 2 at ends and 5 between the

gentlemen

Arrangement L, G1, L, G2, L, G3, L, G4, L, G5, L, G6, L

Example In a class there are 13 students. 5 of them are boys and the rest

are girls. Find the probability that two students selected at random wil be

both girls.

Solution : Two students out of 13 can be selected in ways and two

girls out

of 8 can be selected in ways.

Therefore, required probability =

Example A box contains 5 white balls, 4 black balls and 3 red balls.

Three balls are drawn randomly. What is the probability that they will be (i)

white (ii) black (iii) red ?

Solution : Let W, B and R denote the events of drawing three white,

three black and

three red balls respectively.

8.4 The Law of Probability

So far we have discussed probabilities of single events. In many

situations we come across two or more events occurring together. If event A

and event B are two events and either A or B or both occurs, is denoted by

A B or (A + B) and the event that both A and B occurs is denoted by A

B or AB. We term these situations as compound event or the joint

occurrence of events. We may need probability that A or B will happen.

It is denoted by P (A B) or P (A + B). Also we may need the

probability that A and B (both) will happen simultaneously. It is denoted by

P (A B) or P (AB).

Consider a situation, you are asked to choose any 3 or any diamond or

both from a well shuffled pack of 52 cards. Now you are interested in the

probability of this situation.

Now see the following diagram.

It is denoted by P (A B) or P (A + B). Also we may need the

probability that A and B (both) will happen simultaneously. It is denoted by P

(A B) or P (AB).

Consider a situation, you are asked to choose any 3 or any diamond or

both from a well shuffled pack of 52 cards. Now you are interested in the

probability of this situation.

Now see the following diagram.

Now count the dots in the area which fulfills the condition

any 3 or any diamond or both. They are 16.

Thus the required probability

In the language of set theory, the set any 3 or any diamond or both is

the union of the sets „any 3 which contains 4 cards ‟ and „any diamond‟ which

contains 15 cards. The number of cards in their union is equal to the sum of

these numbers minus the number of cards in the space where they overlap.

Any points in this space, called the intersection of the two sets, is counted here

twice (double counting), once in each set. Dividing by 52 we get the required

probability.

Thus P (any 3 or any diamond or

both)

In general, if the letters A and B stands for any two events, then

Clearly, the outcomes of both A and B are non-mutually exclusive.

Example Two dice are rolled. Find the probability that the score is an

even number or multiple of 3.

Solution : Two dice are rolled.

Sample space = {(1, 1), (1, 2), ............, (6, 6)}

n(S) = 6 6 = 36

Event E : The score is an even number or multiple of 3.

Note here score means the sum of the numbers on both the dice when

they land. For example (1, 1) has score 1 + 1 = 2.

It is clear that the least score is 2 and the highest score (6, 6) 6 + 6 =

12

i.e. score 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Let Event A : Score is an even numbers

A = {(1, 1), (1, 3), (1, 5), (2,2), (2, 4), (2, 6), (3, 1), (3, 3) (3, 5), (4, 2), (4, 4), (4, 6),

(5, 1), (5, 3), (5, 5), (6, 2), (6, 4), (6, 6) }

Therefore n (A) = 18

Let Event B: The score is the multiple of 3

i.e. 3, 6, 9, 12

B = {(1, 2), (1, 5), (2, 4), (2, 1) (3, 6) (3, 3) (4,2), (4, 5), (5, 1), (5,4), (6, 3), (6, 6) }

n (B) = 12

Let Event A B:The score is an even number and multiple of 3 or

(i.e. common to both A and B) AB

AB = {(2, 4), (4, 2), (33,3), (4,2), (5, 1), (6,6)}

n (AB) = 6

Multiplication Law of Probability

If there are two independent events; the respective probability of which

are known, then the probability that both will happen is the product of the

probabilities of their happening respectively P (AB) = P (A) P (B)

To compute the probability of two or even more independent event all

occurring (joint occurrence) extent the above law to required number.

For example, first flip a penny, then the nickle and finally flip the dime.

On landing, probability of heads is for a penny

probability of heads is for a nickle

probability of heads is for a dime

Thus the probability of landing three heads will be or 0.125.

(Note that all three events are independent)

Example Three machines I, II and III manufacture respectively 0.4, 0.5

and 0.1 of the total production. The percentage of defective items produced by

I, II and III is 2, 4 and 1 percent respectively for an item randomly chosen,

what is the probability it is defective?

Solution:

Example In shuffling a pack of cards, 4 are accidentally dropped one

after another. Find the chance that the missing cards should be one from each

suit.

Solution: Probability of 4 missing cards from different suits are as

follows:

Let H, D, C and S denote heart, diamond, club and spade cards

respectively

Example A problem in statistics is given to three students A, B and C

whose chances in solving it are respectively. What is the

probability that the problem will be solved ?

Solution : The probability that A can solve the problem = 1/2

The probability that B cannot solve the problem = 1 - 1/2 = 1/2

Similarly the probabilities that B and C cannot solve problem are

respectively.

Conditional Probability

In many situations you get more information than simply the total

outcomes and favorable outcomes you already have and, hence you are in

position to make yourself more informed to make judgements regarding the

probabilities of such situations. For example, suppose a card is drawn at

random from a deck of 52 cards. Let B denotes the event „the card is a

diamond‟ and A denotes the event „the card is red‟. We may then consider the

following probabilities.

Since there are 26 red cards of which 13 are diamonds, the probability

that the card is diamond is . In other words the probability of event B

knowing that A has occurred is .

The probability of B under the condition that A has occurred is known

as condition probability and it is denoted by P (B/A) . Thus P (B/A) = . It

should be observed that the probability of the event B is increased due to the

additional information that the event A has occurred.

Conditional probability found using the formula P (B/A) =

Justification :- P (A/B) =

Similarly P(A/B) =

In both the cases if A and B are independent events then P (A/B) = P (A) and

P(B/A) = P(B)

Therefore P(A) = P (AB) = P (A) . P (B)

or P(B) = P (AB) = P(A) . P (B)

8.5 Importance of Probability The theory of probability has its origin in the seventeenth century to

develop the quantitative measure of probability concerning problems related

to the theory of die in gambling.

Later, the theory was used on problems pertaining to chance by

mathematicians. The problems are related to tossing of a coin, possibility of

getting a card of specific suit, possibility of getting balls of specific colour from

a bag of balls. Now a days the law of probability, is used to solve the economic

and business problems. It is also used to solve the problems of our day to day

life even.

The utility of probability can be known by its various uses.

Following are the areas where probability theory has been used :

1. The fundamental laws of statistics like Law of Statistical Regularity

and Law of Inertia of large numbers are based on the theory of probability.

2. The various test of significance like Z –test, F test, Chi – suare test,

are derived from the theory of probability.

3. This theory gives solution to the problems relating to the game of

chance.

4. The decision theories are based on the fundamental laws of

probability.

5. The theory is generally used in economic and business decision

making. The theory is very useful in the situations where risk and uncertainty

prevails.

6. The subjective probability is widely used in those situations where

actual measurement of probability is not feasible. It has, thus, added new

dimension to the theory of probability. These probability can be revised at a

later stage on the basis of experience.

8.6 Practical Problems:

Illustration: A single letter is selected at random from the word

„PROBABILITY‟. What is the probability that it is a vowel?

Sollution :

Total number of letters in the word, „PROBABILIT5Y‟ = n = 11

Number of favourable cases = m = 4 ( vowels are – o, a, i, i )

We know that,

P(A)= =

Illustration: Find the probability of having at least one son in a family if

there are two children in a family on an average.

Solution:

Two children in a family may be either :

(1) Both sons

or (2) Son and daughter

or (3) Daughter and son

or (4) Both daughters

Thus, total number of equally likely cases = n = 4

At least one son implies that a family may have one son or two sons.

Thus, favourable number of cases = m = 3 (i.e., option, nos 1,2,3,)

P(A) =

Illustration: Find the chance of getting an ace in a draw from a pack of

52 cards.

Solution:

Total number of cards = n = 52

Number of favourable cases = m = 4 (number of aces)

P(A)

Illustration: Suppose an ideal die is tossed twice. What is the probability

of getting a sum of 10 in the two tosses?

Solution:

A die can be tossed first time in = 6 ways

Adie can be tossed second time in = 6 ways

A die can be tossed twice in = 6 × 6 = 36 ways (as per rule of counting)

Number of ways in which we can through two die to get a sum of 10 are

= m = 3 ways (i.e., dot number 4+6+5and 6+4)

P(A)

Chapter Eight: End Chapter Quizzes

1. The outcome of tossing a coin is a

a- simple event

b- mutually exclusive event

c- complementary event

d- compound event

2. Classical probability is measured in terms of

a- an absolute value

b- a ratio

c- absolute value and ratio both


3. Probability is expressed as

a- ratio

b- proportion

c- percentage

d- all the above

4. Classical probability is also known as

a- Laplace‟s probability

b- mathematical probability

c- a priori probability

d- all the above

5. Each outcome of a random experiment is called

a- Primary event

b- Compound event

c- Derived event

d- All the above

6. The definition of statistical probability was originally

given by

a- De Moivre

b- Laplace

c- Von-Mises

d- Pascal

7. The definition of priori probability was originally

given by

a- De Moivre

b- Laplace

c- Von-Mises

d- Feller

8. Probability by classical approach has

a- no lecuna

b- only one lecuna

c- only two lecunae

d- many lecunae

9. An event consisting of those elements which are not in

A is called

a- primary event

b- derived event

c- simple event

d- complementary event

10. The probability of the intersection of two mutually

exclusive events is always

a- infinity

b- zero

c- one


Answer key to End Chapter Quizzes:

Chapter One

(1) b , (2) c , (3) c , (4) a , (5) b , (6) c , (7) b , (8) c , (9) c , (10) d

Chapter Two

(1) b , (2) b , (3) c , (4) b , (5) d , (6) c , (7) c , (8) c , (9) c , (10) c

Chapter Three

(1) d , (2) c , (3) a , (4) d , (5) b , (6) b , (7) a , (8) d , (9) a , (10) c

Chapter four

(1) d , (2) c , (3) b , (4) a , (5) a , (6) b , (7) c , (8) b , (9) c , (10) d

Chapter Five

(1) c , (2) b , (3) c , (4) c , (5) a , (6) b , (7) b , (8) b , (9) c , (10) b

Chapter Six

(1) b , (2) b , (3) d , (4) d , (5) a , (6) c , (7) a , (8) d , (9) b , (10) b

Chapter Seven

(1) d , (2) d , (3) a , (4) c , (5) b , (6) b , (7) c , (8) a , (9) b , (10) d

Chapter Eight

(1) a , (2) b , (3) d , (4) d , (5) a , (6) c , (7) b , (8) d , (9) d , (10) b

BIBLIOGRAPHY

(I) Books :

1. Gupta, S. P. : Business Statistics

2. Sharma, N. L. : Statistics

3. Gupta, K. L. : Business Statistics

4. Gupta, S. P. : Statistical Methods

5. Kapoor & Sancheti : Business Statistics

6. Kothari, C. R. : Quantitative Techniques

7. Agarwal, B. M. : Business Statistics

8. Hooda, R. P. : Introduction to Statistics

9. Sharma, J. K. : Business Statistics

II) Journals, Periodicals, Newspapers and Other useful Publications

1. Economic and Political Weekly.

2. India Today.

3. Business India.

4. Journal of Development Economics.

III) Reports and Other Materials

1. Journals in Statistics.

2. Various reports on Economic and Statistical investigations.

Business Statistics Vol 2 Online

Documents