Top Banner
Master of Business Administration- MBA Semester 1 MB0040 – Statistics for Management - 4 Credits (Book ID: B1129) Assignment Set – 1 Q1. Define “Statistics”. What are the functions of Statistics? Distinguish between Primary data and Secondary data. Ans: Definition of Statistics Statistics is usually and loosely defined as: 1. A collection of numerical data that measure something. 2. The science of recording, organising, analysing and reporting quantitative information. Professor A.L. Bowley gave several definitions of Statistics. He defined Statistics as: “i) The science of counting ii) The science of averages iii) The science of measurement of social phenomena, regarded as a whole in all its manifestations. iv) A subject not confined to any one science”1 However, none of these definitions are complete. According to Horace Secrist, “Statistics may be defined as the aggregate of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to a reasonable standard of accuracy, collected in a systematic manner, for a predetermined purpose and placed in relation to each other”2. This definition is both comprehensive and exhaustive. Prof. Boddington, on the other hand, defined Statistics as ‘The science of estimates and probabilities’3. This definition is also not complete. According to Croxton and Cowden, ‘Statistics is the science of collection, presentation, analysis and interpretation of numerical data from logical analysis’. Functions of Statistics
22

MB0040 Statistics

Oct 28, 2014

Download

Documents

Mohammed Ismail

SMU-Assignments 1st sem MBA
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MB0040 Statistics

Master of Business Administration- MBA Semester 1MB0040 – Statistics for Management - 4 Credits

(Book ID: B1129)Assignment Set – 1

Q1. Define “Statistics”. What are the functions of Statistics? Distinguish between Primary data and Secondary data.Ans:Definition of Statistics Statistics is usually and loosely defined as: 1. A collection of numerical data that measure something. 2. The science of recording, organising, analysing and reporting quantitative information. Professor A.L. Bowley gave several definitions of Statistics. He defined Statistics as: “i) The science of counting ii) The science of averages iii) The science of measurement of social phenomena, regarded as a whole in all its manifestations. iv) A subject not confined to any one science”1 However, none of these definitions are complete.

According to Horace Secrist, “Statistics may be defined as the aggregate of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to a reasonable standard of accuracy, collected in a systematic manner, for a predetermined purpose and placed in relation to each other”2. This definition is both comprehensive and exhaustive. Prof. Boddington, on the other hand, defined Statistics as ‘The science of estimates and probabilities’3. This definition is also not complete.

According to Croxton and Cowden, ‘Statistics is the science of collection, presentation, analysis and interpretation of numerical data from logical analysis’.

Functions of Statistics Statistics is used for various purposes. It is used to simplify mass data and to make comparisons easier. It is also used to bring out trends and tendencies in the data as well as the hidden relations between variables. All this helps to make decision making much easier. Let us look at each function of Statistics in detail. 1. Statistics simplifies mass data The use of statistical concepts helps in simplification of complex data. Using statistical concepts, the managers can make decisions more easily. The statistical methods help in reducing the complexity of the data and consequently in the understanding of any huge mass of data. Solved Problem 1: Fifty people were interviewed to rate a regional movie on the scale of 1 to 10, with number 1 being for the top movie and number 10 being for the worst movie. The table 1.1a shows the ratings given by 50 customers. Simplify the data?

Page 2: MB0040 Statistics

The data in table 1.1a can be condensed and is presented in table 1.1b using the statistical concepts such as calculating frequency and frequency distribution to draw conclusions and then frequency table is prepared. In this example, from the bulk data consisting of 50 rating scores, the frequency table was prepared. The frequency table is in condensed and simple form. From the tabled data, we can easily interpret that for the regional movie, most of the customers gave a 7 rating (that is, 11 customers). Only two customers gave a rating of 1 for the regional movie, which means only two out of 50 customers surveyed liked the regional movie the most.

2. Statistics makes comparison easier Without using statistical methods and concepts, collection of data and comparison cannot be done easily. Statistics helps us to compare data collected from different sources. Grand totals, measures of central tendency, measures of dispersion, graphs and diagrams, coefficient of correlation all provide ample scopes for comparison.Example

Page 3: MB0040 Statistics

Hence, visual representation of numerical data helps you to compare the data with less effort and can make effective decisions. 3. Statistics brings out trends and tendencies in the data After data is collected, it is easy to analyse the trend and tendencies in the data by using the various concepts of Statistics. 4. Statistics brings out the hidden relations between variables Statistical analysis helps in drawing inferences on data. Statistical analysis brings out the hidden relations between variables. 5. Decision making power becomes easier With the proper application of Statistics and statistical software packages on the collected data, managers can take effective decisions, which can increase the profits in a business.

Primary & Secondary DataPrimary data Data collected for the first time keeping in view the objective of the survey is known as primary data. They are likely to be more reliable. However, cost of collection of such data is much higher. Primary data is collected by the census method. In other words, information with respect to each and every individual of the population is observed.Collection of primary data can be done by any of the following methods. 1. Direct personal observation

2. Indirect oral interview

3. Information through agencies

4. Information through mailed questionnaires

5. Information through schedule filled by investigators

Let us know about each of them in detail.A sample which consists of entire population is called a census.

Page 4: MB0040 Statistics

Secondary data Any information, that is used for the current investigation but is obtained from some data, which has been collected and used by some other agency or person in a separate investigation, or survey, is known a secondary data. They are available in published or unpublished form.In published form, secondary data is available in research papers, news papers, magazines, government publication, international publication, and websites. Secondary data is collected for different purposes. Therefore, care should be exercised while making use of it. The accuracy, reliability, objectives and scope of secondary data should be examined thoroughly before use. Secondary data may be collected either by census or by sampling methods.

Differences between primary and secondary data:

Q2. Draw a histogram for the following distribution:

Age 0-10 10-20 20-30 30-40 40-50 No. of people

2 5 10 8 4

Ans:

Page 5: MB0040 Statistics

Q3. Find the (i) arithmetic mean and (ii) the median value of the following set of values: 40, 32, 24, 36, 42, 18, 10.Ans:

Arranging in ascending order 10,18,24,32,36,40,42

Arithmetic mean= (10+18+24+32+36+40+42)/7 =28.85

Arranging in ascending order 10,18,24,32,36,40,42 Median= 32

Q4. Calculate the standard deviation of the following data: Marks 78-80 80-82 82-84 84-86 86-88 88-90 No. of students

3 15 26 23 9 4

Ans:

Arithmetic mean= 3+15+26+23+9+4

Page 6: MB0040 Statistics

6 = 15

X M (X-M) (X-M)2

3 15 -12 14415 15 0 026 15 11 12123 15 8 649 15 -6 364 15 -11 121

S= /n-1 = 81 = 9

Q5. Explain the following terms with respect to Statistics: (i) Sample, (ii) Variable, (iii) Population.Ans:

Sample Sample is a finite subset of a population. A sample is drawn from a population to estimate the characteristics of the population. Sampling is a tool which enables us to draw conclusions about the characteristics of the population. The figure for illustrations:

Illustration of population and sample

Advantages of Sampling The advantages of sampling are: In short time we get maximum information about the population.

Page 7: MB0040 Statistics

It results in considerable amount of saving of time and labour.

The organisation and administration of a sample survey is relatively much less.

The results obtained are reliable and always possible to attach degree of reliability.

There is a possibility of obtaining detailed information. In other words there is a greater scope.

In case of infinite population, it is the only available method.

If the units are destroyed or affected adversely in the course of investigation, then the only method is sampling.

Variable:In a population, some characteristics remain the same for all units and some others vary from unit to unit. The quantitative characteristic that varies from unit to unit is called a variable. The qualitative characteristic that varies from unit to unit is called an attribute. A variable that assumes only some specified values in a given range is known as discrete variable. A variable that assumes all the values in the range is known as continuous variable. For example, the number of children per family and number of petals in a flower are examples of discrete variables. The height and weight of persons are examples of continuous variables.

Quantitative characteristic is a characteristic which is numerically measurable otherwise it is a qualitative characteristic. The quantitative characteristic that varies from unit to unit is called a variable.

Universe or Population Statistical survey or enquiries deal with studying various characteristics of unit belonging to a group. The group consisting of all the units is called Universe or Population. The figure illustrates the population.

Illustration of population

Example 1

Page 8: MB0040 Statistics

In the statistical survey aimed at determining average per capita income of the people in the city, all earning individuals in the city form the population.

Types of population The figure displays the types of population along with the explanation.

Fig: Types of populationAlthough many populations appear to be exceedingly large, no truly infinite population of physical objects actually exists. Given limited resources and time it is practically not possible to count the number of grains of sand on the beach. Such populations are termed as infinite population.

Q6. An unbiased coin is tossed six times. What is the probability that the tosses will result in: (i) at least four heads, and (ii) exactly two headsAns:Let ‘A’ be the event of getting head. Given that:

(ii) The probability that the tosses will result in exactly two heads is given by:

Therefore, the probability that the tosses will result in exactly two heads is 15/64.

 

(i) The probability that the tosses will result in at least four heads is given by:

P(X>=4)= P(X=4)+P(X=5)+P(X=6)= {c64*(1/2)^6-4*(1/2)^4} + {6c5*(1/2)^6-5**(1/2)^6} + {6c6*(1/2)^6-6*(1/2)^6}

Page 9: MB0040 Statistics

P(x>=) = {5*(1/4)*(1/16)} + 6*(1/2)^6 + (1/2^6)

=(15/64)+(7/64)= 22/64Therefore, the probability that the tosses will result in at least four heads is 22/64.

Page 10: MB0040 Statistics

Master of Business Administration- MBA Semester 1MB0040 – Statistics for Management - 4 Credits

(Book ID: B1129)Assignment Set – 2

Q1. Find Karl Pearson’s correlation co-efficient for the data given in the below table:X 18 16 12 8 4 Y 22 14 12 10 8

Ans:

The table in question displays the sums calculated for the data represented in table below :X Y

X2 Y2XY

20 22 400 484 440

16 14 256 196 224

12 4 144 16 48

8 12 64 144 96

4 8 16 64 32

åX = 60 åY = 60åX2 = 880 åY2 = 904

åXY = 840

Solution: Applying the formula for ‘r’ and substituting the respective values from the

table we get r as:

Hence, Karl Pearson’s correlation coefficient is 0.70.

Page 11: MB0040 Statistics

Q2. Find the (i) arithmetic mean (ii) range and (iii) median of the following data: 15, 17, 22, 21, 19, 26, 20.

Ans:

Arithmetic mean= (15+77+22+21+19+26+20)/7 =140/7 =20 Range = highest number- lowest number/2 = 58/2 =29

Q3. What is the importance of classification of data? What are the types of classification of data?Ans:

Classification is the first stage in simplification. It can be defined as a systematic grouping of the units according to their common characteristics. Each of the group is called class. For example, in a survey of industrial workers of a particular industry, workers can be classified as unskilled, semi-skilled and skilled, each of which form a class. Learning Objectives By the end of this unit, you should be able to: Describe the functions and methods of classification

Identify the parts of table

Describe the functions of tabulation

Calculate the frequency and frequency distribution for the data

Display the numerical data as graphical representation

Functions of Classification Classification of data performs many functions. It condenses the bulk data

It simplifies the data and makes the data more comprehensible

It facilitates comparison of characteristics

It renders the data ready for any statistical analysis

Requisites of a good classification A good classification should be: Unambiguous: It should not lead to any confusion

Exhaustive: Every unit should be allotted to one and only one class

Mutually exclusive: There should not be any overlapping

Flexible: It should be capable of adjusting to changing situation

Suitable: It should be suitable to objectives of survey

Stable: It should remain stable throughout the investigation Homogeneous: There should be similar units in the same class

Revealing: It should bring out essential features of the collected data

Types of classification The important types of classification are:

Page 12: MB0040 Statistics

Geographical classification Data classified according to region is geographical classification. Chronological classification Data classified according to the time of its occurrence is called chronological classification. Conditional classification Classification of data done according to certain conditions is called conditional classification. Qualitative classification Classification of data that is immeasurable is called qualitative classification. For example, sex of a person, marital status, color and others. Quantitative classification Classification of data that is measurable either in discrete or continuous form is called quantitative classification. Statistical Series Data is arranged logically according to size or time of occurrence or some other measurable or non-measurable characteristics.

Q4. The data given in the below table shows the production in three shifts and the number of defective goods that turned out in three weeks. Test at 5% level of significance whether the weeks and shifts are independent.Shift 1st Week 2nd Week 3rd Week Total I 15 5 20 40 II 20 10 20 50 III 25 15 20 60 Total 60 30 60 150

Ans:Observed and expected values for data of above problem (ii)

Observed

Value (O) Expected Value (E)(O – E)2

15 40 x 60 /150 = 16 1 0.0625

20 50 x 60/150 = 20 0 0.0000

25 60 x 60/150 = 24 1 0.0417

5 40 x 30/150 = 8 9 1.1250

10 50 x 30/150 = 10 0 0.0000

15 60 x 30/150 = 12 9 0.7500

20 40 x 60/150 = 16 16 1.0000

20 50 x 60 /150 = 20 0 0.0000

Page 13: MB0040 Statistics

20 60 x 60/150 = 24 16 0.6667

    c23.6459

The steps followed to calculate c2 are described below.

1. Null hypothesis ‘Ho’: The week and shifts are independent

Alternate hypothesis ‘HA’: The week and shifts are dependent

2. Level of Significance is 5% and D.O.F (3 – 1) (3 – 1) = 4

3. Test Statistics

4. Test c2cal = 3.6459

5. Conclusion: Since c2cal (3.6459) < c2

tab (9.49), ‘Ho’ is accepted. Hence, the attributes

‘week’ and ‘shifts’ are independent.

Q5. What is sampling? Explain briefly the types of samplingAns:Types of Sampling By choosing a sample technique carefully, errors can be minimised. Let us take a look at the different techniques available. The sampling techniques may be broadly classified into. i) Probability Sampling

ii) Non-Probability Sampling

1 Probability sampling Probability sampling provides a scientific technique of drawing samples from the population. The technique of drawing samples is according to the law in which each unit has a predetermined probability of being included in the sample. The different ways of assigning probability are: i) Each unit has the same chance of being selected.

ii) Sampling units have varying probability

iii) Units have probability proportional to the sample size

We will discuss here some of the important probability sampling designs. Simple random sampling Under this technique, sample units are drawn in such a way that each and every unit in the population has an equal and independent chance of being included in the sample. If a sample

Page 14: MB0040 Statistics

unit is replaced before drawing the next unit, then it is known as Simple Random Sampling With Replacement [SRSWR]. If the sample unit is not replaced before drawing the next unit, then it is called Simple Random Sampling without replacement [SRSWOR]. In first case, probability of drawing a unit is 1/N, where N is the population size. In the second case probability of drawing a unit is 1/Nn.

Lottery method: In lottery method, we identify each and every unit with distinct numbers by allotting an identical card. The cards are put in a drum and thoroughly shuffled before each unit is drawn. The figure represents a lotto machine through which samples can be selected randomly.

Fig: Lottery Machine

The use of table of random numbers: There are several random number tables. They are Tippet’s random number table, Fisher’s and Yate’s Tables, Kendall and Babington Smiths random tables, Rand Corporation random numbers and so on.

Table : Tippett’s random number table

Suppose, we want to select 10 units from a population size of 100. We number the population units from 00 to 99. Then we start taking 2 digits. Suppose, we start with 41 (second row) then the other numbers selected will be 67, 95, 24, 15, 45, 13, 96, 72, 03.

Stratified random sampling This sampling design is most appropriate if the population is heterogeneous with respect to characteristic under study or the population distribution is highly skewed.We subdivide the population into several groups or strata such that : i) Units within each stratum is more homogeneous

ii) Units between strata are heterogeneous

Page 15: MB0040 Statistics

iii) Strata do not overlap, in other words, every unit of population belongs to one and only one stratum

The criteria used for stratification are geographical, sociological, age, sex, income and so on. The population of size ‘N’ is divided into ‘K’ strata relatively homogenous of size ‘N1’, ‘N2’………….’Nk’ such that ‘N1 + N2 +……… + Nk = N’. Then, we draw a simple random sample from each stratum either proportional to size of stratum or equal units from each stratum.

Systematic sampling This design is recommended if we have a complete list of sampling units arranged in some systematic order such as geographical, chronological or alphabetical order. Suppose the population size is ‘N’. The population units are serially numbered ‘1’ to ‘N’ in some systematic order and we wish to draw a sample of ‘n’ units. Then we divide units from ‘1’ to ‘N’ into ‘K’ groups such that each group has ‘n’ units. This implies ‘nK = N’ or ‘K = N/n’. From the first group, we select a unit at random. Suppose the unit selected is 6th unit, thereafter we select every 6 + Kth units. If ‘K’ is 20, ‘n’ is 5 and ‘N’ is 100 then units selected are 6, 26, 46, 66, 86.

Cluster sampling The total population is divided into recognisable sub-divisions, known as clusters such that within each cluster units are more heterogeneous and between clusters they are homogenous. The units are selected from each cluster by suitable sampling techniques. The figure represents the cluster sampling where each packet of candy packet forms a cluster.

Page 16: MB0040 Statistics

Fig: Cluster sampling

Multi-stage sampling The total population is divided into several stages. The sampling process is carried out through several stages. It is represented as in figure

Fig: Multistage sampling

2 Non-probability sampling Depending upon the object of enquiry and other considerations a predetermined number of sample units is selected purposely so that they represent the true characteristics of the population. A serious drawback of this sampling design is that it is highly subjective in nature. The selection of sample units depends entirely upon the personal convenience, biases, prejudices and beliefs of the investigator. This method will be more successful if the investigator is thoroughly skilled and experienced.

Judgment Sampling The choice of sample items depends exclusively on the judgment of the investigator. The investigator’s experience and knowledge about the population will help to select the sample units. It is the most suitable method if the population size is less.

Page 17: MB0040 Statistics

Table: Merits and demerits of judgement sampling

Convenience sampling The sample units are selected according to convenience of the investigator. It is also called “chunk” which refers to the fraction of the population being investigated which is selected neither by probability nor by judgment. Moreover, a list or framework should be available for the selection of the sample. It is used to make pilot studies. However, there is a high chance of bias being introduced.

Quota sampling It is a type of judgment sampling. Under this design, quotas are set up according to some specified characteristic such as age groups or income groups. From each group a specified number of units are sampled according to the quota allotted to the group. Within the group the selection of sample units depends on personal judgment. It has a risk of personal prejudice and bias entering the process. This method is often used in public opinion studies.

Q6. Suppose two houses in a thousand catch fire in a year and there are 2000 houses in a village. What is the probability that: (i) none of the houses catch fire and (ii) At least one house catch fire?

Ans:

Page 18: MB0040 Statistics