Top Banner
Quantitative Techniques in Management Chapter 1 : MANAGEMENT AND DECISION - MAKING INTRODUCTION Decision-making is an essential and dominating part of the management process. Although authorities sometimes differ in their definitions of the basic functions of management, everybody agrees that one is not a manager unless he has some authority to plan, organise and control the activities of an enterprise and behaviour of the others. Within this context, decision-making may be viewed as the power to determine what plans will be made and how activities will be organized and controlled. The right to make decisions is an internal part of right of authority upon which the entire concept of management rests. Essentially then, decision-making pervades the activities of every business manager. Further, since to carry out the key managerial functions of planning, organizing, directing and controlling, the management is engaged in a continuous process of decision-making pertaining to each of them, we can go to the extent of saying that management may be regarded as equivalent to decision-making. Traditionally, decision-making has been considered purely as an art, a talent which is acquired over a period of time through experience. It has been considered so because a variety of individuaI styles can be traced in handling and successfully solving similar type of managerial problems in actual business. However, the environment in which the management has to operate nowadays is complex and fast changing. There is a greater need for supplementing the art of decision-making by systematic and scientific methods. A systematic approach to decision-making is necessary because today's business and the environment in which it functions are far more complex than in the past, and the cost of making errors is becoming graver with time. Most of the business decisions cannot be made simply on the basic of rule of thumb, using commonsense and / or snap judgment. Commonsense may be misleading and snap judgments may have painful implications. For large business, a single wrong decision may not only one ruinous but may also have ramifications in national or even international economies. As such, present day management's cannot rely solely on a trial and error approach and the managers have to be more sophisticated. They should employ scientific methods to help them make proper choices. Thus, the decision makers, in the business world of today must understand scientific methodology for making decisions. This calls for (1) defining the problem in a clear manner, (2) collecting pertinent facts, (3) analyzing the facts thoroughly, and (4) deriving and implementing the solution. DECISION - MAKING AND QUANTITATIVE TECHNIQUES. ManageriaI decision-making is a process by which management, when faced with a problem, chooses a specific course of action from a set of possible options. In making a decision, a business manager attempts to choose that course of action which is most effective in the given circumstances in attaining the goals of the organization. The various types of decision-making situations that a manager might encounter can be listed as follows. 1. Decisions under certainly where all facts are known fully and for sure, or uncertainly where the event that would actually occur is not known but probabilities can be assigned to various possible occurrences. 2. Decisions for one time-period only called static decisions, or a sequence of interrelated decisions made either simultaneously or over several time periods called dynamic decisions. 3. Decisions where the opponent is nature (digging an oil well, for example) or a national opponent (for instances, setting the advertising strategy when the actions of competitors have to be
127
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 1 : MANAGEMENT AND DECISION - MAKING

   

  INTRODUCTION

Decision-making is an essential and dominating part of the management process. Although authorities sometimes differ in their definitions of the basic functions of management, everybody agrees that one is not a manager unless he has some authority to plan, organise and control the activities of an enterprise and behaviour of the others. Within this context, decision-making may be viewed as the power to determine what plans will be made and how activities will be organized and controlled. The right to make decisions is an internal part of right of authority upon which the entire concept of management rests. Essentially then, decision-making pervades the activities of every business manager. Further, since to carry out the key managerial functions of planning, organizing, directing and controlling, the management is engaged in a continuous process of decision-making pertaining to each of them, we can go to the extent of saying that management may be regarded as equivalent to decision-making.

Traditionally, decision-making has been considered purely as an art, a talent which is acquired over a period of time through experience. It has been considered so because a variety of individuaI styles can be traced in handling and successfully solving similar type of managerial problems in actual business. However, the environment in which the management has to operate nowadays is complex and fast changing. There is a greater need for supplementing the art of decision-making by systematic and scientific methods. A systematic approach to decision-making is necessary because today's business and the environment in which it functions are far more complex than in the past, and the cost of making errors is becoming graver with time. Most of the business decisions cannot be made simply on the basic of rule of thumb, using commonsense and / or snap judgment. Commonsense may be misleading and snap judgments may have painful implications. For large business, a single wrong decision may not only one ruinous but may also have ramifications in national or even international economies. As such, present day management's cannot rely solely on a trial and error approach and the managers have to be more sophisticated. They should employ scientific methods to help them make proper choices. Thus, the decision makers, in the business world of today must understand scientific methodology for making decisions. This calls for (1) defining the problem in a clear manner, (2) collecting pertinent facts, (3) analyzing the facts thoroughly, and (4) deriving and implementing the solution.

DECISION - MAKING AND QUANTITATIVE TECHNIQUES.

ManageriaI decision-making is a process by which management, when faced with a problem, chooses a specific course of action from a set of possible options. In making a decision, a business manager attempts to choose that course of action which is most effective in the given circumstances in attaining the goals of the organization. The various types of decision-making situations that a manager might encounter can be listed as follows.

1. Decisions under certainly where all facts are known fully and for sure, or uncertainly where the event that would actually occur is not known but probabilities can be assigned to various possible occurrences.

2. Decisions for one time-period only called static decisions, or a sequence of interrelated decisions made either simultaneously or over several time periods called dynamic decisions.

3. Decisions where the opponent is nature (digging an oil well, for example) or a national opponent (for instances, setting the advertising strategy when the actions of competitors have to be considered)

These classes of decisions-making situations are not mutually exclusive and a given situation would exhibit characteristics from each class. Stocking of an item for sale in a certain trade fair, for instance, illustrates a static decision making situation where uncertainly exists and nature is the opponent.

The elements of any decision are :

a. a decision-maker who could be an individual, group, organization, or society;b. a set of possible actions that may be taken to solve the decision problem;c. a set of possible states that might occur;d. a set of consequences (pay-offs) associated with various combinations of courses of action and the states that may occur; ande. the relationship between the pay-offs and the values of the decision maker;

In an actual decision-making situation, definition and identification of the alternatives, the states and the consequences are most difficult,

Page 2: Quantitative Techniques in Management New Book

albeit not the most crucial, aspects of the decision problem.

In real life, some decision-making situations are simple while others are not. Complexities in decision situations arise due to several factors. These include the complicated manner of interaction of the economic, political, technological, environmental and competitive forces in society, the limited resources of an organization; the values, risk attitudes and knowledge of the decision-makers and the like. For example, a company's decision to introduce a new product will be influenced by such considerations as market conditions, labour rates and availability, and investment requirements and availability of funds. The decision will be of multidimensional response, including the production methodology, cost and quality of the product, price, package design, and marketing and advertising strategy. The results of the decision would conceivably affect every segment of the organisation. The essential idea of the quantitative approach to decision-making is that if the factors that influence the decisions can be identified and quantified then it becomes easier to resolve the complexity of the decision-making situations. Thus, in dealing with complex problems, we may use the tools of quantitative analysis. In fact, a large number of business problems have been given a quantitative representation with varying degrees of success and it has led to a general approach which is variably designated as operations research (for operational research), management science, systems analysis, decision analysis, decision science, etc. Quantitative analysis is now extended to several areas of business operations and represents probably the most effective approach to handling of some types of decision problems.

A significant benefit of attaining some degree of proficiency with quantitative methods is exhibited in the way the problems are perceived and formulated. A problem has to be well defined before it can be formulated into a well-structured framework for solution. This requires an orderly and organised way of thinking.

Two observations may be made here. First, it should be understood clearly that a decision by itself does not become a good and right decision for adoption merely because it is made within an orderly and mathematically precise framework. Quantification at best is an aid to business judgment and not its substitute. A certain degree of constructive skepticism is as desirable in considering a quantitative analysis of business decisions as it is in any other process of decision-making. Further, some allowances should be made for qualitative factors involving morale, motivation, leadership, etc. which cannot be ignored. But they should not be allowed to dominate to such an extent that the quantitative analysis may look to be an interesting academic exercise, but worthless. In fact, the manager should seek some balance between quantitative and qualitative factors.

Should, it may be noted that the various names for quantitative analysis; operations research, management science, etc. cannot more or less the same general approach. We shall not attempt to discuss the differences among the various labels as it is prone to create more heat than light, but only state that the basic reason for so many titles is that the field is relatively new and there is not consensus regarding which field of knowledge it includes. We shall now briefly discuss operations research - its historical development, its nature and characteristics, and its methodology. This shall be followed by a discussions of the plan of this book.

HISTORICAL DEVELOPMENT OF OPERATIONS RESEARCH (OR)

While it is difficult to mark the 'beginning' of the operations research/ management science, the scientific approach to management can be traced back to the era of Industrial Revolution and even to periods before that. But operations research, as it exists today, was born during the Second World War when the British military management called upon a group of scientists to examine the strategies and tactics of various military operations with the intention of efficient allocation of scarce resources for the war effort.

The name operational research was derived directly from the context in which it was used - research activity on operational areas of the armed forces. British scientists spurred the American military management to similar research activities (where it came to be known as operations research). Among the investigations carried out by them were the determination of (i) optimum convoy size to minimise losses from submarine attacks, (ii) the optimal way to deploy radar units in order to maximise potential coverage against possible enemy attacks, and (iii) the invention of new flight patterns, and the detemtination of correct color of the aircraft in order to minimise the chance of detection by the submarines.

After the war, operations research was adopted by the industry and some of the techniques that had been applied to the complex problems of war were successfully transferred and assimilated for use in the industrialized sector.

The dramatic development and refinement of the techniques of operations research and the advent of digital computers are the two prime factors that have contributed to the growth and application of OR in the post-war period. In the 1950s OR was mainly used to handle management problems that were clear cut, well structured and repetitive in nature Typically, they were of a tactical and operational nature such as inventory control. resources allocation scheduling of construction projects, etc. Since the 1960s, however, formal approaches have been increasingly adopted for the less well-structured planning problems as well. These problems are strategic in nature and are the ones that affect the future of the organization. The development of corporate planning models and those relating to the financial aspects of the business, for example, are such type of problems. Thus, in the field of business and industry, operations research helps the management to determine their tactical and strategic decisions more scientifically.

Page 3: Quantitative Techniques in Management New Book

Nature and Characteristic Features Of OR.

In general terms, operations research attempts to provide a systematic and rational approach to the fundamental problems involved in the control of systems by making decisions which, in a sense, achieve the best results considering all the information that can be profitably used. A classical definition of OR is given by Churchman et aI, "....Operations Research is the application of scientific methods, techniques and tools to problems involving the operations of systems so as to provide those in control of operations with optimum solutions to the problems". Thus, it may be regarded as the scientific method employed for problems solving and decision-making by the management.

The significant features of operations research are given below:

1. Decision Making Primarily, OR is addressed to managerial decision-making or problem solving a major premise of OR is that decision-making irrespective of the situations involved, can be considered a general systematic process that consists of the following steps.

a. Define the problem and establish the criterion which will be used The criterion may be the maximization of profits utility and minimization of costs. etc.

b. Select the alternative courses of action for consideration.c. Determine the model to be used and the values of the parameters of the process.d. Evaluate the alternative and choose the one which is optimum.

2. Scientific Approach OR employs scientific methods for the purpose of solving problems. It is a formalized process of reasoning and consists of the following steps.

a. The problem to be analyzed is defined clearly and the conditions for observations are determinedb. Observations are made under varying conditions to determine the behaviour of the system.c. On the basis of the observations, a hypothesis describing how the various factors involved are believed to interact and the

best solution to the problem is formulated.d. To test the hypothesis, an experiment is designed and executed. Observations are made and measurements are recorded.e. Finally, the results of the experiments are analyzed and the hypothesis is either accepted or rejected. If the hypothesis is

accepted the best solution to the problem is obtained.3. Objective: OR attempts to locate the best or optimal solution to the problem under consideration. For this purpose, it is necessary

that a measure of effectiveness is defined which is based on the goals of the organization. This measure is then used as the basis to compare the alternative courses of action.

4. Inter-disciplinary Team Approach: OR is inter-disciplinary in nature and requires a team approach to a solution of the problem. So single individual can have a thorough knowledge of the myriad aspects of operations research and how the problems may be addressed. Managerial problems have economic physical, psychological, biological, sociological and engineering aspects. This requires a bind of people with expertise in the areas of mathematics, statistics, engineering, economics, management, computer science and so on. Of course, it is not always so. Some problem situations may be adequately handled even by one individual.

5. Digital Computer: Use of a digital computer has become an integral part of the operations research approach to decision-making. The computer may be required due to the complexity of the model, volume of data required or the computations to be made. Many OR techniques are available in the form of 'canned' programmes.

Methodology of Operations Research

The basic and dominant characteristic feature of operations research is that it employs mathematical representations or models to analyse problems. This distinctive approach represents an adaptation of the scientific methodology used by other physical sciences. The scientific method translates a real given problem into a mathematical representation which is solved and retransformed into the original context. The OR approach to problem solving consists of the following steps.

1. Formulate the problem.2. Determine the assumptions (model building) and formulate the problem in a mathematical framework.3. Solve the model formulated and interpret the results4. Validate the model.5. Implement the solution obtained.

We shall now discuss each of the steps one by one.

1. Problem Formulation: The first and the most important requirement is that the root problem should be identified and understood Logically speaking we can not expect to get the right answer if the problem is identified incorrectly. In that case, the solution derived from it is apt to be useless and all the efforts in that direction shall be a waste. The problem would be identified properly because often what is described as a problem may only be its symptom. For example, excessive costs per se do not constitute a problem. They are only an indication of some problem which may, for instance, be improper inventory levels, excessive wastage and the like. Often the symptoms of a problem may extend beyond a single manager's control to other personnel and other departments in an organization. Thus, it is necessary for an operations researcher to understand that the formulation of a problem develops from a

Page 4: Quantitative Techniques in Management New Book

complicated interaction that involves the selection and interpretation of the data between himself and the manager.

Once the problem has been identified, it is categorized as being standard or special. The standard problems are also known as programmed problems. As has already been mentioned, they are the well- structured problems characterized by routine, repetitive decisions which utilise specific decision-making techniques in their solution strategy. Standard solution procedures have been developed to handle such prototype problems. Examples of these problems include the assigmnent of workers to jobs, fixing the product-mix for a mouth and determination of the quantity of materials to be brought. On the other hand, there arc special or non-programmed problems. They are unique and non-recurrent in nature and, therefore, ill-structured. Undertaking of a research and development project and the merger and consolidation decisions illustrate such type of decision situations.

2. Model Building: Once the problem is defined, the next step is to build a suitable model. As has already been mentioned, the concepts of models and model building lie at the very heart of the operations research approach to problem solving. A model is a theoretical abstraction of a real-life problem. In fact, many real-life situations tend to be very complex because there are literally innumerable inherent factors in any given situation. Thus, the decision maker has to abstract from the empirical situation those factors which are most relevant to the problem. Having selected the critical factors, he combines them in some logical manner so that they form a counterpart or a model of the actual problem.

Thus, a model is a simplified representation of a real-world situation that ideally, strips a natural phenomenon of its bewildering complexity and replicates its essential behavior. Models may be represented in a variety of ways. They can be classified as physical and symbolic models.

a. Physical Models: a physical model is a physical or a schematic representation of the real thing. There are two types of physical models Iconic and analogue.

i. Iconic Models: They are essentially the scaled-up / down versions of the particular thing they represent a model airplane in a wind tunnel, a model of a proposed building provided by an architect, models of the sun and its planets housed in a planetarium, a model of a particular molecular structure of a chemical - are examples of iconic models because they look like what they represent (except size). Maps, pictures or drawings may also be categorised as Iconic models since they represent essentially the images of certain things. The chief merit of an iconic model is that it is cooperate and specific. It resembles visually the thing it represents and, therefore, there are likely to be fewer problems in translating any finding from the model into the real-life situation. However the disadvantage of such models is that they often do not lend themselves to manipulation for experimental purposes.

ii. Analogue Models: The analogue models use one set of properties to represent another set. To illustrate, an electrical network model may be used as in analogue model to study the flows in a transportation system. Similarly, a barometer which indicates changes in atmospheric pressure through movements of a needle represents an examples of analogue model and the contour lines on a map are analogues of elevation. In general, the analogue models are less specific and less concrete but they are easier to manipulate as compared to the iconic models.

b. Symbolic Models: Many real-life problems can be described by symbolic models or mathematical forms. These are the most general and abstract type of models. They employ letters, numbers and other types of symbols to represent the variables and their inter-relationships. As such, they are capable of experimental manipulation most easily. The symbolic models are capable of experimental manipulation most easily. The symbolic models can be verbal or mathematical. Whereas the verbal models describe a situation in spoken language or written works, the mathematical models employ mathematical notion to represent, in a precise manner, the variables of the real situations. The mathematical models take the form of mathematical relationships that portray the structure of what they are intended to represent. The use of a verbal versus mathematical model could be shown by the formula for finding the perimeter of a rectangle. A verbal model would express this problem as follows. The perimeter (P) of a rectangle is equal to the sum total of two times the length (L) and two times the width (W) of the rectangle. In contrast, the advantage of the mathematical model is demonstrated by the following statement: P=2L+2W. If applied to the same rectangle; both models would yield identical results. However. a mathematical model is more precise. Symbolic models are used in operations research because they are easier to manipulate and they yield more accurate results under manipulation than do either the iconic or the analogue models.

Use of Mathematical Models. Various types of mathematical models are used in modern operations research. Two broad categories of these are deterministic and probabilistic models. A deterministic model is the one in which all parameters in the mathematical formulation are fixed at predetermined values so that no uncertainty exists. In a probabilistic model, on the other hand, some or all the basic characteristics may be random variables (capable of assuming different values with given probabilities). In such models, uncertainty and errors are required to be given explicit consideration. Probabilistic models are also termed as stochastic or chance models. The mathematical models comprise three basic components: decision variables, result variables and uncontrollable variables. The decision variables represent those factors where a choice could be made. These variables can be manipulated and, therefore, are controllable by the decision-maker. The result variables indicate the level of effectiveness of a system. They represent output of the system and are also termed as dependent variables. Finally, the uncontrollable variables are those which influence the result variables but are beyond the control of the decision-maker. To illustrate in the area of marketing, the decision variables may be the advertising budget, the number of regional salesmen employed, the number of products, etc. results variables may be the market

Page 5: Quantitative Techniques in Management New Book

share for the company, level of customer satisfaction, etc. while the uncontrollable variables may be the competitors strategies, consumer incomes, etc. As mentioned earlier, the different components of a mathematical model are tied together with the relationships in the form of equations, inequalities etc. .Such a model consists of an objective function and the constraints under which a given system functions. The objective function describes how a dependent (result) variable is related to independent (decision) variables. For example, the profit function of a firm making two products can be stated as follows.

p = p1x1 - p2x 2

in which p indicates the total profit of the firm, x1 and x2 are the number of units (independent variables) of the two products produced and sold, while p1 and p2 represent the profit per unit on the two products respectively (the uncontrollable variables)

The objective function is called for to maximise (or minimise), subject to certain constraints (representing the uncontrollable variables) for example, in this case of production, the firm might be able to sell no more than a certain number of units, say 80. Then the marketing constraint (an uncontrolIable variable) can be expressed as follows:

x1+ x 2 ≤ 80

Similarly, other constraints, if any, of the system can be expressed.

3. Solution of Model: Once an appropriate model has been formulated, the next stage in the analysis calls for its solution and the interpretation of the solution in the context of the given problem. A solution to a model implies determination of a specific set of decision variables that would yield a desired level of output. The desired level of output, in turn, is determined by the principle of choice adopted and represents the level which optimises. Optimisation might mean maximising the level of goal attainment to cost.

It may be noted that the solutions can be classified as being feasible or infeasible, optimal or non-optimal and unique or multiple.

a. Feasible and Infeasible Solution: A solution (as set of values of the decision variables, as already mentioned) which satisfies all the constraints of the problem is called a feasible solution, whereas an infeasible solution is the one which does not satisfy all the constraints. Since an infeasible solution falls to meet one or more of the system requirements, it is an unacceptable one. Only feasible solutions are of interest to the decision-maker.

b. Optimal and Non-optimal Solutions: An optimal solution is one of the feasible solutions to a problem that optimises and is, therefore, the best of all of them. For example, for a multiproduct firm working under some given constraints of capacity, marketing, finance, etc. the optimal solution would be that product-mix which meet all the constrains and yield the maximum contribution margin towards profits. The feasible solutions other than the optimal solution are called non-optimal solutions. To continue with the example, several other product-mixes would satisfy the restrictions imposed and hence qualify for acceptance but they would be ignored because lower contribution margins would be associated with them. They would be non-optimal.

c. Unique and Multiple Solutions. If only one optimal solution to a given problem exists, it is called a unique solution. On the other hand, if two or more optimal solutions to a problem exist which are equally efficient then multiple optimal solutions are said to exist. Of course, these are preferable from the management's point of view as they provide a greater flexibility in implementation.

Once the principle of choice has been specified, the model is solved for optimal solution. For this, the feasible solutions are considered and of them the one (or more) that optimises is chosen. For this purpose, a complete enumeration may be made so that all the possible solutions are checked and evaluated However, this approach is limited to those situations where the number of alternatives is small alternatively, and more commonly, methods involving algorithms may be used to get optimal solutions. It is significant to note that in contrast to complete enumeration, where all solutions are checked and evaluated. However, this approach is limited to those situations where the number of alternatives is small. Alternatively, and more commonly, methods involving algorithms may be used to get optimal solutions. It is significant to note that in contrast to complete enumeration, where all solutions are checked an algorithm represents a trial and error process where only a part of the feasible solutions are considered and the solutions are gradually improved until an optimal solution is obtained.

While algorithms exist for most of the standardised problems, there are also some numerical techniques which yield solutions that are not necessarily optimal. Heuristics and simulation illustrate those methods. Heuristics are step-by-step logical roles which, in a certain number of steps, yield some acceptable solutions to a given problem. They are applied in those cases where no algorithms exist. Similarly, the technique of simulation is also applied where a given system is sought to be replicated and experimented with solutions using simulation need not be optimal because the technique is only descriptive in nature.

Sensitivity Analysis. In addition to the solution of the model formulation by any technique, sensitivity analysis should also be

Page 6: Quantitative Techniques in Management New Book

performed. By sensitivity analysis we imply determination of the behavior of the system to changes in the system inputs and specifications. This is done because the input data and the structural assumptions of the model may not be valid.

4. Model Validation. The validation of a model requires determining if the model can adequately and reliably predict the behaviour of the real system it seeks to represent. Also, it involves testing the structural assumtions of the model to ascertain their validity. Usually, the validity of a model is tested by comparing its performance with the past data available in respect of the actual system. There is, of course, no assurance that the future performance of the system will continue to be in the same manner as in its past. Therefore, one must take cognisance of the change in the system over time and adjust the model as required.

5. Implementation: No standard prescription can be given which would ensure that the solution obtained would automatically be adopted and implemented. This is because the techniques and models used in operations research may sound high and may be detailed in mathematical terms, but they generally do not consider the human aspects which are significant in implementation of a solution. The impact of a decision may cut across various segments of the organization and the factors like resistance to change. desire to be consulted and informed, motivation, etc. may come in the way of implementation. Equally important as the skill and expertise needed in developing a model is the requirement of tackling issues related to the factors which may have a bearing on the implementation of a solution in a given solution. Thus, a model-which secures a moderate theoretical benefit and is implemented is better than a model which ranks very high on obtaining theoretical advantage but cannot be implemented. In fact the importance of having managers in the organisation who would act on the results of the study of the team that analysis the problem can hardly be over-emphasised.

 

 

Page 7: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 2 : FUNCTIONS AND THEIR APPLICATIONS

   

  Functions

In mathematics and its applications we study certain correspondences between two sets of objects. Examples of this type of correspondence are 'price of a commodity and its demand', advertising expenses and sales' 'income and expenditure' etc.. This correspondence is denoted by the concept of a function. It is a rule for relating particular objects in one set to particular objects in another set. For example, one set of objects could be the length of a square and the other set of objects the set of areas corresponding to each length.

Length (x) in cm of a square 1 2 3 4 5 --

Area (A) of square in cm2 1 4 9 16 25 --

A convenient way of expressing this correspondence is through symbols. If x represents the length in cm of a square and A represents the corresponding area, the rule determining the area is written as

A = x2

We say that A is a function of x. Symbol x is a variable in the sense that it can assume varying numerical values. It is also conventional to label x as the 'independent' variable and' A as the 'dependent' variable.

If to each value of a. variable x there corresponds one definite value of another variable y, we say that y is a function of x and denote it byy=f(x). The set of values of x for which the value of the function y is determined is called the domain of the function while the set of values of y is called the range of the function.

Some examples of functions are given by -

In (i), the domain of x is the closed interval (0, 1) and the range of y is (- I5, -14). In (ii), the domain of x is the interval (1,2) and the range of y is (5, 14). In (iii), the domain of x is (- 8,1) U ( 1, - 8 ), i.e. the set of all real numbers except the value 1 and the range of y is (0, 8 ). A function whose range is a set of real numbers is called a real valued function.

Graphs of functions and Coordinate Geometry

It is possible to represent real valued functions in graphical form using the rectangular Cartesian coordinate system. This system consists of two perpendicular lines OX and OY (Fig. 2.1 )

The lines OX and OY are called X-axis and Y-axis respectively. Their intersection is the point 0 called the origin of the system, A point is located giving its direction and distance from each axis.

Page 8: Quantitative Techniques in Management New Book

The direction is indicated by + or. Distances measured along the direction of the arrows shown in Fig. (2.1) are positive and those in the opposite direction are negative. Any point in the plane detemlined by the axes can be represented by an ordered 2-ple (x.y). For example, (-2, 3) means 2 units from the origin on the negative side of the X-axis and 3 units in the positive direction of the Y-axis. The graph of a function y = f(x) is merely the diagram obtained by plotting {x, f(x)} for all values of x in the domain of f(x). Normally, a few points are plotted and smooth curve is drawn over them. Figure 2.1.2. shows the graph of y = 2x -3 and Fig. 2.1.3. shows the graph of y = x 2. In Fig 2.1.2. wehave a 'straight line' and in Fig. 2.1.3. we have a ' parabola'. The equation (y-b) 2+ (x- a) 2 = 4 when plotted, yields a circle with centre at the point (a,b) and with radius √4 = 2.

CONSTRUCTION OF FUNCTIONS

In business applications we commonly talk of profit functions, loss functions, cost functions and revenue functions. The functions are usually set up following the definition and calculation of the functional values. We will take up a few examples to illustrate theprocedure of constructing such functions.

Example

A banana seller buys bananas at R1 rupees and sells at R2 rupees (R2> R1). The unsold fruits at the end of the day are sold at R3 rupees (R3 < R1). What is the profit function for the fruit seller?

The profit depends on how many fruits he is able to sell in relation to the quantity of fruits he brought in the beginning of the day. Let D be the demand of fruits at R2 per fruit and Q the stock of fruits.

Example

A factory has 100 items on hand for shipment to a destination at the cost of Rs. 1 a piece to meet a certain demand d. In case the demand d overshoots the supply, it is necessary to meet the unsatisfied demand by purchases on the local market at Rs. 2 a piece. Construct the cost function if x is the number shipped from the factory.

Solution:

Example:

(Quantity Discounts and Price Breaks) A retailer offers the following price breaks on an item.

Rs. 10 per kg for any amount ordered up to 10 KgRs. 10 per kg for any amount ordered up to 10 kg. Rs. 7 per additional kg above 10 kg and up to 100 kgRs. 5 per additional kg beyond 100 kg.

Construct the cost function for x kg ordered.

 Solution :

Let C(x) be the cost function. Then from the definition of the discounts, we have

The graph of this function is given in Fig. 2.2.1.

Page 9: Quantitative Techniques in Management New Book

The curve obtained in Fig, 2.2.1 is not a straight line in the entire domain of x, but it is piecewise linear.

Example:

The simplest model relating revenue (R) in '000s of Rs and advertising expenditure (A) in '000 of Rs can be given as

The difference between revenue and advertising expendinture may be written as

D is a function of A. Suppose, we have three alternative decisions of advertising expenditure.

Which is the best decision?

We calculate D for each of the three decisions and we have

Decision D(in thousands of rupees)

d1 126

d2 150

d3 150

The best decision is clearly either d2 or d3. Both these decisions yield the same value for D. Suppose, we increase the advertising expenditure to 256, then D = 14 which shows that it does not always mean that an increase in advertising expenditure leads to an increase in net revenue, D.

LINEAR AND QUADRATIC FUNCTIONS

A linear function is defmed as y = f(x) = ax + b, where a and b are given real numbers and x is a variable taking all numerical values in an interval. It is called linear in x because the graph of such a function is a straight line (Fig. 2.3.1.).

Also, note that in the definition of y = ax + b, the power of x is 1. The straight line cuts the X-axis at a distance -b/a units from the origin and the Y -axis at a distance of b units from the origin. In Fig. 2.1.2., the straight line y = 2x -3 has an intercept equal to -3 on the Y-axis an 3/2 on the Y-axis and 3/2 on the X-axis.

A quadratic function is defined by

where a, b and c are any real numbers. Here the maximum power of x = 2. Figure 2.1.3. gives the quadratic function y = x2. In general, a function of the form

y = a1xn + a2xn-1 + ... +anx+an+1

where aI's are real numbers, aI ≠ 0. and n is a positive integer is called polynominal of degree n. Thus, a polynominal of degree 1 is a linear function and that of degree n.

Page 10: Quantitative Techniques in Management New Book

Thus, a polynominal of degree 1 is a linear function and that of degree 2 is a quadratic function.

Fig 2.3.1

Example

Assume that for certain values of the advertising expenditure, the sale is a linear function of the expenditure. It is known that for an advertising expenditure of Rs.50,000, the sales would be Rs.4,00,000 and if no advertising expenditure is incurred the sales would beRs.2,00,000. Construct the sales function.

Since y is assumed to be linear in x, let y = ax + b

When x = 0, we are given that y = 200, y = ax + b, for x = 0 give y = b. Hence b = 200.

The sales function is therefore given by

Example

The demand for a certain item is given by

where q denotes the amount demanded and p the price per unit. It costs Rs. 4 to produce each unit. What is the profit function of the firm for this item?

which is a quadratic function in p.

Example:

(Depreciation calculation)

Consider an asset costing C. At the end of n years it has a scrap value of S. The difference C -S is the depreciation of the asset in n years. In financial planning, a firm has to provide for depreciation and has to set apart a certain amount every year to account for depreciation There are several methods of determining this amount. In the straight-line method of calculating the annual contribution to the depreciation found, we assume that the fund earns no interest. Further, equal contributions are to be made at the end of each year throughout the life of the asset. The formula for the annual connibution is then

Page 11: Quantitative Techniques in Management New Book

The depreciation fund F at the end of the kth year is F = kR. This is a linear function of k and hence is a straight line though we calculate it only for integral k. The book value of an asset on a given date is the differcnce between the original cost and the amount in the depreciation fund at that time. Denoting the book value at kth year by BV(k), then

This is also a straight line as a function of k. Suppose for example c = 25.000, S = 5,000, n = 10. Then

The depreciation schedule and the book value are given in Table 2.3.1.

Figure 2.3.2. shows the graphic representation of the depreciation by the straight-line method

Other methods of depreciation calculations are given in inter chapters of the book.

SOME SPECIAL FUNCTIONS

Absolute Value Functions

First, we define the absolute value of a real numbers, denoted by | x |.

Thus I x I is the magnitude of the number x without caring for its sign.. Thus, I 8 I = 8 and |-3| = 3. The function f(x) = I x l is called the absolute value function. The graph of the absolute value function is shown in Fig. 2.4.1.

Step Functios

A function f(x) that takes a constant valuc for values of x within an interval but possibly takes djfferent values in djfferent intervals is called a step functjon.  

Fig. 2.4.1

Example 2.4.1.

The unit price of a commodity is shown below which depends on the quantity ordered.

The graph of the function giving the unit price in terms of quantity ordered is shown in Fig.2.4.2.

Page 12: Quantitative Techniques in Management New Book

Convex Sets and Convex Functions

Convex Set Consider a Set S of points in the tWo -dimensional plane (Figs. 2.4.3. and 2.4.4.).

Take any two points (x1, y1) and (x2, y2) falling within the sets. The line segment joining (x1, y1) and (x2, y2) is the set of all points lying on the straight line joining these two points. If the line segment is also wholly contained in the set S, we call the set S a convex set. In Fig. 2,.4. 4, S is not a convex set since the line segment joining the points within the set as shown in the figure is not wholly contained in the set Loosely speaking, a convex set cannot have holes in it.

Given two points (x1, y1) and (x2, y2) how does one represent the co-ordinates of a point lying on the line segment joining (x1, y1) and (x2,y2)? A typical point s given by the co-ordinates.

where Δ is a number with 0 = Δ = 1. Note that when Δ = 0, this representation gives the point (x2, y2) and when Δ = 1, we get the point (x1, y1). In particular when Δ =1/2, the corresponding point on line segment is

which is the mid point of the line segment joining (x1, y1) and (x2, y2). Examples of convex sets are a circle and a triangle. There are other examples which are important in operations research and economics and these will be taken up later in the book.

Convex Function. In some areas, e.g. operations research and optimization problems, a class of functions called convex functions assumes a great importance. We give the definition and a few basic properties.

Definition: A function f(x) defined over a convex set S is said to be a convex function if for any two points x1, x2 lying in S and for any 0 ≤ Δ ≤1

Geometrically, this means that the curve f(x) lies below the chord joining the two points x1and x2 (Fig 2.4.5.). .

  

 

Page 13: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 3 : CALCULUS AND MANAGERIAL APPLICATION

   

  Many of the decisions facing managers fall into the category of optimisation problems. For example decisions relating to maximizing profit or minimizing cost clearly involve optimization. Often such problems can be solved graphically or by using algebra. In other cases, however, the solution requires the use of calculus. More importantly, in many cases calculus can be used to solve such problems more easily and with greater insight into the economic principles underlying the solution.

Consider a firm whose total revenues from sales are given by the function

Where Q represents the rate of output. Assume that the total cost of producing any rate of output is given by the equation

Given the firm's objective of maximizing profits (i.e. total revenue minus total cost) how much output should be produced? One approach would be to graph both the total revenue and total cost functions. The vertical distance between the two functions is profit. By identifying that point where this vertical distance is the largest the profit-maximized by producing eight units of output (Q = 8).

An alternative approach is to develop a table showing revenue, cost and profit at each rate of output. Such data are provided in Table 2A-1. The tabular method has the advantage of being somewhat more precise. That is. at an output rate of 8, total revenue is 96, total cost is 82, and profit is 14. From the graph in Figure 2A-1 these values can only be approximated.

Fig. 2A-1

The problem is that neither method is very efficient. What if the profit-maximizing output level is 8,000 or 80,00,000 instead of 8? The answer could be found using either approach. but fInding that answer could have taken considerable time. What if there had been two outputs (Q1 and Q2) and three inputs (land, labour and capital)? In this case, there would be no practical way to determine profit- maximizing output rates for Q1 and Q2 using these methods.

A more powerful technique is needed so that the solution process can be both precise and straightforward. Elementary calculus is easily adapted to optimization problems in economics. Indeed, a few basic principles can be used in many different kinds of problems. The profit-maximization problem just outlined could have been solved quickly using the most elementary calculus. In the following pages, some basic principles of calculus are outlined and their application to economic problems is demonstrated.

The Derivative of a Function

From algebra recall that for the function y = f(x), the slope of that function is the change in y (denoted by Δy) divided by a change in x (i.e.Δ x). The slope sometimes is referred to as the rise (the change in the variable measured on the vertical axis) over the run(the change in Ihe variable measured on the horizontal axis). The slope is positive if the curve slopes upward from left to righl and negative if the function slopes downward from left to right. A horizontal line has a zero slope and a vertical line is said to have an infiniteslope. For a positive change in x (i.e. Δx > 0), a positive slope implies that Δy is positive and a negative slope implies that Δy is negative.

The function y= 10 + x2 is graphed in Figure 2A -2. To deternine the average slope of this function over the range x=1 to x = 2, first find the

Page 14: Quantitative Techniques in Management New Book

couesponding y values. If x1 = 1 then y1= 11, and if x2 = 2 then y2= 14. Then the slope is found by using the formula.

In reality, this meiliod determines the slope of a straight line through the points a and b in Figure 2A -2.

Thus, it is only a rough approximation of the slope of the function y = 10 + x2, which actually changes at every point on that function. By making the interval smaller, a better estimate of the slope is determined. For example, consider the slope over the interval x = 1 to x = 1.1. If x1 = 1, then y1 = 11 and x1 = 1.1 implies that y2 = 11.21. Thus the slope is

By using calculus, the exact slope at any point on the function can be determined. The first derivative of a function (denoted as dy/dx) is simply the slope of a function when the interval along the horizontal axis is infinitesimally small. Technically, the derivative is the limit of the ratio as Δy / Δx as Δx approaches zero, that is,

Thus the calculus term dy/dx is analogous to is analogous to Δy / Δx, but dy/dx is the precise slope at a point, whereas Δy / Δx is the average slope over an interval of the function. The derivative can be thought of as the slope of a straight line drawn tangent to the funcrion at that point. For example, the slope of the function y = 10 + x2 at point a in Figure 2A -2 is the slope of the straight line drawn tangent to that function at A. the derivative of y = f(x) is sometimes written as f'(x).

What is the significance of this concept for management economics? Recall from the discussion of total and marginal relationships that the marginal function is simply the slope of the total function. Calculus offers an easy way to find the marginal function by taking the first derivative of the total function. Calculus also offers a set of rules for using these derivatives to make optimizing decisions such as minimizing cost or maximizing profit.

Standard calculus texts present numerous formulas for the derivative of various functions. In the hands of the skilled mathematician, these formulas allow the derivative of virtually any function to be found. However, only a few of these rules are necessary to solve most of the relevant problems in managerial economics. In this section each of these basic rules is explained and its use demonstrated.

The Derivative of a Constant

The derivative of any constant is zero. When plotted, the equation of a constant (such as y = 5) is a horizontal line. For any Δx, the change in y is always zero. Thus for any equation y = a, where a is a constant.

The Derivative of a constant Times a Function

The derivative of a constant times a function is that constant times the derivative of the function. Thus the derivative of y = af(x), where a is constant, is

Page 15: Quantitative Techniques in Management New Book

For example, if y = 3x, the derivative is

The Derivative of a Power Function

For the general power function of the form y = axb, the derivative is

The function y =x2 is a specific case of a power function where a =1 and b = 2. Hence the derivative of this function is

The interpretation of this derivative is that the slope of the function y =x2 at any point x is 2x. For example, the slope at x = 4 would be found by substituting x = 4 into the derivative. That is

Thus when x = 4, the change in y is 8 times a small change in x. The function y = x2 is shown in Figure 2A-3. Note that the slope changes continually. The slope at x = 4 is 8. As x increases, the slope becomes steeper. For negative values of x, the slope is negative. For example, if x = -3, the slope is -6.

The Derivative of a sum or Difference

The derivative of a function that is a sum or difference of several terms is the sum for difference of the derivatives of each of the terms. That js, if y = f(x) + g(x), then

For example, the derivative of the function

Y = 10 + 5x + 6x2

Is equal to the sum of the derivatives of each of the three terms of the righthand side. Note that the rules for the derivative of a constant, a content times a function. and a power function must be used. This

The Derivative of the Product of Two Functions

Given a function of the form

That is, the derivative of the product of two functions is the derivative of the first function times the second function plus the first function times the derivative of the second.

Page 16: Quantitative Techniques in Management New Book

For example, the derivative of the function

The Derivative of a Quotient of Two Functions

For a function of the form y = f(x) / g(x), the derivative is

Given the function

The derivative would be

The Derivative of a Function of a Function

The function

Is really two functions combined. That is, by writing

It is seen that y is a function of a function. That is

This derivative is the derivatjve of y with respect to u and multiplied by the derivative of u with respect to x, or

Now using the rule for the derivative of a power function yields

Consider another example

Which can be rewritten as

In this case

So the solution is

Page 17: Quantitative Techniques in Management New Book

Substituting x5 + 2x + 6 for u and multiplying the two derivatives just given yields.

These seven rules of differentiation are sufficient to determine the derivatives of all the functions used in this book. However, sometimes two or more of the rules must be used at the same time.

Example

Finding the Marginal Function

Given a total revenue function

And a total cost function

Find the marginal revenue and marginal cost functions.

Solution:

Recall that a marginal function is simply the slope of the corresponding total functions. For example, marginal revenue is the slope of total revenue. This, by finding the derivative of the total revenue function, the marginal revenue function will be obtained.

Similarly, the marginal cost function will be found by taking the rust derivative of the total cost function:

KEY CONCEPTS

The slope of a function y = f(x) is the change in y (i.e., Δ y) divided by the corresponding change in x (i.e .Δ x) For a function y = f(x), the derivative, written as dy/dx or f (x). is the slope of the function at a particular point on the function.

Equivalently, the derivative is the slope of a straight line drawn tangent to the function at that point. By using one or more of the seven formulas outlined in this appendix, the derivative of most functions encountered in managerial

economics can be found.

Higher -Order Derivatives

The derivative of a function sometimes is called the first derivative to indicate that ther are higher -order derivatives. The second derivative of a function is simply the first derivative of the first derivative, it is written d2y / dx2 or f ". In the context of economics, the first derivative of a total function is the marginal function. The second derivative of the total function is the slope of the first derivative or the rate at which the marginal function is changing.

Higher-order derivatives are easy to find. One simply keeps taking the first derivative against. Given the function.

Page 18: Quantitative Techniques in Management New Book

The second derivative has an important application in finding the maximum and / or minimum of a function. This concept is explained in the following section.  

 

Page 19: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 4 : CALCULUS AND OPTIMIZATION

 

  Recall from the discussion of total and marginal relationships that if the marginal function is positive, the total function must be increasing, if the marginal function is negative, the total function must be decreasing. It was shown that if the marginal function is zero,then the total function must be at either a maximum or a minimum. In Figure 2A-4, a total function and its associated marginal function are shown. At point a, which corresponds to x = x1, the total function is at a maximum and the marginal function is zero. A point , corresponding to x= x2, the total function is minimized and the marginal function is again zero. Thus the marginaI curve is zero at both x1 and x2. However, note the difference in the slope of the marginal function at these points. At x1, the marginal curve has a negative slope whereas at x2, the slope is positive.

Fig 2A-4

Because the total function is at a maximum or a minimum (i.e., an extremism) when its slope is zero, one way to find the value of x that results in a maximum or a minimum is to set the first derivative of the total function equal to zero and solve for x. This is a better approach than the triaI-and-error method used earlier. In that example, a total revenue function,

And a total cost function

were given. The problem was to find the rate of output, Q, that maximized profit. The total profit function (π) is found by subtracting total cost from total revenue.

The profit function will have a slope of zero where that function is at a maximum and also at its minimum point. To fund the profit-maximizing output, take the fIrst derivative equal to zero, and solve for output. That is, find the rate (or rates) of output where the function is zero.

But will this output rate result in a profit maximum or a profit minimum? Remember, setting the first derivative equal to zero and solving results in an extremism, but it could be a maximum or a minimum. For Figure 2A-4 shows that if the total function is maximised, the marginal function has a negative slope. Conversely, a minimum point on a total function is associated with a positive slope of the marginal function.

Because the slope of the marginal function is the first derivative of the marginal function, a simple test to determine if a point is a maximum or minimum if suggested. Find the second derivative of the total function and evaluate it at the suggested. Find the second derivative of the total function, and evaluate it at the point where the slope of the total function is zero. If the second derivative is negative (i.e. the marginal function is decreasing) the total function is at a maximum, if the second derivative is positive at that point (i.e. the marginal function is increasing), the total function is at a minimum point.

In the profit-maximization problem just discussed, the second derivative of the profit function is -2, which is negative. Therefore, profit is maximized at Q = 8. When finding the extremum of any function, setting the first derivarive equal to zero is called the first-order condition. meaning that this condition is necessary for an extremum but not sufficient to determine if the function is at a maximum or a minimum. The test for a maximum or a minimum using the second derivative is called the second-order condition. The first- and second-order conditions together are said to be sufficient to test for either a maximum or a minimum point. These conditions are summarized as follows:

Page 20: Quantitative Techniques in Management New Book

 

Fig 2A-5 : Graph of a function having Several Extrema

In some problems there will be two or possibly more points where the first derivative is zero. Therefore, all these points will have to be evaluated using the second order condition to test for a minimum or a maximum. As shown in figure 2A-5, a function could have several points, such as a,b,c, and d. where the slope is zero. Points a and c are relative maximun and b and d are relative minimum. The point a is a maximum relative to other points on the function around it. It is not the maximun point for the entire function because the value of y at point c exceeds that at point for the entire function because the value of y at point c exceeds that at point a. To find the absolute maximum the value of the equation must be determined for all relative maxima within the range that the function is defined and also at each of the end points.

The Partial Derivative

Many economic phenomena are described by multivariate functions (i.e. equations that have two or more independent variables). Given a general multivariate function such as

The first partial derivative of y with respect to x, denoted as dy/dx or f(x) indicates the slope relationship between y and x earn z is held constant. This partial derivative is found by considering z to be fixed and taking the derivative of y with respect to x in the usual way. Similarly, the partial derivative of y with respect to z (i.e. dy/dz or fx) is found by considering x to be a constant and taking the first derivative of y with respect to z.

For example, consider the function

To find the partial derivative dy/dx or fx, consider z as a fixed and take the derivative, Thus

This partial derivative means that a small change in x is associated with y changing at the rate 2x + 3z when z is held constant at a specified level. For example, if z = 2, the slope associated with y and x is 2x + 3(2) or 2x - 6. If z = 5, the slope associated with y and x would be 2x + 3(5), or 2x + 15.

Similarly, the partial derivative of y with respect to z, would be

This means that a small change in z is associated with y changing at the rate 3x + 2z when x is held constant.

Optimization and Multivariate functions

The approach to finding the maximum or minimum value of a multivariate function involves three steps. First find the partial derivative of the function with respect to each independent variable. Second, set all the partial derivatives equal to zero. Finally, solve the system of equations determined in the second step for the values of each of the independent variables. That is, if

Page 21: Quantitative Techniques in Management New Book

Then the partial derivatives are

Setting these derivatives equal to zero

-2x + Z = 0

-2 + x + 4z = 0

and solving these two equations simultaneously for x and z yields

x = 2/9

z = 4/9

These values of x and z minimize the value of the function. The approach to testing whether the optimizing solution results in a maximizing solution results in a maximum or minimum for a multivariate function is complex and beyond the scope of this book. In this text, the context of the problem will indicate whether a maximum or minimum has been determined.

KEY CONCEPTS

Higher-order derivatives are found by repeatedly taking the first derivative of each resultant derivative.

The maximum or minimum point of a function y = f(x) can be found by setting the first derivative of the function equal to zero and solving for the value or values of x.

When the first derivative of a function is zero, the function is at a maximum if the second derivative is negative or at a minimum if the second derivative is positive.

For a function having two or more independent variables (e.g. y = f(x) = -z). the partial derivative dy/dx is the slope relationship between y and x, assuming z to be held constant.

Optimizing a multivariate function requires setting each partial derivative equal to zero and then solving the resulting system of equations simultaneously for the values of each independent variable.

PROBLEMS

2A - 1. Determine the first and second derivatives of each of the following functions

2A - 2. Determine all the first order partial derivatives for each of the following functions.

2A - 3. Given the multivariate function

Determine the values of X and Z that maximize the function.

2A - 4. The total revenue (TR) function for a firm is given by

Page 22: Quantitative Techniques in Management New Book

Where Q is the rate of output per period. Determine the rate of output that results in maximum total revenue. (Be sure that you have maximised not minimised total revenue).

2A - 5. Smith and Wesson have written a new managerial economics book for which they receive royalty payments of 15 percent of total revenue from sales of the book. Because their royalty income is tied to revenue, not profit they want the publisher to set the price so that total revenue is maximised. However, the publisher's objective is maximum profit. If the total revenue functions is

and the total cost function is determine

a. The output rate that will maximize total royalty revenue and also Ihe amount of royalty income that Smith and Wesson would receive.

b. The output rate that would maximize profit to the publisher. Based on this rate of output what is the amount of royalty income that Smith and Wesson would receive? Compare the royalty income of Smith and Wesson to that determined in part

(a) (HINT: first determine a function for total profit by subtracting the cost function from the total revenue function).

2A-6. A firm had determined that its anual profits depend on the number or salespersons it employees and the amount spent on advertising. Specifically the relationship between profits, π(in millions), salespersons S (in thousands) and advertising expenditures,A (in millions), is

Determine the number of salespersons and the amount of advertising expenditures that would maximize the firm's profits.  

 

Page 23: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 5 : MATRICES AND DETERMINANTS

   

  MATRICES: DEFINITION AND NOTATIONS

A matrix is an array of m x n numbers arranged into m rows and n columns.

Let these numbers be denoted by a ij (I= 1,2,------m) (J = 1,2, ------n) The matrix with m rows and n columns can be written as

A matrix having m rows and n columns is called a matrix of order m x n.

(read: m by n ) The individual entries of the array, aij, is, are termed as the elements of the matrix A.

A matrix is indicated by enclosing an array of numbers by tbe parentheses [ ] or () or II.

In order to locate an element of a matrix, one has to specify the raw and the colwnn to which it belongs. For example, the element a34 lies in the third row and fourth column.

Representation of Data in Matrix form

Matrices can be used to present a given set of data in a compact form as shown below:

1. The processing time (in hours) of the two products that pass through three processes:

2. The following matrix gives the transportation cost (in Rs. Per unit) from each of the three warehouses to each of the five distribution points:

Distribution Point

3. The following matrix gives the productive capacity, the maximum units that can be produced per week, of a manufacuring company producing two goods A and B, each of which requires stamping, assembly and painting operations.

Productive Capacity (Units / Week)

4. The following matrix gives the distance ( in kms) by train, berween four metropolitan cities of India.

Page 24: Quantitative Techniques in Management New Book

5. The following matrix gives the input requirements (in rupee) of each industry from other industries (including itself) to produce a rupee worth of output

Input Receiving Industry

SPECIAL TYPES OF MATRICES

(i) Rectangular Matrix

A matrix consisting of m rows and n columns, where m≠ n. is called a rectangular matrix, For example,

(ii) Square Matrix

When the number or rows of a matrix is equal to its number of columns, it is said to be a square matrix.

(iii) Row matrix

A matrix having only one raw is called a row matrix or row vector. For example. [4 1 2 7] is a 1 x 4 matrix or row matrix having 4 elements.

(iv) Column Matrix

A matrix ha\wg only one column is called a column matrix or column vector.

For Example,

is 3 x 1 matrix or column or column matrix having 3 elements

(v) Diagonal Matrix

A square matrix a = (aij) n x n is said to be diagonal matrix if aij= 0 for I=j;

The elements aij of matrix A. for I = j are called the diagonal elements and the line along which they lie is called the principal diagonal.

(vi) Scalar Matrix

A diagonal matrix in which all its diagonal elements are equal, is called a scalar matrix.

The matrix T =

is a 3 x 3 scalar matrix

Page 25: Quantitative Techniques in Management New Book

(vii) Identity (or Unit) Matrix

A scalar matrix in which all its diagonal elements are unity, is called an Identify matrix,

The Matrix I =

is 3 x 3 identiy matrix

(viii) Null (or Zero) Matrix

A Matrix (square or rectungular) having all its elements equal to zero, is called a null matrix.

The matrix 0 =

is a 2 x 3 null matrix

(ix) Triangular Matrix

A triangular matrix can be : (a) Upper triangular or (b) Lower triangular.

(a) A square matrix A = (aij) is said to be upper triangular if aij = 0 for I > j.

 

(b) A square matrix A = (aij) is said to be lower triangular if aij = 0 for I < j.

Remarks:1. A diagonal matrix is both, an upper and lower triangular.

2. If the diagonal elements of a triangular matrix arc all zero, it is said to be strictly triangular.

MATRIX OPERATIONS

Equality of Matrices

Two matrices, say A and B, are said to be equal if (a) they are of same order, and (b) the elements in corresponding positions of the two are same. Thus two matrices A = (aij) and B = (bij) can be equal if aij= bij for all values of I= 1 to m and j = 1 to n.

Remarks: 1.

cannot be considered for equality because their dimensions are different.

2. If A = a b and X = 3 4 and it is given that A = x. then it implies that a = 3, b = 4, c = 2 and d = 5

3. Two matrices A and B are said to be comparable if they are of same order.

Scalar Multiplication

The multiplication of a matrix A by a scalar k implies the multiplication of every element of A by k.

EXAMPLE 1

Page 26: Quantitative Techniques in Management New Book

SOLUTION

Remarks: if k = -1, then (-) A = -A (-2 - 7) is known as negative Matrix of A

EXAMPLE 2

Cars and Jeeps are produced in two manufacturing units, M1 and M2 of a company. It is known that unit M1 manufacnlres 10 cars and 5 jeeps per day and unit M2 manufactures 8 cars and 9 jeeps per day. Write the above in formation in a matrix form. Multiply this by 2 and explain its meaning.

SOLUTION

Let A denote the required matrix. Let first row of A denote the output of M1 and the second row denote the output of M2. Further, let first column represent the number of cars and the second column the number of jeeps.

Addition of Matrices

Two matrices A and B can be added only if they are comparable (i.e. of same order). Their sum is a matrix C, defined as C = A + B = ( aaij + bij ) m x n

Remarks

1. The matrix C = A + B is obtained by adding the elements in corresponding places of A and B.2. The order of C is same as that of A and B.3. The subtraction of B from A and be defined as A+ (-1) B

EXAMPLE 3

SOLUTION

Example 4

SOLUTION

EXAMPLE 5

Page 27: Quantitative Techniques in Management New Book

SOLUTION

Thus X11 = -7, X12 = -12, X21 = -7, etc

Properties of Matrix Addition

(a) Matrix Addition is Commutative

If A and 8 are two matrices of the same order, then A + B = B + A. As in the case of scalars, this property implies that it is immaterial whether matrix B is added to A or A is added to B.

(b) Matrix Addition is Associative

If A, B and C are three comparable matrices, then A -(B+C) = (A + B) + C = B + (A + C).

According to this property, to add three matrices, we can first add any two of them and then add this result to the third matrix.

(c) Scalar Multiplication is Distributive over Matrix addition

Given a scalar k and matrices A and B of the same order, we can write k(A + B) = kA + kB

(d) Existence of Additive Identity

The null matrix 0 is said to be the additive identity of any matrix A. of the same order, because A +) = A, i.e. the identity of A does not change when 0 is added to it.

(e) Existence of Additive lnverse

If A and B are two comparable matrices such that A + B = 0, where 0 is a null matrix of the same order as that of A or B, then B is said to be additive inverse of a and vice-versa. Obviously, B = -A, i.e. the negative of a matrix is its additive inverse.

(f) Matrix Addition admits Cancellation Law

Matrix Addition, like scalars, admits cancellation law. If A, B and C are three comparable matrices such that A+ C = B + C, then A = B.

Page 28: Quantitative Techniques in Management New Book

Multiplication of Two Matrices

Two matrices A and B are said to be conformable for the product A B if the number of columns of A is equal to the number of rows of B. If A = (aij)mxn and B = (bjk)nxp then the product AB is a matrix C of order m x p such that

Note: If AB is defined then it is not necessary that BA will also be defined.

EXAMPLE 6

SOLUTION

EXAMPLE 7

and B = (pqr) can you find A B and B A ? if yes, find the two products two products

SOLUTION

Since the number of columns of A = number of rows of B, therefore AB is defined. On the similar argument, BA is also defined.

Inner Product

The inner product (or dot product) of these vectors, denoted by X.Y (read as X dot Y) is given by

The essential requirement for the inner product is that the two vectors must have same dirnentions. However, both can be column (or row) vectors or one a row vector and the other a column vector.

EXAMPLE 8

SOLUTION

The inner product is CD = 3 x 1 + 4 x 0 - 2 x 5 = -7

EXAMPLE 9

A firm produces three products A, B and C which it sells in two markets. Annual sales in units are given below:

In the prices per unit of A, B and C are Rs.2.50 Rs. 1.25 and Rs.1.50 and the costs per unit are Rs. 1.70 Rs.1.20 and Rs.0.80 respectively. Find the totaIprofit in each market by using matrix algebra.

Page 29: Quantitative Techniques in Management New Book

SOLUTION

Further, let P and C be the matrices of prices and costs respectively, Thus We can write

Note : P and C can also be written as row matrices.

The total revenue (TR) and Total cost (TC) matrices are given by

Hence profits from markets I and II are Rs. 17.800 and 12,800 respectively.

Note: Alternatively, we can write profit matrix = Q (P-C)

Properties of Matrix Multiplication.

(a) Matrix Multiplication is not, in general, Commutative

If A and B are two matrices confomable for the products AB and BA, then in general AB ≠ BA.

In view of this fact, the terms premultiply and postmultiply are often used to specify the order of multiplication. For example, B A can be obtained either by premultiplication of A by B or by post multiplication of B by A. Matrix multiplication is, however, commutative under the following exceptional circumstances:

i. When one matrix is square and the other an identity matrix of the same order.ii. When one matrix is inverse ( see section 3.6) of the other.

(b) Matrix Multiplication is Associative

If A, B and C are three matrices such that conformability conditions for the product ABC are satisfied, then

ABC = A(BC) = (AB)C

(c) Matrix Multiplication is Distributive over Addition

The distributive law is A(B + C) = AB = AC, provided that the conformability conditions for adilition as well as for multiplication are satisfied.

(d) Existence of Multiplicative identity Matrix

The identity matrix I is said to be the multiplicative identity matrix or simply the identity matrix of any matrix, say A of the same order, becauseIA = AI = A i.e. the identity of A remains unchanged as a result of its multiplication of I.

(e) Existence of Multiplicative Inverse.

If a and b are two scalars, then they are said to be inverse of each other if ab = ba = 1. In a similar way. two square matrices, say A and B. of the same order. are said to be multiplicative inverse of each other if AB = BA = I.

Page 30: Quantitative Techniques in Management New Book

(f) Matrix Multiplication does Dot always admit Cancellation Law:

According to this law, if AB = AC, then it does not necessarily mean that B = C. To illustrate this, consider

Thus AB = AC while B ≠ C

(g) The product of two Matrices can be a Null Matrix with none of them being Null Matrix.

Note

i. If AB = 0, then it is not necessary that BA will also be a null matrix.ii. Properties (f) and (g) hold true only for singular matrices (See section 3,4)

(g) Positive Integral Powers of Matrices

If A is any square matrix, then the product A.A is denoted by A2. Further, we can write A2.A = A3 etc.

We should note that, like real numbers, we have An.Am = Am.An = Am+n for matrices

Note: (i) Using the above equation it is possible to find some matrices which commute with respect to multiplication. If we write B = An and C = Am then obviously, BC = CB

(ii) Matrix A is said to be idempotent if A= A2 =------Am

Transpose of a Matrix

The transpose of a matrix A is a matrix, denoted by A (or A'). obtained by the interchange of its rows and columns. Symbolically,

if A = (aij)mxn then A = (aji)nxm

EXAMPLE 10

Properties of Transpose

i. The transpose of the transposed matrix is the original matrix i.e. (A1)1=A ii. The transpose of the sum of two ( or more ) matrices is equal to the sum of transposed matrices i.e (A+B)' = A' + B'

Thus (A + B)' = A' + B'

i. The transpose of the product of two ( or more) matrices is equal to the product of the transposed matrices in revised order i.e (AB)' = B'A'

Page 31: Quantitative Techniques in Management New Book

Note (I) A matrix A with the property that a = A', is blown as a symmetric matrix.

(ii) A matrix A with the property that A = -A. is known as skew symmetric matrix.

DETERMINANTS AND NON -SINGULARITY

Let X, Y and Z be in -dimensional (row or column) vectors. If we can write W = aX + bY + cZ, where a, b and c are scalars. then W is said to be a linear combinabon of the vector X, Y and Z or in other words the set of vectors X, Y, Z and W are said to be linearly dependent. Contrary to this, if none of these vectors can be expressed as a linear combination of the others. the vectors are said to be linearly independent.

The condition for Linear Independence.

A set of vectors V1,V2, ---Vm is said to be linearly independent if for scalars k1,k2,--- km3 the linear combination k1 V1 + k2 V2 + ---- + km Vm = 0 only when all k1 ( I = 1 to m) = 0.

Non-Singular Matrix

A square matrix consisting of linearly independent rows ( or columns) is said to be non- singular.

Let us examine the linear independence of the rows of matrix

To examine the linear independence, we have to find the values of two scalars k1 and k2 such that

This gives the following system of simultaneous equations:

3k1 - k2 = 0-2k1 = 3k1 = 0

On solving these equations simultaneously, we get k1 - k2 = 0

This, the rows of matrix A are linearly independent. Further, since A is a square matrix, with linearly independent rows, it is said to be non-singular.

Note : A square matrix with linearly dependent rows (or columns) is said to be a singular matrix.

The above method of examining linear independence of rows (or columns) becomes very cumbersome when the number of rows (or colunms) becomes large and each rows (or column) contains several elements. Alternatively, an easier method is provided by determinants. The concept of a determinant is discussed below.

Determinant

A determinant is a uniquely defined scalar associated with a square matrix. The determinant of a square matrix A = (a ij)nxn is denoted as det A or (a') or

Determinants of Order one

If a matrix A consists of only one element i.e. A = (a11)' its determinant is defined as (a11). It should be noted here that the concept of determinant is entirely different from the concept of absolute value although the same symbol is used to denote both of them. For example det (-4) = (-4) = -4, however, absolute value of -4, also denoted as (- 4) is 4.

Determinants of Order two

If A =

Page 32: Quantitative Techniques in Management New Book

is a 2 x 2 matrix, its determinant is defined as |A| = = a11.a22.a12.a21

Determinants of Order Three or Higher

Before giving a rule for the evaluation of this determinant, we introduce the concept of Minors and Cofactors

Minors

A minor of an element aij denoted by Mij is a sub-determinant of (A) obtained by determine its i th row and jth column .

Cofactor

A Cofactor of an element aij, denotes by cij, is its minor with appropriate sign Thus, we can write Cij = (-1) I + j M ij

We note that C11 = M 11 but C12 ≠ M12 etc.

Definitions

The determinant of any matrix of order 3 x 3 or higher is defined as the inner product for dot product of any row (or column) of that matrix by the corresponding row (or column) of that matrix by the corresponding row (or column) of cofactors. It can be shown that the result is independent of the choice of the row (or column)

Thus we can write

EXAMPLE 11

Evaluate the following determinants

SOLUTION

1. Writing the inner product of the element of first row and their co-factors we have

i. Since the third row contains two elements as zero, the calculation work will be minimum if we write the determinant as the inner product of the elements of third row and their cofactors

Page 33: Quantitative Techniques in Management New Book

Sarrus Diagram

An alternative method, that is very convenient for writing the value of a 3 x 3 determinant is provided by Sarrus Diagram

This diagram consists of 5 Columns in whic first and second columns of (A) are written as the fourth and fifth columns respectively, as shown below

The elements in the indicated directions are multiplied. From the diagram we can write.

On the basis of this rule, we can write the value of determinant given in example 11 (I) as

Properties of Determinants

1. The value of a determinant is not affected by the interchange of its rows and columns.

1. The interchange of any two rows ( or columns) of a determinant alters its value by a multiple of - 1.

Interchanging the first and second row of | A |considered in property (1) above we have

1. If all the elements of a row ( or colwnn) are multiplied by a scalar k, the determinant thus obtained is k times the original determinant.

by a scalar k, we get

1. The addition (or subtraction) of a multiple of any row (or column) to another row (or column) does not affect the value of the determinant.

1. If any row (or column) is a linear combination of one or more rows (or columns ) or if two or more rows (or columns) are identical, the determinant vanishes.

Consider a determinant in which first and the second columns are identical i.e.

Remarks

This property applies that if two or more rows of a matrix are linearly dependent, its detemlinant will be zero.

Page 34: Quantitative Techniques in Management New Book

Thus, to determine whether a matrix is singular or not, we simply have to evaluate its determinant. Since determinants are defined only for square matrices, therefore we can say that a matrix is non-singular if its determinant is non-zero.

Rank of Matrix

If the determinant of a matrix is zero, it only indicates that its rows (or colwnns) are not linearly independent. When all the rows (or columns) are not linearly independent, we are often interested to know, how many rows (or columns) out of the total number of rows (or columns) are linearly independent? To answer this question, we introduce the concept of rank of a matrix.

Rank of a matrix A is the maximum number of linearly independent rows (or columns) in it.

If A is m x n matrix, then its maximum rank can be m or n whichever is smaller. The maximwn possible rank of an n x n matrix can be n. We may note here that if the rank of an n x n matrix is n, then it must be nonsingular.

Determination of Rank.

(a) By the use of Determinants

The rank of an m x n matrix A is said to be r, if every minor of order greater than r is zero and there is at least one minor of order r which is non-zero.

EXAMPLE 12

SOLUTION

The highest order of the minor is 3. Therefore, we first compute | A |

This implies that there is at least one row ( or column) which is linearly dependent and hence rank of A, to be denoted as p (A), is less than 3.

Further, we consider minors or order 2 x 2. Let us consider the minor

Example 13

SOLUTION

Note that A is a 3 x 4 matrix, therefore its maximum possible rank can be 3. To determine whether p (A) is 3 or not, we find whether there exists at least one minor of order 3 that is non-zero.

Further, consider another minor of order 3.

SOLUTION

Page 35: Quantitative Techniques in Management New Book

First we examine the minors of order 3

Since all the four possible minors of order 3 are zero, the rank of A is less than 3. We now consider minors of order 2.

Hence p (A) = 2.

Note1. The rank of an identity matrix In is n2. The rank of a null mab-ix is zero 3. The rank of a matrix whose every element is a (= 0) is unity 4. P (A) = P (A')

(b) By Elementary Row Operations

It is obvious from example 14 that the method of finding rank of a matrix, by the use of determinants, may become very cumbersome. Alternatively, the rank of a matrix can be obtained by reducing it to an Echelon form (a matrix whose all elements below its diagonal are zero) and then the rank is given by the number of non-zero rows in its echelon form.

To obtain an echelon form of a matrix, we introduce the concept of Equivalent Matrices.

Two matrices A and B are said to be equivalent if one can be obtained from the other by a sequence of elementary row (or column) transformations.

An elementary row ( or column) transformation can be anyone of the following transformations:

(i) The interchange of ith and jib rows (or columns). This transformation is denoted by R ij (or Cij) (ii) the multiplication of ith row ( or column) by a non-zero scalar k, to be denoted as Rs = kR1 (or C1 = kC1). (iii) The addition (or substraction) of k times the jth row (or column) to ith row (or colunm), to be denoted as Rl = R1 + kRj (or C1 = C1 + kC1)

We state, without proof, that Equivalent Matrices have the same rank.

Using elementary row operations, a matrix can be reduced to an echelon form. The rank of the matrix is then given by the number of non-zero rows In Its echelon

Example 15

Solutions

Page 36: Quantitative Techniques in Management New Book

Since tbe number of non-zero rows is 3, therefore p (A) = 3

Example 16

Solutions

hence p (A) = 2

Example 17

Solution

(ii) To get a non zeo diagnal element at the intersectionof second row and second column, apply R24 and write

Thus p(B) = 2

Consistency of a System of Equations

Using matrix algebra, it can be written as the following matrix equation:

The condition for consistency of the above system can be expressed by using the concept of rank of a matrix. The above system of equations is said to be consistent if p (A) = P (A | B). where A I B (read as A augmented B) is a matrix obtained by augmenting A by B as shown below:

Case 1: If A is a square mabix- with | A | ≠ o

Then the rank condition for consistency will always be satisfied. Since

Page 37: Quantitative Techniques in Management New Book

p (A) will be equal to the number of rows ( or columns), the addition of another column to A wil not alter the rank of A I B. Further, since all the rows are linearly independent when I A | = 0, tbe system is consistent and independent and has a unique solurion ( refer to sec. 2.4.)

Case 2 : If p (A) = P (A I B) and I A | = 0, the system is said to be consistent and dependent. Such a system has infinite number of solutions. (refer to Sec 2.4.) .

Summary

This blocks attempts to introduce the concepts of matrices and determinants and their importance in solving real world problems of business. While a matrix is an array of numbers arranged into certain number of rows and columns, a determinant is a scalar associated with a square matrix. Unlike scalars, the basic operations such as addition, subtraction and multiplication can be performed only if certain conditions are satisfied by the participating matrices. Like scalars, division of one matrix by another is not defined. Using matrices, we can write a system of linear equations in a compact form, test their consistency and solve them in an efficient manner.

Review Question

1. What are the dimensions of the following matrices:

1. Indicate whether the following statements are true or false. Give reasons if the statement is false.

1. Perform the indicated operations. State reasons if an operation is not possible.

find A - B, A + B, 4A + 2B, 3A - 3B

1. Find A B and BA, if possible, for the matrices

1. An automobile company has two manufacr:wing plants located at Delhi and Pune. It manufactures scooters and motorbikes at each plant. Each vehicle is produced in three models A, B and C. The following two matrces give the number of Vehicles (in thousands) of each model produced in the two plants during 1997.

i. Write a matrix showing the total production for both in 1997.ii. if the production is increased by 20% in Delhi plant and 10% in Pune plant, writethe matrix for total production for the following year.

1. Show that, for all values of p, q, r and s, the matrices

1. if possible, find a matrix X such that

Page 38: Quantitative Techniques in Management New Book

1. A manufacturer produces three products A, B and C which are sold in Delhi and Calcutta. The annual sales of these products are given by the following matrix

 

If the sale price of the products A, B and C per unit be Rs. 2, 3 and 4 respectively, calculate total revenue from each centre by using matrices.

15. A firm uses three ingredients 10 manufacture two products. A and B. The cost (in Rs.) per kg of each ingredient is given by C = (5.0 12.5 15.0).

The requirement of each ingredient (in kgs) to produce one unit of each product is shown in the following matrix.

find the cost per unit of each products  

 

Page 39: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 6 : COLLECTION OF DATA

   

  OBJECTIVES

After studying this unit, you should be able to:

appreciate the need and significance of data collection distinguish between primary and secondary data know different methods of collecting primary data design a suitable questionnaire edit the primary data and know the source of secondary data and its use understand the concept of census vs. sample.

INTRODUCTION

To make a decision in any business situation you need data. Facts expressed in quantitative form can be termed as data. Sucess of any statistical investigation depends on the availibility of accurate and reliable data.These depend on the appropriateness of the method chosen for data collection. Therefore, data collection is a very basic activity in decision-making. In this unit, we shall be studying the different methods that are used for collecting data. Data may be classified either as primary or secondary.

DESIGNING A QUESTIONNAIRE

The success of collecting data through a questionnaire depends mainly on how skillfully and imaginatively the questionnaire has been designed. A badly designed questionnaire will never be able to gather the relevant data. In designing the questionnaire, some of the important points to be kept in mind are:

Covering letter Number of questions should be kept to the minimum Questions should be simple, short and unambiguous Questions of sensitive or personal nature should be avoided Answers to questions should not require calculations Logical arrangement Cross-check and Footnotes

EDITING PRIMARY DATA

Once the questionnaires have been filled and the data collected, it is neccessary to edit this data. Editing of data should be done to ensure completeness, consistency, accuracy and homogeniety.

Completeness Consistency Accuracy Homogeniety

SOURCES OF SECONDARY DATA

Page 40: Quantitative Techniques in Management New Book

The sources of secondary data may be divided into two broad categories, published and unpublished

Published Sources Unpublished Sources

PRECAUTIONS IN THE USE OF SECONDARY DATA

A careful scrutiny be made before using published data. The user should be extra cautious in using secondary data and he should not accept it at its face value. The reason may be that such data is full of errors because of bias, inadequate, sample size, errors of definitions and computational errors etc. Therefore, before using such data, the following aspects should be considered

Suitability Reliability Adequacy

SUMMARY

Statistical data is a set of facts expressed in quantitative form. The use of facts expressed as measurable quantities can help a decision maker to arrive at better decisions. Data can be obtained through primary source or secondary source. When the data is collected by the investigator himself, it is called primary data. When the data is collected by others it is known as secondary data. The most important method for primary data collection is thorugh a questionnaire. A Questionnaire refers to a device used to secure answers to questions from the respondents. Another important distinction in considering data is whether the values represent the complete enumeration of some whole, known as population or universe, or only a part of the population, which is called a sample.

FURTHER READINGS

Clark, T. C and E W Jordan, 1985. Introduction to Business and Economic Statistics, South - Western Publishing Co. : Ohio

Eenns, P G, 1985, Business Statistics, Richard D, Irwin Inc. Homewood. Gupta, S P. and M P Gupta, 1958, Business Statistics, Sultan Chand & Sons: New Delhi.

Levin, R I 1979, Statistics for Management, Prentice Hall of India: New Delbi Moskowitz, H and G P Wright, 1985, Statistics for Management and Economics, Charles E, Merill Publishing company: Ohio.  

 

Page 41: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 7 : PRESENTATION OF DATA

   

   OBJECTIVES

After studying this unit, you should be able to :

understand the need and significance of presentation of data know the necessity of classifying data and various types of classification construct a frequency distriburion of discrete and continuous data present a frequency distribution in the form of bar diagram, histogram, frequency polygon. and ogives.

STRUCTURE

Summary Key Words Self-assessment Exercises Further Readings

SUMMARY

Presentation of data is provided through tables and charts. A frequency distribution is the principal tabular summary of either discrete or continuous data. The frequency distribution may show actual, relative or cumulative frequencies. Actual and relative frequencies may be charted as either histogram (a bar chart) of a frequency polygon. Two graphs of cumulative frequencies are: less than ogive or more than ogive.

KEY WORDS

Bar Chart is a thick line where the length of the bars should be proportional to the magnitude of the variable they present.

Class Interval represents the width of a class.

Class Limits denote the lowest and highest value that can be included in the class.

Continuous Data can take all valued of the variable.

Discrete Data refers to quantitative data that are limited to certain numerical values of a variable. Frequency Distribution is a tabular presentation where a number of observations similar or closely related values are put in groups.

Qualitative Data is characterised by exhaustive and distinct categories that do not posses magnitude.

Quantitative Data possess the characteristic of numerical magnitude.

SELF-ASSESSMENT EXERCISES

1. Explain the purpose and methods of classification of data giving suitable examples. 2. What are the general guidelines of forming a frequency distribution with particular reference to the choice of class intervals and

number of classes? 3. Explain the various diagrams and graphs that can be used for charting a frequency distribution. 4. What are ogives? Point out the role. Discuss the method of constructing ogives with the help of an example.

Page 42: Quantitative Techniques in Management New Book

5. The following data relate to the number of family members in 30 families of a village.

Classify the above data in the form of a discrete frequency distribution.

6. The profits (Rs.Lakh) of 50 companies are given below:

Classify the above data taking first class as 10-20 and form a frequency distribution.

7. The income (Rs.) of 24 employees of a company are given below.

Form a continous frequency distribution after selecting a suitable class interval.

8. Draw a histogram and a frequency polygon from the following data.

9. Go through the following data carefully and then construct a histogram.

10. The following data relating to sales of 100 companies is given below:

Draw less than and more than ogives. Determine the number of companies whose sales are (i) less than Rs.13 lakhs (ii) more than 36 lakhs and (iii) between Rs. 13 lakhs and Rs.36 lakhs.

FURTHER READINGS

Clark. TC and E W Jordan. 1985. Introduction to Business and Economic Statistics, South - Western Publishing Co: Ohio, USA

Enns, P G.,1985. Business Statistics, Rjcbard D. Irwin Inc. Homewood.Gupta, S P and M P Gupta. J 988. Business Statistics, Sultan Chand & Sons: New Delhi.

Levin. R.I., 1979. Statistics for Management, Prentice Hall of India: New Delhi. Moskowitz, H. and G.P. Wright, 1985. Statistics for Management and Economics, Charles E.Merill Publishing company: Ohio, USA.

  

 

Page 43: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 8 : MEASURES OF CENTRAL TENDENCY

   STRUCTURE

Introduction Significance of Measures of Central Tendency Properties of a Good Measure of Central Tendency Arithmetic Mean Mathematical Properties of Arithmetic Mean Weighted Arithmetic Mean Median Mathematical Property of Meman Quantiles Locating the Quantiles Graphically Mode Locating the Mode Graphically Relationship among Mean, Median and Mode Geometric Mean Harmonic Mean

INTRODUCTION

With this unit, we begin our format discussion of the statistical methods for summarising and describing numeral methods for summarising and describing numeral data. The objective here is to find one representative value which can be used to locate and summarise the entire set of varying values. This one value can be used to made many decisions concerning the entire set. We can define measures of central tendency (or location) to find some central value around which the data tend to cluster.

SIGNIFICANCE OF MEASURES OF CENTRAL TENDENCY

Measures of central tendency i.e. condensing the mass of data in one single value enable us to get an idea of the entire data. For example, it is impossible to remember the individual incomes of millions of earning people of India. But if the average income is obtained, we get one single value that represents the entire population. Measures of central tendency also enable us to compare two or more sets of data to facilitate comparison. For example, the average sales figures of April may be compared with the sales figures of previous months.

PROPERTIES OF A GOOD MEASURE OF CENTRAL TENDENCY

A good measure of central tendency should possess, as far as possible, the following properties,

i. It should be easy to understandii. It should be simple to computeiii. It should be based on all observationsiv. It should be uniquely definedv. It should be capable of further algebraic treatmentvi. It should not be unduly affected by extreme values.

Page 44: Quantitative Techniques in Management New Book

Following are some of the important measures of central tendency which are commonly used in business and industry.

Arithmetic Mean Weighted Arithmetic Mean Median Quantiles Mode Geometric Mean Hannonic Mean

ARITHMETIC MEAN

The arithmetic mean (or mean or average) is the most commonly used and readily understood measure of central tendency, In statistics, the term average refers to any of the measures of central tendency. The arithmetic mean is defined as being equal to the sum of the numerical values of each and every observation divided by the total number of observations, Symbolically it can be represented as:

Where ∑ X indicates the sum of the values of all the observations, and N is the total number of observation. For example, let us consider the monthly salary (Rs) of 10 employees of a fIrm,

2500, 2700, 2400, 2300, 2550, 2650, 2750, 2450, 2600, 2400

If we compute the arithmetic mean, then

Therefore, the average monthly salary is Rs.2530

We have seen how to compute the arithmetic mean for ungrouped data. Now let us consider what modifications are necessary for grouped data. When the observations are classified into a frequency distribution the midpoint of the class interval should be treated as the representative average value of that class. Therefore, for grouped data, the arithmetic mean is defined as

Where X is midpoint of various classes, f is the frequency for corresponding class and N is the total frequency. i.e. N =∑ f

This method is illustrated for the following data which relate to the monthly sales of 200 firms.

For computation of arithmetic mean, we need the following table.

Hence the average monthly sales are Rs.510

To simplify calculations, the following formula for arithmatic mean may be more convenient to use.

This formula makes the computations very simple and takes less time. To apply this formula, let us consider the same example discussed earlier and shown again in the following table.

Page 45: Quantitative Techniques in Management New Book

It may be observed that this formula is much faster than the previous one and the value of arithmetic mean remains the same.

MATHEMATICAL PROPERTIES OF ARITHMETlC MEAN

Because the arithmetic is defmed operationally, it has several useful mathematical properties. Some of these are :

I) The sum of the deviations of the observations from the arithmetic mean is always zero. Symbolically, it is:

1. It is because of this property that the mean is characterized as a point of balance, i.e, the sum of the positive deviations from mean is equal to the sum of the negative deviations from mean.

2. The sum of the squared deviations of the observations from the mean is minimum. i.e., the total of the squares of the deviations from any other value than the mean value will be greater than the total sum of squares of the deviations from mean. Symbolically,

The arithmetic means of several sets of data may be combined into a single arithmetic mean for the combined sets of data. For two sets of data, the combined arithmetic mean may be defined as

If we have to combine three or more than three sets of data, then the same formula can be generalised as

The arithmetic mean has the great advantages of being easily computed and readily understood. It is due to the fact that it possesses almost all the properties of a good measure of central tendency. No other measure of central tendency possesses so many properties. However, the arithmetic mean has some disadvantages. The major disadvantage is that its value may be distorted by the presence of extreme values in a given set of data. A minor disadvantage is when it is used for open-end distribution since it is difficult to assign a midpoint value to the open - end class.

Activity A

The following data relate to the monthly earnings of 428 skilled employees in a big organisation.

WEIGHTED ARITHMETIC MEAN

The arithmetic mean, as discussed earlier, gives equal importance (or weight) to each observation. In some cases, all observations do not have the same importance. When this is so, we compute weighted arithmetic mean. The weighted arithmetic mean can be defined as

W are the weights assigned to the variable X.

You are familiar with the use of weighted averages to combine several grades that are not equally important. For example, assume that the grades consists of one fmal examination and two mid term assignments. If each of the three grades are given a different weight, then the procedure is to multiply each grade (X) by its appropriate weight (W). If the final examination is 50 per cent of the grade and each mid term assignment is 25 per cent, then the weighted arithmetic mean is given as follows. :

Suppose you got 80 in the final examination, 95 in the first mid term assignment, as 85 in the second mid term assignment then

Page 46: Quantitative Techniques in Management New Book

The following table shows this computation in a tabular form which is easy to employ for calculation of weighted arighmetic mean.

The concept of weighted arithmetic mean is important because the computation is the same as used for averaging ratios and determining the mean of grouped data.

Weighted mean is specially useful in problem relating to the construction of index numbers.

Activity B

A contractor employees three types of workers: male, female and childrens. He pays Rs.40, Rs.30, and Rs.25 per day to a male, female and child worker respectively. Suppose he employs 20 males, 15 females, and 10 children. What is the average wage per day paid by the contractor? Would it make any difference in the answer if the nwnber of males, females, and children employed are equal? Illustrate.

MEDIAN

A second measure of central tendency is the median. Median is that value which divides the distribution into two equal parts. Fifty per cent of the observations in the distribution are above the value of median and other fifty per cent of the observations are below this value of median. The median is the value of the middle observation when the series is arranged in order of size or magnitude. If the number of observations is odd, then the median is equal to one of the original observations. If the number of observations is even, then the median is the arithmetic mean of the two middle observations. For example, if the income of seven persons in rupees is 1100, 1200, 1350, 1500, 1600, 1800, then the median income would be Rs. 1500. Suppose one more pcrson joins and his income is Rs. 1850, then the median income of eight persons would

(Since the number of observations is even the median is the arithmetic mean of the 4th and the 5th person)

For grouped data, the following formula may be used to locate the value of the median.

Where L is the lower limit of the median class, pef is the preceding cumuIative frequency to the median class, F is the frequency of the median class and i is the size of the median class.

As an illustration, consider the following data which relate to the age distribution of 1000 workers in an industrial establishment.

Determine the median age

The location of median value is facilated by the use of a cumulative frequencey distribution as shown below in the table

Median = size of N/2th observation = 1000/2500 th observation which lies in the class 35-40.

Hence the median age is approximately 37 Years. This value of median suggests that half of the workers are below the age of 37 years and other half of the workers are above the age of 37 years.

MATHEMATICAL PROPERTY OF MEDIAN

The important mathematical property of the median is that the sum of the absolute deviations about the median is a minimum. In symbols ∑!

Page 47: Quantitative Techniques in Management New Book

X Med'"! a minimum.

Although the median is out as popular as the arithmetic mean, it does have the advantage of being both easy to determine and easy to explain.

As illustrated earlier, the median is affected by the number of observations rather than the values of the observations; hence it will be less distorted as a representative value than the arithmetic mean.

An additional advantage of the median is that it may be computed for an open end distribution.

The major disadvantage of median is that it is a less familiar measure than the arithmetic mean. However, since median is a positional average, its value is not determined by each and every observation. Also median is not capable of algebraic treatment.

Activity C

For the following data, compute the median and interpret this value.

Monthly RentRs.

No. of personspaying the rent

Monthly Rent(Rs.)

No. of Personspaying the rent

Below 1000 6 1800-2000 15

1000-1200 9 2000-2200 10

1200-1400 11 2200-2400 8

1400-1600 14 2400 and above 7

1600-1800 20

QUANTILES

Quantiles are the related positional measures of central tendency. These are useful and frequently employed measures of non-central location. The most familiar quantiles are the quartiles, delis, an percentiles.

Quartiles: QuartiIes are those values which divide the total data into four equal parts. Since there points divide the distribution into four equal parts, we shall have three quartiles. Let us call them Q1, Q2 and Q3. The first quartile, 1 is the value such that 25% of the observations are smaller and 75% of the observations are larger. The second quartile, Q2, is the median. i.e. 50% of the observations are smaller and 50% are larger. The third quartile, Q3 is the value such that 75% of the observations are smaller and 25% of the observations are larger.

For grouped data, the following formulas are used for quartiles.

Where L is lower limit of the quartile class, pef is the preceding cumulative frequency to the quartile class. F is the frequency of the quartile class, and i is the size of the quartile class.

Deciles: Deciles are those values which divide the total data into a ten equal parts. Since nine points divide the distribution not ten equal parts, we shall have nine deciles denoted by D1, D2.….. D9,

For grouped data, the following formulas are used for deciles :

Where the symbols have usual meanings and interpretation.

Percentiles: Percentiles are those values which devide the total data into hundred equal parts. Since ninety nine points divide the distribution into hundred equal parts, we shall have ninety nine percentiles denoted by P1, P2, P3,...... P9

Page 48: Quantitative Techniques in Management New Book

For grouped data. the following formulas are used for percentiles.

To illustrate the computations of quartiles, deciles and percentiles, consider the following grouped data which relate to the profits of 100 companies during the year 1987 -88.

Calculate Q1, Q2, (Median) D6 and P90 we need the following table

Profit No. of Companies c.f.

20-30 4 4

30-40 8 12

40-50 18 30

50-60 30 60

60-70 15 75

70-80 10 85

80-90 8 93

90-100 7 100

Q1= Size of N/4 th observation = 100/4 = 25the observation, which lies in the class 40-50

This value of Q1suggests that 25% of the companies earn an annual profit of Rs.47.22 lakhs.

Which lies in the class 50-60

This value of Q2; (or median) suggests that 50% of the companies earn the annual profit of Rs. 56.67 lakh or less and the remaining 50% of the companies earn an annual profits of Rs. 56.67 lakh or more.

Thus 60% of the companies earn an annual profit of Rs. 60 laks or less and 40% of the companies earn Rs. 60 lakh or more.

This value of 90th percentile suggests that 90% of the companies earn an annual profit of Rs. 85 lakh or less and 20% of the companies earn more than Rs. 85 Iakh or more.

LOCATlNG THE QUANTILES GRAPHICALLY

To locate the median graphically, draw less than cumulative frequency curve (less than ogove). Take the variable on the X-axis and frequency on the Y-axis. Determine the median value by locating N/2th observation on the Y-axis is the value of median.

Similarly we can locate graphically the other quantiles such as quartiles, declies and percentiles.

For the data or previous illustration, locate graphically the values of Q1, Q2, D60, and the First step is to make a less than cumulative frequency. curve as shown in figure 1.

Page 49: Quantitative Techniques in Management New Book

To determine different quantiles graphically, horizontal lines are drawn from the cumulative relative frequency values. For example if we want to determine the value of median (or Q2) a horizontal line can be drawn from the cumulative frequency value of 0.50 to the less than curve ad then extending the vertical line to the horizontal axis. In a similar way, other values can be determined as shown in the graph, From the graph. we observe.

Q1= 47.22, Q2 = 57.67, D60 = 60.0, P90 = 75.0

It may be noted that these graphical values of quantiles are the same obtained by the formulas.

Activity D

Given below is the wage distribution of 100 workers in a factory :

Sages(Rs) No. of workers Wages (Rs.) No. of workers

Below 1000 3 1800-2000 10

1000-1200 5 2000-2200 8

1200-1400 12 2200-2400 5

1400-1600 23 2400 and above 3

1600-1800 31    

Draw a less than cumulative frequency curve (ogive) and use it to detennine graphically the values of Q2, Q3, D60 and P80. Also verify your result by the corresponding mathematical formula.

MODE

The mode is the typical or commonly observed value in a set of data. It is defined as the value which occurs most often or with the greatest frequency. The dictionary meaning of the term mode is 'most usual'. For example, in the series of numbers. 3,4,5,5,6,7,8,8,9 the mode is 8 because it occurs the maximum number of times.

The calculations are different for the grouped data, where the modal class is defined as the class with the maximum frequency, the following formula is used for calculating the mode

Where L is lower limit of the modal class, d1 is the difference between the frequency of the modal class apd the frequency of the preceding class, d2 is the difference between the frequency of the modal class and the frequency of the succeeding class, i is the size of the modal class. To illustrate the computation of mode, let us consider the following data.

Since the maximum frequency 35 is in the class 60 -70, therefore 60-70 is the modal class. Applying the formula, we get,

Hence model daily sales are Rs. 66.

LOCATING THE MODE GRAPHICALLY

In a grouped data, the value of mode can also be determined graphically. In graphical method, the first step is to construct histogram for the given data. The next step is to draw two straight lines diagonally on the inside of the modal class bars, starting from each upper corner of the bar to the upper corner of the adjacent bar. The last step is to draw a perpendicular line from the intersection of the two diagonal lines to the X-axis which gives us the modal value.

Page 50: Quantitative Techniques in Management New Book

Consider the following data to locate the value of mode graphically.

First draw the histogram as shown below in Figure II Figure II : Histogram of Monthly Salaries

MONTHLY SALARY IN RUPEES

The two straight lines are drawn diagonally in the inside of the modal class bars and then finally a vertical line from the intersection of the two diagonal lines is drawn on the X-axis. Thus the modal value is approximately Rs.2353. It may be noted that the value of mode would be approximately the same if we use the algebric method.

The chief advantage of the mode is that it is, by definition, the most representative value of the distribution. For example, when we talk of modal size of shoe or garment, we have this average in mind. Like median. the value of model is not affected b extreme values and its value can be determined in open-end distributions.

The main disadvantage of the mode is its indeterminate value, i.e., we cannot calculate its value precisely in a grouped data, but merely estimate it. When a given set of data have two or more than two values as maximum frequency, it is a case of bimodal or multimodal distribution and the value of mode cannot be determined. The mode has no useful mathematical properties. Hence, in actual practice the mode is more important as a conceptual idea than as a working average.

Activity E

Compute the value of mode from the grouped data give below. Also check this value of mode graphically.

Monthly stipend (Rs)

No. of Management Trainees

Monthly Stipend Trainers

No. of Management

2500-2700 25 1400-1500 20

2700-2900 35 3500-3700 15

2900-3100 60 3700-3900 5

3100-3300 40

RELATIONSHIP AMONG MEAN, MEDIAN AND MODE

A distribution in which mean, median and mode coincide is known as a symmetrical (bell shaped) distribution. If a distribution is skewed (that is, not symmetrical) then mean, median, and mode are not equal. In a moderately skewed distribution, a very interesting relationship ex.ists among mean. median and mode. In such rye of distributions, it can be proved that the distance between mean and median is aproximately one third of the distance between the mean and mode. This is shown below for two types of such distributions.

This relationship can be expressed as follows:

Mean -Median = 1/3 (Mean -Mode)

Or Mode = 3 Median -2 Mean

Similarly, we can express the approximate relationship for median in terms of mean and mode. Also this can be expressed for mean in terms of median and mode. Thus, if we know any of the two values of the averages, the third value of the average can be determined from this approximate relationship.

For example, consider a moderately skewed distribution in which mean and median is 35.4 and 34.3 respectively. Calculate the value of

Page 51: Quantitative Techniques in Management New Book

mode.

To compute the value of mode, we use the approdimate relationship

Mode = 3 Median -2 Mean

= 3(34.3) - 2 (35.4)

=102.9 -70.8 = 32.1

Therefore the value of mode is 32.1

GEOMETRIC MEAN

The geometric mean like the arithmetic mean is a calculated average. The geometric mean. GM of a series of numbers X1, X2, ...Xn, is defined as

GM N1/X1, X2, X3.... Xn

or the Nth root of the product of N observation.

When the number of observations is three or more, the task of computation becomes quite tedious. Therefore a transformation into logarifums is useful to simplify calculations. If we take logarithms of both sides, then the fomlula for GM becomes.

For the grouped data, the geometric mean is calculated with the following formula

where the notation has the usual meaning.

Gemoetric mean is specially useful in the construction of index numbers. It is an average most C suitable when large weights have to be given to small values of observations and smaIl weights to do large values of observations. This average is also useful in measuring the growth of population.

The following data illustrates the use and the computations involved in geometric mean.

A machine was purchased for Rs. 50.000 in 1984. Depreciation on the diminishing balance was charged @ 40% in the first year, 25% in the second year and 15% per annum during the next threeyear. What is the average depreciation charged during the whole period?

Since we are interested in finding the average rate of depreciation, geometric mean will be the most appropriate average.

The diminishing value being Rs. 77.32, the depreciation will be 100-72.32 = 22.68%.

The geometric mean is very useful in averaging rations and percentages. It also helps in determining the rates of increase and decrease. It is also capable of further algebraic treatment, so that a combined geometric mean can easily be computed.

However, compared to arithmetic mean, the geometric mean is more difficult to compute and interpret. Further, geometric mean cannot be computed if any observation has either a value zero or negative.

Page 52: Quantitative Techniques in Management New Book

Activity F

Find the geometric mean for the following data:

HARMONIC MEAN

The harmonic mean is a measure of central tendency for data a expressed as rates such as kilometers per hour, tonnes per day, kilometers per litre etc. The harmonic mean is defined as the______ of the arithmetic mean of tbe reciprocal of tbe individual observations. H X1, X2.__________ Xn and N observations,, then harmonic mean can be represented by the following formula.

The harmonic mean is useful for computing the average rate of increase of profits, or average speed at which a journey bas been performed, or the average price at which an article has been sold. Otherwise its field of application is really restricted.

To explain the computational procedure, let us consider the following example.

In a factor, a unit of work is complete by A in 4 minutes, by B in 5 minutes, by C in 6 minutes, by D in 10 minutes, and by E in 12 minutes. Find the average number of units of work completed per minute.

The calculations for computing harmonic mean are given below:

Hence the average of units computed per minute is 6.25.

The harmonic mean like arithmetic mean and geometric mean is computed from each and every observation. It is specially useful for averaging rates.

However, harmonic mean cannot be computed when one or more observations have zero value or when there are both positive or negative observations. In dealing with business problems, harmonic mean is rarely used.

Activity G

In a facoy four workers are assigned to complete an order received for dispatching 1400 boxes of a particular commodity. Worker A takes 4 minutes per box, B takes 6 minutes per box, C takes 10 minutes per box, D takes 15 minutes per box. Find the average minutes taken per box by the group of workers.

KEY WORDS

ARITHMETIC MEAN is equal to the sum of the values divided by the number of values.

GEOMETRIC MEAN of N observations is the Nth root of the product of the given value observations.

HARMONIC MEAN of N observations is the reciprocal of the arithmetic mean of the reciprocals of the given values of N observations.

MEDIAN is that value of the variable which divides the distribution into two equal parts.

MODE is that value of the variable which occurs the maximum number of times.

QUANTILES are those values which divide the distribution into a fixed number of equal parts, eg., quartiles divide distribution into four equal

Page 53: Quantitative Techniques in Management New Book

parts.

SUMMARY

Measures of central tendency give one of the very important characteristics of data. Any one of the various measures of central tendency may be chosen as the most representative or typical measure. The arithmetic mean is widely used and understood as a measure of central tendency. The concepts of weighted arithmetic mean, geometric mean, and harmonic mean are useful for specific type of applications. The median is generally a more representative measure for open-end distribution and highly skewed distribution. The more should be used when the most demanded or customary value is needed.

SELF - ASSESSMENT EXERCISES

1. List the various measures of central tendency studied in this unit and explain the difference between them. 2. Discuss the mathematical properties of arithmetic mean and median. 3. Review for each of the measure of central tendency, their advantages and disadvantages.4. Explain how you will decide which average to use in a parricular problem .5. What are quantiles? Explain and illustrate the concepts of quartiles, deciles and percenties.

6. Following is the cumulative frequency distribution of preferred length of study table obtained from the preference study of 50 students.

Length No of Students Lengths No. of Students More than 50 Cms 50 More than 90 Cms 25More than 60 Cms 46 More than 100 Cms 18More than 70 Cms 40 More than 110 Cms 7More than 80 Cms 12    

              A manufacturer has to take decision on the length of study table to manufacture. When length would you recommend and why?

7. A three month study of the phone calls received by small Company yielded the following information.

No. of Calls Per Day No of Days No of calls No. of Days Per day 100-200 3 600-700 10200-300 7 700-800 9300-400 11 800-900 8400-500 13 900-1000 4500-600 27    

Compute the arithmetic mean, median and mode.

8. From the following distribution of travel time of 213 days to work of a firm's employee, find the modal travel time

Travel Time(in minutes)

No of Days Travel time(in minutes)

No. of Days

More than 90 Cms 213 More than 40 Cms 85More than 70 Cms 210 More than 30 Cms 50More than 60 Cms 195 More than 20 Cms 18More than 50 Cms 156 More than 10 Cms 2

9. The mean monthly salary paid to all employees ina company is Rs.1600. The mean monthly salaries paid to technical employees are Rs.1800 and Rs.1200 respectively. Determine the percentage of technical and non-technical employees of the company.

Page 54: Quantitative Techniques in Management New Book

10. The following distribution is with regard to weight ( in grams ) of apples of a given variety. If an apple of less than 22 grams is to be considered unsuitable for export, what is the percentage of total apples suitable for the export?

Weight(in grams)

No. of apples(in grams

Weight No. of apples

100-110 10 140-150 35110-120 20 150-160 15120-130 40 160-170 5130-140

Draw an olive of more than one type and deduce how many apples will be more than 122 grams.

11. The Geometric mean of 10 observations on a certain variable was calculated to be 16.2. It was later discovered that one of the observations was wrongly recorded as 10.9 when in fact it was 21.9. Apply appropriate correction and calculate the correct geometric mean.

12. An incomplete distribution of daily sales (Rs. thousand) is given below. The data relate to 299 days.

Daily Sales(Rs. thousand)

No. of DaysDaily Sales

(Rs. thousand) No. of days

10-20 12 50-60 ?20-30 30 60-70 2530-40 ? 70-80 1840-50

You are told that the median value is 46. Using the median formula, fill up to missing frequencies and calculate the arithmetic mean of the completed data.

13. The following table shows the income distribution of a company.

Income(Rs)

No. of Employees

Income(Rs)

No. of Employees

1200-1400 8 2200-2400 351400-1600 12 2400-2600 181600-1800 20 2600-2800 71800-2000 30 2800-3000 62000-2200 40 3000-3200 4

Determine (i) the mean income (ii) the median income (iii) the mean (iv) the income limits for the middle 50% of the employees (v) D7, the seventh decile, and (vi) P80, the eightienth percentile.

FURTHER READINGS

Clark, T.C. and E. W. Jordan, 1985. Introduction to Business and Economics Statistics, South - Western Publishing Co.

Enns, P.G., 1985. Business Statistics. Rjchard D.lrwin: Homewood. Gupta, S.P. and M.P. Gupta, 1988. Busines Statistics, Sultan Chand & Sons: New Delhi.

Moskowitz, H. and G.P. Wright, 1985. Statistics for Management and Economics, Charles E. Merill Publishing Company.

 

Page 55: Quantitative Techniques in Management New Book
Page 56: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 9 : MEASURES OF VARIATION AND SKEWNESS

   

  STRUCTURE

Introduction Significnce of measuring variation Properties of a good measure of variation Absolute and relative measures of variarion Range Quartile deviation Average deviation Standard deviation Coefficient of variation Skewness Relative skewness Summary Key words Self-assessment exercises Further readings

INTRODUCTION

In the previous unit, we were concerned with various measures that are used to provide a single representative value of a given set of data. This single value alone cannot adequately describe a set of data. Therefore, in this unit, we shall study two more important characteristics of a distribution. First we shall discuss the concept of variation and later the concept of skewness.

A measure of variation (or dispersion) describes the spread or scattering of the individual values around the central value. To illustrate the concept of variation, let us consider the data given below.

Since the average sales for firms, A, B and C is the same, we are likely to conclude that the distribution pattern of the sales is similar. It may be observed that in Firm A. daily sales are the same irrespective of the day, whereas there is Iess, amount of variation in the daily sales for firm B and greater amount of variation in the daily sales for rum C.

Therefore, different sets of data may have the same measure central tendency but differ greatly in terms of variation.

SIGNlFICANCE OF MEASURING VARIATION

Measuring variation is significant for some of the folloing purposes.

i. Measuring variability determines the reliability of an average by pointing out as to how far an average is representative of the entire data.

ii. Another purpose of measuring variability is to determine the nature and cause of variation in order to control the variation itself.iii. Measures of variation enable comparisons of two or more distributions with regard to their variability.iv. Measuring variability is of great importance to advanced statistical analysis. For example, sampling or statistical inference is

essential inference is essentially a problem in measuring variability.

PROPERTIES OF A GOOD MEASURE OF VARlATION

Page 57: Quantitative Techniques in Management New Book

A good measure of variation should possess, as far as possible, the same properties as those of a good measure of central tendency.

Following are some of the well known measures of variation which provide a numerical index of the variability of the given data.

i. Rangeii. Average or Mean Deviationiii. Quanile Deviation or Semi-lnterquartile Rangeiv. Standard Deviation

ABSOLUTE AND RELATIVE MEASURES OF VARIATION

Measures of variation may be either absolute or relative. Measures of absolute variation are expressed in terms of the original data. In case the two sets of data are expressed in different units of measurement, then the absolute measures of variation are not comparable. In such cases, measures of relative variation should be used. The other type of, comparison for which measures of relative variation are used involves the comparison between two sets of data having the same unit of measurement but with differenr means. We shall now consider in turn each of the four measures of variation.

RANGE

The range is defined as the difference between the highest (numerically largest) value and the lowest (numerically smallest) value in a set of data. ln Symbols, this may be indicated as:

R = H - L

Where R = Range; H = Highest Value; L = Lowest Value

As an illustration consider the daily sales data for the three firms as given earlier.

For firm A. R = H -L = 5000 -5000 = 0For firm B, R = H -L = 5140 -4835 = 305 For firm C, R = H -L = 13000 -1800 = 11200

The interpretation for the value of range is very simple

In this example, the variation is nil in case of daily sales for firm A, the variation is small in case of firm B and variation is very large in case of firm C.

The range is very easy to calculate and it gives us some idea about the variability of the data. However, the range is a crude measure of variation, since it uses only two extreme values.

The concept of range is extensively used in statistical quality control. Range is helpful in studying the variations in the prices of shares and debentures and other commodities that are very sensitive to price changes from one period to another. For meteorological departments, the range is a good indicator for weather forecast.

For grouped data, the range may be approximated as the difference between the upper limit of the largest class and the lower limit of the smallest class.

The relative measure corresponding to range, called the coefficient of range, is obtained by applying the following formula.

Activity A

Following are the prices of shares of a company from Monday to Friday

Day : Monday Tuesday Wednesday Thursday Friday

Price : 670 678 750 705 720

Page 58: Quantitative Techniques in Management New Book

Compute the value of range and interpret the value,

Activity B

Calculate the coefficient of range from the following data :

QUARTILE DEVIATION

The quartile deviation, also known as semi-interquartile range, is computed by taking the average of the difference between the third quartile and the flIst qartile. In symbols, this can be written as:

Q1 - Q32

Where Q1 = first quartile, and Q3 = third quartile.

The following illustration would clarify the procedure involved. For the data given below, compute the quartile deviation.

To compute quartile deviation, we need the valued of the first quartile and the third quratile which can be obtained from the following table.

The relative measure corresponding to quartile deviation, called the coefficient of quartile deviation, is calculated as given below.

The quartile deviation is superior to the range as it is not based on two extreme values but rather on middle 50% observations. Another advantage of quartile deviation is that it is the only measure of variability which can be used for open-end distribution.

The disadvantage of quartile deviation is that it ignores the first and the last 25% observations.

Activity C

A survey of domestic consumption of electricity gave the following distribution of the units consumed. Compute the quartile deviation and its coefficient.

AVERAGE DEVIATION

The measure of average (or mean) deviation is an improvement over the previous two measures in that it considers all observations in the given set of data. This measure is computed as the mean of deviations from the mean or the median, all the deviations are treated as positive regardless of sign. In symbols, this can be represented by :

Page 59: Quantitative Techniques in Management New Book

Theoretically speaking, there is an advantage in taking the deviations from median because the sum of the absolute deviations (i.e. ignoring + signs) from median is minimum. In actual practice, however, arithmetic mean is more popularly used in computation of average deviation.

For grouped data, the formula to be used is given as :

As an illustration, consider the following grouped data which relate to the sales of 100 companies.

To compute average deviation, we construct the following table :

The relative measure corresponding to the average deviation, called the coefficient of average deviation, is obtained by dividing average deviation by the particular average used in computing the average deviation. Thus, if average, deviation has been computed from median. the coefficient of average deviation shall be obtained by dividing the average deviation by the median.

Although the average deviation is a good measure of variability, its use is limited. If one desires only to measure and compare variability among several sets of data, the average deviation may be used.

The major disadvantage of the average deviation is its lack of mathematical properties. This is more true because non-use of signs in its calculations makes it algebraically inconsistent.

ACTIVITY D

STANDARD DEVIATION

The standard deviation is the most widely used and important measure of variation. In computing the average deviation, the signs are ignored. The standard deviation overcomes this problem by squaring the deviations, which makes them all positive. The standard deviation, also known as root mean square deviation, is generally denoted by the lower case Greek letter σ (read as sigma). In symbols, this can be expressed as

The square of the standard deviation is called variance. Therefore

Variance = σ 2

The standard deviation and variance become larger as the variability, or spread within the data becomes greater. More important, it is readily comparable with other standard deviations and the greater the standard deviation, the greater the variability

For grouped data, the formula is

Page 60: Quantitative Techniques in Management New Book

The square of the standard deviation is called variance. Therefore Variance = σ 2

The standard deviation and variance become larger as the variablity or spread within the data becomes greater. More important, it is readily comparable with other standard deviations and the greater the standard deviation, the greater the variability.

For grouped data, the foImula is

The following formulas for standard deviation are mathematically equivalent to the above formula and are often more convenient to use in calculations.

Remarks: If the data represent a sample of size N from a population, then it can be proved that the sum of the squared deviations are divided by (N - 1) instead of by N. However, for large sample sizes, there is very little difference in the use of (N- 1) or N in computing the standard deviation.

To understand the formula for grouped data, consider the following data which relate to the profits of 100 companies.

To compute standard deviation we construct the following table.

The standard deviation is commonly used to measure variability, while all other measures have rather special uses. In addition, it is the only measure possessing the necessary mathematical properties to make it useful for advanced statistical work.

ACTIVITY E

The following data show the daily sales at a petrol station. Calculate the mean and standard deviation.

Coefficient of variation

A frequently used relative measure of variation is the coefficient of variation, denoted by C. V. This measure is simply the ratio of the standard deviation to mean expressed as the percentage.

Less in the data it is said to be less variable or more consistent.

Consider the following data which relate to the mean daily sales and standard deviation for four regions.

To determine which region is most consistent in terms of daily sales, we shall compute the coefficients of variation. You may notice that the mean daily sales are not equal for each region.

Page 61: Quantitative Techniques in Management New Book

As the coefficient of variation is minimum for Region 1, therefore the most consistent regions is Region 1.

Activity F

A factory produces two types of electric lamps A and B. In an experiment relating to their life, the following results were obtained.

Compare the variability of the life of the two types of electric lamps using the coefficient of variation.

SKEWNESS

The measures of central tendency and variation do not reveal all the characteristics of a given set of data. For example, two distributions may have the same mean and standard deviation but may differ widely in the shape of their distribution. Either the distribution of data is symmetrical or it is not. If the distribution of data is not symmetrical, it is called asymmetrical or it is not. If the distribution of data is of symmetrical, it is called asymmetrical or skewed. Thus skewness refers to the lack of symmetry in distribution.

A simple method of detecting the direction of skewness is to consider the tails of the distribution (figure 1). The rules are:

Data are symmetrical when there are no extreme values in a particular direction so that low and high values balance each other. In this case, mean median mode (see Fig I (a)).

If the longer tail is towards the lower value or left hand side. The skewness is negative. Negative skewness arises when the mean is decreased by some extremely low values thus making mean < median < mode. (see Fig l(b).

If the longer tail of the distribution is towards the higher values of right hand side, the skewness is positive. Positive skewness occurs when mean is increased by some unusually high values, thereby making mean > median > mode. (See Fig l(c))

Figure 1 (a) Symmetrical Distribution

(b) Negatively skewed Distribution

(c) Positively skewed distribution

RELATIVE SKEWNESS

In order to make comparisons between the skewness in two or more distributions, the coefficient of skewness (given by Kaul Person) can be defined as

Page 62: Quantitative Techniques in Management New Book

If the mode cannot be determined, then using the approximate relationship, Mode 3 Median 2 Mean the above formula reduces to

If the value of this coefficient is zero, the distribution is symmetrical; if the value of the coefficient is positive, it is positively skewed distribution, or if the value of the coefficient is negative, it is negatively skewed distribution. In practice, the value of this coefficient usually lies between ±1.

When we are given open-end distributions where extreme values are present in the data or positional measures such as median and quartiles, the following formula for coefficient of skewness (given by Bowley) is more appropriate.

Again if the value of this coefficient is zero, it is a symmetrical distriburion. For positive value, it is positively skewed distribution and for negative value, it is negatively skewed distribution.

To explain the concept of coefficient of skewness, let us consider the following data.

Since the given distribution is not open-ended and also the mode can be determined, it is appropriate to apply Karl Pearson formula as given below.

This value of coefficient of skewness indicated that the distribution is negatively skewed and hence there is a greater concentration towards the higher profits.

The appication of Bowley's method would be clear by considering the following data.

This value of coefficient of skewness indicates that the distribution is slightly skewed to the left and therefore there is a greater concentration of the sales at the higher values than the lower values of the distribution.

SUMMARY

In this unit, we have shown how the concepts of measures of variation and skewness are important. Measures of variation considered were the range, average deviation., quartile deviation and standard deviation. The concept of coefficient of variation was used to compare relative variations of different data. The skewness was used in relation to lack of symmetry.

KEY WORDS

AVERAGE DEVIATION is the arithmetic mean of the absolute deviations from the mean or the median.

COEFFICIENT OF VARIATION is a ratio of standard deviation to mean expressed as percentage.

Page 63: Quantitative Techniques in Management New Book

INTERQUARTILE RANGE considers the spread in the middle 50% (Q3 - Q1) of the data.

QUARTILE DEVIATION is one half the distance between first and third quartiles.

RANGE is the difference between the largest and the smallest value in a set of data.

RELATIVE VARIATION is used to compare two or more distributions by relating the variation of one distribution to the variation of the other.

SKEWNESS refers to the lack of symmetry.

STANDARD DEVIATION is the root mean square deviation of a given set of data.

VARIANCE is the square of standard deviation and is defined as the arithmetic mean of the squared deviations from the mean.

SELF-ASSESSMENT EXERCISES

1. Discuss the importance of measuring variability for managerial decision making2. Review the advantages and disadvantages of each of the measures of variation.3. What is the concept of relative variation? What problem situations call for the use of relative variation in their solution?4. Distinguish between Kari Person's and Bowley's coefficient of skewness. Which one of these would you prefer and why?5. Compute the range and the quartile deviation of the following data.

6. Compute the average deviation for the following data.

7. Calculate the mean, standard deviation and variance for the following data.

8. Records were kept on three employees who wrapped packages on sweet boxes during the Diwali holidays in a big sweet house. The study yielded the following data.

i. Which package wrapper was most productive?ii. Which employee was the most consistent?iii. What measure did you choose to answer part (ii) and why?

9. The following data relate to the mileage of two types of tyre.

Page 64: Quantitative Techniques in Management New Book

i. Which of the two types gives a higher average life?ii. If prices are the same for both the types, which would you prefer and why?

10. The following table gives the distribution of daily travelling allowance to salesmen in a company:

Computer Karl Pearson's coefficient of skewness and comment on its value.

11. Calculate Bowley's coefficient of skewness from the following data:

12. You are given the following information before and after the settlement of workers stike.

Assuming that the increase in wage is a loss to the management, comment on the gains and losses from the view of workers and that of management.

FURTHER READINGS

1. Clark, T C and E W Jordan, 1985, Introduction to Business and Economic Statistics, South - Western Publishing Co.

2. Enns, P G 1985, Business Statistics, Richard D. Irwin Inco: Homewood. Gupta, S P and M P Gupta, 1988. Business Statistics, Sultan Chand & Sons: New Delhi.

3. Moskowitz. H and G P Wright, 1985. Statistics for Management and Economics Charles E. Merill Publishing Company:  

 

Page 65: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 10 : CORRELATION THEORY

   

  GENERAL NOTES

There are various methods for measuring the relationships existing between economic variables. The simplest are correlation analysis and regression analysis. We shall start from correlation analysis because although it has serious limitations and throws little light on the nature of the relationship existing between variables. It will make the student familiar with the correlation coefficient, which is an essential statistic of regression analysis. Correlation may be defined as the degree of relationship existing between two or more variables. The degree of relationship existing between two variables is called simple correlation. The degree of relationship connecting three or more variables is called multiple correlation. In this chapter we shall examine only simple correlation. Postponing the discussion on multiple correlation until a later chapter, after the examination of regression analysis. (Actually the multiple correlation coefficient cannot be interpreted without reference to the multiple regression analysis.)

Correlation may be linear, when all points ( X, Y) on a scatter diagram seem to cutter near a straight line, or nonlinear, when all points seem to lie near a curve.

Two variables may have a positive correlation, a negative correlation, or they may be uncorrelated. This holds both for linear and nonlinear correlation.

Positive Correlation. Two variables are said to be positively correlated if theory tend to change together in the same direction, that is, if they tend to increase or decrease together. Such positive correlation is postulated by economic theory for the quantity of a commodity supplied and its price. When the price increases the quantity supplied increases, and conversely, when price falls the quantity supplied decreases. The scatter diagram of two variables positively correlated appears in figure 3.1. All points in the scatter diagram seem to lie near a line or a curve with a positive slope. If all points lie on the line (or curve) the correlation is said too be perfect positive.

Negative correlation. Two variables are said to be negatively correlated if they tend to change in the opposite direction: when X increases Y decreases, and vice versa. For example, the quantity of a commodity demanded and its price are negatively correlated. When price increases, demand for the commodity decreases and when price falls demand increases. The scatter diagram appears in figme 3 : 2, the points cluster around a line (or curve) with a negative slop. If all points lie on the line (or curve) the correlation is said to be perfect negative. No correlation, or zero correlation. Two variables are uncorrelated when they tend to change with no connection to each other. The scatter diagram will appear as in figure 3.3.. the points are dispersed all over the surface of the XY plane. For example one should expect zero correlation between the height of the inhabitants of a country and the production of steel, or between the weight of students and the colour of their hair.

Zero correlation

Figure 3.3

MEASURE OF LINEAR CORRELATION: THE POPULATION CORRELATION COEFFICIENT P AND ITS SAMPLE ESTIMATE T

In the light of the above discussion it appears that when can determine the kind of correlation between two variables by direct observation of the scatter diagram. In addition, the scatter diagram indicates the strength of the relationship between the two variables. If the points lie close to the line, the correlation is strong. On the other hand a greater dispersion of points about the line implies weaker correlation. Yet inspection of a scatter diagram gives only a rough idea of the relationship between variables X and Y. For a precise quantitative measurement of the degree of correlation between Y and X we use a parameter which is called the correlation coefficient and is usually designated by the Greek

Page 66: Quantitative Techniques in Management New Book

letter p having as subscripts the variables whose correlation it measures. p refers to the correlation of all the values of the population of X and Y. Its estimate from any particular sample (the sample statistic for correlation) is denoted by r with the relevant subscripts. For example if we measure the correlation between X and Y the population correlation coefficient is represented by P xy and its sample estimate by Γ xy. We will establish that the sample correlation coefficient is defined by the formula .

                (3.1)

where X1 = X1 -X and Y1 = Y1 - Y (throughout this book lowercase letters will denote deviations from the mean of the variables and capital letters the observed values, unless otherwise stated.)

We will use a simple example from the theory of supply. Economic theory suggests that the quantity of a commodity supplied in the market depends on its price, cereris paribus. When price increases the quantity supplied increases and vice versa, when the market price falls producers offer smaller quantities of their commodity for sale. In other words economic theory postulates that price (X) and quantity supplied (Y) are positively correlated.

Table 3.1.

Our problem is to define a measure with which we will detemine the correlation between price X and quantity supplied Y. Our first task is to gather observations of prices and quantities supplied during a given time period. A set of hypothetical observations appears in table 3.1.

Correlation Theory: The Simple Linear Regression Model

By plotting the above observations on a rectangular co-ordinate system, we get the scatter diagram of figure 3.4.

Figure 3.4

Each point of the scatter diagram represents a pair of price -quantity in a given period. For example, point z represents the pair (X5 Y5) that is the price, which is 10 shillings and the quantity supplied, which is 50 tons, during the 5th period. Looking at the diagram we see that the points tend to cutter around a line with a positive slope. This suggests that there exists a positive linear correlation between price and quantity. In order to find the exact measure of correlation we work as follows.

1. We compute the mean value of the variables

2. We draw the perpendicuIars X'X' and Y'Y' from the means. X' and Y' thus dividing the area of the rectangular co-ordinate system into four quadrants: I, II, Ill and IV (figure 3.5)

3. We next take the deviation of each value of X and Y from their mean and denote this difference by lower case letters

Examining the deviations of the values of the variables X and Y from their means, we observe that their products can provide a measure of the correlation between the variables X and Y.

It is necessary to start correlation analysis by plotting the sample observations on a scatter diagram in order to see whether the relationship is linear or nonlinear. Because if the relationship between X and Y exists, but is nonlinear, the formulae which will be developed in the present

Page 67: Quantitative Techniques in Management New Book

chapter break-down (see below figure 3.5).

Figure 3.5

a. In quadrants II and IV the product (X1 - X1) (Y1 - Y1) = x1y1 is positive, because both deviations x1 and y1 have the same sign, both being either positive or negative.

b. In quadrants I and III the product (X1 - X1) (Y1 - Y1) = x1y1 is negative, because the deviations of the x1's have opposite sign of the deviations of the y1 is lying in the same quadrants.

Thus if most observations fall in quadrants II and IV, the correlation between Y and X is positive. If on the other hand most of the quantity - price pairs fall in quadrants I and Ill, the correlation between Y and X will be negative. If the observations are scattered at random all over the four quadrants, the positive and negative products x1y1, will tend to cancel each other out, and the sum of products will tend to approach zero. If the sum of all products of the deviations of variables X and Y from their means is positive, the correlation between X and Y will be positive, while if the sum of the products of deviations is negative, the correlation between X and Y will be negative, Symbolically,

if

the correlation between X and Y is negative.

if

Thus the sum of the products of the deviations ΣX1Y1 provides a measure of the association between X and Y. However, this measure has two basic defects. Firstly, it is affected by the number of observations. The greater the number of observations, the greater the number of products will be, and therefore the value of the sum ΣX1Y1 will be different. Thus if X and Y are positively related, an increase in the number of observations would make the correlation appear stronger, without this being necessarily true. Secondly, the sum ΣX1Y1 is affected by the units of measurement of the variables X and Y. For example, the correIation in the above case would appear higher if the supply was measured in kilograms and the price in pence although the observations would be exactly the same.

To correct the first defect we divide the sum ΣX1Y1 by the number of observations n

This expression is the covariance of X and Y and is obviously a better measure of correlation than the simple sum ΣX1Y1, because it will not change directly with the number of observations in the sample. However, it still has the defect of being influenced by the units in which the variables X and Y are measured. To correct this defect we divide the covariance by the standard deviations of the variables, which are measured in the same units as the variables themselves so that the ration becomes a pure number, independent of any change in the units of measurement of X and Y. The resulting ration is the sample correlation coefficient r, which is an estimate of the population correlation coefficient p.

             (3.2)

Substituting the values of Sxy, Sx and Sy in expression 3.2 we find

(3.3)

This formula is expressed in deviations of the variables from their means. If we want to use the actual values of the observations we use the following form:

      (3.4)

Page 68: Quantitative Techniques in Management New Book

This formula is derived from (3.3) through the following transformations:

Given

1. The numerator can be expanded as follows

          (3.5)

2. From the denominator of expression (3.3) we get

Similarly

(3.8)

3. Substituting (3.5) and (3.8) in (3.3) we get

The above formulae of the correlation coefficient have two points of interest:

1. The formulae are symmetric with respect to X and Y, that is rYX = rXY.2. The formulae are applicable only for linear relationships. Expressions for nonlinear relationships will be developed in Chapter 7 in

connection with regression analysis.

3.3 NUMERICAL VALUES OF THE CORRELATION COEFFICIENT

The correlation coefficient is a measure of the degree of covariability of the variables X and Y. The values that the correlation coefficient may assume vary from -1 to + 1. When r is positive, X and Y increase or decrease together. r = +1 implies that there is perfect positive correlation betWeen X and Y. Diagrammatically, all the observations on Y and X lie on a straight line with a positive slop (figure 3.6.).

When r is negative, X and Y move in opposite directions. If r = -I, there exists a perfect negative correlation between X and Y. Diagrammatically, all the observations of Y and X lie on a line with a negative slope ( Figure 3.7)

When r is zero, then the two variables are uncorrelated.

Perfect positive correlation Figure 3.6                                      Perfect negative correlation Figure 3.7

We shall prove that there will assume the value of unit when the two variables are perfectly linearly correlated. In this case all the observations will lie on a line with a positive or negative slope according to whether the correlation is positive or negative.

Figure 3.8

In figure 3.8 we picture the case of perfect positive correlation between X and Y. The line depicting the relation forms an angle θ with the parallel Y'Y' to the horizontal axis. From elementary trigonometry it is known that tan θ = y/x. Therefore {(x). (tan θ)} = (y).

Substituting this result in the formula of the correlation coefficient we find

Page 69: Quantitative Techniques in Management New Book

In practice, we almost never observe either perfect correlation or zero correlation. Usually r assumes some value between zero and one. The closer that value s to one, the greater is the degree of covariability, that is the closer will the scatter of points approach a straight line. On the other hand, the greater the scatter of points in the diagram, the closer r is to zero.

We said that r is the sample estimate of the population correlation coefficient p. As a statistical estimate r is inevitably subject to some error and should be tested for its reliability. Tests of significance for r are explained in Chapters 5 and 8.

Example. Suppose we want to compute the correlation coefficient between the variables Y (quantity supplied) and X (price) with the observations included in table 3.1.

Table 3.2. Data for the estimation of the simple correlation coefficient ryx

Computation of t range deviations from the means

We need compute the terms Σ X1 Y1 , Σ X12, Σ Y1

2 which appear in the fonnula. The computations are given in table 3.2.

Computation of r using actual observarions:

From the formula we see that we need compute the terms

The computations are shown in table 3.2.. Substituting we find

 

3.4. THE RANK CORRELATION COEFFICIENT

The formulae of the linear correlation coefficient developed in the previous section are based on the assumption that the variables involved are quantitative and that we have accurate data for their measurement. However, in many cases the variables may be qualitative (or binary variables) and hence cannot be measured numerically. For example, profession, education, preferences for particular brands, are such categorical variables. Furthermore, in many cases precise values of the variables may not be available, so that it is impossible to calculate the value of the correlation coefficient with the formulai developed in the preceding section. For such cases it is possible to use another statistic, the rank correlation coefficient (or Spearman's correlation coefficient). We rank the observations in a specific sequence, for example in order of size, importance etc. using the number 1,2,....... n. In other words we assign ranks to the data and measure the relarionship berween their ranks instead of their actual numerical values. Hence the name of the statistic as rank correlation coefficient. If two variables X and Y are ranked in such way the rank correlation coefficient may be computed by the formula.

(3.9)

where D = difference between ranks of corresponding pairs of X and Y

            n = number of observations..

Page 70: Quantitative Techniques in Management New Book

The values that r may assume range from +1 to -I. (For a test of sigficance of r see Chapter 6.)

Two points are of interest when applying the rank correlation coefficient. Firstly, it does not matter whether we rank the observations in ascending or descending order. However, we must use the same rule of ranking for both variables. Second, if two (or more) observations have the same value we assign to them the mean rank. Some examples will illustrate the application of the rank correlation coefficient.

Example 1 - The following table shows how ten students were ranked according to their performance in their class work and their final examinations. We want to find out whether there is a relationship between the accomplishments of the students during the whole year and their performance in their exams.

The differences between the two rankings is given in the following table.

The rank correlation coefficient is

The high value of the correlation coefficient indicates that there is a close relationship between class work and exam performance:, students with good record all over the year do well in their examinations and vice versa.

Example 2. A market researcher asks two smokers to express their perfonnance for twelve different brands of cigarettes. Their replies are shown in the following table.

The differences of preferences of the two smokers are shown below:

shows a market similarity of preferences of two consumers for the various brands of cigarettes.

3.5 PARTIAL CORRELATION COEFFICIENTS

A partial correlation coefficient measures the relationship between any two variables, when all other variables connected with those two are kept constant. For example, let us assume that we want to measure the correlation between the number of hot drinks (x1) consumed in a summer resort and the number of tourists (x2) coming to that resort. It is obvious that both these variables are strongly influenced by weather conditions, which we may designate by x3. On a priori grounds we expect xl and x2 to be positively correlated: when a large number of tourists arrive in the summer resort one should expect a high consumption of hot drinks and vice versa. The computation of the simple correlation coefficient between xl and x2 may not reveal the true relationship connecting these two variables however, because of the influence of the third variable, whether conditions (x3). In other words the above positive relationship between number of tourists and number of hot drinks consume is expected to hold if weather conditions can be assumed constant. If weather changes the relationship between Xl and X2 may be distorted t such an extent as to appear even negative. Thus if the weather is hot, the number of tourists will large, but because of the heat they will prefer to consumer more cold drinks and ice-cream rather than hot drinks. If we overlook weather and look only at XI and X2 we will observe a negative correlation between these two variables which is explained by the fact that hot drinks as well as number of visitors are affected by heat. In order to measure the true correlation between X1 and X2 we must find some way of accounting for changes in X3. This is achieved with the partial correlation coefficient between X1 and X2 when X3 is kept constant. The partial correlation coefficient is determined in terms of the simple correlation coefficients among the various variables involved in a multiple relationship. In our example there are three simple correlation coefficients. .

There are two partial correlation coefficients r12.3 = partial correlation coefficient between X1 and X2 when X3 is kept constant

Page 71: Quantitative Techniques in Management New Book

and

The Example is taken from J.E. Freund and F.J. William's, Modern Business Statistics, Pitman. London 1959. r13.2 = partial correlation coefficient between X1 and X3 when X2 is kept constant

The proof and rationalisation of these formulae will be given in Chapter 7, by which time the reader will have become familiar with regression analysis. The formula for the partial correlation coefficient can be directIy extended to relationships involving any number of explanatory variables. (see Chapter 7)

3.6 LIMITATIONS OF THE THEORY OF LINEAR CORRELATION

Correlation analysis has serious limitations as a technique for the study of economic relationship.

Firstly. The above formulae for r apply only when the relationship between the variables is linear. However, two variables may be strongly connected with a nonlinear relationship.

It should be clear that zero correlation and statistical independence of two variables (X and Y) are not the same thing Zero correlation implies zero covariance of X and Y so that

Statistical independence of X and implies that the probability of X1 and Y1 occurring simultaneously is the simple product of the individual probabilities

P (X and Y) = P(X). P(Y)

(For a discussion of this result see Appendix I)

Independent variables do have zero covanance and are uncorrelated :

The linear correlation coefficient between two independent variables is equal to zero (figure 3.9). However, zero linear correlation does not necessarily imply independence. In other words uncorrelated variables may be statistically dependent. For example if X and Y are related so that the observations fall

(Figure 3.9)

On a circle or on a symmelteal parabola (as in figure 3.10 and 3.11) the relationship is perfect but not linear. The variable are statistically dependent. Although their covariance and the linear correlation coefficient are zero. Absence of

Y and X uncorrelated but dependent.Their function is(X-X1)2 + (Y-Y1)2 = a2r = 0

Figure 3.10

Y and X uncorrelated but dependent.Their function is(X-X1)2 = 4a(Y-Y1)2 = a2r = 0

Figure 3.10

linear correlation does not imply absence of any dependence. (see A. Goldberger, Econometric Theory, Wiley, New York 1964. pp61-2. Also C Christ, Econometric models and Methods, p 146)

Secondly. The second limitation of the theory is that although the correlation coefficient is a measure of the covariability of variables it does not necessarily imply any functional relationship between the variables concerned. Correlation theory does not establish and / or prove any

Page 72: Quantitative Techniques in Management New Book

causal relationship between the variables. It seeks to discover if a covariation exists, but it does not suggest that variations in, say, Y are 'caused' by variations in X, or vice versa. Knowledge of the value of r, alone will not enable us to predict the value of Y from X. A high correlation between variables Y and X may describe anyone of the following situations:

1. Variation in X is the cause of variation in Y 2. Variation in Y is the cause of variation in X 3. Y and X are jointly dependent, or there is a two -way causation, that is to say Y is the cause of (is determined by ) X, but also X is

the cause of (is determine by ) Y. For example in any market Q = f (P), but also P = f(Q), therefore there is a two -way causation between Q and P, or in other words P and Q are simultaneously determined.

4. There is another common factor (Z), that affects X and Y in such a way as to show a close relation between them. This often occurs in time series, when two variables have strong time trends (i.e. grow over time); in this case we find a high correlation between Y and X, even though they happen to be causally dependent.

5. The correlation between X and Y may be due to chance.

To illustrate the above discussion, we cite the following examples.

Example. 1. Suppose we look at the marks a student gains in his examinations (Y) and his hours of work in a ship (X). By gathering information on the performance of the student in various exams and the hours he worked in the shop during tourism's we compute the correlation coefficient which assumes, say, the value of -0.9 this value of r is not enough evidence to prove that there exists actually an inverse causal relationship between grades (Y) and hours of outside work (X). More information is required before we can establish such a functional (causal) relationship exists (Y = f(x)), due to long hours of working at the shop the grades scored at exams are low. (2) the opposite may be true. (X = f(Y)): because of low grades it may be that the student cannot get a scholarship and as a result he has to do outside work. (3) There may be a third factor, affecting both X and Y in such a way that they show a close relation, that is it may be that the student has to suppon his ailing arenas, who are causing him to get low grades and engage in extra outside work for money. (4) The correlation of X and Y may be due to chance, the student who works in a shop may be a bad examinee in general, scoring low marks in examinations.

Example 2. Correlations are sometimes observed between quantities that could not conceivably be causally related. For example, if a high correlation is found between the number of births and the number of murders in a country, this should not obviously provide a proof that the births of babies are determined by the number or murders! Also the fact that statisticians have observed a high correlation between birth of babies and arrivals of storks does not mean hat births are determined by stork movements! it simply shows that storks and babies show time trends. These are examples of what is called spurious correlation (or chance correlation), in other words correlation which does not show any casual relaionship between the variables involved.

Example 3. Consumption (C) and income (Y) are jointly dependcnt variables since C = f(Y) according to the simple Keynesian theory, but also Y = f(C). Similarly price (P) and quantity demanded are two jointly dependent variables. since D= f(P) but also P = f(D).

It follows from the above discussion that correlation theory does not establish a functional relationship that is to say it does not prove which is the dependent and which is the explanatory variable. Only through a more thorough investigation using economic theory can we come to some conclusion as to whether or not X is the cause of Y. Furthemore, correlation analysis does not give numerical values for the coefficients of the relationship, that is it does not give estimates for the slope and the constant intercept of the function. A given value of the correlation coefficient is consistent with an infinite number of straight lines. In figure 3.12 and 3.13 the correlation between X and Y is positive and perfect all observations lie on the lines, but the lines are completely different, having different slopes and intercepts.

In summary the linear correlation coefficient measures the degree to which the points cluster around a straight line, but it does not give the equation for the line, that is it does not assign numerical values to the parameters of the function which is represented by this line. These parameters are elasticities (or components of elasticities), or propensities and multipliers as far as economic theory is concerned, and the knowledge of their numerical value is of particular interest both to entrepreneurs and to policy makers.

To estimate the parameters of the relationship we may apply various methods. We will start by the development of the method of Ordinary Least Squares regression, because it is the simplest of all and, furthermore, it forms the basis of most of the other, more elaborate, econometric techniques.

EXCERCISES

Page 73: Quantitative Techniques in Management New Book

1. Calculate the coefficient between the following series:-

Interpret your results.

2. The following table includes the rankings of the preferences of two housewives of ten different brands of soap.

Are the preferences of the two housewives similar?

3. Calculate the correlation coefficient between the following series

What can you infer from the value of r?

4. Suppose that a firm had the following profits and investment expenditures in each year from 1961-1970.

a. Estimate the correlation coefficient between profits and investment expenditures.b. On the basis of observed correlation coefficient, researcher A claims that profits determine the level of investment, while researcher

B claims that investment determines the profitability of the firm. Whom of the two researchers is right?

5. It is often suggested that the research expenditure of a firm is related to the level of its profits. Do the following data substantiate this hypothesis?

6. Show algebraically that the correlation coefficient between n observations (where a. b, c, d are constants) is equal to the correlation coefficient between the simple observations Y1 and X1.

7. Repeat exercise 6 with.

8. Generalise the results of exercises 6 and 7 to show that the value of the correlation coefficient is independent of the scale and origin of measurement of Y1 and X1 and hence the burden of calculation of r is most reduced if a, b,c, d are chosen to make Y* and X* as simple as possible.

 

 

Page 74: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 11 : THE SIMPLE LINEAR REGRESSION MODEL THE ORDINARY LEAST SQUARES METHOD

   

  There are various econometric methods that can be used to derive estimates of the parameters of economic relationships from statistical observations. In this chapter we shall examine the method of ordinary least squares (OLS) or classical least squares (CLS). The reasons for starting with this method are many. Firstly, the parameter estimates obtained by ordinary least squares have some optimal properties which will be discussed in Chapter 6. Secondly, the computational procedure of OLS is fairly simple as compared with other econometric techniques and the data requirements are not excessive. Thirdly, the least squares method has been used in a wide range of economic relationships with fairly satisfactory results (see Chapter 21), and, despite the improvement of computational equipment and of statistical information which facilitated the use of other more elaborate econometric techniques. OLS is still one of the most commony employed methods in estimating relationships in econometric models. Fourthly, the mechanics of least squares are simple to understand Fifthly, OLS is an essential component of most other econometric techniques. In fact, as we will see later, with the exception of the Full Information Maximum Likelihood method, all other techniques involve the application of the least squares method, modified in some respects.

We shall start by the simple linear regression model, that is, by a relationship between two variables, one dependent and one explanatory, related with a linear function. Subsequently we will examine the multiple regression analysis, which refers to the relationship between more than two variables.

4.1 THE SIMPLE LINEAR REGRESSION MODEL

An example

We will illustrate the meaning of the method of least squares by referring to our earlier example from the theory of supply. The theory of supply in its simplest form postulates that there exists a positive relationship between the quantity supplied of a commodity supplied increases and vice versa. Following the econometric procedure outlined in Chapter 2, our first task is the specification of the supply model, that is, the determination of the dependent (regressed) and the explanatory variables (regression), the number of equations of the model and their precise mathematical form, and finally the a priori expectations concerning the sign and the magnitude of the coefficients. Economic theory provides the following information with respect to the supply function.

The Simple Linear Regression Model

1. The dependent variable is the quantity supplied and the explanatory variable is the price

where Y = quantity supplied             X = Price of the commodity

2. Economic theory does not specify whether the supply should be studied with a single - equation model or with a more elaborate system of simultaneous equations. In view of this indeterminacy we choose to start our investigation with a single -equation model. In later stages we may study more elaborate models.

3. Economic theory is not clear about the mathematical form (linear or nonlinear) of the supply function. In textbooks the supply is sometimes depicted by a straight upward - sloping line, or by an upward-sloping curve. The latter implies a nonlinear relationship between quantity and price. Again the econometrician has to decide to form of the supply function. We start by assuming that the variables are related with the simplest possible mathematical form, that is, the relationship between quantity and price is linear of the form

Yi = b0+ b1 X 1

This form implies that there is a one- way causation between the variables Y and X: Price is the cause of changes in the quantity supplied, but

Page 75: Quantitative Techniques in Management New Book

not the other way around.

The parameters of the supply function are b0 and b1, our aim is to obtain estimates of their numerical values, b0 and b1.

As regards the sign and size of the constant intercept b0, we note that it should be either zero (in which case its meaning is that the quantity is zero when price is Zero) or positive (in which case its meaning is that some quantity is supplied even when the price drops to zero). NormaIIy b0 should not be negative in the case of a supply function. If b0 turns up with a negative quantity does not make sense in economics. However, the sign b0 is crucial in determining the price elasticity of supply as, we will presently see.

Regarding the value b1, we note that in the particular case of a supply function we except the sign of b1 to be positive (b1>0), since a supply curve is normally upward -sloping.

It is important to examine the relationship between the price elasticity of supply and the coefficients b0 and b1. Recall that the elasticity is defined by the expression

From the supply function it is obvious that

In computing the elasticity from a regression line, we use the estimate b1, and the mean values of price(X') and the quantity (Y') in the sample. Thus

Thus, substituting for Y' in the expression of the elasticity, we obtain

Given that b1> 0, it follows that

i. the supply will be elastic (η p > 1) if b0 is negative (b0<0)ii. the supply will be inelastic (η p < 1) if b0 is negative (b0> 0)iii. the supply will have unitary elasticity (η p >1) if b0 = 0 iv. Thus the elasticity of a supply curve (with positive slope) depends on the sign of the constant intercept, b0

4. The above form of the supply function implies that the relationship between quantity and price is exact, that all the variation in Y is due solely to changes in X, and that there are no other factors affecting the dependent variable. If this were true all the points of price - quantity actually supplied in the market at various prices and we plot them on a diagram we see that they do not fall on a straight line ( or any other smooth curve for that matter). Suppose that we have the ten pairs of observations on X and Y shown in table 4.1. The scatter diagram of these observations shows that the relationships between price and quantity supplied has a form roughly similar to a straight line (figure 4.1).

Fig 4.1

The Simple Linear regression Model

The deviations of the observation form the line may be attributed to several factors.

1.          Omission of variables from the function

In economic reality each variable is influenced by a very large number of factors. For instance, the consumption pattern of a family is determined by family income, prices, the composition by age and sex of the family, the past levels of the family income, tastes, religion, social and educational status, wealth, and so on. One could compile an almost non-ending list of such factors. However, not all the factors influencing a certain variable can be included in the function for various reasons. (a)     Some of the factors may not be known even to the person most

Page 76: Quantitative Techniques in Management New Book

acquainted with the relationship being studied. This lack of knowledge is to a great extent due to incomplete theory about the variation of economic variables in general.

(b)     Even when known to be relevant, some factors cannot be measured statistically. These are mainly psychological           factors, or, in general, qualitative factors (tastes, expectations, religion) which cannot even be approximated           satisfactorily with dummy variables.(c)     Some factors are random, appearing in an unpredictable way and time. So that their influence cannot be taken           satisfactorily into account (e.g. epidemics, earthquakes, wars).(d)     Some factors may have, each individually a very small influence on the dependent variable. Thus their parameter is           so small that is cannot be measured in a reliable way (due to rounding errors of the computations). All these factors           together, however, may account for a considerable part of the variation of the dependent variable. (e)     Even if all factors are known, the available data most often are not adequate for the measurement of all factors           influencing a relationship. This is particularly so when we use time series, which are usually short. Thus in most           cases only the most important three or four variables are explicitly included in the function. The lack of adequate           numbers of observations creates a problem of 'degrees of freedom', which impairs the application of the traditional           tests of significance. (see Chapter 5 and Appendix I.)

                                                                                    

Table 4.1

       

2)          Random behaviour of the human beings.

The scatter of points around the line may be attributed to an erratic element which is inherent in human behaviour. Human reactions are to a certain extent unpredictable and may cause deviatons from the 'normal' behavioral pattern depicted by the line. For example in a moment's whim a consumer may change his expenditure pattern, although income and prices did not change.

3)          Imperfect specification of the mathematical form of the model.

We may have linearised a possibly non linear relationship. Or we may have left out of the model some equations. The economic phenomena are much more complex than a single equation may reveal, no matter how many explanatory variables it contains. In most cases many variables are simultaneously determined by a system containing many equations. For example price determines and is determined by the quantity supplied. Under such circumstances if we attempt to study the phenomenon with a single-equation model, we are bound to commit an error, which is due to the imperfect specification of the form of the model, that is, of the number of its equations.

4)          Errors of aggregation.

We often use aggregate data (aggregate consumption, aggregate income), in which we add magnitudes referring to individuals whose behaviour is dissimilar. In this case we say that variables expressing individual peculiarities are missing. For e.g., in a production function for an industry we add together the factor inputs and outputs of dissimilar entrepreneurs. Changes in the distribution of total output among firms are important in the determination of total output. However, such distribution variables are often missing from the function. There are other types of aggregation which introduce error in the relationship. For e.g., aggregation over time, spatial aggregation, cross section aggregation, and so on.

5)          Errors of measurement.

The deviations of the points from the line may be due to errors of measurement of the variables, which are inevitable due to the methods of collecting and processing statistical information.

The first four sources of error render the form of the equation wrong, and they are usually referred to as error in the equation or error of omission. The fifth source of error is called error of measurement or error of observation. It is usual of course to have both these types of error simultaneously in the function.

In order to take into account the above sources of error we introduce in econometric functions a random variable which is usually denoted by the letter u and is called error term or random disturbance term or stochastic term of the function, so called because u is supposed to 'disturb' the exact linear relationship which is assumed to exist between X and Y. By introducing this random variable in the function the model is

Page 77: Quantitative Techniques in Management New Book

rendered stochastic of the form

Y1 = (b0 + b1 Xi) + (ui)

the true relationship which connects the variables involved is split into two

The Simple Linear Regression Model

Parts: a part represented by a line and a part represented by the random term u. The meaning of these two parts may be explained by looking at fig 4.2.,

          Fig 4.2

scatter of observations represents the true relationship between Y and X. The line represents the exact part of the relationship and the deviations of the observations from the line represent the random component of the relationship. Were it not for the errors in the model, we would observe the points on the line Y1', Y2', .......Yn', corresponding to X 1, X 2, .......Xn. However, because of the random disturbances, we observe Y 1, Y 2, .......Yn, corresponding the X 1, X 2, .......Xn. These points diverge from the regression line by quantities u1, u2, u3, ......un, where ui is the random error associated with Yi. In other words the values of Y corresponding to a value of X will on the average fall on a line, but each individual Yi will deviate from the line depending on the value of ui. Hence each Yi (i = 1,2, .......n) can be expressed in terms of two components, one component due to Xi and a second component due to the influences included in the random term u i.

Yi= (b0 + b1 Xi) + (ui)

Variation in Yi = Systematic Variation + Random Variation or Variation in Yi = Explained Variation + Unexplained Variation

The first component in brackets is the part of the variation in Y explained by the changes in X and the second is the part of the variation not explained by any specific factor, that is to say the variation in Y is due to the random influence of u.

Seen in this light the random term u seems to have a meaning related to the ceteris paribus clause of economic theory. Economic theory assumes that the functional relationships between variables are exact under the ceteris paribus clause: For e.g., the demand functionD = b0 + b1P postulated by economic theory implies that the quantity of a particular commodity is a linear function of its price alone, ' other things remaining equal', that is, the price - quantity relationship holds provided that all other factors not appearing explicitly in the function (for e.g., tastes, income, other prices) remain which exist in the real world, so that the ceteris paribus clause is very seldom fulfilled When we collect data on the quantities of a commodity purchased at various prices we do not observe the quantity that would be bought if all 'other things' were constant, but rather the quantities purchased while the prices of other goods, incomes, tastes and other factors have all been changing.

In econometrics we may read the true relationship, ceteris paribus. If factors other than X remain unchanged then changes in Y would be fully explained by changes in X. However, other factors do not remain equal; hence we introduce u into the function to account for the changes in other variables not included in it explicitly.

We may now look at the final form of our equation Yi = b0 +b1Xi + ui in another way. For a given value of X, Y may assume various values depending on the particular (positive or negative) value that u happens to assume. To each value of X corresponds a distribution of various values of u, and therefore Y's. This situation is pictured in fig 4.3. For example if the price of the commodity is equal to X1, the quantity which will be supplied at this price may assume any value between Y1' and Y1 ", depending on the value of u in this period. If, for instance, there is a strike of lorry drivers, or a power cut, which delays the delivery of the commodity (these situations being examples of chance events), the quantity will not be Y1, as the linear equation suggests, but a smaller quantity Y1*, due to the above factors which give a value u1* to the random term. If, however, there is a rumor of a fall in prices of substitutes or of a new product being developed, the supplier may offer all the stock, which he otherwise would offer in future periods, so that at the price X1 the quantity supplied would be Y1**, because the change in exceptions caused u to assume the value u1 **.

Fig 4.3

Page 78: Quantitative Techniques in Management New Book

To estimate the coefficients b0 and b1, we need observations on X, Y and u. Yet u is never observed like the other explanatory variables, and therefore in order to estimate the function Yi = b0 + b1Xi + ui, we should guess the values of u, that is we should make some reasonable (plausible) assumptions about the shape of the distribution of each ui (its mean, variance and covariance: with other u's). These assumptions are guesses about the true, but unobservable.

4.2 ASSUMPTIONS OF THE LINEAR STOCHASTIC REGRESSION MODEL

The linear regression model is based on certain assumptions. Some of which refer to the distribution of the random variable u, some to the relationship between u and the explanatory variables themselves. We will group !he assumptions in two categories. (a) stochastic assumptions, (b) other assumptions.

4.2.1 STOCHASTIC ASSUMPTIONS OF ORDINARY LEAST SQUARES

These are assumptions about the distribution of the values of u. They are crucial for the estimates of the parameters and will be explained in detail in subsequent chapters (see Chapter 9 -12). It is these assumptions about the random term u that adapt the least squares method, which is a statistical method, to the stochastic nature of economic phenomena. At this stage we will state these assumptions without attempting to explain their implications for the parameter estimates.

Assumption 1:  ui is a random real variable.

The value which ui may assume in anyone period depends on chance; it may be positive, negative or zero. Each value has a certain probability of being assumed by u in any particular instance.

Assumption 2:  The mean value of u in any particular period is zero.

This mean that for each value of X, u may assume various values, some greater than zero and some smaller than zero, but if we considered all the possible values of u, for any given value of X they would have an average value equal to zero. With this assumption we may say the Yi = b0 + bI Xi gives the relationship between X and Y on the average, that is, when X

__________________________________________________________________________________________________________________1. As we shall see readily, we can get an estimate of the u's after the estimation of the regression line and the computation of the residual deviations of the observations from this line.2. The covariance of the u's measures the way in which the u's of different periods tend to covary. The covariance of u's measures the way     in which the values of u's of different periods tend to vary with the values of X in these periods. (see Appendix 1).

Assumes the value Xi the dependent variable will on the average assume the value Yi (on the line), although the actual value of Y observed in any particular occasion may display some variation: Sometimes the value of the dependent variable (corresponding to the given value of X) will be bigger than Yi, and at other times it. will be smaller than the Yi (on the line). Yet on the average the value of Y will be equal to Yi when X assumes the values Xi. That is, on the average u is equal to zero

Assumption 3:  The variance of ui is constant in each period.

The variance of ui about its mean is constant at all values of X. In other words for all values of X, the u's will show the same dispersion round their mean. In figure 4.3 this assumption is denoted by the fact that the values that u may assume lie within the same limits, irrespective of the values of X: for x1, u can assume any value within the range AB; for X2, u can assume any value within the range CD which is equal to AB and so on.

Assumption 4:  The variable ui has a normal distribution.

The values of u (for each Xi) have a bell-shaped symmetrical distribution about their zero mean.

The above four assumptions about the behaviour (distribution) of the values of u may be summarised by the expression

and are pictured in fig 4.4.

Page 79: Quantitative Techniques in Management New Book

Fig 4.4

Assumption 5:  The random terms of different observation (ui, uj) are independent.

This means that all the covariances of any ui with any other uj are equal to zero. The value which the random term assumed in one period does not depend on the value which it assumed in any other period

The Simple Linear Regression Model

Assumption 6:  u is independent of the explanatory variable(s).

The disturbance term is not correlated with the explanatory variable(s). The u's and the X's do not tend to vary together, their covariance is zero. Symbolically

It is however, conceptually easier and computationally more convenient to make an alternative assumption which ensures zero covariance of the u's and X's.

Assumption 6 A:  The Xi' s are a set of fixed values in the hypothetical process of repeated sampling which underlies the linear regression model.

This means that, in taking a large number of samples of Y and X, the Xi values are the same in all samples, but the ui values do differ from sample to sample, and so of course do the values of Yi. For eg., assume that every day in a market we choose the same prices X1, X2, .......Xn

and we record the quantities Yi's sold each day at these prices. The X's do not vary. they are a set of fixed values; while the Yi's vary for each day due to different random influences. Clearly, under these conditions the covariance of the (fixed) X's and the u's is Zero. Because

 

In the remainder of this book we will mostly use Assumption 6A, that the explanatory variables are fixed.

Assumption 7:  The explanatory variable(s) are measured without error.

U absorbs the influence of omitted variables and possibly errors of measurement in the Y's. That is, we will assume that the repressors are error-free, while the Y values mayor may not include errors of measurement.

4.2.2 OTHER ASSUMPTIONS OF ORDINARY LEAST SQUARES

Assumption 8:  The explanatory variables are not perfectly linearly correlated.

If there is more than one explanatory variable in the relationship it is assumed that they are not perfectly correlated with each other. Indeed the regressors should not even be strongly correlated, they should not be highly multicollinear.

Assumption 9:  The macro variables should be correctly aggregated.

Usually, the variables X and Y are aggregative variables, representing the sum of individual items. For eg., in a consumption functionC= b0 + b1Y + uC is the sum of the expenditures of all consumers and Y is the sum of all individual incomes. It is assumed that the appropriate aggregation procedure has been adopted in compiling the aggregate variables.

Assumption 10:  The relationship being estimated is identified.

It is assumed that the relationship whose coefficients we want to estimate has a unique mathematical form, that is it does not contain the same variables as any other equation related to the one being investigated. Only if this assumption is fulfilled can we be certain that the coefficients

Page 80: Quantitative Techniques in Management New Book

which result from our computations are the true parameters of the relationship which we study.

Assumption 11:  The relationship is correctly specified.

It is assumed that we have not committed any specification error in determining the explanatory variables, that we have included all the important regressors explicitly in the model, and that its mathematical form (number of equations and their linear or non linear nature) in correct.

4.3 THE DISTRIBUTON OF THE DEPENDENT VARIABLE Y

In this section we will establish that the dependent variable Y has a normal distribution with mean                                                                    

Proof 1. The mean of Yi = E (Yi) = b0 + b1Xi. By definition the mean of Yi is its expected value.

Given                      Yi = b0 + b1Xi + ui

Taking expected values we find                                 E(Yi) = E{b0 + b1Xi + ui}                                           = E{b0 + b1Xi} + E(ui)

Given that b0 and b1 are parameters and by Assumption 6A the values of X1's are a set of fixed numbers (in the process of hypothetical repeated sampling)

                                 E{b0 + b1Xii} = b0 + b1Xi

Furthermore, by Assumption 2                   E(ui)=0

Therefore,                    E(Yi) = b0 +b1Xi

Proof 2. The variance of Yi = E[(Yi -E(Yi)]2 = σ2 Substitute Yi = b0 + bXi + ui and E(Yi) = b0 + blXi in the definition of the variance

E[(Yi - E(Yi)]2 = E {b0 + b1Xi + ui - b0 - b1Xi}2 = E(ui)2

The Simple Linear Regression Model

Clearly the deviations of the observations from the lines depend on their constant intercept (ba) and their slope (b1). The choice among all possible lines is done on the basis of what is called the least squares criteria. The rationale of the criterion is easy to understand. It is intuitively obvious that the smaller the deviations from the line, the better the fit of the line to scatter of observations. Consequently from all possible lines we choose the one for which the deviations of the points is the smallest possible. The least squares criterion requires that the regression line be drawn (i.e. its parameters be chosen) in such a way as to minimise the sum of the square of the deviations of the observations from it.

The first step is to draw the line so that the sum of the simple deviations of the observation is zero - some observations will lie above the line and will have a positive deviation, some will lie below the line in which case they will have a negative deviation, and finally the points lying on the line will have a zero devaiation. In summing these deviations the positive values will offset the negative values, so that the final algebraic sum of these residuals will equal zero by definition( Σe = 0). This of course does not mean that the deviations disappear when we fit the least squares line, but that their algebbraic sum is by construction equal to zero. The best solution is to square the deviations and minimise the sum of the squares.( Σei2). The reason for calling this method the least squares method should now be clear: the method seeks the minimisation of the sum of the squares of the deviations of the actual observations from the line. Our next task is to express the residual deviations (e's) in terms of the observed values of Y and X in our sample. In figure 4.7 the estimated line is Y' - b0^ + b1^X. As already mentioned the sign(^) on

Page 81: Quantitative Techniques in Management New Book

top of the dependent variable indicates the estimated (predicted) value of the dependent variable, as distinguished from the observed value of this variable, which is represented by the simple letter Y1.

 Fig. 4.7

If b0 and b1 are numerically known, from the estimated line we can obtain a prediction of Y, that is, an estimated value of the dependent variable (Y1) which corresponds to a given value of the explanatory variable (X1). That is, for each given X, the corresponding Y^ lies on the line. For example when X assumes the value X1, the equation predicts that the dependent variable will assume the (estimated) value Y1^. However, the actually observed value of the dependent variable which corresponds to X1 is Y1 and not lie on the estimated line. It is apparent that the equation does not predict the values of the dependent variable with perfect accuracy. We have denoted by e1 the difference between the observed value Y1 and its estimated value Y1^, that is

Substituting Y1^ we find

Squaring these deviations and taking their sum we obtain

The sum of squared residual deviations is to be minimised with respect to b0^ and b1^, Following the minimisation procedure we get the normal equations

Formal derivations of the normal equations

We have to minimise the function

with respect to b0^ and b1^. The necessary condition for a minimum is that the first derivatives of the function be equal to zero

To obtain the above derivatives we apply the 'function of a function' rule of differentiation. According to this rule if y = f (w) and w = f (x).then

In the case of the above function we let (Y1 - b0 - b1x1) = w, thus we have Partial derivative with respect to be                       

Partial derivative with respect to b0             

                       

Formal derivative with respect to b1        

                                                                          (4.6)

Page 82: Quantitative Techniques in Management New Book

Combining equations (4.5) and (4.6) and performing the summations we get

Applying the usual summation rules (see Appendix 1) we obtain the normal equations of OLS

Solving the normal equations for b0^

And b1^ We obtain the least squares estimates

        

It is clear that b0^ and b1^ can be estimated by substituting tbe terms n, ΣX, ΣY, ΣXY and ΣX2, whose values can be obtained from the sample observations.

The above formulae are expressed in terms of the original sample observations on X and Y. It can be shown that the estimates b0^ and b1^ may be obtained by the following formulae which are expressed in deviations of the variables from their means:

     

Proof

(1) In Chapter 3 we established that Σx1y1= (n ΣXY - ΣX ΣY) / n. (This is the expression 3.5 pn P.-----)

(2) Similarly we have proved (expression 3.6 of Chapter 3) that

(3) Substituting in the expression for ^b1 we find

The solution of a system of equations may be obtained by the use of various methods.

Table 4.2. Worksheet for the estimation of the supply function of commodity Z.

Dividing the first normal equation through by n we obatin

that is the regression line passes through the point defined by the means of the variables. This is very useful result which we will use often in subsequent chapters.

Example: To illustrate the use of the above formulae we will estimate the supply function of commodity n using the data in table 4.2.

We substitute the computed values from table 4.2 into the formulae for b0' and b1^.

Page 83: Quantitative Techniques in Management New Book

(I) Using the original sample observations.

(2) Using the derivations of the variables from their means

Thus the estimated supply function is

4.5 ESTIMATION OF A FUNCTION WHOSE INTERCEPT IS ZERO

In some cases economic theory postulates relationships which have a zero constant intercept, that is, they pass through the origin of the XY plane. For example linear production functions of manufactured products should normally have zero intercept, since output is zero when the factor inputs are zero. In this event we should estimate the function.

Imposing the restriction b0 = 0. The formula for the estimation of b1' then becomes.

Which involves the actual values of the variables and not their deviations as in the case of unrestricted value of b0.

Proof: We want to fit the line Y = b0 + b1 X1 +n, subject to the restriction b0 = 0. This is a restricted minimisation problem: we minimise

Subject to b0^ = 0. Following the minimisation procedure for a constrained function, we form the composite function.

where λ is a Language multuplier, and we minimise it with respect to b1^, and b0'

Substituting (3) into (2) and re-arranging we obtain

4.6 ESTIMATION OF ELASTICITIES FROM AN ESTIMATED REGRESSION LINE

We said that the estimated function

Y1^ = b0^ + b1^ X1

As the equation of a line whose intercept is b0^ and its slope b1^. The coefficient b1^ is the derivative of Y ^ with respect to X

and shows the rate of change in Y ^ as X changes by a very small amount. It should be clear that if the estimated function is linear demand or supply function the coefficient b1^ is not the price elasticity, which is defined by the formula.

Page 84: Quantitative Techniques in Management New Book

where np = price elasticity

             Y = quantity(demanded or supplied)

             X = price

Clearly b1 is the component dY/dX. From an estimated function we obtain an average elasticity.

where X' = the average price in the sample

Y' = average regressed value of the quantity, i.e. the mean of the estimated from the regression Y1

Y' = average value of the quantity in the sample.

Note that Y ' ^ = Y' that is the mean of the estimated values of Y is equal to the mean of the actual (sample) values of Y, because

In our earlier example of the supply function the price electricity of supply of

Exercises

1. Using the time series in the following table, estimate the consumption function and the savings function (in linear form) of the U.K.. What is the marginal propensity to consume and the marginal propensity to save of the country? Interpret the intercepts of the two functions.

Year Income in L m Consumption in L m

1964 26,934 21,439

1965 28,729 22,833

1966 30,171 24,205

1967 31,781 25,307

1968 33,450 27,020

2. Assume that over the 1964 -8 period direct taxation yielded the following revenues (Lm)

1964 1965 1966 1967 1968

4245 5029 5518 5960 6726

Using the income figures of exercise 1 obtain the tax (linear) equation and interpret its coefficients. Plot the tax regression line.

Page 85: Quantitative Techniques in Management New Book

3. A random sample of ten families had the following income and food expenditure (in L. per week).

Families A B C D E F G H I J

Family income 20 30 33 40 15 13 26 38 35 43

Family expenditure 7 9 8 11 5 4 8 10 9 10

Estimate the regression line of food expenditure on income and interpret your results.

4. The following results have been obtained from a sample of 11 observations on the value of sales (Y) of a firm and the corresponding prices (X).

X' = 519.18     Y' = 217.82

X12 = 3.134,543

Σ X1Y1 = 1.296,836

Σ Y12 = 539.512

(i) Estimate the regression line of sales on price and interpret the results.

(ii) What is the part of the variation in sales which is not explained by the regression line?

(iii) Estimate the price elasticity of sales.

(iv) The following table gives the quantities of commodity z bought in each year from 1961-1970 and the corresponding prices.

Year 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970

Quantity (in tons) 770 785 790 795 800 805 810 820 840 850

Price(in L) 18 16 15 15 12 10 10 7 9 6

(i) Estimate the linear demand function for commodity :-

(ii) Calculate the price elasticity of demand.

(iii) Forecast the demand at the mean price of the sample.

(iv) Forecast the demand at P = 20.

Note: Additional exercises are included in Appendix III.  

 

Page 86: Quantitative Techniques in Management New Book

  Quantitative Techniques in Management

  Chapter 12 : TIME SERIES

   

  OBJECTIVES

To learn why forecasting changes that take place over time are an important part of decision making.

To understand the four components of a time series.

To use regression-based techniques to estimate and forecast the trend in a time series.

To learn how to measure the cylical components of a time series.

To compute seasonal indies and use them to deseasonalize a time series.

To be able to recognize irregular variation in a time series.

To deal simultaneously with all four components of a time series and to use time-series analysis for forecasting.

INTRODUCTION

Forecasting, or predicting, is an essential tool in any decision-making process. Its uses vary from determining inventory requirements for a local shoe store to estimating the anual sales of video games. The quality of the forecasts management can make is strongly related to the information that can be extrcted and used from past data. Time-series analysis is one quantiative method we used method we use to determine patterns in data collected over time. Table 1 is an example of time-series data.

Time-series analysis is used to detect patterns of change in statistical information over regular intervals of time. We project these patterns to arive at an estimate for the future. Thus, time-series analysis help us cope with uncertainity about the future.

Table - 1 Time Series for the number of Ships loaded at Morehead City, N.C.

Year 1985 1986 1987 1988 1989 1990 1991 1992

Number 98 105 116 119 135 156 177 208

EXERCISES 1.

1. Of what value are forecasts in the decision-making process? 2. For what purpose do we apply time-series analysis to data collected over a period of time? 3. How can one benefit from detennining past patterns?4. How would errors in forecasts affect a city government?

Page 87: Quantitative Techniques in Management New Book

VARIATION IN TIME SERIES

Four kinds of variation in time-series

We use the term time series to refer to any group of statistical information accumulated at regular intervals. There are four kinds of change, or variation. involved in time-series analysis. They are

1. Secular trend 2. Cyclical fluctuation 3. Seasonal variation 4. Irregular variation

SECULAR TREND With the first type of change, secular trend, the value of the variable tends to increase or decrease over a long period of time. The steady increase in the cost of living recorded by the Consumer Price Index is an example of secular trend. From year to individual year, the cost of living varies a great deal, but if we examine a long- term period, we see that the trend is toward a steady increase. Figure 1 (a) shows a secular trend in an increasing but fluctuating time series.

CYCLICAL FLUCTUATION The second type of variation seen in a time series in cyclical fluctuation. The most common example of cyclical fluctuation is the business cycle. Over time, there are years when the business cycle hits a peak above the trend line. At other times, business activity is likely to slump, hitting a low point below the trend line. The time between bitting peaks or falling to low points is a least 1 year, and it can be as many as 15 or 20 years. Figure 1(b) illustrates a typical pattern of cyclical fluctuation above and below a secular trend line. Note that the cyclical movements do not follow any regular pattern but move in a somewhat unpredictable manner.

SEASONOL VARIATION The third kind of change in time-series data is seasonal variation. As we might expect from the name, seasonal variation involves patterns of change within a year that tend to be repeated from year to year. For example, a physician can expect a substantial increase in the number of flu cases every winter and of poison in every summer. Since these are regular patterns, they are useful in forecasting the future. In figure 1(c), we see a seasonal variation. Notice how it peaks in the fourth quarter of each year.

IRREGULAR VARIATION Irregular variation is the fourth type of change in time-series analysis. In many situations, the value of a variations describe such movements. The effects of the Middle East conflict in 1973, the Iraqi situation in 1990 on gasoline prices in the United States are examples of irregular variation. Figure 1 (d) illustrates irregular variation.

Figure 1 : Time-series Variations                                                                                                                                                                                    (a)

                                                                                                                                                                                   (b)

                                                                                                                                                                                 (c)

                                                                                                                                                                                   (d)

Thus far, we have referred to a time series as exhibiting one or another of these four types of variation. In most instances, however, a time series will contain several of these components. Thus, we can describe the overall variation in a single time series in terms of these four different kinds of variation. In the following sections, we will examine the four components and the ways in which we measure each.

Careful analysis of time-series data helps managers analyze past trends and plan for future customer demand. Knowing about the three analyzable components, the secular trend, cyclical fluctuation. and seasonable variation enables managers to have enough resources-such as staff and inventory-to meet customers' needs. Once these three components are well understood, it's much easier to interpret a seemingly erratic pattern of sales. Then the fourth component, irregular variation, can possible be linked with an identifiable cause. You can see that the

Page 88: Quantitative Techniques in Management New Book

link between measured performance and causes fits well with a Total quality Management approach to business.

EXERCISE 2

1. Identify the four principal components of a time series and explain the kind of change, over time, to which each applies.

2. Which of the four components of a time series would we use to describe the effect of Christmas sales upon a retail department store?

3. What is the advantage of decomposing a time series into its four components?

4. Which of the four components of a time series might the U.S. Department of Agriculture use to describe a 7-year weather pattern?

5. How would a war be accounted for in a time series?

6. What component of a time series explains the general growth and decline of the steel indusny over the last two centuries?

TREND ANALYSIS

Two methods of fitting a trend line: Of the four components of a time series, secular trend represents the long-term direction of the series. One way to describe the trend component is to fit a line visually to a set of points on a graph. Any given graph., however, is subject to slightly different interpretations by different individuals. We can also fit a trend line by the method of least sequares. In our discussion. we will concentrate on the method of least squares, since visually fitting a line to a time series is not a completely dependable process. Reasons for Studying Trends

There are three reasons why it is useful to study secular trends:

Three reasons for studying secular trends:

1. The study of secular trends allows us to describe a historical pattern: There are many instances when we can use a past trend to evaluate the success of a previous policy. For example, a university may evaluate the effectiveness of a recruiting program by examining its past enrollment trends.

2. Studying secular trends permits us to project past patterns, or trends, into the future: Knowledge of the past can tell us a great deal about the future. Examining the growth rate of the world's population., for example, can help us estimate the population for some future time.

3. In many situations, studying the secular trend of a time series allows to eliminate the trend component from the series: this makes it easier for us to study the other three components of the time series. If we want to determine the seasonal variation in ski sales, for example, eliminating the trend component gives us a more accurate idea of the seasonal component.

Trend lines take different forms.

Trends can be linear, or they can be curvilinear. Before we examine the leanear, or sraightline, method of describing trends, we should remember that some relationships do not take that form. The increase of pollutants in the environment follows an upward sloping curve similar to that figure 2 (a). Another common example of a curvilinear relationship is the life cycle of a new business product, illustrated in Figure 2(b). When a new product is introduced, its sales volume is low (I). As the product gains recognition and success, unit sales grow at an increasingly rapid rate (II). After the product is firmly established, its unit sales grow at a stable rate (III). Finally, as the product reaches the end of its life cycle, unit sales begin 10 decrease(IV).

Figure - 2 Curvilinear trend relationships

Fitting the Linear Trend by the Least-Squares Method

Besides those trends that can be described by a curved line, there are others that are described by a straight line. These are called linear

Page 89: Quantitative Techniques in Management New Book

trends. Before developing the equation for a linear trend. we need to review the general equation for estimating a straight line Equation 1.

Equation for estimating a Straight Line                                                                                  (1.1)

We can describe the general trend of many time series using a straight line. But we are faced with the problem of finding the best-fitting line. We can use the least-squares method to calculate the the best-fitting line, or equation. There we saw that the best-fitting line was determined by Equation 2.4 and 2.5.

Where

Y = values of the dependent variable X = values of the independent variable .. Y = mean of the values of the dependent variable X = mean of the values of the independent variable n = number of data points in the time series a = Y-intercept b = slope

With equations 2.1 and 2.2, we can establish the best-fitting line to describe time-series data. However, the regularity of time-series data allows us to simplify the calculations in equations 2.1 and 2.2 through the process we shall now describe.

Translating, or Coding, Time

Coding the time variable to simplify computation.

Normally, we measure the independent variable time in terms such as weeks, months, and years. Fortunately, we can convert these traditional measures of time to a form that simplifies the computation. We call these process coding. To use coding here, we find the mean time and then subtract that value from each of the sample times. Suppose our time series consists of only three points, 1989, 1990, and 1991. If we had to place these numbers in Equation 2.1 and 2.2, we would find the resultant calculations tedious. Instead, we can transform the values1989, 1990 and 1991 into corresponding values of -1,0, and 1, where 0 represents the mean (1990), -1 represents the first year (1989 - 1990 = -1) and 1 represents the last year (1991 - 1990 = 1).

Treating odd and even numbers of elements

We need to consider two cases when we are coding time values. The first is a time series with an odd number of elements, as in the previous example. The second is a series with an even number of elements. Consider table 2. In part a, we have an odd number of years. Thus, the process is the same as the one we just described, using the years 1989, 1990, and 1990. In part b, we have an even number of elements. In cases like this, when we find the mean and subtract it from each element, the fraction 1/2 becomes part of the answer. To simplify the coding process and to remove the 1/2, we multiply each time element by 2. We will denote the "coded," or translated, time with a lowercase x.

Why use coding?

We have two reasons for this translation of time. First, it eliminates the need to square numbers as large as 1989, 1990, 1991 and so on. This method also sets the mean year, x, equal to zero amd allows us to simplify equations 2.1 and 2.2.

Simplifying the calculation of a and b

Now we can return to our calculations of the slop (Equation 2.1) and the Y -intercept (Equation 2.2) to determine the best-fitting line. Since we are using the coded variable x, we replace X and by x and Equations 2.1 and 2.2. Then, since the mean of our coded time variable

is zero, we can substitute 0 for in Equations 2.1 and 2.2, as follows:

Page 90: Quantitative Techniques in Management New Book

TABLE 2 Translating, or Coding, Time Value

Equation 2.2 changes as follows:                                                                                                                                   (2.2)                                           

                                                                                                                                (2.4)

Equations 2.3 and 2.4 represent a substantial improvement over Equation 2.1 and 2.2.

A Problem Using the Least-Squares Method in a Time Series (Even Number of Elements)

Using the least-squares method

Consider the data in table 1, illustrating the number of ships loaded at Morehead City between 1985 and 1992. In this problem, we want to find the equation that will describe the secular trend of loadings. To calculate the necessary values for Equations 2.3 and 2.4, let us look at table 3.

Finding the slope and Y -intercept

With these values, we can now substitute into Equations 2.3 and 2.4 to find the slope and the Y -intercept for the line describing the trend in ship loadings:                                                                                                                                                                  (2.3)

                                                                                                                                                                (2.4)

Thus, the general linear equation describing the secular trend in ship loadings is                                                                                                                                                                 (1.1)

TABLE 3 Intermediate Calculations for computing the Trend

where

= estimated annual number of ships loaded. X = Coded time value representing the number of half-year interval (a minus s sign indicates half-year intervals before 1998 1/2; a

plus sing indicates half-year intervals after 1988 1/2).

Projecting with the Trend Equation

Once we have developed the trend equation, we can project it to forecast the variable in question. In the problem of finding the secular trend in ship loadings, for instance, we determined that the appropriate secular trend equation was

= 139.35 + 7.536x

Using our trend line to predict

Now, suppose we want to estimate ship loadings for 1993. First, we must convert 1993 to the value of the coded time (in half-year intervals).

Page 91: Quantitative Techniques in Management New Book

Substituting this value into the equation for the secular trend, we get

Therefore, we have estimated 207 ships will be loaded in 1993. If the number of elements in our time series had been odd, not even, our procedure would have been the same except that we would have dealt with 1-year intervals, not half-year intervals.

Use of a Second-degree Equation in a Time Series

Handling time series that are described by curves

So far, we have described the method of fitting a straight line to a time series. But many time series are best described by curves,.

                                                                                                                                                              (2.5)

                                                                   Where

Finding the values for a, b and c

Again we use the least-squares method to determine the second-degree equation to describe the best fit. The derivation of the second-degree equation is beyond the scope of this text. However, we can determine the value of the numerical constants (a, b and c) from the following three equations:

When we find the values of a, b, and c by solving Equation 2.6, 2.7, and 2.3 simultaneously, we substitute these values into the second-degree equation, Equation 2.5.

As in describing a linear relationship, we transform the dependent variable, time(X), into a coded form (x) to simplify the calculation. We'll now work through a problem in which we fit a parabola to a time series.

FIGURE 3 Form and equation for a parabolic curve

TABLE 4 Annual Sales of Electric Quartz Watches

A Problem Involving a Parabolic Trend (Odd Number of Elements in the Time Series)

In recent years, the.sale of electric quartz watches has increased at a significant rate. Table 4 contains sales information that will help us determine the parabolic trend describing watch sales.

Coding the time variable

We organize the necessary calculations in Table 5. The first step in this process is to translate the independent variable X into a coded time variable x. Note that the coded variable x is listed in 1-year intervals because there is an odd number of elements in our time series. Thus, it

Page 92: Quantitative Techniques in Management New Book

is not necessary to multiply the variable by 2.

Calculating a. b, and c by substitution

Substituting the values from Table 5 into equations 2.6, 2.7 and 2.3, we get

Now we must find a and c by solving equation (1) and (2).

1. Multiply equation (1) by 2 and subtract equation (2) from equation (1)

From equation 4, we readily find c;

2. Substitute the value for c into equation 1:

This gives us the appropriate values of a, b and c to describe the time series presented in table4 by the following equation:

TABLE 5 : Intermediate Calculations for Computing the Trend

Does our curve find the data?

Let's graph the watch data to set how well the parabols we just derived fits the time series. We' ve done this in figure 4

Forecasts Based on a Second-degree Equation

Making the forecast : Suppose we want to forecast watch sales for 1997. To make a prediction, we must first translate 1997 into a coded variable x by subtracting the mean year, 1990:

FIGURE 4 : parabolic trend fitted to data in Table-4

This coded value (x = 7) is then substituted into the second- degree equation description, watch sales:

We conclude, based on the past secular trend, that watch sales should be approximately 446,600,000 units by 1997. This extraordinarily large forecast suggests, however, that we must be more careful in forecasting with a parabolic curve than we are when using a linear trend. The slope of the second-degree equation in Figure. 4 is continually increasing. Therefore, the parabolic curve may become a poor estimator as we attempt to predict further into the future. In using the second-degree equation method, we must also take into consideration factors that may be solving or rereversing the frowth rate of the variable.

Page 93: Quantitative Techniques in Management New Book

Being careful in interpreting the forecast:

In our watch example, we can assume that during the time period under consideration, the product is at a very rapid growth stage in its life cycle. But we must realize that as the cycle approaches a mature stag, sales will probably decelerate and no longer be predicted accurately by our parabolic curve. When we calculate predictions for the future, we need to consider the possibility that the trend line may change. Such a situation could cause considerable error. It is therefore necessary to exercise particular care when using a second-degree equation as a forecasting tool.

Recall that when we first studied regression we stated that an estimating equation is valid only over the same range as the one from which the sample was taken initially. Clearly, any time- series analysis asks us to look one or two periods beyond the range of the data we have collected. This can make good business sense. But many analysts run into trouble by extrapolating the time series of revenue of a rapidly growing company for many years into the future. A Wall Street proverb. "No tree grows to the sky," reminds us that trend lines that accurately describe the rapid growth of start-up "sapling" companies acen't necessarily good predictors of performance as companies mature.

CYCLICAL VARIATION

Cyclical Variation Defined

Cyclical variation is the component of a time series that tends to oscillate above and below the secular trend line for periods longer than 1 year. The procedure used to identify cyclical variation is the residual method

Residual Method

When we look at a time series consisting a annual data, only the secular-trend, cyclical, and irregular components are considered. (This is true because seasonal variation makes a complete, regular cycle within each year and thus does not affect one year may more than another.) Since we can describe secular trend using a trend line, we can isolate the remaining cyclical and irregular components from the trend We will assume that the cyclical component explains most of the variation left unexplained by the trend component (Many real-life time series do not satiry this assumption. Methods such as Fourier analysis and spectral analysis can analyze the cyclical component for such time series. These, however, are beyond the scope of this book.)

Expressing cyclical variation as a percent of trend:

If we use a time series composed of annual data, we can find the fraction of the trend by dividing the actual value (Y) by the corresponding

trend value for each value in the time series. We than multiply the result of this calculation by 100. This gives us the measure of cyclical variation as a percent of trend. We express this process in equation 2.8:

                                                                                                (2.8)

Where

Y = actual time - series value

= estimated trend value from the same points in the time seriesNow let's apply this procedure.

Measuring variations

A farmers' marketing cooperative wants to measure the variations in its members' wheat harvest over an 8-year period. Table 6 shows the volume harvested in each of the 8 years. Column Y contains the values of the linear trend for each time period. The trend line has been generated using the method illustrated earlier in this chapter. Note that when we graph the actual (Y) and the trend (Y) values for the 8 years in figure 5, the actual values move above and below the trend line.

TABLE 6: Grain received farmer's Cooperative Over 8 Years

Page 94: Quantitative Techniques in Management New Book

Interpreting cyclical variations:

Now we can determine the percent of trend for each of the years in the sample (column 4 in table 7). From this column, we can see the variation in actual harvests around the estimated trend (98.7 to 102.5). We can attribute these cyclical variations to factor such as rainfall and temperature. However, because these factors are relatively unpredictable, we cannot forecast any specific patterns of variation using the method of residuals.

Expressing cyclical variations in terms of relative cyclical residual:

The relative cyclical residual is another measure of cyclical variation. In this method, the percentage deviation ftom the trend is found for each value. Equation 2.9 presents the mathematical formula for determining the ralative cyclical residual. As with percent of trend, this measure is also a percentage.                                                                                                                                                                                      (2.9)

Where

Y = actual time-series value

= estimated trend value from the same point in the time series

FIGURE 5 Cyclical fluctuations around the trend line

TABLE 7 Calculation of Percent of trend

Table 8 shows the calculation for the relative cyclical residual for the farmers' cooperative problem. Note that the easy way to compute the relative cyclical residual (column 5) is to subtract 100 from the percent of trend (column 4).

Comparing the two measures of cyclical variation

These two measures of cyclical variation, percent of trend and relative cyclical residual, are percentages of the trend. For example, in 1990, the percent of trend indicated that the actual harvest was 98.8 percent of the expected harvest for that year. For the same year, the relative cyclical residual indicated that the actual, harvest was 1.2 percent short of the expected harvest (a relative cyclical residual of -1.2).

Graphing cyclical variation

Frequently, we graph cyclical variation as the percent of trend. Figure 6 illustrates how , this process eliminates the trend line and isolates the cyclical component of the time series. It, must be emphasized that the procedures discussed in this section can be used only for describing past cyclical variations and not for predicting future cyclical variations. Predicting cyclical variation requires the use techniques beyond the scope of this book.

Although seasonal variation on is a recurring cycle in past data, remember the rule that seasonal variation describes patterns that occur Within one year, but cyclical variation mean a pattern in a time series that is greater than one year in scope.

TABLE 8 : Calculations of relative Cyclical residuals

FIGURE 6 : Graph of percent of trend around the trend line for the data in table7

Page 95: Quantitative Techniques in Management New Book

SEASONAL VARIATION

Seasonal variation defined

Besides secular trend and cyclical variation, a time series also includes seasonal variation. Seasonal variation is defined as repetitive and predictable movement around the trend line in one year or less. In order to detect seasonal variation, time intervals need to be measured in small units, such as days, weeks, months, or quarters.

1. We can establish the pattern of past changes. This gives us a way to compare two time intervals that would otherwise be took dissimilar. If a flight training school wants to know if a slump in business during December is normal, it can examine the seasonal pattern in previous years and find the information it needs.

2. It is useful to project past patterns into the future. In the case of long-range decisions, secular-trend analysis may be adequate. But for short-run decisions, the ability to predict seasonal fluctuations is often essential. Consider a wholesale food chain that wants to maintain a minimum adequate stock of all items. The ability to predict short-range patterns, such as the demand for turkeys at Thanksgiving, candy at Christmas, or peaches in the summer, is useful to the management of the chain.

3. Once we have established the seasonal pattern that exists, we can eliminate its effects from the time series. This adjustment allows us to calculate the cyclical variation that takes place each year. When we eliminate the effect of seasonal variation from a time series, we have deseasonalized the time series.

Ratio-to-Moving- Average Method

Using the ratio-to-Moving-Average method of measuring seasonal variation

In order to measure seasonal variation, we typically use the ratio-to-moving-average method. This technique provides an index that describes the degree of seasonal variation. The index is based on a mean of 100, with the degree of seasonality measured by variations away from the base. For example, if we examine the seasonality of canoe rentals at a summer resort, we might fInd that the spring-quarter index is 142. The value 142 indicates that 142 percent of the average quarterly rentals occur in the spring. If management recorded 2,000 canoe rentals for all of last year, then the average quarterly rental would be 2,000/4 = 500. Since the spring- quarter index is 142, we estimate the number of spring rentals as follows:

An example of the ratio-to-moving-average method

Our chapter-opening example can illustrate the ratio-to-moving-average method. The resort hotel wanted to establish the seasonal pattern of room demand by its clientele. Hotel management wants to improve customer service and is considering several plans to employ personnel during peak periods to achieve this goal. Table 9 contains the quarterly occupancy, that is the average number of guests during each quarter of the last 5 years.

We will refer to table 9 to demonstrate the six steps required to compute a seasonal index.

TABLE 9: Time Series for Hotel Occupancy

Step: 1 Calculate the 4-quarter moving total:

1. The fIrst step in computing a seasonal index is to calculate the 4-quarter moving total for the time series. To do this, we total the values for the quarters durig the first year, 1988, in Table 9: 1,861 + 2,203 + 2,415 + 1,908 = 8,387. A moving total is associated with the middle data point in the set of values from which it was calculated. Since our first total of 8,387 was calculated from four data points, we place it opposite the midpoint of those quarters, so it falls in column 4 of table 10, between the rows for the 1988-ll and 1988-III quarters.

We find the next moving total by dropping the 1988-1 value, 1,861, and adding the 1989- I value, 1,921. By dropping the first value and adding the fifth, we keep four quarters in the total. The four values added now are 2,203 + 2,415 + 1,908 + 1,921 = 8,447. This total is entered in table 10 directly below the first quarterly total of 8,387. We continue the process of , "sliding" the 4-quarter total over the time series until we have included the last value in the series. In this example, it is the 1,967 rooms in the fourth quarter of 1992, the last number in column 3 table 10. The last entry in the moving total column is 8,793. It is between the roes for the 1992-II and 1992-III quarters, since it

Page 96: Quantitative Techniques in Management New Book

was calculated from the data for the 4 quarters of 1992.

Setp-2: Compute the 4-quarter moving average.

2. In tbe second step, we compute the 4-quarter moving average by dividing each of the 4-quarter totals by 4. In table 10, we divided the values in column 4 by 4, to arrive at the values for column-5.

Step-3: Center the 4-quarter moving average

3. In the third step, we center the 4-quarter moving average. The moving averages in column 5 all halfway between the quarters. We would like to have moving averages associated with each quarters. In order to center our moving averages, we associate with each quarter the average of the two 4-quarter moving averages falling just above and just below it. For the 1988-III quarter, the resulting 4-quarter centered moving average is 2,104.25, that is, (2,096.75 + 2,111.75) / 2. The other entries in column 6 are calculated the same way. Figure 7 illustrates how the moving average has smoothed the peaks and troughs of the original time series. The seasonal and irregular components have been smoothed, and the resulting dotted colored line represents the cyclical and trend components of the series.

Sometimes step 3 can be skipped.

Suppose we were working with the admissions data for a hospital emergency room., and we wanted to compute daily indices. In steps 1 and 2, we would compute 7-day moving totals and moving averages, and the moving averages would already be centered (because the middle of a 7-day period is the fourth of those 7 days). In this case, step 3 is unnecessary. Whenever the number of periods for which we want indices is odd (7 days in a week, three shifts in a day), we can skip step 3. However, when the number of periods is even (4 quarter, 12 months, 24 hours), then we must use step 3 to center the moving averages we get with step 2.

Step 4: Calculate the percentage of actual value to moving average value.

4. Next, we calculate the percentage of the actual value to the moving-average value for each quarter in the time series having a 4-quarter moving-average entry. This step allows us to recover the seasonal component for the quarters. We determine this percentage by dividing each of the actual quarter values in column 3 of table 10 by the corresponding 4- quarter centered moving-average values in column 6 and then multiplying the result by 100. For example. we find the percentage for 1988-III as follows:

TABLE 10 : Calculating the 4 - Quarter Centered Moving Average

Step 5: Collect answers from step 4 and calculate the modified mean.

5. To collect all the percentage of actual to moving-average values in column 7 of table 10, arrange them by quarter. Then calculate the "modified mean" for each quarter. The modified mean is calculated by discarding the highest and lowest values for each quarter and averaging the remaining values. In table 11, we present the fifth step and show the process for finding the modified mean.

Reducing extreme cyclical and irregular variations

The seasonal values that we recovered for the quarters in column 7 of table 10 still contain the cyclical and irregular components of variation in the time series. By eliminating the highest and lowest values from each quarter, we reduce the extreme cyclical and irregular variations. When we average the remaining valaues, we further smooth the cyclical and irregular components. Cyclical and irregular variations tend to be removed by this process, so the modified mean is an index of the seasonality component. (Some statisticians prefer to use the median value instead of computing the modified mean to achieve the same outcome).

Step 6: Adjust the modified mean.

6. The final step, demonstrated in Table 12, adjusts the modified mean slightly, Notice that the four indices in table 11 total 404.1. However, the base for an index is 100. Thus, the four quarterly indices should total 400, and their mean should be 100. To correct for this error, we multiply each of the quarterly indices in table 11 by an adjusting constant. This number is found by dividing the desired sum of the indices (400) by the actual sum (404.1). In this case, the result is 0.9899. Table 12 shows that multiplying the indices by the adjusting constant brings the quartely indices to total of 400. (Sometimes even after this adjustment, the mean of the seasonal indices is not exactly 100 because of

Page 97: Quantitative Techniques in Management New Book

accumulated rounding error. In this case, however it is exactly 100.)

TABLE 11 : Demonstration od Step in Computing a Seasonal Index

TABLE 12 : Demonstration od Step 6.

Use of the Seasonal index

Deseasonalizing a time series

The ratio-to-moving-average method just explained allows us to identify seasonal variation in a time series. The seasonal indices are used to remove the effects of seasonality from a time series. This is called deseasonaIizing a time series. Before we can identify either the trend or cyclical components of a time series, we must eliminate seasonal variation. To deseasonalize a time series, we divide each of the actual values in the series by the appropriate seasonal index (expressed as a fraction of 100). To demonstrate, we shall deseasonalize the value of the first four quarters in table 9. In table 13, we show the deseasonaIizing process using the values for the seasona indeces from table 13. Once the seasonal effect has been eliminated, the deseasonalized values that ream reflect only the trend, cyclical, and irregular components of the time series.

Using seasonality in forecasts

Once we have removed the seasonal variation, we can compute a deseasonalized trend line, which we can then project into the future. Suppose the hotel management in our example estimates from a deseasonalized trend line that the deseasonalized average occupancy for the fourth quarter of the next year will be 1,121. When this prediction has been obtained, management must then take the seasonlity into account. To do this, it multiplies the deseasonalized predicted average occupancy of 2,121 by the fourth-quarter seasonal index (expressed as a fraction of 100) to obtain a seasonlized estimate of 1,930 rooms for the fourth- quarter average occupancy. Here are the calculations:

Adjusting quarterly and monthly data help us detect an underlying secular trend. Unfortunately, most reported figure don't tell us how much adjustment was used, and for many business figures the unadjusted figure is important, too. For example, if a state motor vehicle department reports that last month's new vehicle registrations were at a seasonally adjusted rate of 24,000, how many new customer should the owner of quick-lube service station anticipate three months from now? So, for internal company planning purposes it may be helpful to report both adjusted and unadjusted figures for decision makers to use.

TABLE 13 : Demonstration of Desesonalizing

IRREGULAR VARIATION

The final component of a time series is irregular variation. After we have eliminated trend, and seasonal variations from a time series, we still have an unpredictable factor left. Typically, irregular variation occurs over short intervals and follows a random pattern.

Difficulty of dealing with irregular variation

Because of the unpredictabiliry of irregular variation, we do not attempt to explain it mathematically. However, we can often isolate its causes. New York City's financial crisis of 1975, for example, was an irregular factor that severely depressed the municipal bond market. In 1984, the unusually cold temperatures in late December in the southern states were an irregular factor that significantly increased electricity and fuel oil

Page 98: Quantitative Techniques in Management New Book

consumption. The Persian Gulf War in 1991 was another irregular factor; it significantly increased airline and ship travel for a number of months as troops and supplies were moved. Not all causes of irregular variation can be identified so easily, however. One factor that allows managers to cope with irregular variation is that over time, these random movements tend to counteract each other.

In presenting time-series data, managers do not attempt to "fit" a line to account for irregular variation. However, it's often noted with a comment on a graph Such as; "Market closed for holiday," or as a footnote in table (for example: "Spring break fell in March last year and April this year.")

PROBLEM INVOLVING ALL FOR COMPONENTS OF TIME SERIES.

For a problem that involves all four components of a time series, we turn to a firm that specializes in producing recreational equipment. To forecast future sales based on an analysis of its past pattern of sales, the firm has collected the information in table 14. Our procedure for describing this time series will consist of three stages:

TABLE 14 : Quarterly Sales

1. Deseasonalizing the time series2. Developing the trend line3. Finding the cyclical variation around the trend line.

Step 1 : Computing seasonal indices

Since the data are available on a quartely basis, we must first deseasonalize the time series. The steps to do this are shown in table 15 and 16. These steps are the same as those originally introduced in seasonal variations

TABLE 16: Step 5 and 6 in Computing the Seasonal Index

Finding the deseasonalized values

Once we have computed the quarterly seasonal indices, we can find the deseasonalized value of the time series by dividing the actual sales (in table 14) by seasonal indices. Table 17 shows the calculation of the deseasonalized time-series values.

Step 2: Developing the trend line using the least-squares method.

The second step in describing the components of the time series is to develop the trend line. We accomplish this by applying the least-squares method to the deseasonalized time series (after we have translated the time variable). Table 18 presents the calculations to identify the trend component.

With the values from table 18, we can now find the equation for the trend. From Equation 2.3 and 2.4, we find the slop and Y - intercept for the trend line as follows:

The appropriate trend line is described using the straight-line equation (Equation 1.3), with X replaced by x:

Page 99: Quantitative Techniques in Management New Book

Step 3: Finding the cyclical variation

We have now identified the seasonal and trend components of the time series. Next, we find the cyclical variation around the trend line. This component is identified by measuring deseasonalized variation around trend line. In this problem, we will calculate cyclical variation in table 19, using the residual method.

Assumptions about irregular variation

If we assume that irregular variation is generally short-term and relatively insignificant, we have completely described the time series in this problem using the trend, cyclical, and seasonal components. Figure 8 illustrates the original time series, its moving average (containing both the trend and cyclical components), and the trend line. Predictions using a time series.

Now, suppose that the management of the recreation company we have been using as an example wants to estimate the sales volume for the third quarter of 1993. What should management do?

TABLE 17: Calculation of Deseasonalized Time - Series values

TABLE 18 : Identifying the Trend Component

TABLE 19: Identifying the Cyclical variation

1. It has to determine the deseasonalized value for sales in the third quarter of 1993 by using the trend equation = 18 ...0.16x. This requires it to code the time, 1993-llI. That quarter (1993-Ill) is three quarters past 1992-IV, which we see in table 18, has coded time value of 19. Adding 2 for each quarter, management find x = 1 9 + 2(3) = 25. Substituting this value (x = 25) into the trend equation Produces the following result:

Thus, the deseasonalized sales estimate for 1993-Ill is $220,000. This point is shown on the trend line in figure.8.

Step 2: Seasonalizing the initial estimate.

2. Now management must seasonalize this estimate by multiplying it by the third- quarter seasonal index, expressed as a fraction of 100:

FIGURE 8: Time series, trend line, and 4-quarter centered moving average for quarterly sales data in Table 14.

Page 100: Quantitative Techniques in Management New Book

Caution in using the forecast

On the basis of this analysis, the firm estimates that sales for 1993-llI will be $135,000. We must stress, however, that this value is only an estimate and does not take into account the cyclical and irregujar components. As we noted earlier, the irregular variation cannot be predicted mathematically. Also, remember that our earlier treatment of cyclical variation was descriptive of past behavior and not predictive of future behavior.

Remember that the complete analysis of a time series attempts to account for three factors: (1) seasonal vriation, (2) the secular trend, and (3) cyclical variation. What's left is the fourth factor of a time series is descriptive of past behaviour and not necessarily predictive of future behaviour.

EXCERCISES 1

1. A state commission designed to monitor energy consumption assembled the following seasonal data regarding natural gas consumption, in millions of cubic feet:

(a) Determine the seasonal indices and deseasonaIize these data (using 4-quarter centered moving average).

(b) Calculate the least-squares line that best describes these data.

(c) Identify the cyclical variation in these data by the relative cyclical

(d) Plot the original data, the deseasonalized data, and the trend.

2. The following data describe the marketing performance of a regional beer producer:

(a) Calculate the seasonal indices for these data. (Use a 4-quarter centered moving average). (b) Deseasonalize these data using the indices from part (a).

3. For Exercise 2 (a) Find the least-squares line that best describes the trend in deseasonalized beer sales. (b) Identify the cyclical component in this time series by computing the percent of trend

Time Series Analysis in Forecasting

In this chapter, we have examined all four components of a time series, we have described the process of projecting past trend and seasonal variation into the future while taking into consideration the inherent inaccuracies of this analysis. In addition, we noted that the irregular and cyclical component do affect the future, they are erratic and difficult to use in forecasting.

Recognizing limitations of time-series analysis

We must realize that the mechanical approach of time-series analysis is subject to considerable error and change. It is necessary for management to combine these simple procedures with knowledge of other factors in order to develop workable forecasts. Analysis are constantly revising, updating, and discarding their forecasts. If we wish to cope successfully with the furure, we must do the same.

When using the procedures described in this chapter, we should pay particular attention to two problems:

1. In forecasting, we project past trend and seasonal variation into the future. We must ask. "How regular and lasting were the past trends?

Page 101: Quantitative Techniques in Management New Book

What are the chances that these patterns are changing?"

2. How accurate are the historical data we use in series analysis? If a company has changed from a FIFO (first-in, first-out) to a LIFO (last-in, first-out) inventory accounting system in a period during the time under consideration, the data (such as quarterly profits) before and after the change are not comparable and not very useful for forecasting.

The use of statistical techniques to fit a time-series model to data sometimes produces a compelling match, with much of the historical data accounted for by seasonal variation, trend, and cyclicality. But prudent managers don't necessarily assume that this "pretty picture" will continue. They combine the estimates generated by time series analysis with the answers to broad. "'what-if?" questions in strategic planning. One such question is whether the business environment in future time periods will be similar to the periods for which data are variable.

CASE 01

STATISTICS AT WORK

Lee Azko was resting on well-earned laurels. The complicated regression analysis for the results of advertising expenditures had given Sherrel Wright new confidence in making the argument for better planning. Even Walter Azko began to accept that some of the firm's success wasn't hit or miss-there really were some ruIes to this game.

"I never could see the value of running five- or six- page spreads," Uncle Walter said as he rounded the comer of Lee's "office" -a cubicle that was furnished with little except one of the largest and fastest of Loveland's latest personal computers. "Thanks for showing I wa pensive newspaper advertising, too."

"Did Margot say anything about those focus groups?" Lee fished for another compliment.

"We're going to deal with that next week-too early to say. But don't get too comfortable. I have a whole new project for you-go and see Gratia."

Gratia Delaguardia was clearly sharing a joke. The laughter was audible down the corridor-gratia rated a "real" office, with a door, Lee found Grana looking at a graph with yet another player on Loveland's team.

'"Lee-come on in and meet Roberto Palomar. Bert runs the phone bank -you know, our order department. We were just taking about you."

"Hence the laughter?" Lee was nervous.

"No, no. Take a look at this. Bert's been trying to estimate the number of phone reps we need to have available to take orders. We need to plan for hiring "

"And to install enough incoming 800 lines," added Roberto, whom everyone called Bert.

"We plotted out the quarterly data," continued Gratia, "and, as an engineer, let me tell you I can recognize a nonlinear trend when I see one." Gratia pointed to a curve that looked like the path of the space shuttle a curve that looked like the path of the space shuttle going into orbit "Of course, we aren't complaining about our growth. It's good to be on a winning team."

"But if we continue this trend, " said Bert. sliding ruler into place on the graph,"within 10 years, we'll have to employ the whole population of Loveland-just to staff our phone bank. "With that. Gratia and Bert again dissolved with laughter, "Lee, look at these numbers and say it isn't so."

"Well, there's no doubt there's a very strong under lying trend," Lee observed, noting the obvious. "Is there nay seasonality -you know, differences from month to month?"

Good question, "Bert replied. "These quarterly totals tend to mask some of the monthly ups and downs. For example, August is always a bust because people are away on vacation. But December is a very heavy month. We're not really in the Christmas gift business, although some home users apparently do ask Santa for a new Loveland Computer. The main effect comes from small businesses that want to book equipment expenditures before the end of the year for tax purposes."

Page 102: Quantitative Techniques in Management New Book

"And I don't suppose the call volume is evenly spaced over the week," Lee ventured

"Ah, rainy days and Mondays!" Bert answered. "We have a rule of thumb that we do twice as much business on Mondays as on Tuesdays. So we try to avoid training sessions of staff meetings on Mondays. Sometimes the supervisor staff will putch in -whatever it takes. If we miss a call, a potential customer may buy from one of our competitors.

"But now we're at the point where I really should plan a little better for the number of staff to have available. If I schedule too many people, it's a waste of money and the reps get bored. They'd rather be at home."

"Well I think I can help," Lee offered. 'Let me tell you what I'll need," .

Study Questions: What data will Lee want to examine? What analysis will be performed? How will Bert make use of the information that Lee develops?

Review and Real-world Exercise

01. The owner of an air-conditioning and heating company is examining data regarding quarterly revenue (in thousands of dollars). He wants to determine the trend in his business.

(a) Calculate the seasonal indices for these data (use a 4-quarter centered moving average). (b) Deseasonalize these data using the indices from part (a).(c) Find the least-squares line that best describes these data.

02. Wheeler Airline, a regional carrier, has estimated the number of passenger to be 595,000 (deseasonalized) for the month of Deccember. How many passengers should the company anticipate if the December seasonal index is 128?

03. An EPA research group has measured the level of mercury contamination in the ocean at a certain point off the East Coast. The following percentages of mercury were found in the water:

Construct a 4-month centered moving average, and plot it on a graph along with the original data.

04. A production manager for a Canadian paper mill has accumulated the following information describing the millions of pounds processed quarterly:

(a) Calculate the seasonal indices for these data (percentage of actual to centered moving average).

(b) Desceasonalize these data, using the seasonal indices from part (a).

(c) Find the least-squares line that best describes these data.

(d) Estimate the number of pounds that will be processed during the spring of 1993.

05. Describe some of the difficulties in using a linear estimating equation to describe these data:

(a) Gasoline mileage achieved by U.S. automobiles. (b) Fatalities in commercial aviation.(c) The grain exports of a single country.(d) The price of gasoline.

06. Magna International is a large Canadian manufacturer of automotive components such as molded door panels. Magna's 1992 annual

Page 103: Quantitative Techniques in Management New Book

report listed the company's revenues for the sprevious ten years (in millions of Canadian dollars):

(a) Find the least-squares trend liner for these data (b) Plot the annual data and the trend line on the same graph. Do the variations from the trend appear random or cyclical? (c) Use a computer-based regression package to find the best-fitting parabolic trend for these data. Is c, the coefficient of X2, significantly different from zero? Which of the two trend models would you recommend using to forecast Magna's 1993 revenues? Explain. (d) Forecast magna's 1993 revenues.

07. Comment on the difficulties you would have using a second-degree estimating equation to predict the future behavior of the process that generated these data:

(a) Sales of personal computers in the United States.(b) Use of video games in the United States.(c) Premiums for medical malpractice insurance. (d) The numberof MBAs graduated form U.S. universities.

08. John Barry, a hospital administrator planning for a new emergency-room facility, has examined the number of patients who have visited the present facility during each of the last 6 years.

(a) Find the linear equation that describes the trend in the number of patients visiting the emergency room. (b) Estimate for him the number of patients the hospital's emergency room should be prepared to accommodate in 1994.

09. An assistant undersecretary in the U.S. Commerce Department has the following data describing the value of gram exported during the last 16 quarters (in billions of dollars).

(a) Determine the seasonal indices and deseasonalize these data (using a 4-quarter centered moving average). (b) Calculate the least-squares line that best describes these data. (c) Identify the cyclical variation in these data by the relative cyclical residual method. (d) Plot the original data, the deseasonalized data, and the trend.

10. Richie Bells's College Bicycle Shop has determind from a previous trend analysis that spring sales would be 165 bicycles (deseasonalized). If the spring seasonal index is 143, how many bicycles should the shop sell this spring?

11. With U.S. Interstate Highway program nearly finished, of what use are old data to the manufacturers of heavy earth-moving equipment as they attempt to forecast sales? What new data would you suggest they utilize in their forecasting?

12. Automobile manufacturing is often cited as an example of a cyclical industry (one subject to changes in demand according to an underlying business cycle). Consider automobile production world-wide (in millions of units) and in the former U.S.S.R. (in hundreds of thousands of units) from 1970 through 1990:

(a) Find the least-squares trend line for the world-wide data. (b) Plot the world-wide data and the trend line on the same graph. Do the variations from the trend appear random or cyclical? (c) Plot the residuals as a percent of trend. Approximately how long is the business cycle shown by these data? (d) Consider the output of automobiles in the former U.S.S.R. Discuss its similarities and differences with the patterns you found in parts (a), (b). and (c).

13. R.B. Ritch Builders has completed these numbers of homes in the 8 years it has been in business:

Page 104: Quantitative Techniques in Management New Book

(a) Develop linear estimating equation to describe the trend of completions. (b) How many completions should R. B. plan on for 1996? (c) Along with the answer to part (b), what advice would you give RB. about using this forecasting technique?

14. As part of an investigation being done by a federal agency into the psychology of criminal activity, a survey of the number of homicides and assaults over the course of a year produced the following results:

(a) If the corresponding seasonal indices are 84, 134, 103, and 79, respectively, what are the deseasonalized values for each season? (b) What is the meaning of the seasonal index of 79 for the 'Winter season?

15. A state's quarterly deseasonalized unemployment percentage figures for years 1988-1992 are as follows:

(a) Find the linear equation that describes this unemployment trend. (b) Calculate the percent of trend for these data. (c) Plot the cyclical variation in the unemployment rates from the percent of trend.

16. The number of confumed AlDS cases reported at a local health clinic during the 5 years from 1988 to 1992 were 2, 4, 7, 13, and 21, respectivel y .

(a) Develop the linear regression line for these data. (b) Find the least-squares second-degree curve for these data. (c) Construct a table of each year's actual cases, the linear estimates from the regression in part (a), and the second-degree values from the curve in part (b). (d) Which regression appears to be the better estimator?

17. The manager of Pizza Parlor wishes to estimate the future sales of a new menu items based on the first 7 weeks of sales data. The weekly sales valumes are as follows:

(a) Find the lenear Regression line that best fits these data. (b) Estiinate the expected number of sales for week 8. (c) Based on the estimate in part (b) and the available data, does the regression accurately describe the sales trend for this item?