Top Banner

of 25

MB0050 Complete

Apr 07, 2018

Download

Documents

Bela Jadeja
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/4/2019 MB0050 Complete

    1/25

    Name : J. H. PATEL

    Roll No. : 511034080

    COURSE : MBA

    SEMESTER : 3 (THIRD)

    SUBJECT : RESEARCH METHODOLOGY

    SUB. CODE : MB0050

    ASSIGNMENT SET : 1

  • 8/4/2019 MB0050 Complete

    2/25

    Q 1. Why should a manger know about research when the job entails

    managing people, products, events, environments, and the like? [10 Marks]

    The manager, while managing people, products, events, and environments will invariably faceproblems, big and small, and will have to seek ways to find long lasting effective solutions. This

    can be achieved only through knowledge of research even if consultants are engaged to solve

    problems. Managers are responsible for the final outcome by making the right decisions at work.

    This is greatly facilitated by research knowledge. Knowledge of research heightens the

    sensitivity of managers to the innumerable internal and external factors of a varied nature

    operating in their work and organizational environment. It also helps to facilitate effective

    interactions with consultants and comprehension of the nuances of the research process.

    Sophisticated technology such as simulation and model building is now available and may lend

    itself to profitable application in certain business areas. The recommendations of the external

    consultant who is proficient in this technology and urges its application in a particular situation

    may make no sense to, and might create some misgivings, in the manager not acquainted with

    research. Even a superficial knowledge of these techniques helps the manager to deal with the

    researcher in a mature and confident manner, so that dealing with experts does not result in

    discomfort. As the manager, you will be the one to make the final decision on the

    implementation of the recommendations made by the research team. Remaining objective,

    focusing on problem solutions, fully understanding the recommendations made, and why and

    how they are arrived at, make for good managerial decision making. Although company

    traditions are to be respected, there my be occasions where todays rapidly changing turbulent

    environment would demand the substitution or re-adaptation of some of these traditions, based

    on research findings. Thus, knowledge of research greatly enhances the decision making skills of

    the manger.

    Managers with knowledge of research have an advantage over those without. Though you

    yourself may not be doing any major research as manager, you will have to understand, predict

    and control events that are dysfunctional to the organization. For example: a new product

    developed may not be taking off, or a financial investment may not be paying off as

    anticipated. Such disturbing phenomena have to be understood and explained. Unless this is

  • 8/4/2019 MB0050 Complete

    3/25

    done, it will not be possible to predict the future of that product or the prospects of that

    investment, and how future catastrophic outcomes can be controlled. A grasp of research

    methods will enable managers to understand, predict and control their environment. With the

    ever increasing complexity of modern organizations, and the uncertainty of environment they

    face, the management of organizational systems has become one of constant trouble shooting in

    the workplace. It would help if managers could sense, spot and deal with problems before they

    get out of hand. Knowledge of research and problem solving processes helps manager to identify

    problem situations before they get out of control. Although minor problems can be fixed by the

    manager, major problesm would warrant the hiring of outside researchers or consultant. The

    manager who is knowledgeable about research can interact effectively with them. Knowledge

    about research processes, design and interpretation of data also helps mangers to become

    discriminating recipients of the research finding presented, and to determine whether or not the

    recommended solutions are appropriate for implementation.

    Another reason why professional managers today need to know about research methods is that

    they will become more discriminating while sifting through the information disseminated in

    business journals. Some journal articles are more scientific and objective than others. Unless the

    manager is able to grasp fully what the published empirical research really conveys, she or he is

    likely to err in incorporating some of the suggestions such publications offer.

    There are several other reasons why professional managers, should be knowledgeable about

    research and research methods in business. First such knowledge sharpens the sensitivity of

    managers to the myriad variables operating in a situation and reminds then frequently of the

    multicausality and multifinality of phenomena, thus avoiding inappropriate, simplistic notions of

    one variable causing another. Second, when managers understand the research reports about their

    organizations handed to them by professionals, they will be equipped to take intelligent,

    educated, calculated risks with known probabilities attached to the success or failure of their

    decisions, Reseach then becomes a useful decision making tool rathen than a mass of

    incomprenesnsible statistical information. Third, because managers become knowledgeable

    about scientific investigations vested interests inside or outside the organization will not prevail.

    For instance, an internal research group within the organization will not be able to distort

  • 8/4/2019 MB0050 Complete

    4/25

    information or manipulate the finding to their advantage if managers are aware of the biases that

    could creep into research and know how data are analyzed and interpreted.

    In summary, being knowledgeable about research and research methods helps professional

    managers to:

    1) Identify and effectively solve minor problems in the work setting.2) Know how to discriminate good from bad research.3) Appreciate and be constantly aware of the multiple influences and multiple effects of

    factors impinging on a situation.

    4) Take calculated risks in decision making, knowing full well the probabilities associatedwith the different possible outcomes.

    5) Prevent possible vested interests form exercising their influence in a situation.6) Relate to hired researchers and consultants more effectively.7) Combine experience with scientific knowledge while making decisions.

    Q 2. a. How do you evolve research design for exploratory research? Briefly

    analyze

    Exploratory research is a type ofresearch conducted for a problem that has not been clearly

    defined. Exploratory research helps determine the best research design, data collection method

    and selection of subjects. It should draw definitive conclusions only with extreme caution. Given

    its fundamental nature, exploratory research often concludes that a perceived problem does not

    actually exist. Exploratory research often relies on secondary research such as reviewing

    available literature and/or data, or qualitative approaches such as informal discussions with

    consumers, employees, management or competitors, and more formal approaches through in-

    depth interviews, focus, projective methods, case studies or pilot studies. The Internet allows for

    research methods that are more interactive in nature. For example, RSS feeds efficiently supply

    researchers with up-to-date information; major search engine search results may be sent by email

    to researchers by services such as Google Alerts; comprehensive search results are tracked over

    lengthy periods of time by services such as Google; and websites may be created to attract

    worldwide feedback on any subject. The results of exploratory research are not usually useful for

    http://en.wikipedia.org/wiki/Researchhttp://en.wikipedia.org/wiki/Researchhttp://en.wikipedia.org/wiki/Data_collectionhttp://en.wikipedia.org/wiki/Secondary_researchhttp://en.wikipedia.org/wiki/Case_studieshttp://en.wikipedia.org/wiki/Pilot_studieshttp://en.wikipedia.org/wiki/Internethttp://en.wikipedia.org/wiki/RSShttp://en.wikipedia.org/wiki/Search_enginehttp://en.wikipedia.org/wiki/Google_Alertshttp://en.wikipedia.org/wiki/Websitehttp://en.wikipedia.org/wiki/Websitehttp://en.wikipedia.org/wiki/Google_Alertshttp://en.wikipedia.org/wiki/Search_enginehttp://en.wikipedia.org/wiki/RSShttp://en.wikipedia.org/wiki/Internethttp://en.wikipedia.org/wiki/Pilot_studieshttp://en.wikipedia.org/wiki/Case_studieshttp://en.wikipedia.org/wiki/Secondary_researchhttp://en.wikipedia.org/wiki/Data_collectionhttp://en.wikipedia.org/wiki/Researchhttp://en.wikipedia.org/wiki/Research
  • 8/4/2019 MB0050 Complete

    5/25

    decision-making by themselves, but they can provide significant insight into a given situation.

    Although the results ofqualitative research can give some indication as to the "why", "how" and

    "when" something occurs, it cannot tell us "how often" or "how many". Exploratory research is

    not typically generalizable to the population at large. Exporatory research is conducted when the

    researcher does not know how and why a certain phenomenon occurs, for example, how does the

    customer evaluate the quality of a bank, hotel or airline? While in the case of a manufactured

    product, quality is assessed on the basis of tangible features, replacement policy, warranty and so

    on, in the case of services there are no tangibles. To understand this phenomenon, several

    research have conducted focus group discussions to identify these quality parameters. For

    example, Zeithaml, Parsuraman and Berry identified variables which they clubbed under five

    groups. In doing so, they used focus groups. Since the prime goal of exploratory research is to

    know the unknown, this research is unstructured. Focus group interviewing key customer groups,

    experts and even search for printed or published information are some common techniques.

    Objective: To provide insights and understanding.

    Characteristics: Information needed is defined only loosely. Research process is flexible and

    unstructured. Sample is small and non-representative. Analysis of primary data is qualitative.

    Methods: Expert surveys, Pilot surveys, Secondary data:qualitative analysis

    b. Briefly explain Independent dependent and extraneous variables in a

    research design. [5 Marks]

    A variable is something that can be changed, such as a characteristic or value. Variables aregenerally used in psychology experiments to determine if changes to one thing result in changesto another.

    Independent Variable:

    The independent variable is the variable that is controlled and manipulated by the experimenter.

    For example, in an experiment on the impact of sleep deprivation on test performance, sleep

    deprivation would be the independent variable. That factor which is measured, manipulated, or

    selected by the experimenter to determine its relationship to an observed phenomenon. "In a

    http://en.wikipedia.org/wiki/Qualitative_researchhttp://en.wikipedia.org/wiki/Statistical_populationhttp://psychology.about.com/od/iindex/g/independent-variable.htmhttp://psychology.about.com/od/iindex/g/independent-variable.htmhttp://en.wikipedia.org/wiki/Statistical_populationhttp://en.wikipedia.org/wiki/Qualitative_research
  • 8/4/2019 MB0050 Complete

    6/25

    research study, independent variables are antecedent conditions that are presumed to affect a

    dependent variable. They are either manipulated by the researcher or are observed by the

    researcher so that their values can be related to that of the dependent variable. For example, in a

    research study on the relationship between mosquitoes and mosquito bites, the number of

    mosquitoes per acre of ground would be an independent variable" While the independent

    variable is often manipulated by the researcher, it can also be a classification where subjects are

    assigned to groups. In a study where one variable causes the other, the independent variable is

    the cause. In a study where groups are being compared, the independent variable is the group

    classification.

    Dependent Variable:The dependent variable is the variable that is measured by the experimenter. In our previous

    example, the scores on the test performance measure would be the dependent variable. That

    factor which is observed and measured to determine the effect of the independent variable, i.e.,

    that factor that appears, disappears, or varies as the experimenter introduces, removes, or varies

    the independent variable. "In a research study, the independent variable defines a principal focus

    of research interest. It is the consequent variable that is presumably affected by one or more

    independent variables that are either manipulated by the researcher or observed by the researcher

    and regarded as antecedent conditions that determine the value of the dependent variable. For

    example, in a study of the relationship between mosquitoes and mosquito bites, the number of

    mosquito bites per hour would be the dependent variable" (Jaeger, 1990, p. 370)The dependent

    variable is the participant's response.

    The dependent variable is the outcome. In an experiment, it may be what was caused or what

    changed as a result of the study. In a comparison of groups, it is what they differ on.

    For example, we might change the type of information (e.g. organised or random) given to

    participants to see what affect this might have on the amount of information remembered. In this

    particular example the type of information is the independent variable (because it changes) and

    the amount of information remembered is the dependent variable (because this is being

    measured).

    http://psychology.about.com/od/dindex/g/dependvar.htmhttp://psychology.about.com/od/dindex/g/dependvar.htm
  • 8/4/2019 MB0050 Complete

    7/25

  • 8/4/2019 MB0050 Complete

    8/25

    and temperament to ensure that these factors do not interfere with the results. If, however, a

    variable cannot be controlled for, it becomes what is known as a confounding variable. This type

    of variable can have an impact on the dependent variable, which can make it difficult to

    determine if the results are due to the influence of the independent variable, the confounding

    variable or an interaction of the two.

    Suppose I wanted to measure the effects of Alcohol (IV) on driving ability (DV) I would have to

    try to ensure that extraneous variables did not affect the results. These variables could include:

    Familiarity with the car: Some people may drive better because they have drove this make ofcar before.

    Familiarity with the test: Some people may do better than others because they know what to

    expect in the test.

    Used to drinking. The effects of alcohol on some people may be less than on others becausethey are used to drinking.

    Full stomach. The effect of alcohol on some subjects may be less than on others because they

    have just had a big meal.

    If these extraneous variables are not controlled they may become confounding variables, becausethey could go on to affect the results of the experiment.

    Example:

  • 8/4/2019 MB0050 Complete

    9/25

    Q 3. A. Differentiate between Census survey and Sample Survey [5 Marks]

    Practically every country in the world conducts censuses and sampling surveys on a regular basis

    in order to get valuable data from and about their populations. This data is used by the federal

    and state governments in making numerous decisions with regard to various health care, housing,

    and educational issues, among others. While both these two data-gathering methods essentially

    serve the same purpose, they have a number of differences with regard to approach and

    methodology, as well as scope. These two methods may also differ in terms of the variance in the

    data gathered, as you will see later.

    Scope

    A census involves the gathering of information from every person in a certain group. This may

    include information on age, sex and language among others. A sample survey on the other hand

    commonly involves gathering data from only a certain section of a particular group.

    Sampling Variance

    The main advantage of a census is a virtually zero sampling variance, mainly because the data

    used is drawn from the whole population. In addition, more precise detail can generally be

    gathered about smaller groups of the population.

    As for sampling, there is a possibility of sampling variance, since the data used is drawn from

    only a small section of the population. This makes sampling a much less accurate form of data

    collection than a census. In addition, the sample may be too small to provide an accurate picture

    of the population.

    Cost and Timetable

    A census can be quite expensive to conduct, particularly for large populations. In most cases,

    they are also a lot more time-consuming than sample surveys. Adding considerably

    to the timetable is the necessity of gathering data from every single member of the population.

    The huge scope of a census also makes it harder to maintain control of the quality of the data.

  • 8/4/2019 MB0050 Complete

    10/25

    For instance, anyone who does not complete a census form will be visited by a government

    representative whos only job to is to gather census data.

    A sample survey for its part costs quite a bit less than a census, since data is gathered from a

    much smaller group of people. In addition, sample surveys generally take a much shorter time to

    conduct, again given the smaller scope. This also means reduced requirements for respondents,

    which in turn leads to better data monitoring and quality control.

    Survey cycle and costs

    The development of most surveys follows the same cycle. Unless the survey has been conducted

    before, the survey cycle starts with the identification of the need for information by one or more

    clients or data user. This step is the most important since, without clear identification of user

    need, the purposes of the survey will be unclear and the development process will be flawed

    from the beginning.

    This starts the ongoing development process for data collection. User needs guide in the

    planning process that leads into development and design of the survey. When data is collected

    and processed, estimation of prevalence and other analyses begin. These are disseminated and

    evaluated by, among others, the data user. So the process comes full circle. Diagram below

    illustrates the survey cycle.The Survey Cycle

    http://www.unescap.org/stat/disability/manual/Chapter5-Disability-Statistics.asp#Diagram5_1http://www.unescap.org/stat/disability/manual/Chapter5-Disability-Statistics.asp#Diagram5_1
  • 8/4/2019 MB0050 Complete

    11/25

    Several of these phases, and in particular, those that raise issues of special concern to disability

    data collection, are discussed in this and the following chapters.

    Considerations of cost are always relevant to the development of surveys. National statistical

    offices must always be aware of whether the potential benefits of the survey compensate for the

    costs of developing a survey, putting it into the field, and collecting and analyzing the data.

    National statistical agencies regularly record, not only total costs of data collection, but the costs

    of each phase. Cost information is essential for budgeting each phase of survey development

    within the organization, and may also be used to compare costs with those of other national

    statistical organizations.

    The most significant costs in any interviewer-administered data collection are the actual field

    costs to administer the survey and collect the data. The planning and development costs are

    significant and of course necessary to ensure high quality disability data. Costs of analysis and

    data dissemination are also major expenses. Usually, total development costs are roughly equal

    to total output related costs.

    Census cycle

    The cycle of phases of censuses is similar to that of surveys, as Diagram illustrates. In

    particular, the census cycle begins and ends with evaluation of previously collected data and

    user consultation. The major difference between the two is that far more time is required for all

    phases of census collection, especially for additional consultation, design and testing

    procedures, including topic selection, government endorsement of final design, and quality

    assurance.

    http://www.unescap.org/stat/disability/manual/Chapter5-Disability-Statistics.asp#Diagram5_2http://www.unescap.org/stat/disability/manual/Chapter5-Disability-Statistics.asp#Diagram5_2
  • 8/4/2019 MB0050 Complete

    12/25

    The Census Cycle

    b. Analyze multi-stage and sequential sampling. [5 Marks]

    Multi-Stage Sampling

    Multi-stage sampling is a kind of complex sample design in which two or more levels of units

    are imbedded one in the other. For example: geographic areas (primary units), factories

    (secondary units), employees (tertiary units). At each stage, a sample of the corresponding units

    is selected. At first, a sample of primary units is selected, then, in each of those selected, a

    sample of secondary units is selected, and so on. All ultimate units (individuals, for instance)

    selected at the last step of this procedure are then surveyed.

    The reasons for adopting such a design may be reducing costs, for example, when interviewers

    are assigned to persons located in a restricted area, or reducing the sample error. Multi-stagesampling is sometimes used when no general sample frame exists. In this case, a first step is to

    select, at random, a sample of areas, collective units, or villages from a list where they are all

    registered (primary units). Then, for each selected primary unit, a comprehensive enumeration of

    all units of lower rank is made, thus obtaining a local sample frame among which a sample of

    secondary units will be selected.

  • 8/4/2019 MB0050 Complete

    13/25

    For example, for each village of the primary sample, a list of all housing units is established,

    allowing for a selection of a sample of households. Different probabilities can be used at each

    stage, as well as within one particular stage, for the different units to be selected. Probabilities at

    the successive stages multiply, so that the resulting probability for selecting one final unit is the

    product of the probabilities used at each step. The corresponding answers need to be weighted by

    the inverse of that final probability in order to obtain unbiased estimates. A cluster sample can be

    seen as a two-stage sample where the secondary probability is 100 percent.

    Sequential Sampling:

    Sequential sampling is a non-probability sampling technique wherein the researcher picks a

    single or a group of subjects in a given time interval, conducts his study, analyzes the results then

    picks another group of subjects if needed and so on.

    In sequential sampling technique, there exists another step, a third option. The researcher can

    accept the null hypothesis, accept his alternative hypothesis, or select another pool of subjects

    and conduct the experiment once again. This entails that the researcher can obtain limitless

    number of subjects before finally making a decision whether to accept his null or alternative

    hypothesis.

    ADVANTAGES OF SEQUENTIAL SAMPLING

    The researcher has a limitless option when it comes to sample size and samplingschedule. The sample size can be relatively small of excessively large depending on the

    decision making of the researcher. Sampling schedule is also completely dependent to the

    researcher since a second group ofsamples can only be obtained after conducting the

    experiment to the initial group of samples.

    As mentioned above, this sampling technique enables the researcher to fine-tune hisresearch methods and results analysis. Due to the repetitive nature of this sampling

    method, minor changes and adjustments can be done during the initial parts of the study

    to correct and hone the research method.

    There is very little effort in the part of the researcher when performing this samplingtechnique. It is not expensive, not time consuming and not workforce extensive.

    http://www.experiment-resources.com/sample-size.htmlhttp://www.experiment-resources.com/sample-group.htmlhttp://www.experiment-resources.com/sample-group.htmlhttp://www.experiment-resources.com/sample-size.html
  • 8/4/2019 MB0050 Complete

    14/25

    DISADVANTAGES OF SEQUENTIAL SAMPLING

    This sampling method is hardly representative of the entire population. Its only hope ofapproaching representativeness is when the researcher chose to use a very large sample

    size significant enough to represent a big fraction of the entire population.

    The sampling technique is also hardly randomized. This contributes to the very littledegree representativeness of the sampling technique.

    Due to the aforementioned disadvantages, results from this sampling technique cannot beused to create conclusions and interpretations pertaining to the entire population.

    Q 4. List down various measures of central tendency and explain the

    difference between them? [10 marks].

    Statisticians use summary measures to describe patterns of data.

    Measures of central tendency refer to the summary measures used to describe the most

    "typical" value in a set of values.

    The Mean and the Median

    The two most common measures of central tendency are the median and the mean, which can be

    illustrated with an example. Suppose we draw a sample of five women and measure theirweights. They weigh 100 pounds, 100 pounds, 130 pounds, 140 pounds, and 150 pounds.

    To find the median, we arrange the observations in order from smallest to largest value. If there

    is an odd number of observations, the median is the middle value. If there is an even number of

    observations, the median is the average of the two middle values. Thus, in the sample of five

    women, the median value would be 130 pounds; since 130 pounds is the middle weight.

    The mean of a sample or a population is computed by adding all of the observations anddividing by the number of observations. Returning to the example of the five women, the mean

    weight would equal (100 + 100 + 130 + 140 + 150)/5 = 620/5 = 124 pounds. In the general case,

    the mean can be calculated, using one of the following equations:

    Population mean = = X / N OR Sample mean = x = x / n

    http://www.experiment-resources.com/randomization.htmlhttp://www.experiment-resources.com/research-population.htmlhttp://www.experiment-resources.com/research-population.htmlhttp://www.experiment-resources.com/randomization.html
  • 8/4/2019 MB0050 Complete

    15/25

    where X is the sum of all the population observations, N is the number of population

    observations, x is the sum of all the sample observations, and n is the number of sample

    observations.

    When statisticians talk about the mean of a population, they use the Greek letter to refer to the

    mean score. When they talk about the mean of a sample, statisticians use the symbol x to refer to

    the mean score.

    The Mean vs. the Median

    As measures of central tendency, the mean and the median each have advantages and

    disadvantages. Some pros and cons of each measure are summarized below.

    The median may be a better indicator of the most typical value if a set of scores has an outlier.

    An outlier is an extreme value that differs greatly from other values.

    However, when the sample size is large and does not include outliers, the mean score usually

    provides a better measure of central tendency.

    To illustrate these points, consider the following example. Suppose we examine a sample of 10

    households to estimate the typical family income. Nine of the households have incomes between$20,000 and $100,000; but the tenth household has an annual income of $1,000,000,000. That

    tenth household is an outlier. If we choose a measure to estimate the income of a typical

    household, the mean will greatly over-estimate the income of a typical family (because of the

    outlier); while the median will not.

    Effect of Changing Units

    Sometimes, researchers change units (minutes to hours, feet to meters, etc.). Here is howmeasures of central tendency are affected when we change units.

    If you add a constant to every value, the mean and median increase by the same constant. For

    example, suppose you have a set of scores with a mean equal to 5 and a median equal to 6. If you

    http://stattrek.com/Help/Glossary.aspx?Target=Populationhttp://stattrek.com/Help/Glossary.aspx?Target=Populationhttp://stattrek.com/Help/Glossary.aspx?Target=Samplehttp://stattrek.com/Help/Glossary.aspx?Target=Samplehttp://stattrek.com/Help/Glossary.aspx?Target=Population
  • 8/4/2019 MB0050 Complete

    16/25

    add 10 to every score, the new mean will be 5 + 10 = 15; and the new median will be 6 + 10 =

    16.

    Suppose you multiply every value by a constant. Then, the mean and the median will also be

    multiplied by that constant. For example, assume that a set of scores has a mean of 5 and a

    median of 6. If you multiply each of these scores by 10, the new mean will be 5 * 10 = 50; and

    the new median will be 6 * 10 = 60.

    OR

    A measure of central tendency is a single value that attempts to describe a set of data by

    identifying the central position within that set of data. As such, measures of central tendency are

    sometimes called measures of central location. They are also classed as summary statistics. The

    mean (often called the average) is most likely the measure of central tendency that you are most

    familiar with, but there are others, such as, the median and the mode.

    The mean, median and mode are all valid measures of central tendency but, under different

    conditions, some measures of central tendency become more appropriate to use than others. In

    the following sections we will look at the mean, mode and median and learn how to calculate

    them and under what conditions they are most appropriate to be used.

    Mean (Arithmetic)

    The mean (or average) is the most popular and well known measure of central tendency. It can

    be used with both discrete and continuous data, although its use is most often with continuous

    data (see our Types of Variable guide for data types). The mean is equal to the sum of all the

    values in the data set divided by the number of values in the data set. So, if we have n values in a

    data set and they have values x1, x2, ..., xn, then the sample mean, usually denoted by

    (pronounced x bar), is:

    This formula is usually written in a slightly different manner using the Greek capitol letter, ,

    pronounced "sigma", which means "sum of...":

    http://statistics.laerd.com/statistical-guides/types-of-variable.phphttp://statistics.laerd.com/statistical-guides/types-of-variable.php
  • 8/4/2019 MB0050 Complete

    17/25

    You may have noticed that the above formula refers to the sample mean. So, why call have we

    called it a sample mean? This is because, in statistics, samples and populations have very

    different meanings and these differences are very important, even if, in the case of the mean,

    they are calculated in the same way. To acknowledge that we are calculating the population

    mean and not the sample mean, we use the Greek lower case letter "mu", denoted as :

    The mean is essentially a model of your data set. It is the value that is most common. You will

    notice, however, that the mean is not often one of the actual values that you have observed in

    your data set. However, one of its important properties is that it minimises error in the predictionof any one value in your data set. That is, it is the value that produces the lowest amount of error

    from all other values in the data set.

    An important property of the mean is that it includes every value in your data set as part of the

    calculation. In addition, the mean is the only measure of central tendency where the sum of the

    deviations of each value from the mean is always zero.

    When not to use the mean

    The mean has one main disadvantage: it is particularly susceptible to the influence of outliers.These are values that are unusual compared to the rest of the data set by being especially small or

    large in numerical value. For example, consider the wages of staff at a factory below:

    Staff 1 2 3 4 5 6 7 8 9 10

    Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k

    The mean salary for these ten staff is $30.7k. However, inspecting the raw data suggests that this

    mean value might not be the best way to accurately reflect the typical salary of a worker, as most

    workers have salaries in the $12k to 18k range. The mean is being skewed by the two large

    salaries. Therefore, in this situation we would like to have a better measure of central tendency.

    As we will find out later, taking the median would be a better measure of central tendency in this

    situation.

  • 8/4/2019 MB0050 Complete

    18/25

    Another time when we usually prefer the median over the mean (or mode) is when our data is

    skewed (i.e. the frequency distribution for our data is skewed). If we consider the normal

    distribution - as this is the most frequently assessed in statistics - when the data is perfectly

    normal then the mean, median and mode are identical. Moreover, they all represent the most

    typical value in the data set. However, as the data becomes skewed the mean loses its ability to

    provide the best central location for the data as the skewed data is dragging it away from the

    typical value. However, the median best retains this position and is not as strongly influenced by

    the skewed values. This is explained in more detail in the skewed distribution section later in this

    guide.

    Median

    The median is the middle score for a set of data that has been arranged in order of magnitude.

    The median is less affected by outliers and skewed data. In order to calculate the median,

    suppose we have the data below:

    65 55 89 56 35 14 56 55 87 45 92

    We first need to rearrange that data into order of magnitude (smallest first):

    14 35 45 55 55 56 56 65 87 89 92

    Our median mark is the middle mark - in this case 56 (highlighted in bold). It is the middle mark

    because there are 5 scores before it and 5 scores after it. This works fine when you have an odd

    number of scores but what happens when you have an even number of scores? What if you had

    only 10 scores? Well, you simply have to take the middle two scores and average the result. So,

    if we look at the example below:

    65 55 89 56 35 14 56 55 87 45

    We again rearrange that data into order of magnitude (smallest first):

    14 35 45 55 55 56 56 65 87 89 92

    Only now we have to take the 5th and 6th score in our data set and average them to get a median

    of 55.5.

  • 8/4/2019 MB0050 Complete

    19/25

    Mode

    The mode is the most frequent score in our data set. On a histogram it represents the highest bar

    in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the most

    popular option. An example of a mode is presented below:

    Normally, the mode is used for categorical data where we wish to know which is the most

    common category as illustrated below:

  • 8/4/2019 MB0050 Complete

    20/25

    We can see above that the most common form of transport, in this particular data set, is the bus.

    However, one of the problems with the mode is that it is not unique, so it leaves us withproblems when we have two or more values that share the highest frequency, such as below:

  • 8/4/2019 MB0050 Complete

    21/25

    We are now stuck as to which mode best describes the central tendency of the data. This is

    particularly problematic when we have continuous data, as we are more likely not to have any

    one value that is more frequent than the other. For example, consider measuring 30 peoples'

    weight (to the nearest 0.1 kg). How likely is it that we will find two or more people

    with exactlythe same weight, e.g. 67.4 kg? The answer, is probably very unlikely - many people

    might be close but with such a small sample (30 people) and a large range of possible weights

    you are unlikely to find two people with exactly the same weight, that is, to the nearest 0.1 kg.

    This is why the mode is very rarely used with continuous data.

    Another problem with the mode is that it will not provide us with a very good measure of central

    tendency when the most common mark is far away from the rest of the data in the data set, as

    depicted in the diagram below:

  • 8/4/2019 MB0050 Complete

    22/25

    In the above diagram the mode has a value of 2. We can clearly see, however, that the mode is

    not representative of the data, which is mostly concentrated around the 20 to 30 value range. To

    use the mode to describe the central tendency of this data set would be misleading.

    Skewed Distributions and the Mean and Median

    We often test whether our data is normally distributed as this is a common assumption

    underlying many statistical tests. An example of a normally distributed set of data is presented

    below:

  • 8/4/2019 MB0050 Complete

    23/25

    When you have a normally distributed sample you can legitimately use both the mean or the

    median as your measure of central tendency. In fact, in any symmetrical distribution the mean,

    median and mode are equal. However, in this situation, the mean is widely preferred as the bestmeasure of central tendency as it is the measure that includes all the values in the data set for its

    calculation, and any change in any of the scores will affect the value of the mean. This is not the

    case with the median or mode.

    However, when our data is skewed, for example, as with the right-skewed data set below:

  • 8/4/2019 MB0050 Complete

    24/25

    we find that the mean is being dragged in the direct of the skew. In these situations, the median is

    generally considered to be the best representative of the central location of the data. The more

    skewed the distribution the greater the difference between the median and mean, and the greater

    emphasis should be placed on using the median as opposed to the mean. A classic example of the

    above right-skewed distribution is income (salary), where higher-earners provide a false

    representation of the typical income if expressed as a mean and not a median.

    If dealing with a normal distribution, and tests of normality show that the data is non-normal,

    then it is customary to use the median instead of the mean. This is more a rule of thumb than a

    strict guideline however. Sometimes, researchers wish to report the mean of a skewed

    distribution if the median and mean are not appreciably different (a subjective assessment) and if

    it allows easier comparisons to previous research to be made.

    Summary of when to use the mean, median and mode

    Please use the following summary table to know what the best measure of central tendency is

    with respect to the different types of variable.

    http://statistics.laerd.com/statistical-guides/types-of-variable.phphttp://statistics.laerd.com/statistical-guides/types-of-variable.php
  • 8/4/2019 MB0050 Complete

    25/25

    Type of Variable Best measure of central tendency

    Nominal Mode

    Ordinal Median

    Interval/Ratio (not skewed) Mean

    Interval/Ratio (skewed) Median