8/3/2019 6345 Descrptive Stat Excel
1/15
1
Lab2
Descriptive Statistics in Excel and plotting mean and standard deviation
Excel is a spreadsheet program that is an excellent, simple way to organize data. While it is notas powerful as the major statistical software packages, it can be used do compute descriptive
statistics. You will be using an Excel spreadsheet to calculate some descriptive statistics (mean,variance, etc.) on the masses of male and female crickets. I have entered measured masses ofmale and female crickets into columns of a spreadsheet Cricket Mass Data.xls.
You can start answering some of these questions by looking directly at the frequencydistribution, but for some other questions you may have to do some calculations by obtaining a
set of descriptive statistics. These statistics are a collection of measurements from our variable of
interest to estimate: central tendency and variability or dispersion. Central tendency tells you
what the center of the data is (mean, median mode). Dispersion or variation refers to the spreadof the data from the center value (variance, standard deviation).
CentralTendency Variation or Dispersion
Mean Variance
Median Standard deviation
Begin by familiarizing yourself with the descriptive statistics functions.
There are two ways to compute descriptive statistics. One is to manually type a function. Excelwill also insert a function.
Under the Formulas tab, select Insert Function from the menu at the top of the spreadsheet.
This will give you a list of all the functions that excel computes. We do not have to compute allexcel functions a few of them will be pertinent to this exercise.
Getting Started
The spreadsheet is arranged so that the masses of females are in column B and those of males in
column C. Labels are located at the top of each column. Move around the spreadsheet to becomefamiliar with the dataset.
Getting Started
1. Once you have downloaded the file, open the Excel spreadsheet.
2. Spreadsheets are arranged in rows (numbered 1, 2, 3) and columns (lettered A, B, C).
Calculate Descriptive Functions for Dataset
On the excel worksheet, select the Formulas tab. On the far left side of the worksheet, select
Insert Function as shown below:
8/3/2019 6345 Descrptive Stat Excel
2/15
2
Under the popup menu, you are given two options to find a function. The first option allows you
to Search for a function. The second option provides a scroll down list of all possible functions
that excel performs. I have provided below a list and description of all the functions that you
will need to perform for this lab in Appendix 1. This includes average, sum, count, min, max,median, mode, var, stdev, skew, and kurt.
Position your cursor in the cell F2. First, in the Search for a function option, type: average.Type in the other functions listed above to make sure that you can select all the functions that
you will need to perform for lab.
8/3/2019 6345 Descrptive Stat Excel
3/15
3
1. Begin by calculating the average (=mean) for cricket mass for both males and females.
A. To calculate cricket mass for females, position your cursor in the cell F2 as shown below by
red box:
UnderSearch for a function, type average. Once you pull up the average function, select OK.
B. Now, you need to input the range of the cells that you wish to compute the average. First,
delete all the ranges in the Number by hitting the backspace key. Use your cursor to select the
cells that contain all the data for the female cricket mass (B2:B60) as shown below. Once thecells are selected, click on ok.
8/3/2019 6345 Descrptive Stat Excel
4/15
4
What is the result? You should see a number appear in F2: 413 is the mean for the cricket mass
of females.
C. To calculate the mean for the male cricket size, position your cursor in cell G2. Insert theaverage function by first, deleting all the ranges in the Number by hitting the backspace key.
Use your cursor to select the cells that contain all the data for the male cricket mass (C2:C35).Once the cells are selected, click on ok.
2. Calculate the Sum. Position your cursor in F3. Under the Insert function tab, select insert
Insert Function and Search for a function. This time, search for sum. Select cells containingall the data for the female cricket mass (B2:B60). Clickok. Position your cursor in G3. Select
your cells for male cricket mass (C2:C35). Clickok.
3. Calculate the Count. Position your cursor in F4. Under the Insert function tab, select insert
Insert Function and Search for a function. This time search for count, which will return yoursample size. Select cells containing all the data for the female cricket mass (B2:B60). Clickok.Position your cursor in G4. Select your cells for male cricket mass (C2:C35). Clickok.
4. Calculate the min. Position your cursor in F5. Under the Insert function tab, select insert
Insert Function and Search for a function. This time search for min. Select cells containing
all the data for the female cricket mass (B2:B60). Clickok. Position your cursor in G5. Select
your cells for male cricket mass (C2:C35). Clickok.
8/3/2019 6345 Descrptive Stat Excel
5/15
5
5. Calculate the max. Position your cursor in F6. Under the Insert function tab, select insert
Insert Function and Search for a function. This time search for max. Select cells containing
all the data for the female cricket mass (B2:B60). Clickok. Position your cursor in G6. Selectyour cells for male cricket mass (C2:C35). Clickok.
6. Calculate the median. Position your cursor in F7. Under the Insert function tab, select insert
Insert Function and Search for a function. This time search for median. Select cells
containing all the data for the female cricket mass (B2:B60). Clickok. Position your cursor in
G7. Select your cells for male cricket mass (C2:C35). Clickok.
7. Calculate the mode. Position your cursor in F8. Under the Insert function tab, select insert
Insert Function and Search for a function. This time search for mode. Select cells containing
all the data for the female cricket mass (B2:B60). Clickok. Position your cursor in G8. Selectyour cells for male cricket mass (C2:C35). Clickok.
8. Calculate the variance. Position your cursor in F9. Under the Insert function tab, selectinsert Insert Function and Search for a function. This time search for var. Select cells
containing all the data for the female cricket mass (B2:B60). Clickok. Position your cursor in
G9. Select your cells for male cricket mass (C2:C35). Clickok.
9. Calculate the standard deviation. Position your cursor in F10. Under the Insert function tab,
select insert Insert Function and Search for a function. This time search for stdev. Selectcells containing all the data for the female cricket mass (B2:B60). Clickok. Position your
cursor in G10. Select your cells for male cricket mass (C2:C35). Clickok.
10. Calculate a skewness index. Position your cursor in F11. Under the Insert function tab,select insert Insert Function and Search for a function. This time search for skew. Select
cells containing all the data for the female cricket mass (B2:B60). Clickok. Position your
cursor in G11. Select your cells for male cricket mass (C2:C35). Clickok.
11. Calculate a kurtosis index. Position your cursor in F12. Under the Insert function tab,
select insert Insert Function and Search for a function. This time search for kurt. Select cellscontaining all the data for the female cricket mass (B2:B60). Clickok. Position your cursor in
G12. Select your cells for male cricket mass (C2:C35). Clickok.
Part2. Graphic Display of Means and Standard Deviation
Lets now compare the means of males and females and their standard deviation by generating a
graph in excel.
Already, you have calculated the mean and standard deviation for mass of female and male
crickets. We want to graphically display this to determine whether means of males and femalesare different and what the overlap is among our sample observations.
8/3/2019 6345 Descrptive Stat Excel
6/15
6
To do this, use your cursor to select the cells that contain the titles (females mass and males mass
mg) and the cells with the mean estimates of male and female cricket mass. Under the Inserttab, select Two Dimensional Column, Clustered Column.
.
8/3/2019 6345 Descrptive Stat Excel
7/15
7
It should return a graph that looks like this:
There are several problems with the default graph in Excel. For one, this gives no estimate of thevariation among observations (standard deviation or variance). In addition, excel treats Females
Mass and Males Mass as the same series. We want to be able to add the standard deviation bar
on each series (Females Mass) and (Males Mass) which requires that we create separate series.
8/3/2019 6345 Descrptive Stat Excel
8/15
8
First, we need to assign each variable (male and female mass) as separate series. To do this,
select your chart by positioning your cursor in the top right or left side of the chart. Right click
with your mouse until the following appears:
Choose Select Data.
8/3/2019 6345 Descrptive Stat Excel
9/15
9
UnderLegend Entries (Series), select Series 1 and select Remove. Then click on Series Name.
Position your cursor in the cell that contains the title for females (cell F1) and click on this title.
Now we want to add the value for the mean for Females. UnderSeries Values, click on cell
containing average for females (cell F2). ClickOK.
Now we need to create a separate series for average male cricket mass. ClickAdd under
Legend Entries (Series). Position your cursor in the cell that contains the title for males (cellG1). Now we want to add the value for the mean for mean mass of males. Under Series Values,
click on cell containing average mass for males (cell G2). ClickOK.
Now, we want to edit our Horizontal Axis. In the same window, click on edit. Under the cell
Axis Label Range, type: Sex. Clickok.
Last, we want to add the standard deviation bars to show dispersion for each of the means.
Select the female cricket bar on chart until it is highlighted:
Under the Layout tab, choose the Error Bars tab on the far right of the spreadsheet.
8/3/2019 6345 Descrptive Stat Excel
10/15
10
Sex
8/3/2019 6345 Descrptive Stat Excel
11/15
11
Select More Error Bars Options under the Error Bars tab.
UnderError Bars Display, Choose the option Both.
UnderError Amount, Choose the option Custom, click on Specify Value. UnderPositive
Error Value, choose the cell that contains the stdev for females (F10). UnderNegative ErrorValue, choose the cell that contains the stdev for females (F10). Clickok.
Now, repeat the process for the second series.
Select the male cricket bar on chart until it is highlighted:
Under the Layout tab, choose the Error Bars tab on the far right of the spreadsheet.
Select More Error Bars Options under the Error Bars tab.
UnderError Bars Display, Choose the option Both.
Sex
8/3/2019 6345 Descrptive Stat Excel
12/15
12
UnderError Amount, Choose the option Custom, click on Specify Value. UnderPositive
Error Value, choose the cell that contains the stdev for males (G10). UnderNegative ErrorValue, choose the cell that contains the stdev for males (G10).
You should now have a chart that looks something like this:
Lets use our knowledge from the first lab to format the histogram.
Now, lets add a title to the y-axis.
Under the Layout tab, choose Axis Titles and then select Primary Vertical Axis Title, andchoose Rotated Title. Change Text within each of the titles by right highlighting with mouse
and typing title. The title on the y axis should be: mean mass of crickets (mg). To change font
size, select the text, right click and choose Font. Change font size to 12. Click OK.
Now, we want to format the histogram.
Right click and Select Format Data Series.
To change the color of the bars, select the series by positioning your cursor on the mean of eithermale or female cricket mass. Select Solid fill and choose an appropriate color. For presentingresearch data, it is common to use bold, solid colors, such as black, blue, or white with black
border.
Sex
8/3/2019 6345 Descrptive Stat Excel
13/15
13
Try changing the color of the outline of the bars on your histogram. Right click and select
FORMAT DATA SERIES. UnderBorder Color, select Solid Line and change the color to
change to Blue or Black.
Almost there
Removing Grid Lines:
To remove the grid lines, position your cursor on one of the grid lines and click. After all appear
to be selected, right click on the mouse. UnderLine Color, Select, Solid and then from the dropdown color menu, select White. Like magic, the black lines disappear.
Make the graph bigger by selecting the corner of the chart and expanding it.
Last Step: Presenting the graph.
You can print the graph as you would for any chart or you can copy and paste it into a worddocument. If you do this you may wish to remove the outside border. Right click in the outside
area around the histogram. Select Format Chart Area and the following dialogue box will
appear. Select None for Border and than clickOK.
You can change the scale of the y-axis.
Are you done?
Based on the data, you should be able to begin to answer the following questions:
What is the magnitude of the effect size between mean cricket mass of females and males?
Is there significant overlap among observations?
Which has the largest standard deviation? Mass of males or females?
Remember, these are observations, and dont as of yet, represent STATISTICALLY significantdifferences. On to that nextafter we transform data.
You are now ready for SigmaStat!
8/3/2019 6345 Descrptive Stat Excel
14/15
14
Appendix 1. Summary of sum of the useful commands in Excel
=AVERAGE(range of cells with the values of interest)
In the average length cell, use the above function to calculate the sample mean.
The Average or sample mean is the sum of the observations divided by the total number ofobservations. It is the most common indicator of central tendency of a variable.
Sum refers to the sum of all the values in a range of values. For weight means the sum of the
weights of all alligators. The excel formula for sum is:
=SUM(range of cells with the values of interest)
Count refers to the count of cell that contain values (numbers). For our purposes, it will give usthe sample size. The formula is:
=COUNT(range of cells with the values of interest)
Min is the lowest value in an array of values. The formula is:
=MIN(range of cells with the values of interest)
Max is the largest value in an array of values. The formula is:
=MAX(range of cells with the values of interest)
The median is another measure of central tendency. To get the median you have to order the data
from lowest to highest. The median is the number in the middle. If the number of cases is odd
the median is the single value, for an even number of cases the median is the average of the twonumbers in the middle. The excel formula is:
=MEDIAN(range of cells with the values of interest)
The mode refers to the most frequent, repeated or common number in the data. By height there
are alligators that are 86 cm in length in the sample than any other group. If you see the term
#N/A, this means that all values in the sample are unique. The excel formula is:
=MODE(range of cells with the values of interest)
Range is a measure of dispersion. It is simple the difference between the largest and smallest
value, max min.
The sample variance measures the dispersion of the data from the mean. It is the simple mean of
the squared distance from the mean. It is calculated by:
8/3/2019 6345 Descrptive Stat Excel
15/15
15
s2
=y i "y( )
2
#n "1
Higher variance means more dispersion from the mean. The excel formula is:
=VAR(range of cells with the values of interest)
The standard deviation is the square root of the variance. Indicates how close the data is to themean. The standard deviation will be important when we study the properties of a normal
distribution. Assuming a normal distribution, 68% of the values are within 1 s from the mean,
95% within 2 s and 99% within 3 s.
s =y i "y( )
2
#
n "1
The excel formula for the sample standard deviation:
=STDEV(range of cells with the values of interest)
Skewness measures the asymmetry of the data, when in an otherwise normal curve one of the
tails is longer than the other. It is a roughly test for normality in the data (by dividing it by the
SE). If it is positive there is more data on the left side of the curve (right skewed, and the median
and the mode are lower than the mean). A negative value indicates that the mass of the data isconcentrated on the right of the curve (left tail is longer, left skewed, the median and the mode
are higher than the mean). An indicator of a normal curve shows a skew of +2 to 2. Skewness
can also be estimated with the following formula:
=SKEW(range of cells with the values of interest)
Kurtosis measures the peak of the distribution. It is also an indicator of normality. Positive
kurtosis indicates too few cases in the tails of a tall distribution (leptokurtic), negative kurtosis
too many cases in the tails or a flat distribution (platykurtic). If the excess kurtosis is larger than0, then we have a dataset which is steeper than a normal distribution, and smaller than 0
corresponds to a distribution which is flatter than a normal distribution. The excel formula forkurtosis is:
=KURT(range of cells with the values of interest)