-
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 1
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Applies to: Basic statistic measures and statistical forecasting
of time series. A forecast sample was taken from SAP Forecasting
and Replenishment 5.1, but apart from that sample, the article is
not linked to SAP applications.
For more information, visit the Retail homepage.
Summary Statistics and forecasts are a matter of our daily
business and private life. Therefore, a basic knowledge of the
statistical key-figures and the forecast methods often used is
required. The paper illustrates the statistical key figures of mean
values, variance and standard deviation, Normal and Poisson
distribution. It explains basic forecast methods such as moving
average, exponential smoothing and linear regression.
Author: Dr. Barbara Wessela
Company: SAP AG
Created on: 09 February 2009
Author Bio Barbara Wessela from SAP AG works in the Solution
Management Supply Chain in the Industry Sector Trading Industries,
Industry Business Unit Retail.
Barbara joined SAP in 1999 and has specialized the past 5 years
in SAP Forecasting and Replenishment (releases 4.1, 5.0 and 5.1).
She gained a lot of practical experience with the application by
testing and building up a demo system. She has developed various
training and documentation materials for SAP F&R and has
teached numerous customer and partner workshops in that area.
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Table of Contents 1 Running into Statistics and Forecasts in
your daily and business
life.............................................................3
2 Refresh your knowledge: Basics statistics
......................................................................................................4
2.1 Qualitative and quantitative characteristics How to describe
objects?..................................................4 2.2
What are histograms
for?..........................................................................................................................5
2.3 Mean values: one for all
............................................................................................................................7
2.4 How can we measure the
variance?.......................................................................................................10
2.5 The Normal Distribution
..........................................................................................................................14
2.6 The Poisson Distribution for rare
events.................................................................................................17
3 Basics Forecasting
........................................................................................................................................18
3.1 What is Forecasting?
..............................................................................................................................18
3.1.1 Applications of forecasting
................................................................................................................................19
3.1.2 Forecast Approaches
........................................................................................................................................19
3.2 What are Time Series?
...........................................................................................................................21
3.3 Basic Forecasting
Methods.....................................................................................................................22
3.3.1 Moving Average
................................................................................................................................................22
3.3.2 Weighted Moving Average
................................................................................................................................25
3.3.3 First Order Exponential Smoothing
...................................................................................................................27
3.3.4 Seasonal adjustment of time series as a general statistical
method
.................................................................30
3.3.5 Exponential Smoothing with Trend and Seasonality
.........................................................................................34
3.3.6 Linear
Regression.............................................................................................................................................35
3.3.7 More sophisticated Regression Methods:
.........................................................................................................37
3.4 Causal based forecasting
.......................................................................................................................38
3.5 Forecasting Performance
Measures.......................................................................................................40
References........................................................................................................................................................45
Copyright...........................................................................................................................................................46
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 2
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
1 Running into Statistics and Forecasts in your daily and
business life Flip the newspaper open and you will find reams of
different statistics and graphics about various sociopolitical,
economical or natural scientific topics (such as employment rates,
economic indices, inflation, diseases, age pyramid and many
others). You will also run into numerous predictions of the future,
such as for the economic growth, your horoscope, the worlds
population and the weather forecast.
Wherever you may live or work, you will be confronted with
statistics and forecasts whether you are aware of it or not. Sharp
tongues might quote Benjamin Disraeli who said: There are three
kinds of lies: lies, damned lies, and statistics to argue that
statistics are often taken to prove the case for the own opinion.
Nevertheless the better you understand the basics of statistics and
forecasting, the better position you will be in to judge the
quality of the statistic or forecast youre facing.
Of course, there is a lot of science and popular science
literature on the market about understanding statistics and
statistical reporting (see for example [1], [2]). Most of you had
to pass exams about statistics during your education. This paper
aims to remind you of the very basics of statistics as well as
explain basic forecast methods for time series forecasts. Of
course, it is not a scientific paper covering all kinds of forecast
approaches. It rather gives an illustration of statistic and
forecast principles on a high level in order to lay the foundations
and to get a better feeling for statistical forecast.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 3
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
2 Refresh your knowledge: Basics statistics
2.1 Qualitative and quantitative characteristics How to describe
objects?
You run into an old schoolmate and talk to him or her about
another person the name of whom you forgot. You will certainly
describe the person: hair style and color, skin color, height,
voice, special characteristics.
This work perfectly for only one person or a few people.
However, for larger number of people or objects, you have to better
organize the data in order to keep track of the essential
information.
Figure 1: Qualitative and Quantitative Characteristics
In Figure 1, there are some possible characteristics for people
such as this group of children. Such characteristics can be divided
into:
Qualitative characteristics that describe properties such as:
sex, hair color, religion. Values of these characteristics can be:
male or female, blond, black or brown hair, Christian, Jewish or
Moslem.
Quantitative characteristics that have metric values which can
be added, subtracted etc. Examples: Body height, body mass, age
etc
For larger numbers of children or people, it becomes unpractical
to describe the individuals. We have to somehow sort the
information. The histogram is the oldest method to preprocess
metric data.
Lets use the body height as example, which is a quantitative
characteristic with metric values.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 4
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
2.2 What are histograms for?
Figure 2: Histogram
First, we create intervals within the value range of body
heights. Then we count the number of children per interval. That
is: how many children have a body height between 0.80 m to 1.00 m
and between 1.01 m to 1.20 and so on (see Fig. 2).
The histogram helps already to reduce the amount of information
to get a better overview of the data. It is an abstraction from the
real world. Of course, some detailed information gets lost.
Therefore it is the challenge is to find meaningful intervals, not
too many, not to few, depending on the number of objects.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 5
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 3: Histogram for Large Groups Normal Distribution or
Normal Probability Curve. The bigger the group becomes, the
histogram probably will come closer and closer to the shape of a
bell. The bars of the histogram will be symmetrical around a mean
value. Such a distribution is called a Normal distribution.
Normal distributions can be found in many examples in nature,
such as the mass of chicken eggs or elephants, the body height of
mice or giraffes etc. You can also find it in economy, for example
for the daily deviations of shares of a stock index.
When values are influenced by many random factors, you can
expect a normal distribution of these values, because a normal
distribution is characterized by random deviations of actual values
from an expected value.
We will come back to the normal distribution later using another
approach.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 6
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
2.3 Mean values: one for all
If you dont want to create a histogram and if your data set is
too small, you can use simple formulas to calculate mean values.
There are different mean values. The most common is the arithmetic
mean value.
Figure 4: Mean values: Arithmetic Mean
The arithmetic mean is the sum of all data values divided by its
number. In our example, it is the sum of all body heights divided
by the number of children (see Fig. 4).
It is a big advantage that the arithmetic mean can also be
calculated if the single values are unknown; it is sufficient to
know the sum and the number of values. Example: the average number
of beer a German drinks in a year is simply determined by the total
beer consumption divided by the number of German citizens.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 7
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 5: Mean Values: Median
The Median is the value of the object in the middle if you sort
the objects in ascending (or descending) order. In our example, it
is the body height of the child standing in the middle of the
sorted queue (see Fig. 5)
It is easy to find, because no calculation is necessary. It has
the advantages, that:
it is not sensitive to extremely high or low values it doesnt
lead to unnatural values like the average of 1.75 children per
family.
In most statistics, the arithmetic mean is used instead of the
median, because it allows drawing conclusions from a random sample
to the total amount. The median doesnt contain this information,
but therefore, it can also be used for non-metric, qualitative
characteristics, for example: the average educational certification
of people in a company: you simply sort all possible certification
and take the one in the middle as the median.
Further mean values are:
geometric mean value harmonic mean value Weighted arithmetic
mean.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 8
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 6: Different Groups but Same Arithmetic Mean
In our example of a childrens group, the mean value doesnt tell
everything about the body height of this group. There could be
another group of children with the same average body height but
still it could look different (see Fig. 6). For instance, the
children of the second group could be all about the same height.
That means that the variance would be much bigger in the first
group.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 9
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
2.4 How can we measure the variance?
Figure 7: Variance of the first Group of Children (1)
Lets first calculate the deviations of each child from the
average. However, you can easily see that the deviation can be
positive or negative. If we just added them up, they would balance
out. Therefore, we calculate the squared deviations, which are
always positive. Taking the square also means, that values with a
bigger deviation have much more impact than values with smaller
deviations.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 10
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 8: Variance of the first Group of Children (2)
In order to get a normalized value, we divide the sum of the
squared deviations by the number of children. This gives the
variance.
The variance of this group of children is the sum of squared
deviations divided by the number of children.
(Sometimes you will also find a formula where the sums of
squared deviations are divided by the number minus 1. This is a
correction that can be done in order to count for the value which
is very close to the average. However, for large numbers, the
difference of both formulas becomes negligible.)
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 11
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 9: Standard Deviation of this Group of Children
Because of the squaring in the formula of the variance, the
variance doesnt have the same unit of measure than the original
values; it cannot be plot in the graphics and is also not really
evident. Therefore, one can extract the root of the variance, to
get the standard deviation.
The standard deviation is the root of the variance.
You can easily see that the standard deviation has the same unit
of measure as the original values. Therefore, you can plot the
standard deviation into the data graphics by plotting a line for
the arithmetic mean value plus the standard deviation and a line
for the mean value minus standard deviation.
These two lines give a range. If the number of values is big
enough, then about 68% of the values will be in this range.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 12
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 10: Same Arithmetic Mean but Different Standard
Deviations
If you calculate the arithmetic mean value, the variance and the
standard deviation for these two groups of children, you can see,
that although they have the same mean value, the variance and thus
the standard deviation is much bigger for the first group than for
the second (see Fig. 19). That means, that the variance and
standard deviation help describing the body height distributions of
the childrens groups much better than the mean value alone.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 13
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
2.5 The Normal Distribution
Figure 11: Normal Distribution
Taking many values into account that vary because of random
factors, you can often find a normal distribution for these values
when you plot the number of values with a certain deviation around
the mean value (see Fig. 11 and compare also to the histogram in
Fig. 3).
Normal distribution, definition from [3]: The normal
distribution, also called the Gaussian distribution, is an
important family of continuous probability distributions,
applicable in many fields. Each member of the family may be defined
by two parameters, the mean ("average", ) and variance (standard
deviation squared, 2) respectively. The standard normal
distribution is the normal distribution with a mean of zero and a
variance of one. Carl Friedrich Gauss became associated with this
set of distributions when he analyzed astronomical data using them
and defined the equation of its probability density function. It is
often called the bell curve because the graph of its probability
density resembles a bell.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 14
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 12: Standard Normal Distribution
The standard normal distribution is the normal distribution with
a mean of zero and a variance of one.
68,27 % of all values deviate not more than from the mean value
95,45 % of all values deviate not more than 2 from the mean value
99,73 % of all values deviate not more than 3 from the mean
value
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 15
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 13: Normal Distribution with Different Parameters
If the mean value deviates from zero, the function is shifted
horizontally.
If the variance 2 is bigger than one, the function becomes
broader and flatter than the standard normal distribution. If the
standard deviation is smaller than one, the function becomes
tighter and higher.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 16
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
2.6 The Poisson Distribution for rare events
Figure 14: Poisson distribution
Although many natural and business events are distributed
normally, there is another very important distribution: the Poisson
distribution. It is especially important for events that happen
rarely but have many opportunities to happen. Examples from nature:
nuclear decay of atoms or chromosome mutations in DNA the events
have a low probability for each atom or chromosome to happen, but
the overall number can be high regardless. A business example is
the intermittent demand of slow-moving products: the more products
and product variants are in the assortments, the smaller the
individual sales become. A product might sell only once every two
weeks but it is hard to predict when the next sales transaction
will happen and how many will be sold in this transaction.
Poisson distribution, definition from [3]: The Poisson
distribution is a discrete probability distribution that expresses
the probability of a number of events occurring in a fixed period
of time if these events occur with a known average rate and
independently of the time since the last event. The Poisson
distribution can also be used for the number of events in other
specified intervals such as distance, area or volume.
The distribution was discovered by Simon-Denis Poisson
(17811840) and published 1838. The work focused on certain random
variables N that count, among other things, a number of discrete
occurrences that take place during a time-interval of given length.
If the expected number of occurrences in this interval is , then
the probability that there are exactly k occurrences (k being a
non-negative integer, k = 0, 1, 2, ...) is equal to the formula
shown in Figure 14.
The Poisson distribution can be applied to systems with a large
number of possible events, each of which is rare. A classic example
is the nuclear decay of atoms.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 17
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
3 Basics Forecasting
3.1 What is Forecasting?
Forecasting is a mixture of science, art and luck. [4]
Forecasting Two Definitions from the Internet:
Forecasting is the process of estimation in unknown situations.
Prediction is a similar, but more general term. [] Usage can differ
between areas of application: for example in hydrology, the terms
"forecast" and "forecasting" are sometimes reserved for estimates
of values at certain specific future times, while the term
"prediction" is used for more general estimates, such as the number
of times floods will occur over a long period of time. Risk and
uncertainty are central to forecasting and prediction. Forecasting
is used in the practice of Customer Demand Planning in every day
business forecasting for manufacturing companies. The discipline of
demand planning, also sometimes referred to as supply chain
forecasting, embraces both statistical forecasting and a consensus
process. Forecasting is commonly used in discussion of time-series
data. [3]
Forecasting is the prediction of outcomes, trends, or expected
future behavior of a business, industry sector, or the economy
through the use of statistics. Forecasting is an operational
research technique used as a basis for management planning and
decision making. Common types of forecasting include trend
analysis, regression analysis, Delphi technique, time series
analysis, correlation, exponential smoothing, and input-output
analysis. [5]
Figure 15: Every day forecasts
The following list is taken from [5]
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 18
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
3.1.1 Applications of forecasting
Forecasting has application in many situations:
Supply chain management Weather forecasting, Flood forecasting
and Meteorology Transport planning and Transportation forecasting
Economic forecasting Technology forecasting Earthquake prediction
Land use forecasting Product forecasting Player and team
performance in sports Telecommunications forecasting Political
Forecasting
Figure 16: Forecast Approaches
3.1.2 Forecast Approaches
The following classification is taken from [5], see also Fig.
16.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 19
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Time series methods: Time series methods use historical data as
the basis of estimating future outcomes.
o Moving average o Exponential smoothing o Extrapolation o
Linear prediction o Trend estimation o Growth curve o Topics
Causal / econometric methods: Some forecasting methods use the
assumption that it is possible to identify the underlying factors
that might influence the variable that is being forecast. For
example, sales of umbrellas might be associated with weather
conditions. If the causes are understood, projections of the
influencing variables can be made and used in the forecast.
o Regression analysis using linear regression or non-linear
regression o Autoregressive moving average o Autoregressive
integrated moving average o Econometrics
Judgmental methods: Judgmental forecasting methods incorporate
intuitive judgments, opinions and probability estimates.
o Surveys o Delphi method o Scenario building o Technology
forecasting
Other methods: o Simulation o Prediction market o Probabilistic
forecasting and Ensemble forecasting o Reference class
forecasting
A model in science is a physical, mathematical, or logical
representation of a system of entities, phenomena, or processes.
Basically a model is a simplified abstract view of the complex
reality. It may focus on particular views, enforcing the "divide
and conquer" principle for a compound problem. Formally a model is
a formalized which deals with empirical entities, phenomena, and
physical processes in a mathematical or logical way.
A simulation is the implementation of a model over time. A
simulation brings a model to life and shows how a particular object
or phenomenon will behave. It is useful for testing, analysis or
training where real-world systems or concepts can be represented by
a model.
For more information regarding the above mentioned, see [3].
Forecast Approaches addressed in this paper:
Time Series methods: use historical data as the basis of
estimation future outcomes. Examples: Moving Average or Exponential
Smoothing.
Causal methods: Like time series methods, but underlying factors
may influence the forecast. Example: Regression analysis.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 20
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
These forecast methods perform an extrapolation of time series.
Time series are based on historical data. Example: Demand Forecast
based on historic sales.
3.2 What are Time Series?
Figure 17: Time Series Definition and Examples
Definitions from the Internet
Time Series [5]: Values taken by a variable over time (such as
daily sales revenue, weekly orders, monthly overheads, yearly
income) and tabulated or plotted as chronologically ordered numbers
or data points. To yield valid statistical inferences, these values
must be repeatedly measured, often over a four to five year period.
Time series consist of four components:
(1) Seasonal variations that repeat over a specific period such
as a day, week, month, season, etc.
(2) Trend variations that move up or down in a reasonably
predictable pattern
(3) Cyclical variations that correspond with business or
economic 'boom-bust' cycles or follow their own peculiar cycles,
and
(4) Random variations that do not fall under any of the above
three classifications.
Time Series [3]: In statistics, signal processing, and many
other fields, a time series is a sequence of data points, measured
typically at successive times, spaced at (often uniform) time
intervals. Time series analysis comprises methods that attempt to
understand such time series, often either to understand the
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 21
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
underlying context of the data points (where did they come from?
what generated them?), or to make forecasts (predictions). Time
series forecasting is the use of a model to forecast future events
based on known past events: to forecast future data points before
they are measured. A standard example in econometrics is the
opening price of a share of stock based on its past
performance.
3.3 Basic Forecasting Methods
3.3.1 Moving Average
SAP 2008 / Page 8
Moving Average for Forecasting - Principle
8.33
Moving average Mt(number of values N = 3)
91181012108107Data Dt
987654321Time t
9.338.33Mt
91181012108107Dt
987654321t
7 + 10 + 83
10 + 8 + 103
Starting point:
Moving forward:
Figure 18: Moving Average Principle
Fig. 18 shows a sample time series consisting of data Dt at
subsequent points in time t. Suppose the number of values to be
considered for the calculation of the mean value is N=3. In order
to calculate the first moving average value M4, you calculate the
arithmetic mean value of the first three values. You move forward
by always calculating the mean value of the three preceding
values.
The Moving Average moves from one data point to the next and
thereby performs a smoothing of the values.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 22
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
SAP 2008 / Page 9
Moving Average for Forecasting
9.89
9.8911
9.67
9.6710
Mt
Dt
t
10
99
1010.339.679.338.33
11810121081071287654321
9.89
9.8911
9.67
9.6710
Mt
Dt
t
10
99
9.851010.339.679.338.33
9.8511810121081071287654321
10
99
9.671010.339.679.338.33Mt
9.671181012108107Dt
1087654321t
8 + 11 + 93
11 + 9 + 9.673
11 + 9.67 + 9.893
At the end of the original data values, you start
forecasting:
Figure 19: Moving Average for Forecasting
At the end of the original data values, the last average value
serves as the first forecast value (see Fig. 19). From then on, you
also consider forecast values for the calculation of the moving
average. That means in this example, that after three periods the
forecast is purely based on previous forecast values.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 23
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
SAP 2008 / Page 10
6
7
8
9
10
11
12
13
1 2 3 4 5 6 7 8 9 10 11 12
time t
DtMoving average (N=3)Forecast
Moving Average for Forecasting, Example
Issues with moving average for forecasting: If the constant
level changes, it takes N periods until the forecasted value
adapts All N values used to calculate the Moving average have
the same impact,
although recent values better represent the recent development
of values
+=
+
++
=
+++=t
Ntiit
Ntttt
DN
F
DDDN
F
11
111
1
)(1 L
Figure 20: Moving Average for Forecasting, Example
Plotting the original data values, the moving average and the
forecast against the time, shows how the moving average performs a
smoothing of the original time series together with a time shift of
N periods (see Fig. 20). Suppose there is a new original data point
at the next point in time (by collecting the original time series
sequentially), the forecast can adapt with a lead time to peaks,
constant level changes or trends in the original time series.
The moving average is a simple method, but it considers all N
values with the same weight, although recent values might better
represent the recent development. It is apparent, that it can only
be used for a short term forecast.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 24
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
3.3.2 Weighted Moving Average
Figure 21: Weighted Moving Average Principle
In order to overcome the issue of the Moving Average method,
that all N values have the same impact, there is an improvement in
the method of Weighted Moving Average. Although the principle of
how to start, to move forward and to calculate the forecast is the
same, the values will be weighted with weighting factors that need
to be specified. In this above example, the weighting factors for
the N=3 values was chosen 0.167, 0.333 and 0.5 to give a weighting
of 1 in total (see Fig. 21).
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 25
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 22: Weighted Moving Average Example
As a result of weighting the values considered for the
calculation of the weighted mean value, this method better reacts
on constant level changes, trends or other fluctuations of the
original time series, because it gives the recent values more
impact than the distant ones (see Fig. 22).
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 26
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 27
3.3.3 First Order Exponential Smoothing
Figure 23: First Order Exponential Smoothing, Principle
First Order Exponential Smoothing is actually a further
enhancement in weighting the values taken into account for
calculating the mean value. Moreover, the mean values can easily be
calculated out of the previous mean values and the next data value
(or forecast value, respectively).
Fig. 23 shows an example: start with the first two data values
with an equal weighting of 0.5 to get the first average value. Take
this average and next data value D3 again with a weighting of 0.5
each. Proceed like that until the first forecast value that is
shown in the figure. (Forecast values are taken into account for
further extrapolation.)
When recalculating the weighting factors that each data value in
the past got, you will see, that the factors describe an
exponential curve.
This means, that all past values will have an impact on the
forecast, although this impact decreases exponentially.
SAP 2008 / Page 13
First Order Exponential Smoothing, Principle
10.07
99
9.539.1410.2810.569.138.258.5
Exponential smoothing with =0.5
9.531181012108107Dt / Forecast Ft
1087654321t
0.1250.25
0.5
0.0625
98.44% input from last 5 periods1.56% input from older
periods
10.070.5 0.5+ 9
Always take e.g. 50% of what you calculated so far plus 50% of
the next data value
Weighting factor
time periods
Smoothing factor = 0.5Smoothing factor = 0.5
0.03130.01560.0078
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 28
Figure 24: Smoothing Factor
The weighting factor used to weight the most recent data value
is the Smoothing Factor , whereas the last average calculated with
exponential smoothing is weighted with 1-. determines two
characteristics of the exponential smoothing at the same time:
Responsiveness, that is how quickly the exponentially smoothed
values (and also the forecast) react on level shifts
Stability, that is how strong the smoothed values and the
forecast react on short pulses Obviously, these both
characteristics run contrary to each other. Reasonable results can
be found for = 0.2 or 0.3 (see Fig. 24).
SAP 2008 / Page 14
Smoothing Factor
0.1250.25
0.5
0.080.090.1
0.070.070.06
65.1% fromlast 10 periods
34.9% older
= 0.1 = 0.1
reacts quickly on a level shift
reacts slowly on a level shift
98.4%5 periods1.6% older
reacts strongly on a short pulse
reacts little on a short pulse
= 0.5 = 0.5
The smoothing factor determines both the responsiveness and the
stability of the forecast. Common values = 0.2 or = 0.3.
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 25: First Order Exponential Smoothing, Example and
Formula
Comparing the plot of the exponential smoothing with the moving
average of Fig. 20 or the weighted moving average of Fig. 22, you
can see that exponential smoothing better follows the fluctuation
of the original time series (see Fig. 25). A further advantage is
that you can calculate the exponentially smoothed value from two
values only: the latest smoothed value and the next data value.
However, like the moving average methods, first order
exponential smoothing is not able to predict trends or seasonality
pattern in the forecast. All these methods can only follow such
fluctuations when smoothing the original data values, but are not
able to predict them in the future. Therefore, further enhancements
are needed.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 29
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
3.3.4 Seasonal adjustment of time series as a general
statistical method
SAP 2008 / Page 16
Seasonal Adjustment - Example
Example: Hypothetic unemployment numbers for 3.5 years
90
95
100
105
110
115
120
Value 116 100 92 100 108 100 92 100 116 108 100 100 112 108
1/I 1/II 1/III 1/IV 2/I 2/II 2/III 2/IV 3/I 3/II 3/III 3/IV 4/I
4/II
Question: is there a positive trend in the last quarter if
seasonal effects are neglected?
Figure 26: Seasonal Adjustment Example
The following example shows a method to adjust seasonal patterns
in a time series was taken from [1].
Fig. 26 shows hypothetical unemployment numbers per quarters
over 3.5 years. In the last quarter, you can observe a drop from
112 to 108 (relative numbers). The question is, whether this drop
is real or only due to the season that usually leads to a decrease
of the unemployment rate.
You can see easily that there is a seasonal pattern indeed:
every year, there is a maximum of unemployment in the first quarter
and a minimum in the third. But how big is this seasonal
effect?
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 30
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 27: Seasonal Adjustment Example
The first step to answer to this question (How big is the
seasonal effect?) is to calculate the moving average of the
original time series with N=4 (see Fig. 27). One cant start before
the quarter III of the first year and take the following formula in
order to balance the values around the quarter III:
moving average = ( of the quarter before the last + last quarter
+ current quarter + next quarter + of the quarter after next) /
4
The moving average time series has to end at quarter IV of the
third year.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 31
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 28: Seasonal Adjustment Example
The next step is to calculate a seasonal figure which is the
mean deviation of the original data from the moving average (see
Fig. 28). The seasonal factors are:
Quarter I: (8+11)/2 = 9.5
Quarter II: (0+2)2 = 1
Quarter III: (-9-9-5.5)/3 = -7.83
Quarter IV: (0-3-5)/3 = -2.67
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 32
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 29: Seasonal Adjustment Example
This seasonal figure can be applied to each year in order to
calculate a further approximation for the season-independent trend
component (see Fig. 29). This seasonally adjusted time series
contains now also values at the beginning and the end of the time
series, unlike the moving average.
As a result you can find that the seasonally adjusted
unemployment (unlike the non-adjusted one) increased from 102.5 to
107 in the last quarter.
Statistical seasonal adjustments usually work in similar ways.
Season figures can also be used for forecasting future seasons
after having isolated the season factors from the original
data.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 33
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
3.3.5 Exponential Smoothing with Trend and Seasonality
Figure 30: Exponential Smoothing with Trend and Seasonality
Remember, that simple exponential smoothing can follow a time
series, but it can extrapolate only constant values.
Seasonal adjustment is for separating the seasonality effect
from the base. Moreover, you can also determine a trend component,
e.g. by performing a second order exponential smoothing.
In the example of fig. 30, which was adapted from [6], the
isolation of trend and season portions was performed with the help
of the following formulas:
mktttkt
mtt
tt
tttt
ttmt
tt
SkTBF
SBDS
TBBT
TBSDB
++
+=
+=
+=
++=
)(
)1(
)1()(
))(1(
11
11
11
Forecast
factors Season
Trend
Base
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 34
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Note, that in these formulas, the seasonality is assumed to be
multiplicative, that means, the amplitude increases from season to
season. There are other methods considering the seasonal pattern as
additive. Find more information in forecasting literature such as
[4].
3.3.6 Linear Regression
The following example for linear regression is based on [6].
Figure 31: Regression Example
Fig. 31 shows an example taken from [7]: A champagne producer
wants to launch a new champagne product and searches for the retail
price. Before making any decision, the producer wants to find out
how the sales depend on the price. Therefore, a selling test is
performed in 6 stores with prices between 10 and 20 Euros. The
sales per day are plotted against the retail prices. There seems to
be a linear dependency. This can be analyzed with linear
regression.
Linear Regression, definition from [3]: In statistics, linear
regression is a form of regression analysis in which the
relationship between one or more independent variables and another
variable, called dependent variable is modeled by a least squares
functions, called linear regression equation. This function is a
linear combination of one or more model parameters, called
regression coefficients. A linear regression equation with one
independent variable represents a straight line. The results are
subject to statistical analysis.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 35
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 32: Regression Example
The least square analysis can be used to find the regression
line that best fits into the data set of the two depending
variables (for formulas see Fig. 32).
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 36
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 33: Regression Example
As a result, you obtain the regression line shown in Fig. 33.
You can interpret the line in the following way; the store can sell
one bottle less a day for each Euro where the champagne costs
more.
Interpolation and extrapolation:
If the prediction is to be done within the range of values of
the x variables used to construct the model this is known as
interpolation. In the champagne example, this would mean: at a
price of 12 Euro, the store could sell 8 bottles a day. Prediction
outside the range of the data used to construct the model is known
as extrapolation and it is more risky. In the champagne example,
this could mean: at a price of 8 Euro (which was not tested), the
store could sell 12 bottles a day.
3.3.7 More sophisticated Regression Methods:
Non-linear Regression:
The response Y depends on a non-linear function of the variable
x, such as e-function, logarithm etc.
Solution approach: the variable x is plotted in a suitable scale
(e.g. logarithmic scale) to result in a linear curve
Multiple Regression:
The response Y depends on several linear dependent variables x1,
x2, etc. Solution approach: Linear least squares method for a
number of normal equations that can be
described as matrices
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 37
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
3.4 Causal based forecasting
SAP 2008 / Page 25
Causal Factors in Demand Forecasting, Examples and Principle
Local e
vents
Sales Promotion
Calendar Events
Sales Price Chang
es
past futureoccurr. occurr.
Sales dataSales/Demand
of a product in a location
Forecast data
Deterministic demands
Figure 34: Causal Factors in Demand Forecasting, Examples and
Principle
A causal factor is an external factor with a significant
influence on the sales or demand of a product.
By applying concrete occurrences of causal factors to either
locations or location products, the forecast can use the
information about the effects of such occurrences in the past in
order to predict its influence on the future sales or demand.
Fig. 34 shows examples for causal factors together with a
hypothetical sales and forecast curves which should reflect the
following principle: The correlation of the sales peak with the
causal factor occurrences in the past is applied to future
occurrences.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 38
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 35: Impact of Seasonality and Causal Factors on Forecast,
Example
Forecast methods such as exponential smoothing and regression
together with causal factor analysis are used for example in
automatic replenishment software in the retail industry.
Fig. 35 shows a graphic of a forecast calculated and displayed
in SAP Forecasting and Replenishment 5.1. The consumption time
series represents a hypothetical sales curve that is characterized
by a yearly seasonal pattern, positive slopes around Christmas and
additional peaks during promotions. Promotions and Christmas
seasons were indicated as causal factors (Demand Influencing
Factors in SAP F&R) in the system. The forecast method was a
regression method taking into account both the seasonality and the
effect of causal factors. The forecast was able to reproduce all
effects.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 39
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
3.5 Forecasting Performance Measures
Figure 36: Forecasting Performance Measures
Fig. 36 shows an example of a linear curve representing a
supposed forecast together with some supposed actual values, taken
after the forecast had been calculated. The question is now: how
good is the forecast? In order to measure the forecast quality,
there are some common measures:
Mean Forecast error (MFE or Bias) Mean Absolute Deviation (MAD)
Mean Absolute Percentage Error (MAPE) Standard Squared Error
(MSE)
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 40
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 37: Mean Forecast Error
Fig. 37 shows the mean forecast error: it is the sum of all
deviations divided by the number of values. It is obvious, that
positive and negative deviations can cancel out. Therefore, the
mean forecast error can only detect an under- or overshooting of
the forecast.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 41
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
SAP 2008 / Page 29
Mean Absolute Deviation (MAD): Measures absolute error Positive
and negative errors thus do not cancel out (as with MFE) Want MAD
to be as small as possible No way to know if MAD error is large or
small in relation to the actual data
0
2
4
6
8
10
12
14
time
actu
als/
fore
cast
Forecast 3 5 7 9 11 13
Actuals 2 6 5 10 13 11
Absolute deviation 1 1 2 1 2 2
1 2 3 4 5 6
=
=n
ttt FDn
MAD1
1
MAD = 1.5
Mean Absolute Deviation (MAD)
Figure 38: Mean Absolute Deviation (MAD)
Fig. 38 shows the mean absolute deviation which uses the
absolute deviations instead of the actual one. As a result,
positive and negative deviations do not cancel out. However, the
key figure is hard to interpret since it depends on the amounts and
units of the values.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 42
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
Figure 39: Mean Absolute Percentage Error
Fig. 39 shows the mean absolute percentage error which gives the
mean absolute deviation as a percentage of the actual data. This is
a very common key-figure.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 43
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
SAP 2008 / Page 31
Mean Squared Error (MSE)
0
2
4
6
8
10
12
14
time
actu
als/
fore
cast
Forecast 3 5 7 9 11 13Actuals 2 6 5 10 13 11
Squared deviation 1 1 4 1 4 4
1 2 3 4 5 6
2
1)(1 t
n
tt FDn
MSE = =
Mean Squared Error (MSE): Measures variance of forecast error
Measures squared forecast error - error variance Recognizes that
large errors are disproportionately more expensive than small
errors But is not as easily interpreted as MAD, MAPE - not as
intuitive
MSE = 2.5
Figure 40: Mean Squared Error
Fig. 40 shows the mean squared error that is in analogy to the
statistical variance explained earlier.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 44
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
References [1] Walter Krmer: Statistik verstehen, Piper Verlag
GmbH, Mnchen, 6th Edition, 2007
[2] Walter Krmer, So lgt man mit Statistik, Piper Verlag GmbH,
Mnchen, 9th Edition, 2007
[3] Wikipedia, the free encyclopedia,
http://en.wikipedia.org/wiki/Main_Page, search for the keywords
normal distribution, Poisson distribution, Forecasting, Time
Series, Linear regression
[4] Peter Mertens, Susanne Rssler (eds.): Prognoserechnung,
Physica-Verlag Heidelberg, 6th edition, 2005
[5] BNET Business Dictionary (http://dictionary.bnet.com),
search for the keyword Forecasting
[6] Talk given by Stephan R. Lawrence, Demand Forecasting: Time
Series Models, College of Business and Administration, University
of Colorado, Boulder
[7] Wikipedia, die freie Enzyklopdie,
http://de.wikipedia.org/wiki/Regressionsanalyse
For more information, visit the Retail homepage.
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 45
-
Basic Principles of Statistics and Forecasts in your Daily and
Business Life
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com |
BOC - boc.sap.com 2009 SAP AG 46
Copyright 2008 SAP AG. All rights reserved.
No part of this publication may be reproduced or transmitted in
any form or for any purpose without the express permission of SAP
AG. The information contained herein may be changed without prior
notice.
Some software products marketed by SAP AG and its distributors
contain proprietary software components of other software
vendors.
Microsoft, Windows, Outlook, and PowerPoint are registered
trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex,
MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries,
xSeries, zSeries, System i, System i5, System p, System p5, System
x, System z, System z9, z/OS, AFP, Intelligent Miner, WebSphere,
Netfinity, Tivoli, Informix, i5/OS, POWER, POWER5, POWER5+,
OpenPower and PowerPC are trademarks or registered trademarks of
IBM Corporation.
Adobe, the Adobe logo, Acrobat, PostScript, and Reader are
either trademarks or registered trademarks of Adobe Systems
Incorporated in the United States and/or other countries.
Oracle is a registered trademark of Oracle Corporation.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the
Open Group.
Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame,
VideoFrame, and MultiWin are trademarks or registered trademarks of
Citrix Systems, Inc.
HTML, XML, XHTML and W3C are trademarks or registered trademarks
of W3C, World Wide Web Consortium, Massachusetts Institute of
Technology.
Java is a registered trademark of Sun Microsystems, Inc.
JavaScript is a registered trademark of Sun Microsystems, Inc.,
used under license for technology invented and implemented by
Netscape.
MaxDB is a trademark of MySQL AB, Sweden.
SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and
other SAP products and services mentioned herein as well as their
respective logos are trademarks or registered trademarks of SAP AG
in Germany and in several other countries all over the world. All
other product and service names mentioned are the trademarks of
their respective companies. Data contained in this document serves
informational purposes only. National product specifications may
vary.
These materials are subject to change without notice. These
materials are provided by SAP AG and its affiliated companies ("SAP
Group") for informational purposes only, without representation or
warranty of any kind, and SAP Group shall not be liable for errors
or omissions with respect to the materials. The only warranties for
SAP Group products and services are those that are set forth in the
express warranty statements accompanying such products and
services, if any. Nothing herein should be construed as
constituting an additional warranty.
These materials are provided as is without a warranty of any
kind, either express or implied, including but not limited to, the
implied warranties of merchantability, fitness for a particular
purpose, or non-infringement.
SAP shall not be liable for damages of any kind including
without limitation direct, special, indirect, or consequential
damages that may result from the use of these materials.
SAP does not warrant the accuracy or completeness of the
information, text, graphics, links or other items contained within
these materials. SAP has no control over the information that you
may access through the use of hot links contained in these
materials and does not endorse your use of third party web pages
nor provide any warranty whatsoever relating to third party web
pages.
Any software coding and/or code lines/strings (Code) included in
this documentation are only examples and are not intended to be
used in a productive system environment. The Code is only intended
better explain and visualize the syntax and phrasing rules of
certain coding. SAP does not warrant the correctness and
completeness of the Code given herein, and SAP shall not be liable
for errors or damages caused by the usage of the Code, except if
such damages were caused by SAP intentionally or grossly
negligent.
Applies to:SummaryAuthor BioTable of Contents1 Running into
Statistics and Forecasts in your daily and business life2 Refresh
your knowledge: Basics statistics2.1 Qualitative and quantitative
characteristics How to describe objects?2.2 What are histograms
for?2.3 Mean values: one for all2.4 How can we measure the
variance?2.5 The Normal Distribution2.6 The Poisson Distribution
for rare events
3 Basics Forecasting3.1 What is Forecasting?3.1.1 Applications
of forecasting3.1.2 Forecast Approaches
3.2 What are Time Series?3.3 Basic Forecasting Methods3.3.1
Moving Average3.3.2 Weighted Moving Average3.3.3 First Order
Exponential Smoothing3.3.4 Seasonal adjustment of time series as a
general statistical method3.3.5 Exponential Smoothing with Trend
and Seasonality3.3.6 Linear Regression3.3.7 More sophisticated
Regression Methods:
3.4 Causal based forecasting3.5 Forecasting Performance
Measures
ReferencesCopyright