Top Banner
International Journal of Forecasting 9 (1993) 147-161 0169.2070/93/$06.00 @ 1993 - Elsevier Science Publishers B.V. All rights reserved 147 Improving judgmental time series forecasting: A review of the guidance provided by research Paul Goodwin* Department of Mathematical Sciences, University of the West of England, Frenchay, Bristol BS16 1 QY, UK George Wright Strathclyde Graduate Business School, 130 Rottenrow, Glasgow G4 OGE, UK Abstract This study reviews the research literature on judgmental time series forecasting in order to assess: (i) the quality of inferences about judgmental forecasting in practice which can be drawn from this research; (ii) what is currently known about the processes employed by people when producing judgmental forecasts; (iii) the current evidence that certain strategies can lead to more accurate judgmental forecasts. A key focus of the paper is the identification of areas where further research is needed. Keywords: Judgmental forecasting; Time series; Decomposition; Quality of judgement 1. Introduction In recent years judgmental forecasting has received much attention in the literature. This attention has been justified by the evidence of several surveys [see, for example, Dalrymple (1975, 1987); Mentzer and Cox (1984); Sparkes and McHugh (1984)], which suggest that judg- mental forecasts are widely used in business and industry. One major area of research has focused on judgmental time series forecasting where data on the past values of the variable to be forecast are used to produce an assessment of future values. In practice, however, such forecasts will usually be based on both time series and con- * Corresponding author. textual information (e.g. the knowledge that a rival is launching a competing product next month). The intention of this paper is to review the research literature in judgmental time series forecasting in order to answer four questions which will be of interest both to practitioners of judgmental forecasting and to researchers. (1) To what extent does the literature provide a sound basis for drawing inferences about judgmental time series forecasting in practice? (2) What does this body of research tell us about judgmental forecasting processes and their associated biases and strengths? (3) According to the literature, how can more accurate judgmental forecasts be obtained?
15

Improving judgmental time series forecasting: A review of the guidance provided by research

Apr 20, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Improving judgmental time series forecasting: A review of the guidance provided by research

International Journal of Forecasting 9 (1993) 147-161

0169.2070/93/$06.00 @ 1993 - Elsevier Science Publishers B.V. All rights reserved

147

Improving judgmental time series forecasting: A review of the guidance provided by research

Paul Goodwin*

Department of Mathematical Sciences, University of the West of England, Frenchay, Bristol BS16 1 QY, UK

George Wright

Strathclyde Graduate Business School, 130 Rottenrow, Glasgow G4 OGE, UK

Abstract

This study reviews the research literature on judgmental time series forecasting in order to assess: (i) the quality of inferences about judgmental forecasting in practice which can be drawn from this research; (ii) what is currently known about the processes employed by people when producing judgmental forecasts; (iii) the current evidence that certain strategies can lead to more accurate judgmental forecasts. A key focus of the paper is the identification of areas where further research is needed.

Keywords: Judgmental forecasting; Time series; Decomposition; Quality of judgement

1. Introduction

In recent years judgmental forecasting has received much attention in the literature. This attention has been justified by the evidence of several surveys [see, for example, Dalrymple

(1975, 1987); Mentzer and Cox (1984); Sparkes and McHugh (1984)], which suggest that judg- mental forecasts are widely used in business and industry. One major area of research has focused on judgmental time series forecasting where data on the past values of the variable to be forecast are used to produce an assessment of future values. In practice, however, such forecasts will usually be based on both time series and con-

* Corresponding author.

textual information (e.g. the knowledge that a rival is launching a competing product next month). The intention of this paper is to review the research literature in judgmental time series forecasting in order to answer four questions which will be of interest both to practitioners of judgmental forecasting and to researchers.

(1) To what extent does the literature provide a sound basis for drawing inferences about judgmental time series forecasting in practice?

(2) What does this body of research tell us about judgmental forecasting processes and their associated biases and strengths?

(3) According to the literature, how can more accurate judgmental forecasts be obtained?

Page 2: Improving judgmental time series forecasting: A review of the guidance provided by research

148 P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161

(4) In which direction should future research efforts in this area be concentrated?

To examine these issues we have identified three roles for judgement that have been re- ported in the literature. First, there are tasks where judgmental forecasters, who have access only to time series information, are required to assess the nature of the past data pattern and to then extrapolate this into the future. A second role involves the use of judgement to adjust a statistical time series forecast, usually in the light of contextual information. Thirdly, judgmental forecasters, who have access to both time series data and contextual information, are required to produce a forecast by integrating the two types of information. We will refer to the latter as holistic forecasts if no use is made of statistical or analytical techniques, either to make the judge- ments or to integrate them. Of course, some judgmental forecasts are made without having time series information available (e.g. sales fore- casts for a new product), but these will not be considered in this paper.

Bunn and Wright (1991) discussed the relative merits of statistical and judgmental forecasting and proposed structured interactions between the two types of forecast. This paper accepts that judgmental forecasts are in any case the most widely used. Our focus is therefore on the prac- ticalities and details of how the accuracy of judgmental forecasts can be improved, either with or without reference to statistical models, and how future research might be designed to facilitate further improvements.

2. Research and practice

It seems reasonable to expect a practical judg- mental time series forecasting task to possess many, or all, of the following characteristics.

(9

(ii)

Both time series and contextual informa- tion may be available. There may be no basis for assuming con- stancy in the time series pattern (i.e. no basis for assuming the past pattern will continue into the future).

(iii)

(iv)

(v)

(vi)

(vii)

(viii)

(ix)

(x)

(xi)

There may be organizational and political influences on the forecast (Bromiley, 1987). The forecaster may have some control [see Brown (1988); Einhorn and Hogarth (1982)] over the variable to be forecast (e.g. he may decide to launch a promotion campaign to boost flagging sales) so that the forecast may be either self-fulfilling or self-negating (where action is taken to avert the future which has been fore- cast).

The forecast made may affect the be- haviour of the environment. This may also lead to self-fulfilling or self-negating fore- casts. The forecaster may have expertise in rela- tion to the variable to be forecast. The forecasting task itself (as distinct from any expert knowledge) may be familiar to the judgmental forecaster. The forecaster may have a direct interest in the outcome and hence prefer some outcomes to others [this may, for example, lead to optimism biases, see Mathews and Diamantopoulos (1990)]. There may be incentives for accurate fore- casting. These may be direct (e.g. mone- tary losses linked to forecast error) or in- direct. For example, Murphy and Brown (1985) have argued that the presence of actual and potential users provide strong motivation for effective forecasting. More- over, a recognised expert with a reputation to preserve, may also be anxious to pro- duce accurate forecasts (Beach et al., 1986). Regular feedback on past performance may be available. The forecast may be obtained from an interacting group of individuals [see Ed- mundson et al. (1988)].

Many of these characteristics have been ab- sent from research studies, particularly those based in the laboratory. We discuss below some of the conditions which have applied in these studies and consider the extent to which they allow us to draw conclusions about forecasting in practical contexts.

Page 3: Improving judgmental time series forecasting: A review of the guidance provided by research

P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161 149

2.1. Information available to subjects

Three main types of information have been made available to forecasters in studies: (i) time series information; (ii) labels (e.g. information that the series represents costs or sales); (iii) contextual information (e.g. financial informa- tion about a company, details of its market, etc.). As we discuss below, the use of labels is likely to give a context to a forecasting task. However, to simplify our discussion, the phrase ‘time series information’ will be used whether labels are absent or present. Any information in addition to time series and labels will be referred to as ‘contextual information’.

2.1.1. Time series information When time series observations are provided

the judge usually has a set of autocorrelated cues available to produce a forecast. This means that results of other judgmental studies involving non-autocorrelated cues may not generalise to these tasks. For studies involving only time series information to have implications in a given prac- tical situation, one or more of the following conditions must be met.

(1) Making forecasts on the basis of time series information alone is the only practical option. It is, however, unclear how many real situations exist where no contextual information is available to the forecaster.

(2) Even if contextual information is avail- able, the most desirable option is to make fore- casts purely on the basis of time series informa- tion. However, most studies suggest that the use of contextual information leads to improved forecasts [see, for example, Edmundson et al. (1988); Sanders and Ritzman (1992)].

(3) The studies must describe time series judgements accurately in the real situation where contextual information is also available and where other practical considerations apply. For this to hold good we must assume that, for the unaided forecaster, time series judgements are a separable part of the judgmental process; so that any heuristics or biases identified in these studies will still apply when contextual information is available. Yet it has been found in studies of judgement in decision making [see Einhorn and Hogarth (1981); Payne (1982)] that decision be-

haviour is sensitive to even minor changes in task and context. Moreover, Beach et al. (1986) have suggested that if information provision is per- ceived to be poor, subjects may expend little effort in attempting to produce accurate fore- casts. Thus, a poor performance in extrapolating a time series may not result when this is part of a wider task for which contextual information is also available.

(4) It must be desirable to prescribe separability, that is decomposition of the task into time series judgements and judgements based on contextual information. We will discuss later how such a decomposition might be achieved, but we believe that it is in this area that studies of judgement based only on time series information may have their greatest poten- tial value.

2.1.2. The effect of labels In many tasks reported, the subject received a

time series and a label indicating that the vari- able to be forecast represented, for example, monthly costs or sales. No other information was given. Sniezek (1986) found that the use of labels in cue probability learning tasks created expectations regarding function form and nature. One implication of this is that the subject may interpret the nature of a task in a different manner to that of the experimenter. For exam- ple, in a linear trended artificial data series la- belled ‘sales’, the experimenter might seek to evaluate subjects’ ability to extrapolate the past linear trend. Yet Lawrence and Makridakis (1989), for example, found that when subjects were asked to make forecasts for an artificial ’ linear trended series labelled ‘unit sales’ and ‘Year l’, ‘Year 2’ etc., they tended to damp upward trends when extrapolating them, reflect- ing “a common sense view. . . consistent with empirical findings. . . since a growth of seven years (in sales) might well precede some lean years”. It seems that labels should not be used casually by the experimenter.

2.1.3. Time series and contextual information One way of simulating operational forecasting

problems is to give subjects both time series and contextual information about the variable to be forecast [see, for example, Wolfe and Flores

Page 4: Improving judgmental time series forecasting: A review of the guidance provided by research

150 P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161

(1990)]. However, an information package given to subjects may not be the same in content as the informal knowledge base which is present in an operational setting. The cues are pre-selected by the experimenter, which may imply that they

should be used, and the information given may be perfectly reliable. In practice, relevant infor-

mation, which is often ambiguous and contradic- tory, has to be selected from an unbounded range of sources, many of which may be unreli- able. In this way, reliability, in addition to rele- vance, has to be considered in the cue selection and weighting task of the judgmental forecaster. There is some evidence, albeit from non-fore- casting tasks [see Gaeth and Shanteau (1984)], that irrelevant information will adversely affect judgements.

2.2. Type of series used

Many studies have involved subjects ex- trapolating artificially generated time series. This has the advantage that it allows subjects’ per- formances to be assessed under a wide range of controlled conditions. The assumptions made by the experimenter in these studies have generally been: (i) constancy, i.e. that the pattern ob- served in the past will continue into the future; (ii) that the function which was used to generate the series is the optimal forecasting model.

As we stated earlier, the subject may, with good reason, fail to make the assumption of constancy, especially if labels have been used. Hence, biases reported by these studies may not apply in practical forecasting environments. Nevertheless, studies of the ability of forecasters to extrapolate series under the assumption of constancy may still have a valuable role in practi- cal contexts: a possible procedure will be dis- cussed later for formally integrating time series and contextual judgements, involving, as a first stage, extrapolation under the assumption of constancy.

With regard to the second assumption, it may be possible to find other functions which fit the sample of the time series provided to the subject just as well as does the globally optimal function. For example, Wagenaar and Sagaria (1975) sug- gested from their experiments that people gross- ly underestimate exponential growth. However, the subjects in the Wagenaar and Sagaria study

were not told that growth was exponential and had no reason to expect that this functional form applied. Jones (1979) has pointed out that the five data points which were presented to Wagenaar’s subjects could be fitted by poly- nomials at least as well as by exponential func- tions. If this is done the bias of the subjective forecast is much less. What the Wagenaar and Sagaria study might tell us is not that people are poor at extrapolating exponential growth but that, in the absence of other information, they are unlikely to select an exponential model when a set of other plausible models is available.

2.3. Forecasting environment

The obvious advantage of laboratory based studies, besides being easier to implement than real-world studies, is the ability to control and replicate sets of conditions which are relevant to the study. The extent to which laboratory studies can simulate the real world has been discussed extensively over the years particularly in relation to probability judgements [see, for example, Winkler and Murphy (1973); Goodwin and Wright (1991)]. We therefore confine ourselves here to a few comments specific to judgmental time series forecasting. Clearly, in a laboratory study it is difficult, if not impossible, to assess the impact of the organisational and political factors that may impinge on judgmental forecast- ing. Nor is the forecaster likely to have any preference for a particular outcome before mak- ing the forecast. Also, because the forecast will not be used for a real-world decision some of the motivation for producing an accurate forecast will probably be lost [see Murphy and Brown (1985)], though the use of other rewards for accuracy, such as prizes, may compensate for this. In the absence of such incentives, little effort may be expended on the forecasting task by subjects [see Beach et al. (1986)].

Armstrong (1985) has distinguished between ‘intentions’, where people make statements about the future behaviour of things they can control, and ‘opinions’, where the judge has little or no control over the variable to be forecast. Situations where the judge has some control or influence over the forecasting environment have not, in general, been a feature of studies in judgmental time series forecasting, though this

Page 5: Improving judgmental time series forecasting: A review of the guidance provided by research

P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161 151

may be a common characteristic of practical forecasting tasks [see, for example, Edmundson et al. (1988)].

2.4. Expertise of subjects

The fact that subjects participating in many studies are undergraduate students, who may lack both expertise in the variable to be forecast and experience of the forecasting task, has also been discussed extensively elsewhere and the debate will not be replicated here [see Beach et al. (1986), for a fuller discussion]. Clearly, such subjects may have problems in making full use of any contextual information which is supplied to them. Moreover, Beach et al. point out that it is likely that the effort which subjects are prepared to invest in the forecasting task will be reduced if the task is not perceived to be within their own particular area of expertise.

3. Studies describing the judgmental forecasting process

Bearing in mind the reservations we have expressed about the quality of inferences which can be drawn from the research literature, we next consider what it tells us about the processes adopted by people to produce judgmental fore- casts and about their ability to make accurate forecasts under different conditions.

3.1. Studies investigating judgmental time series extrapolation

Eggleton (1982) argues that judgmental time series extrapolation involves three stages: (i) assessing the nature of the underlying data gen- eration process; (ii) representing the characteris- tics of the process cognitively; (iii) attempting to generate bias-free forecasts from this cognitive representation. However, we need a greater un- derstanding of the cognitive processes used by forecasters in this task, under different’ condi- tions. For example, Harvey (1988) has suggested that, while individuals are able to acquire inter- nal representations of the process used to gener- ate data points (though this was only tested on series generated by a first order autoregressive process and may not be the case in general), this

internal representation is not used in the fore- casting task and simplifying heuristics are used instead. Some studies [see, for example, An- dreassen and Kraus (1990); Lawrence and O’Connor (1992)] h ave suggested that these heuristics can be modelled as exponential smoothing. However, while Lawrence and O’Connor obtained a reasonable fit for exponen- tial smoothing (R* = 0.75) they did not examine a wide range of alternative models. Moreover, a study by Harvey et al. (1991) did not produce results consistent with exponential smoothing. Unlike the Lawrence and O’Connor series, theirs was cyclical, containing very low levels of noise, and subjects were presented with tabular, rather than graphical, information.

A number of studies have assessed the per- formance of judgmental forecasters in ex- trapolating different types of series. Time series can be characterised by:

(i) the complexity of the underlying signal (e.g. levels of complexity might vary from stationary, through linear trend, non linear trend, to trend and seasonal);

(ii) the level of noise around the underlying signal;

(iii) the stability of the underlying signal. For example, there may be sudden changes to a new underlying mean level (steps), gradual changes to new levels (ramps) or a trended series might exhibit changes or reversals in the trend, etc.

It is not yet possible to draw firm practical conclusions from research in this area. Typically, studies have assessed the performance of fore- casters in laboratory conditions when only time series information (sometimes with labels) was available. Moreover, some studies have yielded contradictory results, suggesting that a contin- gency model of forecasting may be appropriate [see Payne (1982)], with forecasters’ judgements being highly sensitive to small differences be- tween tasks. A further problem is that few studies have distinguished between (ii) and (iii) above. Thus, series described as ‘volatile’ or ‘not well behaved’ could manifest either a high noise around a stable signal, an unstable underlying signal, or both. In particular, measures like the coefficient of variation of a series do not dis-

Page 6: Improving judgmental time series forecasting: A review of the guidance provided by research

152 P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161

tinguish between these two characteristics, yet one might expect the relative performance of statistical and judgmental methods to depend closely upon them. The conclusions which follow are therefore only tentative.

In stable, but noisy, series the balance of evidence appears to be that statistical time series methods tend to outperform judgmental extrapo- lation. For example, Lawrence and O’Connor (1992) found that judges performed poorly, rela- tive to a statistical method, on stable, untrended series. It seems that noise in a series will mask the true signal and also lead the judge to identify spurious signals. People have difficulty in hand- ling the concept of randomness [see Slavic et al. (1974); Tversky and Kahneman (1974)]. In par- ticular, they expect a randomly generated series to have too many short runs, so that long runs are perceived as non-random and alternating sequences are perceived as random [see Wagenaar (1972); Kahneman and Tversky (1972); Ayton et al. (1989)]. This occurred in studies by Eggleton (1976, 1982), who also found that subjects were more likely to read a sys- tematic pattern into a random series when the variability of the series increased. Similarly, in Sanders’ (1992) study, judgmental extrapolation performed consistently worse than a statistical method for stable series having high noise levels, with the gap widening with increases in noise level.

However, studies which have tested subjects’ abilities to detect a known trend from noisy data have yielded apparently conflicting results. Thus, Andreassen and Kraus (1990) found that the tendency to discern a trend was greater when the signal was strong relative to the noise level, while Mosteller et al. (1981) and Lawrence and Makridakis (1989) suggest that the level of noise has no effect on the propensity to identify a trend. However, subjects in the Andreassen and Kraus study were given only the latest value in the series and the change from the previous value, while subjects in the other two studies received graphs of the series.

Studies of the extrapolation, rather than the detection, of trends have generally suggested that judges exhibit biases. For linear trended series, a consistent tendency to under-forecast growth has been reported [see Eggleton (1982); Lawrence and Makridakis (1989); Sanders

(1992)]. However, all of these studies were based on artificial data, so that judges may have been damping the trend on the assumption that long term linear growth is unrealistic. Experi- ments suggesting serious biases in the extrapola- tion of exponential growth [see Wagenaar and Sagaria (1975); Timmers and Wagenaar (1977)] may also reflect different perceptions of the ex- periment by subjects (who were not told that growth was exponential) and experimenters, as we discussed earlier. However, even when sub- jects were told that an exponential generating function applied, Keren (1983) found that growth was still grossly underestimated. Jones (1984) has suggested that this is because subjects perceive that such growth is based on a simple quadratic function.

It thus appears that an unaided judgmental forecaster will have greater difficulty, relative to a statistical method, in extrapolating more com- plex stable underlying signals from noisy data and there appears to be a reluctance by the judge to take the complexity fully into account so that simplifying heuristics are employed. Thus, Sanders (1992) found that the accuracy of judgmental extrapolation deteriorated, both absolutely and relative to a statistical method, as the underlying signal became more complex.

When the underlying signal in a series is unstable it is even more difficult to draw firm conclusions about the performance of judgement because of the difficulty of apportioning the vari- ability in the data between noise and signal instability. However, unstable signals are dif- ficult to model statistically and Lawrence (1983), Edmundson et al. (1988) and Sanders and Ritz- man (1992) all found that judgmental extrapola- tion outperformed or rivalled statistical time series methods for ‘more unstable micro- economic’, ‘more-volatile non-key product’ and ‘higher variability’ series, respectively. Similarly, Sanders (1992) found that judgement out- performed statistics for a series exhibiting a step function and low noise (this was not the case for a high noise step function). Thus, it may be that judgmental forecasters’ greater propensity to read system into randomness, which is arguably a disadvantage for stable series, leads them to react more quickly than statistical methods to fundamental changes in the underlying data pat- tern. In a real situation, if contextual informa-

Page 7: Improving judgmental time series forecasting: A review of the guidance provided by research

P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161 153

tion was also available then the judge might also more likely to be accurate than an adjusted be able to anticipate or explain these fundamen- statistical forecast, or to be aware of the circum- tal changes and would therefore have an even stances where simple averages of judgmental and greater advantage over statistical time series statistical forecasts are more accurate than judg- methods. mental adjustments.

3.2. Studies investigating judgmental adjustments to statistical forecasts

3.3. Studies investigating holistic forecasts

The main focus of research in this area has been to assess whether judgmental adjustments can improve statistical time series forecasts. In general, it appears that such adjustments will lead to greater accuracy. Willemain (1989) evaluated judgmental adjustments to statistical forecasts when only time series information was available and found that improvements in gener- al were moderate, though they were most mark- ed for difficult-to-forecast series. In practice, it seems reasonable to assume that adjustments are more likely to be based on contextual informa- tion rather than a belief that the statistical model has incorrectly projected the time series pattern. Studies carried out when such information was available, and in a practical setting, suggest, first, that judges are able to recognise statistical forecasts which need adjusting [see Mathews and Diamantopoulos (1990)] and, second, that these adjustments lead to greater accuracy [see Mathe- ws and Diamantopoulos (1986, 1989)]. Similar results from a laboratory study were obtained by Wolfe and Flores (1990). Carbone et al. (1983) found that judgmental adjustment did not im- prove forecasts for series from the ‘M Competi- tion’ [see Makridakis et al. (1984)], but it is unlikely that the student subjects were in posses- sion of useful contextual information.

Most studies in this area relate to real-world forecasting. A large number have focused on corporate earnings forecasts, the object being to determine whether statistical time series methods outperform the judgmental forecasts of mana- gers and analysts [see, for example, Brown and Rozeff (1978); Kodde and Schreuder (1984)]. Because of the real world context of these studies and the consequent difficulties in control- ling the forecasting environment, we know little about the processes by which judgmental fore- casters arrive at their forecasts and, in particular, how they integrate time-series with contextual information. Hogarth and Makridakis (1981)

have tabulated an extensive list of biases which might apply in holistic judgmental forecasting, but many of these have been identified in do- mains other than forecasting and in laboratory, rather than real-world, contexts. Indeed, it ap- pears that heuristics used in holistic forecasting may vary. For example, Bromiley (1987) found that the anchoring and adjustment heuristic of Tversky and Kahneman (1974), where the an- chor is the current value of the variable to be forecast, was not universally adopted by fore- casters in different organisations. Currently, we have little or no knowledge about the effect of organisational and political factors on judgmen- tal forecasts.

As yet we know little about the processes adopted by forecasters in the adjustment task. For example, do they formulate a judgmental forecast independently of the statistical forecast and then arrive at a compromise between the two (indeed they may simply ignore the statisti- cal forecast so that the resulting forecast is pure- ly based on judgement), or do they use the statistical forecast as an anchor? To what extent are adjustments carried out to improve time series extrapolations (as in Willemain’s study), rather than to take into account contextual infor- mation? In particular, it would be useful to know if conditions exist when a holistic forecast is

4. Strategies for improving judgmental forecasts

4.1. Using decomposition

Given that the capacity of the human mind to process information is limited, judgements may be improved by using decomposition, which Armstrong (1985) defines as “the strategy of breaking a problem into subproblems, solving them, and then combining the solutions to the subproblems to get an overall solution”. The central idea is that the subproblems require

Page 8: Improving judgmental time series forecasting: A review of the guidance provided by research

154 P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161

judgements which are simpler than a holistic judgement of the original problem.

There are a number of different approaches to decomposition. In decision analysis, for exam- ple, recomposition of constituent assessments, such as probabilities, is founded on axiom-based theory [see Goodwin and Wright (1991)]. Axiom-based decompositions have the advan- tage that they allow coherence checks on the elicited judgements. Other decompositions ap- peal more through their ‘intuitive reasonable- ness’. One such approach is the algorithmic ap- proach presented by MacGregor and Lichtens- tein (1991). An algorithm is a series of steps or operations which, when sequentially applied, produce a solution to a problem. Such informa- tion combination is mechanical. A further dis- tinction can be made between ‘mechanical de- compositions’, which do not engage the judge in any form of interaction or iteration in order to try and remove judgmental biases [see, for ex- ample, MacGregor and Lichtenstein (1991); Wright et al. (1993)], and ‘structure modifying’ techniques, where the user is encouraged or forced to understand the internal logic of the decomposition, rather than following the proce- dure blindly [see Keren (1992)].

Surprisingly, few studies have evaluated de- composition as a strategy for producing more accurate judgmental time series forecasts. One exception is Edmundson’s (1990) study, which found that the decomposition of judgements about complex time series into trend, seasonal and random components, using a computerised graphical procedure, produced forecasts which were significantly more accurate than holistic forecasts produced from hardcopy graphs. Sever- al other studies have employed decomposition strategies. Wolfe and Flores (1990) used Saaty’s Analytic Hierarchy Process (AHP) [see Saaty (1990)] to adjust statistical forecasts of company earnings, while Flores et al. (1992) applied the centroid method to these forecasts. Saaty and Vargas (1991) describe the use of the AHP in oil price and foreign exchange rate forecasting and Abrahamson and Finizza (1991) have used belief networks to forecast oil prices [see also Gibson (1991)]. However, none of these studies com- pared the accuracy of forecasts based on decom- position with holistic forecasts or adjustments. Nor have there been any direct comparisons of

the effectiveness of different decomposition methods, with the arguable exception of the Flores et al. study.

Although it seems reasonable to speculate that decomposition will, in general, lead to more accurate judgmental forecasts in real situations when contextual information is available, the case as yet remains unproven. Moreover, it is difficult to draw inferences from other areas where decomposition is applied to judgement. In decision analysis, for example, decompositions cannot be evaluated for validity: we can never be certain that a decision model is a faithful repre- sentation of a decision maker’s preferences. Of course, decomposition is not guaranteed to im- prove judgements [see, for example, Lyness and Cornelius (1982)]. R avinder et al. (1988), Ravin- der and Kleinmuntz (1991) and Ravinder (1992) have shown, in the context of probability estima- tion and utility elicitation, that decomposition can increase consistency in judgements but can- not reduce bias. Moreover, determining the opti- mal structure of the decomposition and, in par- ticular, determining the optimal level of specifici- ty for judgmental inputs into a forecast is itself likely to be an intuitive and ad hoc judgement. Furthermore, MacGregor and Lichtenstein’s re- sults indicate that algorithms should be applied with caution. Though people can be trained to produce them, erroneous knowledge accessed as the result of a self-produced algorithm may lead judgements astray, without a warning that it is doing so because of the surface coherence achieved by algorithmic decomposition.

Clearly, future research should recognise that decomposition needs to be applied flexibly and sensitively. The approach is unlikely to be effec- tive where: (i) the decomposition is mechanical and/or the judge is sceptical about the decompo- sition technique which is being employed; (ii) there is unfamiliarity with the technique used for decomposition and the type of judgements re- quired by it; (iii) the judgements required by decomposition are actually more complex psy- chologically than holistic judgements [see Slavic et al. (1977)]; (iv) the judge experiences bore- dom or fatigue because the duration of the task is lengthened and the number of judgements increased. ~

In addition, the relative value of both axiom- based and ‘intuitively reasonable’ decomposition

Page 9: Improving judgmental time series forecasting: A review of the guidance provided by research

p. Goodwin, G. Wright I International Journal of Forecasting 4 (1993) 147-161 155

methods merits close examination. For example, Edmundson’s (1990) approach, although intui- tiveiy sensible, does not contain the axiom base of approaches such as influence diagrams. Future studies should also explore the interaction be- tween holistic forecasts and those generated by decomposition. In decision analysis, Phillips’ re- quisite decision modelling process (see Phillips (1984)] involves the exploration of conflicts be- tween judgements derived from intuition and those derived from decomposition. Phillips ar- gues that such conflicts can be exploited to help judges to resolve inconsistencies in their think- ing, and to achieve a deeper understanding of problems. In this way, the decomposition-based model may be revised and improved. Such an approach may prove to be effective in judgmen- tal forecasting.

A particular focus of decomposition research should be its use in analysing and integrating judgements based on time series information with those based on contextual information. A key problem which will need to be addressed is the extent to which the two types of judgements are separable. Is it possible to make time series extrapolations without taking into account con- textual factors? There is the obvious danger to consider that double counting will occur. It may be possible to separate the judgements if con- textual judgements are used to explain the re- sidual error of the time series extrapolation (e.g. to explain outliers) or if, in making forecasts for the components of the time series (e.g. seasonali- ty and trend), factors suggesting that the past pattern will continue can be separated from those which might lead to change. For example, we might obtain a model which contains judg- mental estimates both of time series components and the effect of explanatory variables (e.g. a short run advertising campaign) which have not been accounted for in the time series forecasts.

One possible approach is a two stage proce- dure, where the time series is first extrapolated under the assumption of constancy (the extrapo- lation could be carried out either statistically or judgmentally: hence the potential value of time series extrapolation studies which assume con- stancy). The time series is then adjusted on the basis of contextual information. Both the judg- mental time series extrapolation and the sub- sequent adjustments can themselves be decom-

posed [see, for example, Edmundson (1990); Wolfe and Flares (1990)]. While a number of papers have considered the adjustment of statis- tical forecasts, we know of none which have studied the adjustment of a judgmental extrapo- lation where the judge was in possession of contextual information.

A concern which emerges if procedures like these are used on their own is that one must forego the potential benefits of using time series data for ‘backward inference’ [see Einhorn and Hogarth (1982)], by which is meant the use of past observations as evidence to infer the process that produced them (e.g. “our low sales last quarter were caused by adverse weather condi- tions”). As Einhorn and Hogarth argue: “suc- cess in predicting the future depends to a consid- erable degree on making sense of the past”. An obvious use of backward inference is as a learn- ing tool: a judgmental forecaster can formulate a model by testing and honing it on past time series observations. Nevertheless, the model would relate to the past. Clearly, we need to know more about how backward inference can be implemented and the role it can play as part of formal forecasting procedures.

4.2. Improving the forecaster’s technical knowledge

Sanders and Ritzman’s (1992) definition of technical knowledge embraces not only knowl- edge of the logic and capability of data analysis and statistical forecasting methods, but also in- formation on the judgmental analysis of data (e.g. visual checks for trends and runs) and the biases inherent in human judgement. Their study suggested that the possession of technical knowl- edge does not lead to improvements in the ac- curacy of judgmental forecasts. Studies by Wagenaar and Sagaria (1975) and Edmundson (1990) produced similar findings, while a study by Lawrence et al. (1985) found that subjects with technical knowledge produced better fore- casts when data was presented in a tabular, rather than graphical form. However, all these studies evaluated technical knowledge when only time series information was available. As far as we know, no researchers have yet examined whether technical knowledge will improve judg-

Page 10: Improving judgmental time series forecasting: A review of the guidance provided by research

156 P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161

mental forecasts made by people who already have contextual knowledge.

4.3. Improving the presentation of the time series

Studies in this area have attempted to estab- lish (i) the relative merits of graphical and tabu- lar displays of time series and (ii) the most effective methods of presenting series graphical- ly. There is some tentative evidence that graphi- cal displays lead to more accurate short term extrapolations, while tabular displays may be superior for longer forecast lead times [see Law- rence et al. (1985); Angus-Leppan and Fatseas (1986); Lawrence et al. (1986)]. Lawrence et al. (1985) also found that graphical methods per- formed better with non-seasonal and macro- economic data (which tends to exhibit less ran- domness), while tabular presentations were superior for micro-economic series. The effect of varying graphical presentations has been investi- gated by Lawrence and Makridakis (1989) and Lawrence and O’Connor (1992). The former study found that the greater the space on the graph above the plot of a linear trended time series, then the higher the forecast tended to be (relative to a fitted regression line). The second study found that varying the scale of a graph had no effect on the accuracy of forecasts for artifi- cial untrended series.

We need to know the extent to which the above results are robust to minor changes in tasks and the extent to which they generalise to different data patterns (e.g. from untrended to trended) and to situations where contextual in- formation is also available. This was not the case in any of the above studies. For example, it may be that graphical displays increase the propensity of forecasters to see systematic patterns in ran- domness and also that they focus the subject’s attention onto the local data pattern which im- mediately precedes the forecast period. With tabular displays, it is possible that subjects only examine the initial digits of each data value, causing them to mentally round each number (e.g. down to the nearest thousand), SO that long-term trends are identified and short-term, and relatively minor, variations are ignored; but all of this is speculation. Other suggestions have also been put forward. For example, Andreassen

and Kraus (1990) have argued that for some series involving substantial change (e.g. ex- ponential growth) there may be a need for pre- sentation modes to focus subjects’ attention onto that change, rather than the original data.

4.4. Mathematically correcting biases

If judgmental forecasts do exhibit biases, then an obvious method for improving accuracy is the application of an arithmetic or mathematical cor- rection. Simple arithmetic corrections can im- prove forecasts. For example, Ashton (1984) found that the systematic underestimation of judgmental forecasts produced by three magazine executives could be corrected by add- ing back the mean forecast error. The resulting forecasts outperformed a multiple regression model. Moriarty (1985) proposes a more sophis- ticated approach based on an optimal linear correction [see Theil (1971)], which can be ob- tained by regressing actual observations for each period on to the judgmental forecast for that period.

There are, however, a number of problems associated with these methods. As Bromiley (1987) shows, biases are not ubiquitous and it would be dangerous to make corrections on the assumption that they are bound to be there. Moreover, the direction in which the adjustment should be made is not always clear. In order to be useful, corrections require the same fore- caster to be in place for a reasonably long period and the same kind of bias must apply over this period. If the forecaster’s accuracy improves over time, an adjustment based on average bias in the past would be inappropriate. Also, fore- casters might deliberately exaggerate forecasts in the anticipation that they will be subsequently corrected.

4.5. Providing feedback

Another obvious strategy is the provision of feedback to the forecaster in the hope that this will facilitate learning and hence lead to more accurate forecasts. Psychologists have disting- uished between outcome feedback, which simply indicates ho-w accurate a prediction was, and task properties feedback, which yields information to the forecaster on the statistical properties of the

Page 11: Improving judgmental time series forecasting: A review of the guidance provided by research

P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161 157

task (e.g. it may reveal correlations between the cues available to the forecaster and the variable to be forecast).

Most studies of feedback have been based on tasks which have uncorrelated cues and a static environment, that is, the underlying rules of the system remain constant during the learning pro- cess, conditions which are unlikely to apply in most practical forecasting situations. These studies have focused on the ability of subjects to identify functional relationships between cues and criterion and on their ability to assign weights to, and aggregate information from, multiple cues [see Klayman (1988) for a review]. Research suggests that outcome feedback has little value while task properties feedback can be effective. It is thought that the random element of outcome feedback may confuse subjects [see, for example, Hammond et al. (1973); Kessler and Ashton (1981)].

Few studies have directly investigated the role which feedback might play in judgmental time series forecasting. One exception is a study by Mackinnon and Wearing (1991) which found that the gross underestimation of exponential growth reported by Wagenaar and Sagaria (1975) and others did not occur when subjects received im- mediate outcome feedback. Indeed, subjects’ av- erage errors were extremely low. However, this task involved only deterministic time series infor- mation, conditions which might have been favourable to outcome feedback.

A large number of important questions re- main open. For example, to what extent are the results of ‘general’ research into feedback applic- able to judgmental time series forecasting? Moreover, little research has been carried out into the role of feedback in cue discovery, that is, the process of determining the appropriate cues for a forecasting task. Clearly, this process is particularly important when the forecaster has to make judgements involving contextual infor- mation. Indeed, Klayman (1988) has argued that cue discovery may be a far more important aspect of learning than the identification of func- tion forms and the aggregation of cue informa- tion. He reports an experiment which showed that people could use outcome feedback to dis- cover valid cues, but learning was far too slow (involving 700 trials) to be effective in most practical forecasting tasks.

We also need to know more about feedback in dynamic forecasting environments where, for ex- ample, the relative importance of cues may change over time [see, for example, Sniezek (1986)] d h an w ere the forecast itself affects the behaviour of the environment. It appears that in such environments learning is particularly inhib- ited by delays in feedback. For example, Mac- kinnon and Wearing’s (1991) study of forecasts of exponential growth found that delayed out- come feedback appeared to reduce the ‘control’ that subjects had over their forecasts so that forecasts became much more widely dispersed around the true values of the series. Such reduc- tions in control could clearly produce instability in systems where there is interaction between forecasts and outcomes.

The importance of using backward inference to enable the forecaster to learn about the pro- cesses underlying time series observations was discussed earlier. Few studies have investigated the use of feedback to learn how to infer causes from outcomes. yet the learning process here may differ from that involved in predictive tasks. For example, prediction requires ~gg~egu~~o~ of information, while a diagnostic task involves categorical discrimination [see Klayman ( 1988)]. Indeed, we need to learn much more about how both of these processes operate in order to de- termine what information to feedback to the forecaster and how to supply it.

4.6. Combining forecasts or using groups of forecasters

A further strategy for improving judgmenta time series forecasts involves combining them either with other judgmental forecasts or with statistical forecasts. The combination can be made either mathematically (e.g. by taking a simple average of the constituent forecasts) or judgmentalIy by an individual or a group of people. Ferrell (1985), in discussing judgmental estimates in general, refers to the commonly held assumption that a judgement obtained from a group will be of a higher quality than that obtained from an individual. Similarly, BIattberg and Hoch (1990) give reasons why improved forecasts are likely to emanate from the combi- nation of judgmental and statistical forecasts. Their argument centres on the complementary

Page 12: Improving judgmental time series forecasting: A review of the guidance provided by research

158 P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161

strengths of the two methods, such as the con- sistency, but rigidity, of statistical models and the inconsistency, but greater adaptability to changing conditions, of human judges. However, there are likely to be limits to the improvement which can be achieved by combination. Ferrell suggests that improvements are likely to be less when there is a high intercorrelation between the constituent forecasts (so that each additional forecast will bring little new information to the combination) and little will be gained by using more than a small number of constituent fore- casts [see also Ashton and Ashton (198.5)]. Moreover, the mean of a set of biased judge- ments will still be inaccurate [see Einhorn et al. (1977)].

In the context of judgmental forecasting based only on time series information, several combi- nation approaches have been evaluated. Law- rence et al. (1986) found that combination did improve judgmental extrapolations and that a simple average of judgmental and statistical fore- casts was more effective than combinations of separate judgmental forecasts. In contrast, Car- bone and Gorr (1985) and Angus-Leppan and Fatseas (1986) found that combined forecasts were less accurate, but both these studies used judgmental combination rather than averages. This was found to be the least effective method in the Lawrence et al. study and it is possible that judgmental combination may suffer from a tendency to anchor onto, and under-adjust from, the first forecast in the combination. Alter- natively, undue weight may be attached to ap- parently rigorous statistical forecasts. One study where contextual information was available and which was based on the real holistic forecasts of buyers and managers, also found that a simple average of forecasts based on a statistical (mul- tiple regression) model and judgmental forecasts led to increased accuracy [see Blattberg and Hoch (1990)],

Other studies have evaluated the role which groups of forecasters can play in extrapolating time series. Sniezek (1989) compared the mean of group members’ forecasts with those based on Delphi [see Parente and Anderson-Parente

(1987)], Best Member (as judged by the group before the forecast) and consensus (based on face to face discussion). She found no difference between the techniques and concluded that this was because all the group members shared the same information (namely the time series data).

This may not generalise, she suggested, to situa- tions where group members have both unique and shared information. However, Ang and O’Connor (1991) found that, for difficult-to-fore- cast series, a modified consensus procedure (where one individual’s forecast formed the basis for the group’s discussion) outperformed fore- casts based on the group mean, consensus and nominal group techniques [see van de Ven and Delbecq (1971)]. The authors suggested that having a single estimate as the basis for discus- sion avoided the information overload which might occur when every group member tables a forecast. Apart from Blattberg and Hoch, all the studies considered here were based only on time series information so their results may not be generalisable to operational forecasting. For ex- ample, in the Ang and O’Connor study, the group discussion could only focus on the pattern in the time series (no labels were supplied). In a practical forecasting situation, one would expect the group to have a much richer body of infor- mation to consider and debate.

5. Conclusions

To date, much valuable research work has been carried out into judgmental time series forecasting. However, we believe that future re- search should concentrate on two general areas. First, there is a need for a much greater under- standing of the cognitive processes adopted in judgmental forecasting tasks, in order to design improved forecasting support systems and strate- gies. Much work has been carried out so that we might develop an understanding of the cognitive processes involved in decision making [see, for example, Payne (1982)] and it may be useful to explore the extent to which this has implications for forecasting. Another possible method of en- quiry involves the use of verbal protocol data (i.e. the requirement that subjects should de- scribe how they are making the forecast). This has also been suggested by others, including Lawrence and O’Connor (1992).

Second, there is clearly a need for more studies to be based in the field rather than the laboratory, so that we may gain greater insights into judgmental forecasting processes in oper- ational settings and also be able to evaluate improvement strategies in these contexts. With

Page 13: Improving judgmental time series forecasting: A review of the guidance provided by research

P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161 1.59

regard to the latter while it is a relatively straightforward matter to evaluate improvement strategies which operate after the judgement has been made (e.g. combination with statistical forecasts, bias correction, etc.), the evaluation of strategies intended to improve judgement in the first place will present greater challenges to re- searchers, particularly where forecasters are re- luctant to risk using untried techniques.

In addition to these general issues, we believe that there are a number of specific questions which future research should address. We need to know under which conditions decomposition is likely to outperform holistic judgement and how decomposition is best implemented in dif- ferent operational contexts. Moreover, strategies need to be explored which will enable decompo- sition to be used to integrate judgements based on time series and contextual information. In particular, the role of judgement in relation to statistical methods needs further exploration, For example, under what conditions is it prefer- able to take an average of independent judgmen- tal and statistical forecasts, rather than applying a judgmental adjustment to the statistical fore- cast? In addition, we need to evaluate methods for providing feedback to judgmental forecasters so that effective learning can take place, both in prediction and backward inference, and we need to establish whether technical forecasting knowl- edge can improve the forecasts of those who already have expert contextual knowledge. There is also much scope for the evaluation of group techniques when contextual information is available to forecasters.

Acknowledgements

The authors would like to thank Nigel Har- vey, two anonymous referees and an associate editor for their helpful comments on an earlier

version of this paper. George Wright’s contribu- tion to this paper was part-funded by the EEC Eurostat DOSES Project.

References

Abrahamson, B. and A. Finizza, 1991, “Using belief net-

works to forecast oil prices”, International Journal of Forecasting, 7, 299-315.

Andreassen, P.B. and S.J. Kraus, 1990, “Judgmental ex-

trapolation and the salience of change”, Journal of Fore- casting, 9, 347-372.

Ang, S. and M. O’Connor, 1991, “The effect of group

interaction processes on performance in time series ex-

trapolation”, International Journal of Forecasting, 7, 141- 149.

Angus-Leppan, P. and V. Fatseas, 1986, “The forecasting

accuracy of trainee accountants using judgemental and

statistical techniques”, Accounting and Business Research, Summer, 179-188.

Armstrong, J.S., 1985, Long Range Forecasting. From Crys- tal Ball to Computer (Wiley, New York).

Ashton, A.H., 1984, “A field test of the implications of

laboratory studies of decision making”, Accounting Re- view, 59, 361-389.

Ashton, A.H. and R.H. Ashton, 1985, “Aggregating subjec-

tive forecasts: Some empirical results”, Management Sci- ence, 31, 1499-1508.

Ayton, P., A.J. Hunt and G. Wright, 1989, “Psychological

conceptions of randomness”, Journal of Behavioral Deci- sion Making, 2, 221-238.

Beach, L.R., V.E. Barnes and J.J.J. Christensen-Szalanski,

1986, “Beyond heuristics and biases: A contingency

model of judgemental forecasting”, Journal of Forecast- ing, 5, 143-157.

Blattberg, R.C. and S.J. Hoch, 1990, “Database models and

managerial intuition: 50% model + 50% manager”, Man- agement Science, 36, 887-899.

Bromiley, P., 1987, “Do forecasts produced by organizations

reflect anchoring and adjustment?“, Journal of Forecast- ing, 6, 201-210.

Brown, L.D., 1988, “Editorial: Comparing judgmental to

extrapolative forecasts: It’s time to ask why and when”,

International Journal of Forecasting, 4, 171-173. Brown, L.D. and M.S. Rozeff, 1978, “The superiority of

analyst forecasts as measures of expectations: Evidence

from earnings”, The Journal of Finance, 33, 1-16. Bunn, D. and G. Wright, 1991, “Interaction of judgmental

and statistical forecasting: Issues and analysis”, Manage- ment Science, 37, 501-518.

Carbone, R., A. Andersen, Y. Corriveau and P.P. Corson,

1983, “Comparing for different time series methods the

value of technical expertise, individualised analysis, and

judgmental adjustment”, Management Science, 29, 559- 566.

Carbone, R. and W.L. Gorr, 1985, “Accuracy of judgmental

forecasting of time series”, Decision Sciences, 16, 153- 160.

Dalrymple, D.J., 1975, “Sales forecasting methods and ac-

curacy”, Business Horizons, 18, 69-73: Dalrymple, D.J., 1987, “Sales forecasting practices, results

from a United States survey”, International Journal of Forecasting, 3, 379-391.

Edmundson, R., 1990, “Decomposition: A strategy for judgemental forecasting”, Journal of Forecasting, 9, 305- 314.

Edmundson, R., M. Lawrence and M. O’Connor, 1988,

“The use of non-time series information in sales forecast- ing: A case study”, Journal of Forecasting, 7, 201-211.

Eggleton, I.R.C., 1976, “Patterns, prototypes and predic-

tions: An exploratory study”, Selected Studies of Human

Page 14: Improving judgmental time series forecasting: A review of the guidance provided by research

160 P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161

Information Processing in Accounting, Supplement to Journal of Accounting Research, 14, 68-131.

Eggleton, I.R.C., 1982, “Intuitive time-series extrapolation”,

Journal of Accounting Research”, 20, 68-102. Einhorn, H.J. and R.M. Hogarth, 1981, “Behavioral Deci-

sion theory: Processes of judgment and choice”, Annual Review of Psychology, 32, 52-88.

Einhorn, H.J. and R.M. Hogarth, 1982, “Prediction, diag-

nosis, and causal thinking in forecasting”, Journal of Forecasting, 1, 23-36.

Einhorn, H.J, R.M. Hogarth and E. Klempner, 1977, “Qual-

ity of group judgment”, Psychological Bulletin, 84, 158- 172.

Ferrell, W.R., 1985, ‘Combining individual judgments”, in:

G. Wright, ed., Behavioral Decision Making (Plenum

Press, New York), 111-145.

Flares, B.E., D.L. Olson and C. Wolfe, 1992, “Judgmental

adjustment of forecasts: A comparison of methods”,

international Journal of Forecasting, 7, 421-433. Gaeth, G.J. and J. Shanteau, 1984, “Reducing the influence

of irrelevant information on experienced decision

makers”, Organizational Behaviour and Human Perform- ance, 33, 263-282.

Gibson, G.J., 1991, “A general framework for the inte-

gration of forecasting methods”, LIKELY report 2.3a, Scottish Agricultural Statistics Service. University of

Edinburgh, UK, July, 53-74.

Goodwin, P. and G. Wright, 1991, Decision Analysis for Management Judgment (Wiley, Chichester).

Hammond, K.R., D.A. Summers and D.H. Deane, 1973,

“Negative effects of outcome feedback in multiple cue

probability learning”, Organizational Behaviour and Human Performance, February, 30-34.

Harvey, N., 1988, “Judgmental forecasting of univariate time

series”, Journal of Behavioral Decision Making, 1, 95-

110.

Harvey, N., F. Bolger and A. McClelland, 1991, “Judgmen-

tal forecasting within and across correlated time series”,

Working paper, Department of Psychology, University

College London. Hogarth, R.M. and S. Makridakis, 1981, “Forecasting and

planning: An evaluation”, Management Science, 27, 115- 138.

Jones, G.V., 1979, “A generalized polynomial model for

perception of exponential series”, Perception and Psycho- physics, 25, 232-234.

Jones, G.V., 1984, “Perception of inflation: Polynomial not

exponential”, Perception and Psychophysics, 36, 485-487. Kahneman, D. and A. Tversky, 1972, “Subjective proba-

bility: A judgment of representativeness”, Cognitive Psy- chology, 3, 430-454.

Keren, G., 1983, “Cultural differences in the misperception

of exponential growth”, Perception and Psychophysics, 34, 289-293.

Keren, G., 1992, “Improving decisions and judgments: The

desirable versus the feasible”, in: G. Wright and F.

Bolger, eds., Expertise and Decision Support (Plenum

Press, New York), 25-46. Kessler, L and R.H. Ashton, 1981, “Feedback and predic-

tion achievement in financial analysis”, Journat of Ac- counting Research, 19, 146-162.

Klayman, J., ‘1988, “Learning from experience”, in: B.

Brehmer and C.R.B. Joyce, eds., Human Judgment. The SJT View (North Holland, Amsterdam).

Kodde, D.A. and H. Schreuder, 1984, “Forecasting corpo-

rate revenue and profit: Time series models versus man-

agement and analysts”, Journal of Business, Finance and Accounting, 11, 381-395.

Lawrence, M.J., 1983, “An exploration of some practical issues in the use of quantitative forecasting models”,

Journal of Forecasting, 2, 169-179. Lawrence, M.J., R.H. Edmundson and M.J. O’Connor,

1985, “An examination of the accuracy of judgmental

extrapolation of time series”, International Journal of Forecasting, 1, 25-35.

Lawrence, M.J., R.H. Edmundson and M.J. O’Connor,

1986, “The accuracy of combining judgemental and statis-

tical forecasts”, Management Science, 32, 1521-1532. Lawrence, M.J. and S. Makridakis, 1989, “Factors affecting

judgmental forecasts and confidence intervals”, Organiza- tional Behaviour and Human Decision Processes, 42. 1722 187.

Lawrence, M.J. and M.J. O’Connor, 1992, “Exploring

judgemental forecasting”, International Journal of Fore- casting, 8, 15-26.

Lyness, K.S. and E.T. Cornelius, 1982, “A comparison of

holistic and decomposed strategies in a performance rat-

ing simulation”, Organizational Behaviour and Human Performance, 29, 21-38.

MacGregor, D. and S. Lichtenstein, 1991, “Problem struc-

turing aids for quantitative estimation”, Journal of Be- havioral Decision Making, 4, 101-116.

Mackinnon, A.J. and A.J. Wearing, 1991, “Feedback and

the forecasting of exponential change”, Acta Psycho- logica, 76, 177-191.

Makridakis, S., A. Andersen, R. Carbone, R. Fildes, M.

Hibon, R. Lewandowski, J. Newton, E. Parzen and R.

Winkler, 1984, The Forecasting Accuracy of Major Time Series Methods (Wiley, Chichester).

Mathews, B.P. and A. Diamantopoulos, 1986, “Managerial

intervention in forecasting. An empirical investigation of

forecast manipulation”, International Journal of Research in Marketing, 3, 3-10.

Mathews, B.P. and A. Diamantopoulos, 1989, “Judgemental

revision of sales forecasts: A longitudinal extension”,

Journal of Forecasting, 8, 129-140. Mathews, B.P and A. Diamantopoulos, 1990, “Judgemental

revision of sales forecasts: Effectiveness of forecast selec-

tion”, Journal of Forecasting, 9, 407-415. Mentzer, J.T. and J.E. Cox, 1984, “Familiarity, application

and performance of sales forecasting techniques”, Journal of Forecasting, 3, 27-36.

Moriarty, M.M., 1985, “Design features of forecasting sys-

tems involving management judgments”, Journal of Mar- keting Research, 22, 353-364.

Mosteller, F., A.F. Siegel, E. Trapido and C. Youtz, 1981, “Eye fitting straight lines”, The American Statistician, 35, 150-152.

Murphy, A.H. and B.G. Brown, 1985, “A comparative

evaluation of objective and subjective weather forecasts in the United States”, in: G. Wright, ed., Behavioral Decision Making (Plenum Press, New York).

Page 15: Improving judgmental time series forecasting: A review of the guidance provided by research

P. Goodwin, G. Wright I International Journal of Forecasting 9 (1993) 147-161 161

Parente, F.J. and J.K. Anderson-Parente, 1987, “Delphi

inquiry systems”, in: G. Wright and P. Ayton, eds.,

Judgmental Forecasting (Wiley, Chichester).

Payne, J.W., 1982, “Contingent decision behavior”, Psycho- logical Bulletin, 92, 382-402.

Phillips, L.D., 1984, “A theory of requisite decision

models”, Acta Psychologica, 56, 29-48. Ravinder, H.V., 1992, “Random error in holistic evaluations

and additive decompositions of multiattribute utility”,

Journal of Behavioral Decision Making, 5, 155-167. Ravinder, H.V. and D.N. Kleinmuntz, 1991, “Random error

in decompositions of multiattribute utility”, Journal of Behavioral Decision Making, 4, 83-97.

Ravinder, H.V., D.N. Kleinmuntz and J.S. Dyer, 1988, “The

reliability of subjective probabilities obtained through

decomposition”, Management Science, 34, 186-199.

Saaty, T.L., 1990, The Analytic Hierarchy Process (RWS

Publications, Pittsburgh).

Saaty, T.L. and L.G. Vargas, 1991, Prediction, Projection and Forecasting (Kluwer, Norwell, MA).

Sanders, N.R., 1992, “Accuracy of judgmental forecasts: A

comparison”, Omega, 20, 353-364.

Sanders, N.R. and L.P. Ritzman, 1992, “The need for con-

textual and technical knowledge in judgmental forecast-

ing”, Journal of Behavioral Decision Making, 5, 39-52. Slavic, P., F. Fischoff and S. Lichtenstein, 1977, “Behavioral

decision theory”, Annual Review of Psychology, 28, l-39. Slavic, P., H. Kunreuther and G.F. White, 1974, “Decision

processes, rationality and adjustment to natural hazards”,

in: G.F. White, ed., Natural Hazards: Local, National, Global (Oxford University Press, New York), 187-205.

Sniezek, J.A., 1986, “The role of labels in cue probability learning tasks”, Organizational Behaviour and Human Decision Processes, 38, 141-161.

Sniezek, J.A., 1989, “An examination of group processes in

judgmental forecasting”, International Journal of Fore- casting, 5, 171-178.

Sparkes, J.R. and A.K. McHugh, 1984, “Awareness and use

of forecasting techniques in British industry”, Journal of Forecasting, 3, 37-42.

Theil, H., 1971, Applied Economic Forecasting (North-

Holland, Amsterdam).

Timmers, H. and W.A. Wagenaar, 1977, “Inverse statistics

and misperception of exponential growth”, Perception and Psychophysics, 21, 558-562.

Tversky, A. and D. Kahneman, 1974, “Judgment under

uncertainty: Heuristics and biases”, Science, 185, 1124-

1131.

Van de Ven, A. and A.L. Delbecq, 1971, “Nominal versus

interacting group processes for committee decision-

making effectiveness”, Academy of Management Journal, 14, 203-212.

Wagenaar, W.A., 1972, “Generation of random sequences by

human subjects: A critical survey of literature”, Psycho- logical Bulletin, 77, 65-72.

Wagenaar, W.A. and S.D. Sagaria, 1975, “Misperception of

exponential growth”, Perception and Psychophysics, 18, 416-422.

Willemain, T.R., 1989, “Graphical adjustment of statistical

forecasts”, International Journal of Forecasting, 5, 179- 185.

Winkler, R.L. and A.H. Murphy, 1973, “Experiments in the

laboratory and the real world”, Organizational Behaviour and Human Performance, 10, 252-270.

Wolfe, C. and B. Flares, 1990, “Judgmental adjustment of

earnings forecasts”, Journal of Forecasting, 9, 389-405. Wright, G., G. Rowe, F. Bolger and J. Gammack, 1993,

“Coherence, calibration and expertise in judgmental

probability forecasting”, Organizational Behaviour and Human Decision Processes, (in press).

Biographies: Paul GOODWIN is Principal Lecturer in Oper- ational Research at the University of the West of England, Bristol. He received his B.A. in Economics from the Uni- versity of Liverpool and his M.Sc. in Management Science and Operational Research from the University of Warwick. Currently, he is carrying out research for a Ph.D. He has written books on Quantitative Methods and Decision Analy- sis and published articles in a variety of journals. His re- search interests focus on the role of judgement in forecasting and decision making.

George WRIGHT is Professor of Business Administration at the Strathclyde Graduate Business School. His research in- terests are the human aspects of decision making and fore- casting. He has published on forecasting in International Journal of Forecasting, Journal of Forecasting, Technological Forecasting, Social Change and Management Science. He currently edits the Journal of Behavioral Decision Making and is an associate editor of Decision Support Systems and Journal of Forecasting.