Top Banner
Uncertainty estimates in regional and global observed temperature changes: a new dataset from 1850 P. Brohan, J. J. Kennedy, I. Harris, S. F. B. Tett & P. D. Jones Accepted version: December 19 th 2005 Abstract The historical surface temperature dataset HadCRUT provides a record of surface temper- ature trends and variability since 1850. A new version of this dataset, HadCRUT3, has been produced; benefiting from recent improvements to the sea-surface temperature dataset which forms its marine component, and from improvements to the station records which provide the land data. A comprehensive set of uncertainty estimates has been derived to accompany the data: estimates of measurement and sampling error, temperature bias effects, and the effect of limited observational coverage on large-scale averages have all been made. Since the mid- 20 th century the uncertainties in global and hemispheric mean temperatures are small and the temperature increase greatly exceeds its uncertainty. In earlier periods the uncertainties are larger, but the temperature increase over the 20 th century is still significantly larger than its uncertainty. 1 Introduction The historical surface tempera- ture dataset HadCRUT [Jones, 1994, Jones & Moberg, 2003] has been exten- sively used as a source of information on surface temperature trends and vari- ability [Houghton et al., 2001]. Since the last update, which produced HadCRUT2 [Jones & Moberg, 2003], important improve- ments have been made in the marine com- ponent of the dataset [Rayner et al., 2006]. These include the use of additional obser- vations, the development of comprehensive uncertainty estimates, and technical improve- ments that enable, for instance, the production of gridded fields at arbitrary resolution. This paper describes work to produce a new dataset version, HadCRUT3, which will extend the advances made to the marine data to the global dataset. These new developments in- clude improvements to: the land station data, the process for blending land data with ma- rine data to give global coverage, and the sta- tistical process of adjusting the variance of the gridded values to allow for varying numbers of contributing observations. Results and uncer- tainties for the new blended, global dataset, called HadCRUT3, are presented. 2 Land-surface data 2.1 Station data The land-surface component of HadCRUT is derived from a collection of homogenised, quality-controlled, monthly-average tempera- tures for 4349 stations. This collection has been expanded and improved for use in the new dataset. 1
35

Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

Jun 19, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

Uncertainty estimates in regional and global observed

temperature changes: a new dataset from 1850

P. Brohan, J. J. Kennedy, I. Harris, S. F. B. Tett & P. D. Jones

Accepted version: December 19th 2005

Abstract

The historical surface temperature dataset HadCRUT provides a record of surface temper-ature trends and variability since 1850. A new version of this dataset, HadCRUT3, has beenproduced; benefiting from recent improvements to the sea-surface temperature dataset whichforms its marine component, and from improvements to the station records which provide theland data. A comprehensive set of uncertainty estimates has been derived to accompany thedata: estimates of measurement and sampling error, temperature bias effects, and the effectof limited observational coverage on large-scale averages have all been made. Since the mid-20th century the uncertainties in global and hemispheric mean temperatures are small and thetemperature increase greatly exceeds its uncertainty. In earlier periods the uncertainties arelarger, but the temperature increase over the 20th century is still significantly larger than itsuncertainty.

1 Introduction

The historical surface tempera-ture dataset HadCRUT [Jones, 1994,Jones & Moberg, 2003] has been exten-sively used as a source of informationon surface temperature trends and vari-ability [Houghton et al., 2001]. Since thelast update, which produced HadCRUT2[Jones & Moberg, 2003], important improve-ments have been made in the marine com-ponent of the dataset [Rayner et al., 2006].These include the use of additional obser-vations, the development of comprehensiveuncertainty estimates, and technical improve-ments that enable, for instance, the productionof gridded fields at arbitrary resolution.

This paper describes work to produce a newdataset version, HadCRUT3, which will extendthe advances made to the marine data to theglobal dataset. These new developments in-

clude improvements to: the land station data,the process for blending land data with ma-rine data to give global coverage, and the sta-tistical process of adjusting the variance of thegridded values to allow for varying numbers ofcontributing observations. Results and uncer-tainties for the new blended, global dataset,called HadCRUT3, are presented.

2 Land-surface data

2.1 Station data

The land-surface component of HadCRUTis derived from a collection of homogenised,quality-controlled, monthly-average tempera-tures for 4349 stations. This collection hasbeen expanded and improved for use in the newdataset.

1

Page 2: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

2.1.1 Additional stations and data

New stations and data were added for Mali,the Democratic Republic of Congo, Switzer-land [Begert et al., 2005] and Austria. Datafor 16 Austrian stations were completely re-placed with revised values. A total of 29 Maliseries were affected: 5 had partial new data,8 had completely new data, and 16 were newstations. Five Swiss stations were updated forthe period 1864–2001 [Begert et al., 2005]. 33Congolese stations were affected: 13 were newstations, and 20 were updates to existing sta-tions.

As well as the new stations discussed above,additional monthly data have been obtainedfor stations in Antarctica [Turner et al., 2005],while additional data for many stations havebeen added from the National Climatic DataCentre publication Monthly Climatic Data forthe World.

2.1.2 Quality control

Much additional quality control hasalso been undertaken. A comparison[Simmons et al., 2004] of the Climatic Re-search Unit (CRU) land temperature datawith the ERA-40 reanalysis found a few areaswhere the station data were doubtful, andthis was augmented by visual examination ofindividual station records looking for outliers.Some bad values were identified and eithercorrected or removed. Only a small fractionof the data needed correction, however; of themore than 3.7 million monthly station values,the ERA-40 comparison found about 10doubtful grid boxes and the visual inspectionabout 270 monthly outliers.

Checking the station data for identical se-quences in all possible station pairs turned up53 stations which were duplicates of others.These duplicates have arisen where the samestation data are assimilated into the archive

from two different sources, and the two sourcesgive the same station but with different namesand WMO identifiers. The duplicate stationswere merged and duplicate temperature datawere deleted.

Also the station normals and standard de-viations were improved. The station nor-mals (monthly averages over the normal pe-riod 1961–90) are generated from station datafor this period where possible. Where thereare insufficient station data to achieve thisfor the period, normals were derived fromWMO values [WMO, 1996] or inferred fromsurrounding station values [Jones et al., 1985].For 617 stations, it was possible to re-place the additional WMO normals (used in[Jones & Moberg, 2003]) with normals derivedfrom the station data. This was made possibleby relaxing the requirement to have data forfour years in each of the three decades in 1961–90 (the requirement now is simply to have atleast 15 years of data in this period), so reduc-ing the number of stations using the seeminglyless reliable WMO normals. As well as makingthe normals less uncertain (see the discussionof normal error below), these improved nor-mals mean that the gridded fields of tempera-ture anomalies are much closer to zero over thenormal period than was the case for previousversions of the dataset.

Figure 1 shows the locations of the stationsused, and indicates those where changes havebeen made.

2.2 Gridding

To interpolate the station data to a regular gridthe methods of [Jones & Moberg, 2003] are fol-lowed. Each grid-box value is the mean of allavailable station anomaly values, except thatstation outliers in excess of five standard devi-ations are omitted.

Two changes have been made in the grid-ding process. The station anomalies can

2

Page 3: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

-80

-60

-40

-20

0

20

40

60

80

-150 -100 -50 0 50 100 150

Figure 1: Land station coverage. Small black circles mark all stations, green circles markdeleted stations, blue circles mark stations added, and red circles mark stations edited. Manystation edits are minor changes: involving, for instance, the correction of a single outlier.

3

Page 4: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

now be gridded to any spatial resolution, in-stead of being limited to a 5◦ × 5◦ resolu-tion; this simplifies comparison of the grid-ded data with General Circulation Model(GCM) results. Also previous versions ofthe dataset did some infilling of missing grid-box values using data from surrounding gridboxes [Jones et al., 2001]. This is no longerdone, allowing the attribution of an uncer-tainty to each grid-box value. The resultinggridded land-only dataset has been given thename CRUTEM3. The previous version ofthis dataset, CRUTEM2, started in 1851: inCRUTEM3 the start date has been extendedback to 1850 to match the marine data (section3).

Figure 2 shows a gridded field for an exam-ple month, at the standard 5◦ × 5◦ degree res-olution. For comparison with GCM results,or for regional studies of areas where obser-vations are plentiful, it can be useful to per-form the gridding at higher resolution. Fig-ure 3 shows a gridded field for the same exam-ple month, at the resolution of the HadGEM1model [Johns et al., 2004], but only for NorthAmerica.

2.3 Uncertainties

To use the data for quantitative, statisticalanalysis, for instance a detailed comparisonwith GCM results, the uncertainties of thegridded anomalies are a useful additional field.A definitive assessment of uncertainties is im-possible, because it is always possible thatsome unknown error has contaminated thedata, and no quantitative allowance can bemade for such unknowns. There are, however,several known limitations in the data, and es-timates of the likely effects of these limitationscan be made [Rumsfeld, 2004]. This meansthat uncertainty estimates need to be accom-panied by an error model: a precise descriptionof what uncertainties are being estimated.

Uncertainties in the land data can be dividedinto three groups:

Station Error the uncertainty of individualstation anomalies,

Sampling Error the uncertainty in a grid-box mean caused by estimating the meanfrom a small number of point values,

Bias Error the uncertainty in large-scaletemperatures caused by systematicchanges in measurement methods.

2.3.1 Station Errors

The uncertainties in the reported stationmonthly mean temperatures can be further subdivided. Suppose

Tactual = Tob + εob + CH + εH + εRC (1)

where Tactual is the actual station meanmonthly temperature, Tob is the reported tem-perature, εob is the measurement error, CH isany homogenisation adjustment that may havebeen applied to the reported temperature andεH is the uncertainty in that adjustment, andεRC is the uncertainty due to inaccurate cal-culation or miss-reporting of the station meantemperature.

The values being gridded are anomalies, cal-culated by subtracting the station normal fromthe observed temperature, so errors in the sta-tion normals must also be considered.

Aactual = Tob−TN + εN + εob + CH + εH + εRC

(2)where Aactual is the actual temperatureanomaly, TN is the estimated station normal,and εN is the error in TN .

The basic station data include normals andmay have had homogenisation adjustments ap-plied, so they provide Tob + CH and TN ; alsoneeded are estimates for εob, εH , εN , and εRC .

4

Page 5: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

-10

-5

0

5

10

-150 -100 -50 0 50 100 150

-80

-60

-40

-20

0

20

40

60

80

Figure 2: CRUTEM3 anomalies (◦C) for January 1969 (global, 5◦ × 5◦ )

-10-5 0 5 10

-150 -140 -130 -120 -110 -100 -90 -80 -70 -60 20

25

30

35

40

45

50

55

60

Figure 3: CRUTEM3 anomalies (◦C) for January 1969 (North America, HadGEM1 model grid(1.875◦ × 1.25◦))

5

Page 6: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

Measurement error (εob) The random er-ror in a single thermometer reading is about0.2◦C (1 σ) [Folland et al., 2001]; the monthlyaverage will be based on at least two readings aday throughout the month, giving 60 or morevalues contributing to the mean. So the er-ror in the monthly average will be at most0.2/

√60 = 0.03◦C and this will be uncorre-

lated with the value for any other station orthe value for any other month.

There will be a difference between the truemean monthly temperature (i.e. from 1 minuteaverages) and the average calculated by eachstation from measurements made less often;but this difference will also be present in thestation normal and will cancel in the anomaly.So this doesn’t contribute to the measurementerror. If a station changes the way meanmonthly temperature is calculated it will pro-duce an inhomogeneity in the station tem-perature series, and uncertainties due to suchchanges will form part of the homogenisationadjustment error.

Homogenisation adjustment error (εH)Inhomogeneities are introduced into the sta-tion temperature series by such things aschanges in the station site, changes in mea-surement time, or changes in instrumentation.The station data that are used to make Had-CRUT have been adjusted to remove these in-homogeneities, but such adjustments are notexact — there are uncertainties associated withthem.

For some stations both the adjusted and un-adjusted time-series are archived at CRU andso the adjustments that have been made areknown [Jones et al., 1985, Jones et al., 1986,Vincent & Gullet, 1999], but for most stationsonly a single series is archived, so any adjust-ments that might have been made (e.g. by Na-tional Met. services or individual scientists)are unknown.

Making a histogram of the adjustments ap-plied (where these are known) gives the solidline in figure 4. Inhomogeneities will come inall sizes, but large inhomogeneities are morelikely to be found and adjusted than smallones. So the distribution of adjustments is bi-modal, and can be interpreted as a bell-shapeddistribution with most of the central, small,values missing.

Hypothesising that the distribution of ad-justments required is Gaussian, with a stan-dard deviation of 0.75◦C gives the dashed linein figure 4 which matches the number of adjust-ments made where the adjustments are large,but suggests a large number of missing smalladjustments. The homogenisation uncertaintyis then given by this missing component (dot-ted line in figure 4), which has a standard de-viation of 0.4◦C. This uncertainty applies toboth adjusted and unadjusted data, the formerhave an uncertainty on the adjustments made,the latter may require undetected adjustments.

The distribution of known adjustments isnot symmetric — adjustments are more likelyto be negative than positive. The mostcommon reason for a station needing adjust-ment is a site move in the 1940-60 period.The earlier site tends to have been warmerthan the later one — as the move is oftento an out of town airport. So the adjust-ments are mainly negative, because the ear-lier record (in the town/city) needs to bereduced [Jones et al., 1985, Jones et al., 1986].Although a real effect, this asymmetry is smallcompared with the typical adjustment, andis difficult to quantify; so the homogenisationadjustment uncertainties are treated as beingsymmetric about zero.

The homogenisation adjustment applied toa station is usually constant over long periods:the mean time over which an adjustment isapplied is nearly 40 years [Jones et al., 1985,Jones et al., 1986, Vincent & Gullet, 1999].The error in each adjustment will therefore be

6

Page 7: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

0

0.1

0.2

0.3

0.4

0.5

0.6

-4 -2 0 2 4

Pro

babi

lity

dens

ity

Adjustment (C)

Adjustments madeAdjustments required (normal, sd=0.75C)

Difference (= adjustment uncertainties: sd=0.4C)

Figure 4: Distribution of station homogeneity adjustments (◦C). The solid line is the distribu-tion of the adjustments known to have been made (763 adjustments, from [Jones et al., 1985,Jones et al., 1986, Vincent & Gullet, 1999]), the dashed line is a hypothesised distribution ofthe adjustments required, and the dotted line is the difference - and so the distribution ofhomogeneity adjustment error.

7

Page 8: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

constant over the same period. This meansthat the adjustment uncertainty is highlycorrelated in time: the adjustment uncertaintyon a station value will be the same for adecadal average as for an individual monthlyvalue.

So the homogenisation adjustment uncer-tainty for any station is a random value takenfrom a normal distribution with a standard de-viation of 0.4◦C. Each station uncertainty isconstant in time, but uncertainties for differ-ent stations are not correlated with one an-other (correlated inhomogeneities are treatedas biases, see below). As an inhomogeneityis a change from the conditions over the cli-matology period (1961–90), station anomalieswill have no inhomogeneities during that pe-riod unless there is a change sometime duringthose 30 years. Consequently these adjustmentuncertainty estimates are pessimistic for thatperiod.

Figure 4 also demonstrates the value of mak-ing homogenisation adjustments. The dashedline is an estimate of the uncertainties in theunadjusted data, and the dotted line an esti-mate of the uncertainties remaining after ad-justment. The adjustments made have reducedthe uncertainties considerably.

Normal error (εN) For most stations, thestation normal is calculated from the monthlytemperatures for that station over the normalperiod (1961–90). So the uncertainty in thenormal consists of measurement and samplingerror for that data. The measurement errorwill be a small fraction of the monthly mea-surement error and can be neglected, so onlythe sampling error is important.

The station temperature in each month dur-ing the normal period can be considered as thesum of two components: a constant stationnormal value (C) and a random weather value(w, with standard deviation σi). If data for a

station are available for N of the 30 possiblemonths during the period from which the nor-mals are taken, and the ws are uncorrelated;then for stations where C is estimated as themean of the available monthly data, the un-certainty on C is σi/

√N . Testing this model

by selecting stations where complete data areavailable for the climatology period and look-ing at the effect on the normals of using only asubset of the data confirmed that the autocor-relation is small and the model is appropriate.

The station normals used fall into threegroups [Jones & Moberg, 2003]. The firstgroup are those where data are available forall months in 1961–90; these normals are givenan uncertainty of σi/

√30. The second group

are those where data are available for at least15 years in 1961–90 (enough data to estimatea normal); these normals are given an un-certainty of σi/

√N where N is the number

of years for which there is data. The thirdgroup are those where too few data are avail-able in 1961–90 to estimate a normal. For someof these stations WMO normals have beenused [WMO, 1996] and experience has shownthat these normals are likely to have problems[Jones & Moberg, 2003]. The process of dataimprovement discussed in section 2.1.2 also al-lowed the generation of new normals for 617such stations. Comparison of the old and newnormals for these stations suggested that theuncertainty in the WMO normals was about0.3σi.

Calculation and reporting error (εRC)The station data used in this analysis may havebeen extensively processed before being addedto the CRU archive. The monthly mean tem-perature values will have been calculated byaveraging 60 or more sub-daily measurements.Where this calculation is done manually it canintroduce an error. The transmission of thestation data to the CRU archive requires at

8

Page 9: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

least one cycle of encoding, transcribing anddecoding the data, and this process may alsointroduce an error.

Where such errors are persistent they willintroduce an inhomogeneity into the data for astation, and so are included in the homogeni-sation adjustment error εH . So the calculationand reporting error (εRC) comprises only therandom and uncorrelated cases.

Calculation and reporting errors can be large(changing the sign of a number and scaling itby a factor of 10 are both typical transcrip-tion errors; as are reporting errors of 10◦C (e.g.putting 29.1 for 19.1)) but almost all such er-rors will be found during quality control of thedata. Those errors that remain after qualitycontrol will be small, and because they are alsouncorrelated both in time and in space theireffect on any large scale average will be negli-gible. For these reasons εRC is not consideredfurther.

Combining station error componentsFor each station, the observational, homogene-ity adjustment, and normal uncertainties areindependent; so estimates of each can be com-bined in quadrature to give an estimate of thetotal uncertainty for each station.

The grid-box anomaly is the mean of the nstation anomalies in that grid box, so the grid-box station uncertainty is the root mean squareof the station errors, multiplied by 1/

√n. The

spatial patterns visible in the station error field(figure 5) are dominated by the distribution ofthe mean station standard deviation. This islarger in the high-latitudes and in the winter,and smaller in the tropics and in the summer;so for the month shown (January) the stationerror is largest for the northern high latitudes.A secondary effect is a reduction in areas with alarge number of observations. In North Amer-ica, Europe, and south-eastern Australia, ob-servations are plentiful and so the station error

is reduced.

2.3.2 Sampling error

Even if the station temperature anomalies hadno error, the mean of the station anomalies ina grid box would not necessarily be equal tothe true spatial average temperature anomalyfor that grid box. This difference is the sam-pling error; and it will depend on the numberof stations in the grid box, on the positionsof those stations, and on the actual variabil-ity of the climate in the grid box. A methodfor calculating sampling error is described in[Jones et al., 1997], who recommend the equa-tion

SE2 =σ2

i r(1− r)1 + (n− 1)r

. (3)

Where σ2i is the mean station standard devi-

ation, n is the number of stations, and r isthe average inter-site correlation (itself esti-mated from the data according to the meth-ods of [Jones et al., 1997]). The method of[Jones et al., 1997] has been used in this anal-ysis.

The spatial distribution of sampling error(see figure 6), like the station error, is domi-nated by the station standard deviations andthe number of observations. The distributionis very similar to that for the station error.

2.3.3 Bias error

Bias correction uncertainties are estimated fol-lowing [Folland et al., 2001] who consideredtwo biases in the land data: urbanisation ef-fects [Jones et al., 1990] and thermometer ex-posure changes [Parker, 1994].

Urbanisation effects The previous analy-sis of urbanisation effects in the HadCRUTdataset [Folland et al., 2001] recommended a1σ uncertainty which increased from 0 in 1900

9

Page 10: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

0 0.2 0.4 0.6 0.8 1 1.2 1.4

-150 -100 -50 0 50 100 150

-80

-60

-40

-20

0

20

40

60

80

Figure 5: CRUTEM3 station errors (◦C) for January 1969

0 0.2 0.4 0.6 0.8 1 1.2 1.4

-150 -100 -50 0 50 100 150

-80

-60

-40

-20

0

20

40

60

80

Figure 6: CRUTEM3 sampling errors (◦C) for January 1969

10

Page 11: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

to 0.05◦C in 1990 (linearly extrapolated af-ter 1990) [Jones et al., 1990]. Since then, re-search has been published suggesting both thatthe urbanisation effect is too small to de-tect [Parker, 2004, Peterson, 2004], and thatthe effect is as large as ≈ 0.3◦C/century[Kalnay & Cai, 2003, Zhou et al., 2004].

The studies finding a large urbanisationeffect [Kalnay & Cai, 2003, Zhou et al., 2004]are based on comparison of observations withreanalyses, and assume that any difference isentirely due to biases in the observations. Acomparison of HadCRUT data with the ERA-40 reanalysis [Simmons et al., 2004] demon-strated that there were sizable biases in thereanalysis, so this assumption cannot be made,and the most reliable way to investigate possi-ble urbanisation biases is to compare rural andurban station series.

A recent study of rural/urban stationcomparisons [Peterson & Owen, 2005] sup-ported the previously used recommendation[Jones et al., 1990], and also demonstratedthat assessments of urbanisation were verydependent on the choice of meta-data usedto make the rural/urban classification. Tomake an urbanisation assessment for all thestations used in the HadCRUT dataset wouldrequire suitable meta-data for each stationfor the whole period since 1850. No suchcomplete meta-data are available, so in thisanalysis the same value for urbanisationuncertainty is used as in the previous analysis[Folland et al., 2001]; that is, a 1σ value of0.0055◦C/decade, starting in 1900. Recentresearch suggests that this value is reasonable,or possibly a little conservative [Parker, 2004,Peterson, 2004, Peterson & Owen, 2005]. Thesame value is used over the whole land surface,and it is one-sided: recent temperatures maybe too high due to urbanisation, but they willnot be too low.

Thermometer exposure changes Overthe period since 1850 there have been changesin the design and siting of thermometer enclo-sures; many early shelters can differ substan-tially from the modern Stevenson-type screen.It is sometimes possible to determine the timeof change by the homogeneity assessments dis-cussed in section 2.3.1, but this is only possibleif changes at neighbouring stations are imple-mented at different times. The bias errors inthis section, therefore, allow for the possiblesimultaneous replacement across entire coun-tries with Stevenson-type shelters. The pos-sible effect of such changes was investigatedin [Parker, 1994], who concluded that therewas a possible difference between 1900 and thepresent day of about 0.2◦C because of such ex-posure changes. This was later expanded intoan error model in [Folland et al., 2001]: in thetropics (20S–20N) the 1σ uncertainty range is0.2◦C before 1930, and then decreases linearlyto zero in 1950. Outside the tropics the 1σ un-certainty range is 0.1◦C before 1900 and thendecreases linearly to zero by 1930. This uncer-tainty model is used here.

It is likely that further changes in thermome-ter exposure have been taking place in recentyears, as Stevenson-type screens are replacedwith aspirated shelters. These changes are,however, too recent to allow a quantitative as-sessment of their effects and they are not in-cluded in the CRUTEM3 error analysis.

2.3.4 Combining the uncertainties

The total uncertainty value for any grid boxcan be obtained by adding the station error,sampling error, and bias error estimates forthat grid box in quadrature. This gives thetotal uncertainty for each grid box for eachmonth.

In practise, however, this combined uncer-tainty is less useful than the individual com-ponents. Most uses of the data set require

11

Page 12: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

not just an individual monthly grid-box valuebut some spatial or temporal average of manyof them. When combining uncertainties ontothese larger scales it is necessary to allow forcorrelations between the grid-box uncertain-ties, and the three error components have dif-ferent spatial and temporal correlation struc-tures.

The sampling errors have little spatial ortemporal correlation. The station errors havelittle spatial correlation, but because the twomain components (homogeneity adjustmentand normal uncertainties) stay the same foreach station for many consecutive monthsthey have almost complete temporal auto-correlation. The bias errors are the same foreach grid box and each month, they have com-plete temporal and spatial correlations.

The errors shown in figures 5 and 6 are for5◦× 5◦ grid boxes. Changing the gridding res-olution will change the uncertainties. Largergrid-boxes will have a larger sampling errorif they contain the same number of observa-tions, but typically increasing the grid-box sizewill mean that each contains more stations andthe box-average uncertainties will be reduced.Similarly, reducing the grid-box size would re-duce the sampling error, except that smallergrid boxes will often contain fewer stations,which will increase the errors.

The combined effect of grid-box sampling er-rors will be small for any continental-scale orhemispheric-scale average (though the lack ofglobal coverage introduces an additional sourceof sampling error, this is discussed in section6.1). Combined station errors will be small forlarge-scale spatial averages, but remain impor-tant for averages over long periods of the samesmall grid box. Bias errors are equally large onany space or time scale.

3 Marine data

The marine data used are from the sea-surface temperature dataset HadSST2[Rayner et al., 2006]. This is a gridded datasetmade from in-situ ship and buoy observationsfrom the new International ComprehensiveOcean-Atmosphere data set [Diaz et al., 2002,Manabe, 2003, Woodruff et al., 2003]. Thisdataset provides the same information for theoceans as described above for the land. Foreach grid box: mean temperature anomalies,measurement and sampling error estimates,and bias error estimates are available. Thedatasets can be produced on a grid of anydesired resolution.

Previous versions of HadCRUT use the SSTdataset MOHSST6 [Parker et al., 1995]. Thenew HadSST2 dataset is an improvement onMOHSST6 for many reasons: it is based on anenlarged and improved set of ship and buoyobservations, it includes a new climatology,and the bias corrections needed for data be-fore 1941 have been revisited. Also HadSST2starts in January 1850 (as does HadCRUT3),MOHSST6 and HadCRUT2 started in January1856. Full details of all the improvements canbe found in [Rayner et al., 2006].

Blending a sea-surface temperature (SST)dataset with land air temperature makes animplicit assumption that SST anomalies area good surrogate for marine air temperatureanomalies. It has been shown, for example by[Parker et al., 1994], that this is the case, andthat marine SST measurements provide moreuseful data and smaller sampling errors thanmarine air temperature measurements would.So blending SST anomalies with land air tem-perature anomalies is a sensible choice.

3.1 Uncertainties in the marine data

Like the land data, the marine dataset hasknown errors: estimates have been made of

12

Page 13: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

the measurement and sampling error, and theuncertainty in the bias corrections. The ma-rine data are point measurements from mov-ing ships, moored buoys, and drifting buoys,so the anomalies for any one grid box comein general from a different set of sources eachmonth. This means that marine data have noequivalent of station errors or homogenisationadjustments. The marine equivalent of the sta-tion errors form part of the measurement andsampling error, and adjustments for inhomo-geneities are done by large scale bias correc-tions.

The measurement and sampling error esti-mates are based, like the land sampling error(section 2.3.2), on the number of observationsin a grid box, on the variability of a single ob-servation, and on the correlation between ob-servations. The latter two parameters are esti-mated from the gridded data for each grid box.Details are given in [Rayner et al., 2006].

Only one bias correction is applied: over theperiod 1850–1940, the predominant SST mea-surement process changed from taking samplesin wooden buckets, to taking samples in canvasbuckets, to using engine room cooling waterinlet temperatures [Folland and Parker, 1995].A bias correction is applied to remove the effectof these changes on the SSTs. This correctiondepends on estimates of the mix of measuringmethods in use at any one time, and of param-eters such as the speed of the ships making themeasurements. An uncertainty has been esti-mated for the correction; again, details are in[Rayner et al., 2006].

As with the land data, the uncertainty es-timates cannot be definitive: where there areknown sources of uncertainty, estimates of thesize of those uncertainties have been made.There may be additional sources of uncertaintyas yet unquantified (see section 6.3).

4 Blending land and marinedata

To make a dataset with global coverage theland and marine data must be combined. Forland-only grid boxes the land value is taken,and for sea-only grid boxes the marine value;but for coastal and island grid boxes the landand marine data must be blended into a com-bined average.

Previous versions of HadCRUT[Jones, 1994, Jones & Moberg, 2003] blendedland and sea data in coastal and island gridboxes by weighting the land and sea values bythe area fraction of land and sea respectively,with a constraint that the land fraction cannotbe greater than 75% or less than 25%, toprevent either data-source being swamped bythe other. The aim of weighting by area wasto place more weight on the more reliabledata source where possible. The constraintsare necessary because there are some gridboxes which are almost all sea but contain onereliable land station on a small island; andsome grid boxes which are almost all land butalso include a small sea area which has manymarine observations. Unconstrained weightingby area would essentially discard one of themeasurements, which is undesirable.

The new developments described in this pa-per provide measurement and sampling uncer-tainty estimates for each grid box in both theland and marine data sets. This means thatthe land and marine data can be blended inthe way that minimises the uncertainty of theblended mean. That is, by scaling accordingto their uncertainties, so that the more reli-able value has a higher weighting than the lessreliable.

Tblended =ε2seaTland + ε2landTsea

ε2land + ε2sea(4)

where Tblended is the blended average tempera-ture anomaly, Tland and Tsea are the land and

13

Page 14: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

marine anomalies, εland is the measurementand sampling error of the land data, and εsea

is the measurement and sampling error of themarine data.

The resulting blended dataset for a sam-ple month (figure 7) shows the coherency be-tween the land and sea data: large scale regionsof positive or negative temperature anomaliesthat cross land-sea boundaries show up clearly.The land data weighting for all coastal and is-land grid boxes with both land and sea data forthe same month (figure 8) shows that weight-ing by uncertainties generally weights the ma-rine data more highly where the marine dataare expected to be good (North Atlantic andNorth Pacific coasts where there are many ma-rine observations); and similarly weights theland data more highly where it is the more re-liable (in the Southern Hemisphere, notably inIndonesia and the South Pacific where marineobservations are sparse). Note that the weight-ing is continually varying with time as the dataavailability changes.

As the land and marine errors are indepen-dent, this choice of weighting gives the low-est measurement and sampling error for theblended mean, giving an error in the blendedmean of:

εblended =

√ε2seaε

2land

ε2land + ε2sea. (5)

The measurement and sampling error for theblended mean (figure 9) is the combined sta-tion and sampling error over land (figures 5and 6). Over the oceans the error distribu-tion is dominated by variations in the num-ber of observations: where marine observa-tions are plentiful (North Atlantic, North Pa-cific and the shipping lanes) the measurementand sampling error is very small; in poorlyobserved areas like the Southern Ocean, theerror is much larger. The errors for marinegrid boxes are much smaller than those forland grid boxes because SST is less variable in

both space and time than land air temperature.This difference is discussed in more detail insection 6.2. The smaller SST errors mean thatthe blended temperatures for coastal and is-land grid boxes are dominated by the SST tem-peratures. This is reasonable if it is assumedthat, in any grid box, the land temperatureand SST values for that box are each estimatesof the same blended temperature. In realitythis may not be true (see section 6.4) and anarea-weighted average might in some cases givea more physically consistent average tempera-ture. However, the choice of blending weightmakes very little difference to large scale av-erages, so the extra complexity of a blendingalgorithm which accounts for possible land-seatemperature anomaly differences is not justi-fied.

5 Variance adjustment

Assigning a grid-box anomaly simply as themean of the observational anomalies in thatgrid box produces a good estimate of the ac-tual temperature anomaly. But it has the dis-advantage that the variance of the grid boxaverage is not constant in time or space; gridboxes containing many observations will havea low variance, and those with few observa-tions a larger one. For some applications thisfluctuation in variance is undesirable. Het-erogeneities in the variance affect estimates ofthe covariance matrices which are used in EOFtechniques such as Optimal Averaging. Theyalso affect analyses of extreme monthly tem-peratures and of changes in temperature vari-ability through time.

For these reasons, previous versions of Had-CRUT have included variance adjustments[Jones et al., 2001]: alternative versions of thegridded datasets with the grid-box anoma-lies adjusted to remove the effects of chang-ing numbers of observations. In producing a

14

Page 15: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

-10

-5

0

5

10

-150 -100 -50 0 50 100 150

-80

-60

-40

-20

0

20

40

60

80

Figure 7: HadCRUT3 anomalies (◦C) for January 1969

0

0.2

0.4

0.6

0.8

1

-150 -100 -50 0 50 100 150

-80

-60

-40

-20

0

20

40

60

80

Figure 8: Land data blending weight for January 1969. (Greater emphasis on the land wouldgive numbers closer to one).

15

Page 16: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

0

0.5

1

1.5

2

2.5

-150 -100 -50 0 50 100 150

-80

-60

-40

-20

0

20

40

60

80

Figure 9: HadCRUT3 measurement and sampling error (◦C) for January 1969

variance adjusted version of HadCRUT3 tworefinements have been made: the error esti-mates for the gridded data have been used todevise a simpler adjustment method applica-ble to both land and marine data, and the ad-justment process has been tested on syntheticdata to ensure that it does not introduce bi-ases into the data. Details of the adjustmentmethod and the tests applied are given in ap-pendix A. Variance adjusted versions havebeen produced for HadCRUT3 and the ma-rine and land datasets from which it is formed;the adjusted datasets are named HadCRUT3v,CRUTEM3v and HadSST2v. One advantageof the new adjustment method is that it canbe applied to the entire dataset, so the vari-ance adjusted datasets now also start in 1850.The previous version of the variance adjusteddataset, HadCRUT2v, started in 1870.

Variance adjustment is successful at the in-dividual grid-box scale: comparison with syn-thetic data shows that the inflation of the gridbox variance caused by the limited numberof observations can be removed without in-troducing biases into the grid-box series. Atlarger space scales, however, variance adjust-

ment does introduce a small bias into the data.Whether variance adjusted or unadjusted datashould be used in an analysis depends on whatis to be calculated. If it is necessary thatgrid-box anomalies have a spatially and tem-porally consistent variance, then variance ad-justed data should be used. Otherwise, betterresults may be obtained using unadjusted data.In particular, global and regional time-seriesshould be calculated using unadjusted data.

6 Analyses of the griddeddataset

From the 5◦ × 5◦ gridded dataset and its com-prehensive set of uncertainty estimates it ispossible to calculate a large variety of clima-tologically interesting summary statistics andtheir uncertainty ranges. Of this variety, globaland regional temperature time series probablyhave the widest appeal, so some illustrative ex-amples of these are presented here.

16

Page 17: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

6.1 Hemispheric and global time-series

If the gridded data had complete coverage ofthe globe or the region to be averaged, thenmaking a time series would be a simple pro-cess of averaging the gridded data and mak-ing allowances for the relative sizes of the gridboxes and the known uncertainties in the data.However, global coverage is not complete evenin the years with the most observations, andit is very incomplete early in the record. Ingeneral, global and regional area-averages willhave an additional source of uncertainty causedby missing data.

To estimate the uncertainty of a large-scale average owing to missing data the ef-fect of sub-sampling on a known, completedataset is used. The NCEP/NCAR reanaly-sis dataset [Kalnay et al., 1996] provides com-plete monthly gridded surface air temperaturevalues for more than 50 years. To estimate themissing data uncertainty of the HadCRUT3mean for a particular month, the reanalysisdata for that calendar month in each of the 50+years is sub-sampled to have the same cover-age as HadCRUT3, and the difference betweenthe complete average and the sub-sampled av-erage anomaly is calculated in each of the 50+cases. The 2.5% and 97.5% values forming theerror range of the HadCRUT3 mean for thatmonth in the record are then estimated fromthe standard deviation of the 50+ differences,assuming that the differences are normally dis-tributed. This procedure has the advantagethat it works for any region, so hemisphericand regional time-series and their uncertain-ties can be calculated as easily as global series.Unlike sophisticated optimal methods such asthat used by [Folland et al., 2001], this processmakes no attempt to minimise coverage uncer-tainties by using estimates of data covariances.This means that the precision of large-scale av-erages is less than that which could be achieved

with a more sophisticated method. But thesimple method has the advantage that the es-timated uncertainty on the large scale averagedue to limited coverage is independent of allthe other sources of uncertainty. So it remainsstraightforward to calculate both the total un-certainty on any large-scale average and all ofits components (figure 10).

This approach can also be used to give cover-age uncertainties on longer timescales. Annualcoverage uncertainties can be made by convert-ing both the HadCRUT3 data and the reanal-ysis data to annual averages and then subsam-pling the annual reanalysis data with the cov-erage of the annual HadCRUT3 data. Simi-larly, estimates can be made of uncertaintiesof coverage uncertainties for smoothed annualor decadal averages.

The grid-box sampling and measurement er-rors are greatly reduced when the gridded dataare averaged into large-scale means, so theonly other important uncertainty componentof global and regional time-series is that dueto the biases in the data. This is dealt with bymaking datasets with allowances for bias un-certainties incorporated. Generating averagesfrom datasets with bias allowances set at the2.5% and 97.5% levels provides a 95% errorrange from bias uncertainties in the resultingaverages.

6.1.1 Global averages

The global temperature is calculated as themean of the northern and southern hemi-sphere series (to stop the better-samplednorthern hemisphere from dominating the av-erage). Figure 10 shows the global tempera-ture anomaly time series calculated from Had-CRUT3 with these error components. Themonthly averages are dominated by short-termfluctuations in the anomalies; combining thedata into annual averages produces a clearerpicture, and smoothing the annual averages

17

Page 18: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

-1.5

-1

-0.5

0

0.5

1

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

Figure 10: HadCRUT3 global temperature anomaly time-series (◦C) at monthly (top), annual(centre), and smoothed annual (bottom) resolutions. The solid black line is the best estimatevalue, the red band gives the 95% uncertainty range caused by station, sampling and measure-ment errors; the green band adds the 95% error range due to limited coverage; and the blueband adds the 95% error range due to bias errors.

18

Page 19: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

with a 21-term binomial filter highlights thelow-frequency components and shows the im-portance of the bias uncertainties. The biasuncertainties are zero over the normal pe-riod by definition. The dominant bias un-certainties are those due to bucket correction[Rayner et al., 2006] and thermometer expo-sure changes [Parker, 1994] both of which arelarge before the 1940s.

A notable feature of the global time series isthat the uncertainties are not always larger forearlier periods than later periods. The uncer-tainties are smaller in the 1850s than in the1920s, at least for the smoothed series, de-spite the much larger number of observationsin the 1920s. The station, sampling and mea-surement, and coverage errors (red and greenbands in figure 10) depend on the numberand distribution of the observations, and thesecomponents of the error decrease steadily withtime as the number of observations increases.These components also decrease with averag-ing to larger space and time scales, so they aresmaller in the annual than the monthly series,and smaller again in the smoothed annual se-ries. The bias uncertainties, however, do notreduce with spatial or temporal averaging, andthey are largest in the early 20th century; sothe smoothed annual series, where the uncer-tainty is dominated by the bias uncertainties,also has its largest uncertainty in this period.

The bias uncertainties are largest in theearly 20th century for two reasons: Firstlythe bias uncertainties in the marine data arelargest then: because the uninsulated canvasbuckets used in that period produced largertemperature biases than the wooden bucketsused earlier (see [Rayner et al., 2006] for de-tails). And also because the land tempera-ture bias uncertainties (present before 1950)are larger in the tropics than the extra-tropics,so for these simple global averages, the bias un-certainty depends on the ratio of station cover-age in the tropics to that in the extra-tropics,

and this ratio is smaller in the 1850s than inthe 1920s.

6.1.2 Hemispheric averages

Comparing the smoothed mean temperaturetime-series for the Northern Hemisphere andSouthern Hemisphere (figure 11) shows the dif-ference in uncertainties between the two hemi-spheres. The difference in the uncertaintyranges for the two series stems from the verydifferent land/sea ratio of the two hemispheres.The Northern Hemisphere has more land, andso a larger station, sampling and measurementerror (figure 9 and section 6.2), but it has moreobservations and so a smaller coverage uncer-tainty. The bias uncertainties are also larger inthe Northern Hemisphere both because it hasmore land (especially in the tropics where theland biases are large), and because the SSTbias uncertainties are largest in the NorthernHemisphere western boundary current regionswhere the SST can be very different from theair temperature ([Rayner et al., 2006]).

The difference between the two hemisphereseries has a smaller uncertainty than eitherhemispheric value over much of the periodshown, because the bias errors, though un-known, will be much the same in the twohemispheres and so mostly cancel in the dif-ference. So the previously observed increase inthe inter-hemispheric difference in the mid 20th

century (see, for example [Folland et al., 1986,Kerr, 2005]) is shown to be significantly out-side the uncertainties.

6.2 Differences between land andmarine data

Comparison of global average time series forland-only and marine-only data (figure 12)demonstrates both a marked agreement in thetemperature trends, and a large difference inthe uncertainties. There are much larger un-

19

Page 20: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

Figure 11: HadCRUT3 hemisphere temperature anomaly time-series (◦C); Northern (top),Southern (middle) and difference (NH-SH, bottom). The solid black line is the best estimatevalue, the red band gives the 95% uncertainty range caused by station, sampling and measure-ment errors; the green band adds the 95% error range due to limited coverage; and the blueband adds the 95% error range due to bias errors.

20

Page 21: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1850 1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

Figure 12: Global average of land and marine components of HadCRUT3. (◦C); Land (top),Sea (middle) and difference (Land-Sea, bottom). The solid black line is the best estimate value,the red band gives the 95% uncertainty range caused by station, sampling and measurementerrors; the green band adds the 95% error range due to limited coverage; and the blue bandadds the 95% error range due to bias errors.

21

Page 22: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

certainties in the land data because the surfaceair temperature over land is much more vari-able than the SST. SSTs change slowly andare highly correlated in space; but the land airtemperature at a given station has a lower cor-relation with regional and global temperaturesthan a point SST measurement, because landair temperature (LAT) anomalies can changerapidly in both time and space. This meansthat one SST measurement is more informativeabout large scale temperature averages thanone LAT measurement. This difference alsoshows in the hemispheric differences (figure11): the Southern Hemisphere (SH) series hasa similar uncertainty to the Northern Hemi-sphere (NH) series despite there being manymore observations in the NH. This is becausea larger fraction of the SH is sea, so fewer ob-servations are needed.

The difference between the land and sea tem-peratures (figure 12, bottom) is not distin-guishable from zero until about 1980. Thereare several possible causes for the recentincrease: it could be a real effect: theland warming faster than the ocean (thisis an expected response to increasing green-house gas concentrations in the atmosphere[Barnett et al., 2000], but it could also indi-cate a change in the atmospheric circulation[Parker et al., 1994]), it could indicate an un-corrected bias in one or both data sources (seesection 6 of [Rayner et al., 2006]), or it couldbe a combination of these effects. These issueshave not been pursued further here, but suchstudies will form part of future work on landand marine temperatures and their uncertain-ties.

6.3 Comparison of global time serieswith previous versions

Figure 13 shows time-series of the global aver-age of the land data, the marine data, and theblended dataset with their uncertainty ranges,

and compares them to the previous versions ofeach dataset. The additions and improvementsmade to the land data do not make any largedifferences to the global land average, exceptvery early in the record where the uncertain-ties are large. The new marine data, however,do produce some sizable changes: refinementsto the climatology have produced an offset, andnew data have produced some other secularchanges in the series.

The differences between the old and new ma-rine data series are sometimes outside the er-ror range of the new series. Most of the dif-ference is a constant offset due to changes tothe climatology, and uncertainties in the clima-tology are not part of the error model for themarine data. (In the land data climatologiesare estimated for each station, and as the mixof stations in any one grid box changes withtime so does the climatology. So uncertaintiesin the station climatology are a component ofthe uncertainty in changes of gridded land tem-perature anomalies. But for the marine data,climatologies are specified for each grid box,and they are constant in time, so uncertaintiesin the marine climatology do not contributedirectly to uncertainties in changes in marinetemperature anomalies). But even after re-moving the constant offset produced by theclimatology change, there are still differencesbetween the old and new SST series that arelarger than the assessed random and samplingerrors. These differences suggest the presenceof additional error components in the marinedata. At the moment, the nature of these er-ror components is not known for certain, butthe main difference between the old and newdatasets is the use of different sets of obser-vations [Rayner et al., 2006]. It seems likelythat different groups of observations may bemeasuring SST in different ways even in recentdecades, and therefore there may be unresolvedbias uncertainties in the modern data. Quan-tifying such effects will be a priority in future

22

Page 23: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

-1

-0.5

0

0.5

1

Tem

pera

ture

(C)

CRUTEM3CRUTEM2

-1

-0.5

0

0.5

1

Tem

pera

ture

(C)

HadSST2MOHSST6

-1

-0.5

0

0.5

1

1856 1880 1904 1928 1952 1976 2000

Tem

pera

ture

(C)

Year (AD)

HadCRUT3HadCRUT2

Figure 13: New dataset versions and their 95% uncertainty ranges (in blue), compared withthe previous version of each dataset (in red). The top panel shows the land data, the middlepanel the marine data, and the bottom panel the combined data.

23

Page 24: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

work on marine data.

6.4 Comparison with Central Eng-land Temperature

The Central England Temperature series(CET) is the longest instrumental tempera-ture record in the world [Parker et al., 1992].It records the temperature of a triangular por-tion of England bounded by London, Hereford-shire and Lancashire, and provides mean dailytemperature estimates back to 1772. The Had-CRUT3 and CET series do use some of thesame stations, but of the 13 sites that makesome contribution to CRUTEM3 in the CETregion, no more than 2 also contribute to CET,and there are always also stations contributingto CET but not to CRUTEM3. So if CETcorresponds closely to the HadCRUT3 valuefor the central England grid box, it suggeststhat both series are correctly describing thelocal temperature changes, and is not simplya consequence of shared inputs. Recently, un-certainty estimates have been derived for CETsince 1878 [Parker & Horton, 2005].

The area covered by CET is less than 1 gridbox in the 5◦×5◦ gridded CRUTEM3 dataset.Comparing the CET data with the correspond-ing grid box in CRUTEM3 (figure 14) showsencouraging agreement: despite being basedon largely different observations, the two seriesagree within their uncertainties.

Doing the same comparison using the fullHadCRUT3 data (blended land and sea) givesa different picture (figure 15). The 5◦ × 5◦

grid box covering the CET region also containsmuch of the Irish Sea and the English Channel;both regions where there are many SST ob-servations. Many SST observations mean thatthe uncertainty on the SST monthly means issmall, so the blended value is biased towardsSST and has a small uncertainty. Addingthe SST data has reduced the agreement withCET; and the uncertainty in the HadCRUT3

value is much smaller than the CRUTEM3 un-certainty because there are a lot of SST obser-vations around the British coast. The uncer-tainty varies in time because, unlike the landdata, the number of SST observations changeswith time: the uncertainty increases in theearly part of the series and during the twoworld wars are quite noticeable. This figuredemonstrates that the land and sea temper-ature anomalies in one 5◦ × 5◦ grid box canhave sizable differences in their annual values,although the longer-term changes are very sim-ilar.

Because of these land-sea differences it willsometimes be better to use the land and seaspecific data rather than the blended dataset.For example when looking at paleo data fromtree-rings near coasts it is probably betterto use the land dataset CRUTEM3 than theblended dataset HadCRUT3. Similarly for pa-leo data from coastal corals the SST datasetshould be used.

7 Conclusions

A new version of the gridded historical sur-face temperature dataset HadCRUT3 has beenproduced. This dataset is a collaborative prod-uct of scientists at the Met Office Hadley Cen-tre (who provide the marine data), and atthe Climatic Research Unit at the Universityof East Anglia (who provide the land-surfacedata). The new dataset benefits from the im-provements to the marine data described in[Rayner et al., 2006] as well as the improve-ments to the land data described in this pa-per. But the principal advance over previ-ous versions of the dataset [Jones et al., 2001,Jones & Moberg, 2003] is in the provision of acomprehensive set of uncertainties to accom-pany the gridded temperature anomalies.

As well as variance adjustments (ad-justments to the data to allow for the

24

Page 25: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

Station, Sampling and bias uncertaintyStation and sampling uncertainty

Best estimateCET

Figure 14: CRUTEM3 (for 50–55◦N , 0–5◦W ) comparison with CET (Error ranges are 95%).

-2

-1.5

-1

-0.5

0

0.5

1

1.5

1875 1900 1925 1950 1975 2000

Ano

mal

y (C

)

Year (AD)

Station, Sampling and bias uncertaintyStation and sampling uncertainty

Best estimateCET

Figure 15: HadCRUT3 (for 50–55◦N , 0–5◦W ) comparison with CET (Error ranges are 95%).

25

Page 26: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

changing numbers of observations), fieldsof measurement and sampling, and biasuncertainty have been produced. Allthe gridded datasets, and some time-seriesderived from them, are available fromthe websites http://www.hadobs.org andhttp://www.cru.uea.ac.uk.

The gridded datasets start in 1850 becausethere are too few observations available frombefore this date to make a useful gridded field.Many marine observations from the first halfof the 19th century are known to exist in logbooks kept in the British Museum and theU.K. National Archive, but these observationshave never been digitised. If these observa-tions were available, it is likely that the grid-ded datasets, and so information on surface cli-mate change and variability, could be extendedby several decades.

Acknowledgements

This work was funded by the Public Met. Ser-vice Research and Development Contract; andby the Department for Environment, Food andRural Affairs, under Contract PECD/7/12/37.The development of the basic land stationdataset has been supported over the last 27years by the Office of science (BER), U.S. De-partment of Energy; most recently by grantDE-FG02-98ER62601. The authors are grate-ful to Tom Peterson, of the U.S. National Cli-matic Data Centre, for many valuable sugges-tions.

A Variance adjustmentmethod

The relationship between the variance in a gridbox and the variance of individual station ob-servations is given by [Jones et al., 1997]

σ2n =

σ2i (1 + (n− 1)r)

n, (6)

where σ2n is the variance of the grid-box aver-

age, σ2i is the mean variance of the individual

station time series that contribute to that grid-box average and r is the average correlation ofstations within the grid box. Two interestingvariables can be derived from this. The first isthe true grid-box variance, σ2

n=∞. That is thevariance the grid box average would have if itcontained an infinite number of observations

σ2n=∞ = σ2

i r. (7)

The second is the sampling error, m2s, equal

to the difference between equations 6 and 7

m2s =

σ2i (1− r)

n. (8)

Equation 6 assumes that the time-series ofthe grid-box anomaly is stationary. In fact,the average temperature in an area defined bya grid box exhibits natural variability on a va-riety of time scales: a long-term trend (perhapsdue to global warming), inter-decadal variabil-ity (perhaps due to modes like ENSO) andhigher-frequency natural variability. To ensurethat the series is stationary, the anomalies inindividual grid boxes were detrended using asix-year running average centred on the monthof interest.

The detrended anomalies were then multi-plied by an adjustment factor,

k =

√σ2

n=∞m2

total(n, t) + σ2n=∞

(9)

where m2total is the estimated random error -

a combination of sampling, measurement andother errors - expressed as a function of thenumber of observations, n, and time, t. Formarine data m2

total and σ2n=∞ were as calcu-

lated as in [Rayner et al., 2006]. For land datavalues for m2

total were calculated as in section2.3 and the values of σ2

n=∞ were calculatedfrom equation 7 using the individual stationvariances and the average correlations between

26

Page 27: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

them. After the adjustment factor was applied,the smoothed series was added back to recoverthe variance adjusted time series.

A.1 Test of the Method

If it is working well, variance adjustmentshould reduce the random noise in the tem-perature values introduced by having onlya limited number of observations, but leavethe real underlying temperature variations un-changed. This can’t be tested using the ac-tual HadCRUT3 data, as the distinction be-tween real variations and noise is unknown.To test the method, the pseudo-proxy methodof [von Storch et al., 2004] has been adaptedto instrumental data. A synthetic version ofHadCRUT3 has been made by adding noise tosubsampled GCM temperature data; the testis then to see how well variance adjustmentrecovers the original GCM data from the syn-thetic HadCRUT3.

A.1.1 Making the synthetic dataset

A synthetic dataset was constructed usingan all forcings run [Tett et al., 2006] of theHadCM3 [Gordon et al., 2000] GCM. Valuesof r for land data grid boxes were calculatedfor the detrended model data following the pro-cedure in [Jones et al., 1997]. In marine gridboxes, values of r were calculated in both timeand space to take into account the fact thatmarine observations are point measurementsrather than monthly averages as in the landdata. The time component was calculated byfitting an exponential to the lagged correla-tions of monthly anomalies in a given grid boxand using the fitted correlation decay time toestimate the average correlation across the gridbox. These were used to calculate estimatedstation variances by assuming that the vari-ance of the model temperature anomalies in agrid box represented the variance in that grid

box for an infinite number of stations, σ2n=∞.

In this instance the value of σ2i can be easily ex-

tracted from equation 7. These average stationvariances were then used to create a synthetictime series for each grid box that showed vari-ance fluctuations of a kind seen in the obser-vational data. The variance of the time serieswas inflated by adding random noise of vari-ance, v2, calculated using

v2 =σ2

i (1− r) + m2m

n(10)

where the n were a realistic distribution ofnumbers of observations as obtained from thehistorical records of monthly average temper-atures. m2

m was an estimate of the measure-ment error, which was assumed to be negligibleover land. Three realisations of the syntheticdata were created. They differed only in therandom numbers used to generate the randomnoise which was added to the time series.

A.1.2 Comparing adjusted and truedata

The synthetic data were then run through thevariance adjustment algorithms and the vari-ance of the output was compared to that ofthe original model data (see figure 16). Beforevariance adjustment the variance of an aver-age land data grid box was overestimated byaround 11% and the variance of an average ma-rine grid box by 180%. After variance adjust-ment the variance of an average land data gridbox was found to be underestimated by lessthan 2% and the variance of an average ma-rine data grid box was underestimated by 5%.In the marine case, discrepancies from the truevariance can be larger than this in individualgrid boxes, although in all cases the adjustedvariance is closer to the true value than theunadjusted variance.

In individual grid boxes variance adjustmenttypically brings the synthetic data closer to

27

Page 28: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

15oE, 40oS

1860 1880 1900 1920 1940 1960 1980 2000Year

0.0

0.5

1.0

1.5

S.D

. (o C

)

0

100

200

300

400

Num

ber o

f Obs

erva

tions

180oW, 45oS

1860 1880 1900 1920 1940 1960 1980 2000Year

0.0

0.5

1.0

1.5

2.0

S.D

. (o C

)

0

100

200

300

400

500

Num

ber o

f Obs

erva

tions

105oW, 40oN

1860 1880 1900 1920 1940 1960 1980 2000Year

0

2

4

S.D

. (o C

)

0

5

10

15

20

Num

ber o

f Obs

erva

tions

15oE, 25oS

1860 1880 1900 1920 1940 1960 1980 2000Year

0.0

0.5

1.0

1.5

2.0

S.D

. (o C

)

0.0

0.5

1.0

1.5

2.0

Num

ber o

f Obs

erva

tions

Figure 16: Running 10 year standard deviations (◦C, left axis scale) are shown for four gridboxes. The blue line shows the standard deviation of the perfect model data masked to havethe same coverage as the data. The red line shows the standard deviation of the synthetic databefore variance adjustment and the black line shows the standard deviation of the syntheticdata after variance adjustment. The number of observations is also shown (right hand scale).The top two panels are marine grid boxes, the lower two are land grid boxes

28

Page 29: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

Average sea surface temperature Tropical Atlantic gridbox

1875 1900 1925 1950 1975 2000Year

−1.0

−0.5

0.0

0.5

1.0

Ano

mal

y o C

wrt

1961

−90

T−T(perfect) Tropical Atlantic gridbox

1875 1900 1925 1950 1975 2000Year

−1.0

−0.5

0.0

0.5

1.0

Diff

eren

ce o C

perfect model datauncorrected synthetic (error=0.09oC)variance corrected synthetic (error=0.06oC)

Figure 17: (a) Annual average sea surface temperatures from a grid box in the Tropical Atlanticfor the original model data (magenta) and three realisations of the synthetic data before (cyan)and after (black) variance adjustment. (b) Shows the difference between the unadjusted syn-thetic data and model data (cyan) and the difference between the variance adjusted syntheticdata and model data (black)

29

Page 30: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

15oE,40oS

−2 0Anomaly (oC)

0

500

1000

1500

Num

ber o

f mon

ths

perfect model datauncorrected syntheticvariance corrected synthetic

180oW,45oS

−2 −1 0 1 2Anomaly (oC)

0

200

400

600

800

1000

Num

ber o

f mon

ths

perfect model datauncorrected syntheticvariance corrected synthetic

90oE,30oS

−2 −1 0 1 2 3Anomaly (oC)

0

200

400

600

800

1000

Num

ber o

f mon

ths

perfect model datauncorrected syntheticvariance corrected synthetic

15oW,50oN

−2 −1 0 1 2Anomaly (oC)

0

500

1000

1500

Num

ber o

f mon

ths

perfect model datauncorrected syntheticvariance corrected synthetic

Figure 18: Cumulative frequency distributions of monthly anomalies in four marine grid boxes.The magenta lines show the original model data, the cyan lines show the three realisations ofthe unadjusted synthetic data and the black lines show the three realisations of the varianceadjusted synthetic data.

30

Page 31: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

Average sea surface temperature Globe

1875 1900 1925 1950 1975 2000Year

−0.6

−0.4

−0.2

−0.0

0.2

0.4

0.6

Ano

mal

y o C

wrt

1961

−90

T−T(perfect) Globe

1875 1900 1925 1950 1975 2000Year

−0.06

−0.04

−0.02

0.00

0.02

0.04

0.06

Diff

eren

ce o C

perfect model datauncorrected synthetic (error=0.01oC)variance corrected synthetic (error=0.02oC)

Figure 19: (a) Annual average sea surface temperatures from the whole globe for the originalmodel data (magenta) and three realisations of the synthetic data before (cyan) and after(black) variance adjustment. (b) Shows the difference between the unadjusted synthetic dataand model data (cyan) and the difference between the variance adjusted synthetic data andmodel data (black)

31

Page 32: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

the true value (see for example Figure 17), es-pecially at times when such adjustments arelarge. This is notable, for example, during thesecond world war or early in the record. Thefrequencies of individual grid-box monthly av-erages are also typically improved (see for ex-ample Figure 18) with extreme outliers due tonoise being effectively adjusted. This meansthat it is possible to make more meaningfulanalyses of the occurrences of true extremesusing the variance adjusted data. However,when these individual variance-adjusted grid-box values are averaged over large regions (Fig-ure 19), the opposite is true. Whereas the ran-dom errors of individual grid boxes tend to can-cel out when averaged, the cumulative effect ofthe hundreds of slight, but correlated, varianceadjustments is to reduce the variance of the re-gional average.

Some degradation of the true temperaturesignal is inevitable, as no filter can perfectlyseparate out the measurement and samplingerror component of the temperature signal,and the reduction applied to the noise compo-nent will then be applied to some of the signalas well. Despite this, the variance adjustmentprocess is very successful at the grid-box scale.

References

[Barnett et al., 2000] Barnett, Tim .P.,Gabriele Hegerl, Tom Knutson andSimon Tett (2000) Uncertainty levels inpredicted patterns of anthropogenic cli-mate change, J. G. R., 105, pp 15525–42,(2000JD00162).

[Begert et al., 2005] Begert, M., T. Schlegeland W. Kirchhofer, Homogeneous temper-ature and precipitation series of Switzer-land for 1864 to 2000, Int. J. Climatol.,25, pp65-80, 2005.

[Diaz et al., 2002] Diaz, H., C. Folland, T.Manabe, D. Parker, R. Reynolds and S.Woodruff (2002) Workshop on Advancesin the Use of Historical Marine ClimateData. WMO Bull., 51, 377-380

[Folland and Parker, 1995] Folland, C.K. andParker, D.E. (1995) Correction of instru-mental biases in historical sea surfacetemperature data Quart. J. Roy. Meteor.Soc., 121, 319-67.

[Folland et al., 1986] Folland, C.K., Parker,D.E. and T.N. Palmer (1986) Sahel rain-fall and worldwide sea temperatures 1901-85 Nature, 320, 602-607.

[Folland et al., 2001] Folland, C. K., N. A.Rayner, S. J. Brown, T. M. Smith, S. S.P. Shen, D. E. Parker, I. Macadam, P.D. Jones, R. N. Jones, N. Nichols and D.M. H. Sexton (2001), Global temperaturechange and its uncertainties since 1861,G.R.L, 28, 13, 2621–24, (2001GL012877).

[Gordon et al., 2000] Gordon, C., ClaireCooper, Catherine A Senior, HeleneBanks, Jonathan M Gregory, TimothyC Johns, John FB Mitchell, Richard AWood (2000), The Simulation of SST, SeaIce Extents and Ocean Heat Transportsin a version of the Hadley Centre Cou-pled Model without Flux Adjustments,Climate Dynamics, 16, pp 147–68.

[Houghton et al., 2001] Houghton, J.T., Y.Ding, D.J. Griggs, M. Nouger, P.J. vander Linden, X. Dai, K. Maskell and C.A.Johnson (eds.) (2001), IPCC, 2001: Cli-mate Change 2001: The scientific Basis.Contribution of Working Group I to theThird Assessment Report of the Inter-governmental Panel on Climate Change,Cambridge University Press, 881pp.

32

Page 33: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

[Johns et al., 2004] Johns, T., Chris Durman,Helene Banks, Malcolm Roberts, AlisonMcLaren, Jeff Ridley, Catherine Senior,Keith Williams, Andy Jones, Ann Keen,Graham Rickard, Stephen Cusack, ManojJoshi, Mark Ringer, Buwen Dong, Hi-lary Spencer, Richard Hill, Jonathan Gre-gory, Anne Pardaens, Jason Lowe, Alejan-dro Bodas-Salcedo, Sheila Stark, YvonneSearl (2004), HadGEM1 - Model descrip-tion and analysis of preliminary experi-ments for the IPCC Fourth AssessmentReport, Hadley Centre Technical Note,55.

[Jones et al., 1985] Jones, P.D., Raper,S.C.B., Santer, B.D., Cherry, B.S.G.,Goodess, C.M., Kelly, P.M., Wigley,T.M.L., Bradley, R.S. and Diaz, H.F.(1985) A Grid Point Surface Air Tem-perature Data Set for the NorthernHemisphere, U.S. Dept. of Energy, Car-bon Dioxide Research Division, TechnicalReport TRO22, 251 pp.

[Jones et al., 1986] Jones, P.D., Raper,S.C.B., Cherry, B.S.G., Goodess, C.M.and Wigley, T.M.L. (1986) A Grid PointSurface Air Temperature Data Set forthe Southern Hemisphere, 1851-1984,U.S. Dept. of Energy, Carbon DioxideResearch Division, Technical ReportTR027, 73 pp.

[Jones et al., 1990] Jones, P. D., P.Ya. Grois-man, M. Coughlan, N. Plummer, W.C.Wang, and T. R. Karl, Assessment of ur-banization effects in time series of surfaceair temperature over land (1990), Nature,347, pp169-172.

[Jones, 1994] Jones, P. D., (1994), Hemi-spheric surface air temperature variations:A re-analysis and an update to 1993, J.Clim., 7, 1794–802.

[Jones et al., 1997] Jones, P. D., T. J. Osborn,and K. R. Briffa (1997), Estimating Sam-pling Errors in Large-Scale TemperatureAverages, J. Clim., 10, 2548–68.

[Jones et al., 2001] Jones, P. D., T. J. Osborn,K. R. Briffa, C. K. Folland, E. B. Hor-ton, L. V. Alexander, D. E. Parker andN. A. Rayner (2001), Adjusting for sam-pling density in grid box land and oceansurface temperature time series, J. G. R.,016, D4, 3371–80, (2000JD900564).

[Jones & Moberg, 2003] Jones, P. D., and A.Moberg (2003), Hemispheric and Large-Scale Surface Air Temperature Variations:An Extensive Revision and an Update to2001, J. Clim., 16, 206–23.

[Kalnay et al., 1996] Kalnay, E., M. Kana-mitsu, R. Kistler, W. Collins, D. Deaven,L. Gandin, M. Iredell, S. Saha, G. White,J. Woollen, Y. Zhu, A. Leetmaa, and B.Reynolds (1996), The NCEP/NCAR 40-year reanalysis project, Bull. Am. Me-terol. Soc., 77, pp 437–471.

[Kalnay & Cai, 2003] Kalnay Eugenia, andMing Cai (2003), Impact of urbanizationand land-use change on climate, Nature,423, pp 528-31.

[Kerr, 2005] Kerr, R.A (2005), Atlantic Cli-mate Pacemaker for Millennia Past,Decades Hence?, Science, 309, 5731, pp41–3.

[Manabe, 2003] Manabe, T (2003) The Kobecollection: newly digitized Japanese his-torical surface marine meteorological ob-servations, in Advances in the Appli-cations of Marine Climatology - TheDynamic Part of the WMO Guide tothe Applications of Marine Meteorology,WMO/TD-No. 1081 (JCOMM Technical

33

Page 34: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

Report No. 13), World Meteorological Or-ganization, Geneva, 246 pp.

[Parker, 1994] Parker, D. E. (1994), Effectsof changing exposure of thermometers atland stations, Int. J. Climatol., 14, pp 1-31.

[Parker, 2004] Parker, David E., (2004),Large-scale warming is not urban,Nature, 290, pp 290.

[Parker et al., 1992] Parker, D.E., T.P. Leggand C.K. Folland (1992), A new dailyCentral England Temperature series, Int.J. Climatol., 12, pp 317–42.

[Parker et al., 1994] Parker, D.E., P.D. Jones,A. Bevan and C.K. Folland (1994), Inter-decadal Changes of surface temperaturesince the 19th century, J. G. R., 99, pp14373–99, (94JD00548).

[Parker et al., 1995] Parker, D.E., C.K. Fol-land and M. Jackson (2005), Marine Sur-face Temperature: Observed variationsand data requirements, Climatic Change,31, pp 559-600.

[Parker & Horton, 2005] Parker, David E.,and Briony Horton (2005), Uncertaintiesin Central England Temperature 1873-2003 and some improvements to the max-imum and minimum series, Accepted byInt. J. Climatol.

[Peterson, 2004] Peterson, Thomas C., (2004),Assessment of Urban Versus Rural In SituSurface Temperatures in the ContiguousUnited States: No Difference Found, J.Clim, 16, 18, pp 2941–59.

[Peterson & Owen, 2005] Peterson, ThomasC., and Timothy W. Owen (2005), UrbanHeat Island Assessment: Metadata areimportant, J. Clim., 18, 18, pp 2637–46.

[Rayner et al., 2006] Rayner, N. A., P. Bro-han, D. E. Parker, C. K. Folland, J.Kennedy, M. Vanicek, T. Ansell, and S.F. B. Tett (2006), Improved analyses ofchanges and uncertainties in sea-surfacetemperature measured in situ since themid-nineteenth century, Accepted by J.Clim.

[Rumsfeld, 2004] The Acronym Institute.Disarmament documentation. Backto disarmament documentation, June2002. Defense secretary Rumsfeldpress conference, June 6. ”Secretaryof Defense Donald H. Rumsfeld, pressconference at NATO headquarters,Brussels, Belgium, June 6, 2002,”US Department of Defense transcript.www.acronym.org.uk/docs/0206/doc04.htm

[Simmons et al., 2004] Simmons, A. J., P. D.Jones, V. da Costa Bechtold, A. C. M.Beljaars, P. W. Kallberg, S. Saarinen,S. M. Uppala, P. Viterbo and N. Wedi(2004), Comparison of trends and low-frequency variability in CRU, ERA-40and NCEP/NCAR analyses of surfaceair temperature, J. G. R., 109, D24115,(2004JD005306).

[Tett et al., 2006] Tett, Simon F.B., RichardBetts, Thomas J. Crowley, Tim Johns,Andy Jones, Jonathan Gregory, TimOsborn, Elisabeth Ostrom, David L.Roberts, Margaret J. Woodage (2006),The impact of Natural and AnthropogenicForcings on Climate since 1550, Submittedto Climate Dynamics.

[Turner et al., 2005] Turner, J., Steve R. Col-well, Gareth J. Marshall, Tom A. Lachlan-Cope, Andrew M. Carleton, Phil D. Jones,Victor Lagun, Phil A. Reid and Svet-lana Iagovkina (2005), Antarctic Climate

34

Page 35: Met Office Hadley Centre observations datasets - …hadobs.metoffice.com/hadcrut3/HadCRUT3_accepted.pdf-80-60-40-20 0 20 40 60 80-150 -100 -50 0 50 100 150 Figure 1: Land station coverage.

Change During the Last 50 Years, Int. J.Climatol., 25, pp 279–94.

[Vincent & Gullet, 1999] Vincent, L.A., andD.W. Gullet (1999), Canadian Historicaland Homogeneous temperature datasetsfor climate change analyses, Int. J. Cli-matol., 19, pp 1375–88.

[von Storch et al., 2004] von Storch, H., Ed-uardo Zorita, Julie Jones, Yegor Dim-itriev, Fidel Gonzalez-Rouco and SimonTett (2004), Reconstructing Past Climatefrom Noisy Data, Science, 306, 5696, pp679–82.

[WMO, 1996] WMO, (1996), ClimatologicalNormals (CLINO) for the period 1961–1990, World Meteorological OrganizationDoc, WMO/OMM No. 847, Geneva,Switzerland, 768pp.

[Woodruff et al., 2003] Woodruff, S.D., S.J.Worley, J.A. Arnott, H.F. Diaz, J.D.Elms, M. Jackson, S.J. Lubker and D.E.Parker (2003) COADS updates and theblend with the U.K. Met Office Ma-rine Data Bank in Advances in the Ap-plications of Marine Climatology - TheDynamic Part of the WMO Guide tothe Applications of Marine Meteorology,WMO/TD-No. 1081 (JCOMM TechnicalReport No. 13), World Meteorological Or-ganization, Geneva, 246 pp.

[Zhou et al., 2004] Zhou, L., Robert E Dickin-son, Yuhong Tian, Jingyun Fang, Qingxi-ang Li, Robert K Kaufmann, Compton JTucker, and Ranga B Myneni (2004), Ev-idence for a significant urbanization effecton climate in China, PNAS, 101, 26, pp9540-9544.

35