Long-term Corrections for Wind Resource Assessment

DTU Wind Energy-Master-Series-M-0047(EN)

Long-term Corrections for Wind

Resource Assessment

Alfonso Perez-AndujarSupervised by:Alfredo Pena and Andrea N. Hahmann

DTU Wind Energy, Ris Campus,Technical University of Denmark, Roskilde, Denmark

December 2013

Author: Alfonso Perez-AndujarSupervised by:Alfredo Pena and Andrea N. HahmannTitle: Long-term Corrections for WindResource AssessmentDepartment: DTU Wind Energy

Abstract (max. 2000 char)

This document is a MSc thesis developed for DTU WindEnergy at Ris Campus. It is mainly a study of differentlong-term correction methodologies, which estimate what theobserved wind climate might look like, had measurements startedlong before. Long-term corrections are commonly assumed torepresent the future long-term wind climatology, so this assump-tion was also investigated.

Long-term corrections are derived from the relationshipbetween the reference and the observed wind speed time series,in the time window where both are concurrent. The time windowor concurrent subset can be made to change in length andposition along the total concurrent set, especially if observationsare long, as in this thesis. Thus, for different concurrent subsetlengths and positions, long-term corrected Weibull parameters

A and k, as well as the long-term corrected power density P ,were compared to those which had been actually observed atthe site. This was done by means of bias ratios of long-termcorrected to observed parameters. For each subset length, themean and standard deviation of each bias ratio was calculated,over all possible positions of that subset within the totalconcurrent set; it was seen that 12 months is a long-enoughduration of the concurrent period in order to observe a gen-eral stabilisation of the three bias ratios. Furthermore, theWeibull method was the absolute best of all non-regressionmethods at yielding bias ratios closest to 1, while regardingthe regression methods, the Variance Ratio method is the winner.

The pasts representativeness of the future long-term windclimatology was explored as well: how representative the concur-rent subset is of the full concurrent set clearly determined howwell the LTC (derived from the concurrent subset) representsthe future. Also, there is only a subtle difference between thecase where the past is just long-term reference wind speed, andthe case where it is long-term corrected wind speed.

DTU WindEnergy-Master-Series-00XX(EN)December 9, 2013

ISSN:ISBN:XXX

Contract no:XXX

Project no:XX

Sponsorship:XX

Cover:

Pages: 100Tables: 6Figures: 79References: 0

Technical Universityof DenmarkFrederiksborgvej 3994000 RoskildeDenmarkTel. [email protected]

Contents

1 Introduction 7

2 Wind Power Meteorology 11

3 Theory: long-term correction methods 163.1 Regression methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.2 Non-regression methods . . . . . . . . . . . . . . . . . . . . . . . . . 18

4 Site description 21

5 WRF 23

6 General pre-processing and data treatment 266.1 General pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . 266.2 Data treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276.3 The effect of fixing invalid data on the correlation . . . . . . . . . . . 28

7 Sensitivity analysis on the correlation 327.1 Sensitivity to time-shifting the concurrent time series . . . . . . . . . . 327.2 Sensitivity to rotating the reference wind direction . . . . . . . . . . . 327.3 Sensitivity to widening the averaging time-range around minute 00 in

the observed 10-min average dataset . . . . . . . . . . . . . . . . . . 33

8 The wind climate at Hvsre 358.1 The local wind speed and direction . . . . . . . . . . . . . . . . . . . 358.2 The effect of averaging on the WRF-observations correlation . . . . . . 378.3 A description of each observed year . . . . . . . . . . . . . . . . . . . 408.4 Similarity of concurrent WRF-derived and observed parameters . . . . 44

9 Results I - Which is the best LTC method? 489.1 How many months are enough to long-term correct? . . . . . . . . . . 499.2 The 12-month concurrent subset . . . . . . . . . . . . . . . . . . . . 569.3 Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599.4 u and v: an alternative approach . . . . . . . . . . . . . . . . . . . . 62

10 Results II - Can LTCs estimate the future? 6810.1 Description of scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 6810.2 Choice of concurrent year . . . . . . . . . . . . . . . . . . . . . . . . 7010.3 LTCs representing the future for different methods . . . . . . . . . . . 74

4 DTU Wind Energy-Master-Series-M-0047(EN)

11 Discussion 77

12 Conclusions 82

13 References 84

A Appendix 86

DTU Wind Energy-Master-Series-M-0047(EN) 5

I would like to thank my supervisors Alfredo and Andrea for the time spent togetherduring the development of this thesis and especially for their help during the final cor-rections. Thanks also to Sonia Lileo, Knut Harstveit and Rickard Klinkert from KjellerVindteknikk for their kind emails and constant help; Anthony Rogers for his advice;Alan Mortimer for his time and help by phone; Colin Ritter for his help and suggestions;Niels G. Mortensen for the papers he printed for me; and finally to Wolfgang Schlezand the guys from Garrad Hassan for the access they gave us to WindFarmer.

Thanks to my friends Matteo and Philippe, to this beautiful country where I metthe one and only Magic Mike; to Sandra and, of course, to my family, who are alwaysa refuge.


1 Introduction

Projected wind farms are getting increasingly larger with time in terms of turbine sizeand investment. A wind farm developer needs to minimise the financial risk by calcu-lating the best possible estimate of what the future long-term wind climatology at thesite of interest will be like, i.e. estimating the future power production. This, however,can only be done using measurements from the past and implies assuming that this isa reasonable approach for predicting the future climatology. Moreover, there may notbe more than a year of wind speed and direction observations at the target site, and toobtain a trustworthy all-time average that accounts for the local interannual variations,around 810 years are needed. Such long observations are of course very hard to findat target sites because on-site measuring campaigns generally last not much longerthan a year.

To circumvent the shortcoming of having only short-term on-site observations, method-ologies known as long-term corrections (LTCs) are commonly used in wind resourceassessment to give an estimation of the long-term past wind climatology that couldhave been measured at a target site. LTC methods work by exploring relationshipsbetween the short-term observations at the site and the short-term slice of a longerreference time series which is concurrent to it. The long-term reference time seriescan be a long-term observation from a nearby site, a dataset from analysis or reanal-ysis data or results from numerical weather prediction models. From the concurrentshort-term observed and reference datasets, some correction factors are established, bymeans of which the long-term time series can be transferred onto the target site.

Figure 1: LTC general scheme for two imaginary concurrent time series.


Figure 1 shows imaginary long-term reference and short-term on-site observed timeseries. The longest possible concurrent period, marked in green, therefore comprisesthe total observed set, but comprises, on the other hand, just a slice of the long-term dataset. The resulting LTC could go, in this case, as far back as year 10 andthus constitute the wind climatology that could have been measured at the targetsite, had measurements started earlier. This is why LTCs are not in essence, as of-ten termed, predictions of the future wind climatology, but rather could-have-beenhypotheses regarding an already past time. The energy yield of a long-term corrected(LTC) climatology is often assumed to give a trustworthy idea of the future energyyield. This is the same as assuming that the LTC climatology is representative of thefuture, which is a reasonable assumption only if the climatology of the area is knownto vary mildly with time.

Two main questions thus arise. First of all, how accurately do the long-term ref-erence data describe the long-term wind climatology at the site? Of course,since the whole purpose of using long-term reference data is precisely to account for thelack of long-term observations at the site, it may seem preposterous to try to comparelong-term reference data with what is actually being looked for. However, if there arelong-term data at both the reference and the target sites for the same period (as inthis thesis), it is interesting to see how similar reference long-term data are to actuallong-term site observations. This consideration has yet nothing to do with LTCs assuch, but of course, if the reference wind climatology is not in the least representativeof the sitess actual wind climatology, probably most LTC methods will give biasedresults, since long-term reference data are the key ingredient of a LTC.

Regarding the issue of similarity between reference data and observations, Lileo etal. (2013) conducted an investigation on what they termed the representativeness ofthe reference wind speed, i.e. how well the reference wind speed represents the con-current site wind speed. They investigated how well reference wind speeds representobserved wind speeds, for 8 different reanalysis models and 42 measurement sites interrain with low complexity. They obtained the best results for those reanalysis refer-ence data coming from the Weather Reanalysis Forecast (WRF) model. In this respect,several different methods (introduced in section 3) will be used in order to generateLTCs which can be later compared to actual concurrent observations. These results willalso be compared to those obtained by Lileo et al. (2013) and Rogers et al. (2005). InLileo et al. (2013), the Knut & Harstveit (KH) method shows the best agreement withobservations in terms of mean wind speed and Weibull parameters A and k. Rogers etal. (2005) shows, on the other hand, that the Variance Ratio (VAR) and the Mortimer(MOR) methods are the closest. Note that in Lileo et al. (2013), long-term referencewind speeds and directions come from reanalysis, whereas Rogers et al. (2005) usedlong-term observations from a secondary mast. Moreover, the methods investigated inone paper are not investigated in the other.


This takes us to the main motivation of this thesis, which is to find out which LTCmethods give the best results. Indeed, even though LTC methods are a common stepin a prediction process of the future wind speed (as well as a relatively simple tool interms of implementation, at least when compared to flow and wake modelling), theyaccount for an average 2.5% of the total variability of the entire predictive process.This is more than the flow and wake variation put together, as seen in figure 2 of astudy carried out in 2011.

Figure 2: Coefficient of variation [%] added by each of the common steps in a prediction process of

the future wind speed. Taken with permission of Niels G. Mortensen, Comparison of Resource and

Energy Yield Assessment Procedures, 2011.

Therefore, a consensus should be reached as to which LTC method to use in a windresource assessment, and why.

Secondly, also an important motivation for doing this thesis: can LTCs predict thefuture wind climatology? If so, it may seem reasonable to hypothesise that the moredata gathered from the past, the more accurate the description of the future will be.However, does this hypothesis still hold reasonable, the longer the future period to beestimated? The assumption of the past being representative of the future has beenthe object of study in recent years. Lileo et al. (2013) investigated, for an already pastperiod of reanalysis data, how well different past windows (i.e. prior to some date in-side the chosen period) of wind speed represent a fixed future window of subsequent


years. They did this for each grid point over a certain focus region, using wind speedsobtained from the Twentieth Century Global Reanalysis Version II (20CRv2). In orderto get an idea of the pasts representativeness of the future, they defined an error bytaking the percentage difference in mean wind speed of the past and the futureperiods. They concluded that the mean wind speed of the near past is not necessarilythe best predictor of the future mean wind speed, as well as that each grid point hasan optimum length of past window, i.e. the number of past years needed to getthe best prediction (i.e. the minimum percentage error) is specific to each grid point.

In this thesis, a similar investigation is conducted. However, LTCs from the pastare compared to future observations. Of course, if the pasts representativeness ofthe future is to be studied, it would always be safer to use past years of observationsto compare them to future observations, rather than use past years of LTCs to com-pare to future observations. However, as mentioned earlier, there are usually no morethan 12 months of observations at a target site, so comparing past LTCs to futureobservations can solve the problem of lack of long-term observations, and tell us whichmethod yields the best result.

The LTC methods used are explained in section 3. They are classified as regression andnon-regression methods and easily found in the literature (Riedel et al. (2001), Nielsenet al. (2001), Woods and Watson (1997), Mortimer (1994), and also summarised inLileo et al. (2013) and Rogers et al. (2005)).

Section 4 describes the site of interest, the area of Hvsre, which is located in WesternDenmark; since WRF-derived wind speeds are used as long-term references, the basicprinciple underlying the model is explained in section 5. The filtering process applied toinvalid values of wind speed and direction found in the observed dataset is explained insection 6. Section 7 is a short investigation on how the correlation for the concurrentwind speed components varies under certain changing conditions. The climate at thesite is described in section 8. Section 9 explores the ability of the different LTC meth-ods to long-term correct different parameters describing the wind climatology. Finally,section 10 tries to answer the question of whether we can predict the future usinginformation from the past, at least for the specific case of Hvsre and the choice ofinputs for this work.


2 Wind Power Meteorology

It is common practice to describe the frequency of 10-min, 30-min, or 1-hr averagewind speeds U at some site, over a long-enough period (e.g. 1 year), by means of theWeibull probability density function (p.d.f.),

f(U) = kUk1

Akexp

((U

A

)k), (1)

where A and k are the Weibull parameters. Equation 1 shows that the frequency ofoccurrence of the wind, f(U), is driven just by A and k. This section is a brief de-scription of the different methods in which these two parameters can be calculatedfrom a wind speed time series. Using A and k, together with wind direction, is enoughinformation to characterise the site, at least for a study of this kind.

Before investigating different methods of calculating A and k, it is worth looking atcertain definitions, like for example the mean wind speed , which is a particular caseof the non-central moment when n = 1, and can be defined as

n =

0

Unf(U)dU. (2)

The variance 2 of the mean wind speed is another particular case of the centralmoment, when n=2,

n =

0

(U 1)nf(U)dU. (3)A very useful relationship for this study shall also be considered, involving non-centralmoments:

n = An(

1 +n

k

), (4)

where the gamma function is defined as:

(t) =

0

exxt1dx, (5)

and where t is a constant such that t > 1.

Square of the mean wind speedDividing the square of the first non-central moment (the square of the mean) by thesecond non-central moment (the mean of the square) gives an equation which is afunction only of k. This can then be solved iteratively, since it is a quotient of knownvalues,


212

=2(1 + 1

k

)(1 + 2

k

) . (6)This method will be referred to as 2NCM.

Cube of the mean wind speedDividing the cube of the first non-central moment (the cube of the mean) by the thirdnon-central moment (the mean of the cube) gives a result which is also just a functionof k,

313

=3(1 + 1

k

)(1 + 3

k

) . (7)This method will be referred to as 3NCM.

Maximum Likelihood EstimatorThis method was developed by Harter and Moore (1965). Let U1, U2, ..., UN be asample of N random and independently distributed wind speeds drawn from a p.d.f.that depends only on the wind speed U and on the parameter to be estimated, . Thelikelihood function of the random sample Ui, i = 1, ..., N , is denoted L and is the jointdensity of all Ui from the drawn sample,

L =Ni=1

f(Ui, ). (8)

The expression for L, when the p.d.f. is the Weibull probability density function, is:

L(U,A, k) =Ni=1

kUk1

Akexp

((U

A

)). (9)

The two equations above are enough to solve iteratively A and k,

lnL

A= 0 (10)

lnL

k= 0, (11)

and thus calculate which value of (A, k) maximises the likelihood function. Thismethod will be referred to as MLE.

Least Square MethodThe Weibull cumulative distribution function (c.d.f.), F (U), is obtained by integratingits p.d.f.,


F (U) =

U

f(U )dU = 1 exp((U

A

)k). (12)

Taking natural logarithms and rearranging equation 12 leads to

ln( ln(1 F (U))) = k ln c+ lnU, (13)which can be minimised e.g. via least squares. This method will be referred to as LSM.

The different Weibull parameters obtained from these four techniques were appliedto a wind speed time series in order to obtain four different distributions. These wereplotted alongside the histogram of the dataset, in order to see the differences betweenthem. Figure 3 shows the entire wind speed distribution, whereas figure 4 shows anamplificaton for better visualisation, since the f(U) curves from the different methodsare closely packed together. The p.d.f representing the LSM method (blue curve) givesnoticeably higher frequencies of occurrence for the speed range 512 m/s.

0 5 10 15 20 25 300

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Wind speed [m/s]

p.d.f.

Data histogram2NCM3NCMMLELSM

Figure 3: Histogram of the wind speed (bars), and Weibull distribution p.d.f.s based on different

methods: 2NCM (square of the mean wind speed), 3NCM (cube of the mean wind speed), MLE

(maximum likelihood estimator) and LSM (least square method).


0 2 4 6 8 10 12 14 16 18 200.03

0.04

0.05

0.06

0.07

0.08

0.09

Wind speed [m/s]

p.d.f.

Data histogram

2NCM3NCMMLELSM

Figure 4: Amplification of the histogram of wind speed (bars), and Weibull distribution p.d.f.s based

on different methods: 2NCM (square of the mean wind speed), 3NCM (cube of the mean wind

speed), MLE (maximum likelihood estimator) and LSM (least square method).

For a wind farm investor, besides A and k it is also very important to estimate thefuture wind power density at the site of interest, P . This third parameter is directlyderived from A and k,

PA,k =1

2A3

(1 +

3

k

). (14)

However, the wind speed power density can also be calculated directly from the timeseries speed values, by averaging over the cubed values of the time series,

PU3 =1

2U3. (15)

Having two approaches is advantageous because it allows for a direct comparisonbetween the single-valued PU3 (which is fixed for any given time series), and PA,kcoming from each of the four methods explained above. This comparison is shown intable 1 by means of

P =PU3 PA,k

PU3. (16)


Methods

Parameters 2NCM 3NCM MLE LSM

A [m/s] 10.47 10.47 10.46 10.36

k 2.19 2.17 2.18 2.26

P [%] 0.67 0.00 0.41 6.02

Table 1: Percentage error between the power density calculated as a function of the average cube

wind speed and the power density calculated as a function of A and k obtained through different

methods. The expression used is P = (PU3 PA,k)/PU3 .

The p.d.f. curves derived from the four different methods (figure 3 or 4) do not giveinformation on which of the four gives the best description of the wind power density.However, table 1 does show how a small increase in terms of the k parameter (from2.17 in 3NCM to 2.19 in 2NCM, i.e. 0.9%), keeping A constant, means a differenceof 0.7% in power density (from 847 W/m2 to 853 W/m2). Furthermore, the LSMmethod is by far the worst in terms of P , with just a 1.1% difference in A withrespect to the three other methods. It can be concluded that, while 2NCM, 3NCM andMLE yield very similar Weibull parameters, PA,k is so sensitive that only the 3NCMmethod is the best approach to estimating an accurate value of the power density.

Using the Weibull parameters is useful in that it describes the local wind climatol-ogy through just two parameters. For example, whenever sector-wise observed windspeeds are generalised, it is a much better choice, in terms of computational cost, tohandle just two parameters per sector instead of generalising value after speed value.


3 Theory: long-term correction methods

Long-term correction methodologies need a short-term wind speed or direction ob-served dataset at the site of interest, and a long-term reference time series. It ismoreover necessary for both time series to be concurrent during a certain period oftime. A wind farm developer would usually use the entire short-term time series andslice the corresponding piece of the reference long-term time series which is concurrentin time with it. These two concurrent time series of equal length can then be usedto calculate the LTC factors, which, applied to the entire reference long-term dataset,give the long-term correction and thus an estimation of the sites long-term climatology.

Expressions such as reference concurrent and short-term reference datasets areequivalent and will refer to the part of the long-term reference time series that isconcurrent to the short-term site-observed dataset, which will in turn be denotedshort-term site, site concurrent or simply short-term observed time series.

3.1 Regression methods

A plot of the concurrent site dataset vs. the concurrent reference dataset is needed,from which to obtain a best fit that will most accurately describe the relationship be-tween both datasets. This can be done in an all-sector fashion, but it is recommendedto correct sector-wise and ultimately recombine the sector-wise corrections into anall-sector LTC. When sectorising both concurrent short-term datasets, it is customaryto use the direction of the short-term reference, i.e. to do as if the short-term sitesdirection were the same as the concurrent reference one. This is done for practicalreasons, since for most methods, direction is not long-term corrected and thus the onlyavailable long-term direction is the reference one.

Ordinary Least Square Method (OLS)It assumes that there is a linear relationship between both concurrent time series.The aim is to calculate the intercept and slope coefficients that will minimise the

sum of the squared residuals,ni=1

2i , in the yaxis direction, where i = yi yi,i.e. the predicted reference value minus the measured value. The regression linecan be forced to go through the origin.

Total Least Square Method (TLS)This is equivalent to the previous method, but the residuals are calculated as thedifference between the reference predicted and the measured values in the perpen-dicular direction with respect to the regression line, instead of in the vertical one.The equations were taken from the commercial software package WindFarmer Rs


manual, for its PCA Method (WindPRO 2.6 Manual, 2008). Again the interceptcan be forced to be zero.

nth degree Polynomial Regression Method (PRn)In practice, an -nth degree polynomial can be chosen to fit a data cloud. For ascatter plot with a non-linear shape, it might be a reasonable approach to fit ahigher degree polynomial and try to cover the data cloud more accurately. In thiswork, the PRn method was applied throughout by means of a third-order polyno-mial, henceforth referred to as PR3.

Better results could be expected from this methodology, but, as pointed out byRebbeck (1996), none of the non-linear models investigated by him (higher or-der polynomials, cubic splines and complex surface fitting) performed much betterthan a linear regression.

Variance Ratio Method (VAR)This method was proposed by Rogers et al. (2005) as a way to force the overallvariance of the LTC time series to be equal to the overall variance of the observedtime series, i.e. (y) = (y). This is done by forcing the slope parameter to be(y)/(x); also, it avoids the problem of the variance of the predicted wind speedabout the mean being smaller than the variance of the observed wind speeds by afactor equal to the correlation coefficient from the regression fit (Rogers et al.,2005).


0 10 20 30 400

10

20

30

40

Reference wind speed [m/s]

Sitewindspeed[m

/s]

w.s.OLSOLS f.t.oTLSPR3VAR

Figure 5: Different regression trend lines for concurrent short-term site and short-term reference wind

speeds. Each trend line corresponds to a different fitting method.

3.2 Non-regression methods

These methods firstly sectorise both the concurrent short-term site and short-termreference time series. Parameters such as A, k and wind power density P are thencalculated for each sector, so that the correction factors can be applied sector-wise tothese parameters. The resulting LTC is therefore not a time series, but a collection of

sector-wise LTC parameters, from here on denoted A, k and P .

Mortimer Method (MOR)This method was created by Alan A. Mortimer, see Mortimer (1994). Both theconcurrent site and the reference time series are firstly binned with respect to thereference speed and direction: 1 m/s and 15, for example. Secondly, a matrixrij is created, where each element ij contains the mean of the quotient of con-current site and reference wind speeds, i.e. the mean of vector vsst

vrlt. An analogous

matrix sij must also be built, to contain the standard deviation of vectorvsstvrlt

.

(The subscripts stand for, respectively: site long-term (slt), site short-term (sst),reference short-term (rst) and reference long-term (rlt)).

sij is used to create a triangularly-distributed pseudorandom number eij at eachspeed/direction bin ij, so that the final governing equation can be applied:

yij = (rij + eij)xij, (17)


where y and x are the binned long-term corrected wind speed and the binnedlong-term reference input, respectively.

Knut & Harstveit Method (KH)The KH method was developed by Knut Harstveit and is used in the Norwegianwind assessment company Kjeller Vindteknikk, see Klinkert (2012). A matrix ofshort-term site observed wind speeds is constructed, Oij, where i and j are direc-tion sectors in the concurrent reference and site datasets, respectively. The elementij of the matrix contains all short-term wind speed values at the site that fall intothe bin ij, i.e. those wind speeds that belong to direction bin j but occur whenthe concurrent short-term reference data value belongs to direction bin i. Fromthis matrix, a population matrix Nij is derived; each element is simply the numberof wind speeds found in each ij in Oij.

A third matrix is also derived from Oij, containing the mean of the observedshort-term site wind speeds contained at each ij. This matrix is expressed as Oij.

A fourth matrix is computed as a probability matrix Pij derived from Nij. Pijis obtained simply by dividing each value ij by the sum of all the column j, i.e. itis the probability of directions observed at the site occurring at the same time asreference directions. Finally, a vector Qi is calculated, with as many elements asdirection bins have been chosen. Each element contains the quotient of long-termreference and short-term (concurrent) reference wind speeds, each sectorised withits own direction. The equation governing is expressed as

vjslt =12i=1

Oij Pij Qi, (18)

where vjslt is the LTC average wind speed calculated for bin j.

Tallhaug and Nygaard Method (TN)This method is explained in Tallgaud and Nygaard (1993). It follows the relation

vislt = visst + R

i islt

islt(virlt virst), which gives the site long-term mean wind speed,

sectorised with respect to the reference wind direction. For each sector, the Pearsoncoefficient R must be calculated, as well as the standard deviation of both concur-rent, sector-wise datasets. Finally, this predicted long-term mean wind speed mustbe translated to the site wind direction by means of:

vjslt =ni=1

visltpji p

i

pj, (19)

where pji is a matrix containing the probability of site sector j occuring at thesame time as reference sector i, while pj and pi are the individual probabilities ofsectors i or j occurring at the site and reference, respectively.


Woods & Watson (WW)This method is explained in Woods and Watson (1997). Two matrices Wij andZij are created. The first one contains the conditional probability of wind blowingin a certain reference sector i, and in sector j of the site. The second matrix

represents the inverse case. Both are built such thatnj=1

Wij = 1 andni=1

Zij = 1.

To calculate the long-term corrected wind speed at the site, the authors proposedtwo options. In this thesis only the second option is implemented, since, accordingto the authors, it is the choice which yields the best results when the correlationbetween the concurrent data sets is poor (and as will be seen, concurrency ismoderate for the site):

vjslt = mj

(ni=1

Zij virlt

)+ cj (20)

Weibull Method (WBL)A very simple method found, among others, in the WindPro R commercial softwarepackage (WindPRO 2.6 Manual 2008). It needs both concurrent short-term siteand short-term reference time series to be sectorised with respect to their own

direction values. The LTC site wind speed is defined as jslt =isstjrst

jrlt. The su-

perscript j in jslt indicates that it is already sectorised for the site direction j. represents any parameter calculated for a specific bin, including frequency. This isthe only method to yield a LTC frequency f , as implemented in this work.

Method Regression Non-regression Corrects direction Developer

OLS X Yes if applied to u and v GL-GH, WindFarmerTLS X Yes if applied to u and v GL-GH, WindFarmerPR3 X Yes if applied to u and vVAR X Yes if applied to u and v Rogers, Rogers & ManwellMOR X No Alan MortimerKH X No Knut HartsveitTN X No Tallhaug & Nygaard

WW X No Woods & WatsonWBL X Yes EMD, Windpro

Table 2: Summary of the different LTC methods used in this work.


4 Site description

The measuring station is located at DTU Wind Energys test center for large windturbines at Hvsre, in Western Denmark.

Figure 6: Bing Maps R image of Hvsre test facility and its surroundings.

Figure 6 shows the Hvsre site, marked in red. It is delimited to the South by a U-shaped road and to the North by a creek. It is a very flat area made of farmlands andgrasslands, and there are two significant bodies of water: the North Sea to the Westand the Bvling Fjord to the South. The farmland is cut mainly by the limiting roadsaround. Along the coastline to the West and protecting the 181-Road from the seawinds, there is a 5-m-high embankment.

Figure 7 shows a closer view of Hvsre. The wind turbines lie in a North-South array,each with its corresponding measuring mast lying roughly 250 m to the West. The


meteorological mast (the station) is roughly 200 m South of the southernmost turbine,from which Hvsres observations are recorded.

Figure 7: Bing Maps R image of Hvsre test facility and its surroundings.

The masts data feed can be followed in real time at DTU Wind Energys website.These measurements are mainly wind speed and direction at different heights, but alsotemperatures and atmospheric pressure. In this thesis, however, only wind speed anddirection measured by the meteorological mast (marked in blue in figure 7) were used.The exact coordinates of the station are 562626893, and measurements wererecorded by a Ris P2546a cup anemometer and vane placed 100 m above ground, forthe period 01012005 to 31122012. Both devices have a measuring frequency of10 Hz, but the data used in this thesis are 10-min average wind speed and direction.The choice of 100 m height is suitable for large wind turbines.


5 WRF

The Weather Research and Forecasting (WRF) model is a numerical weather prediction(NWP) model widely used in research and industry and that counts with up to 6000users (Skamarock et al (2008)). It is a code-based tool, and it is accommodated inthe so-called WRF Software Framework (WSF), which holds the different modulesthat feed into the calculations. Thus, modules such as Physics Package and WRF-Chem serve as input to the Dynamic Solvers (Advanced Research WRF or ARW andNonhydrostatic Mesoscale Model or NMM) while performing the calculations.

Figure 8: WRF software infrastructure, Skamarock et al (2008).

WRF is highly user-configurable. As an example, it can be set to use simplified physicsequations when calculating microphysics, or be set to make use of its full capability(sophisticated mixed-phase physics). It can either treat atmospheric radiation as a mixof long and short waves or as a simple shortwave system. Surface physics can be ac-counted for via a simple thermal model or via a more complete model comprising allpossibilities (vegetation, moisture, snow, ice, etc.). However it is this wide range ofpossibilities what causes WRFs output to be highly dependent on the users choicesand model tuning (Hahmann et al., 2013).

WRF output simulations were used in this work as reference data. The simulationswere run at DTU Wind Energy Ris Campus by nesting the model in a global atmo-spheric reanalysis, i.e. the initialisation of WRFs mesoscale simulations, as well as theareas boundary conditions, were taken from a global atmospheric reanalysis. For this


work, version 3.2.1 of WRF was configured to use the ARW solver on an outer domaingrid of size 15 km 15 km and on a nested domain grid of size 5 km 5 km.

Figure 9 shows the real boundaries of a part of northwestern Jutland (including Hvsvre),overlapped by WRFs nested grid land mask. This is the configuration used in order toobtain the simulated wind speed and direction, which were used as reference data inthe thesis.

Figure 9: Representation of the land mass and the ocean as seen by WRFs nested grid. The pink x

marks the location of the meteorological mast. The two red dots located East and West of the mast

mark the two closest v-component grid output points. The green points North and South mark the

u-component grid output points (it is a staggered grid).

The four points (represented as red, green and white dots in figure 9) belong to thea horizontal slice of the 3D grid, thus representing only the pressure level roughlyequivalent to 100 m in height. The values of the zonal and the meridional wind speed


at the four dots were then horizontally interpolated so as to obtain a single value ofWRF-derived horizontal wind speeds at the middle point (white dot in the figure). Thisfinal output point is, as mentioned, at a height of roughly 100 m.

As for the 5 km 5 km horizontal grid resolution, figure 9 shows that this causesa large difference between the modeled and the real horizontal boundaries. Indeed,WRFs land mask mismatch implies that winds modeled as northeasterly winds atHvsre blow over water when reaching the mast, when in reality, northeasterly windsblow over land. This change in roughness length between the real (observed) and themodelled, WRF-derived winds is one of the reasons behind deviations between betweenboth at coastal sites such as Hvsre.

On the other hand, the coarseness of WRFs grid does not present a problem atHvsre in terms of unseen obstacles, since, as seen in the previous section, the siteis mainly flat terrain. Also, no new significant buildings were erected that could havenot been included in WRFs topography input. All this makes Hvsre a unique site interms of observations and reference data.

Choosing the reference time series to be WRF-derived should be validated by repeat-ing the experiment with wind speeds derived from another NWP model, or even fromlong-term observations from a nearby mast (e.g. from the two neighbouring wind farmsseen in figure 6).


6 General pre-processing and data treat-ment

The time series coming from Hvsres meteorological station comprises 10-min aver-age wind speed and direction observations at 100 m. As mentioned in section 4, theperiod used in this thesis goes from 01012005 to 31122012.

The reference time series used comes from the WRF mesoscale model (section 5),which outputs instantaneous hourly values of the horizontal wind velocity componentsu and v. These were transformed to speed and direction for the period 01011999 to31122012.

The maximum possible concurrent period for both observations and reanalysis is there-fore 01012005 to 31122012 (the duration of the observations).

6.1 General pre-processing

Both original or raw observed speed and direction time series had to be pre-processedbefore they could be put to use. This was done in 5 steps, of which only the last appliesto the WRF dataset:

1. Remove extra time stamps.In the observed time series, extra values were found sometimes in between twooutput time stamps, e.g. an extra output value at 05 between 00 and 10 min.Therefore, in this case, if valid values of wind speed were found at both 00 and10 time stamps for that hour, the value at 05 was removed. Otherwise, the extratime stamp was shifted in place of the missing one (see next point).

2. Shift time stamps.Values corresponding to time stamps which were not 00, 10, 20, 30, 40 or 50 minwere shifted, if needed. As an example, a value at minute 09 was shifted to minute10 if the wind speed value at minute 10 was missing, or time stamp 04 was shiftedto 00 if 00 did not previously exist (in both cases, the time series would showvalues only at 00 and 10).

3. Reduce the length of the observed time seriesSince the WRF time series comprises instantaneous, hourly wind speed and di-rection values, the observed time series contains 6 times more values for anyconcurrent period. However, in order to see how they correlate to each other,both time series must have the same number of data points. This means that the6 observed speed and direction values in each hour must be substituted by just one.


This only value was chosen as the 10-min average value corresponding to thetime frame 010 min. Indeed, averaging over the 6 values in each hour, in orderto obtain a single value per hour, would have meant a greater loss of information.

4. Choose a fixing scheme to treat invalid data.Invalid recordings were not seen to come necessarily in pairs, since flagged timestamps were found that either (i) contained a flawed record only of speed (ii)contained a flawed record only of direction, or (iii) contained flawed speed anddirection. Two different paths can be taken when any of the two previous cases isencountered:

(a) Time stamps containing invalid data are removed.

(b) Time stamps containing invalid data are filled with some estimated value.

These two possibilities are investigated in subsection 6.2 below.

Note: as well as non numeric wind speed and direction outputs, invalid wind speedsare also (i) super high readings (usually taken as wind speeds above three timesthe overall standard deviation), and (ii) time windows with a constant wind speedor direction. However, none of these two cases were seen to occur in the observedtime series.

5. As for the WRF dataset, the only pre-processing it required was the interpolationcalculated at a height of 100 m from the three different isobaric surface levelsat which the model outputs its computations: roughly 14, 70 and 125 m. Thisinterpolation was carried out at each time stamp (at each hour) in order to obtainhourly u and v simulated velocity components at 100 m.

6.2 Data treatment

After shifting and reducing the observed time series, all remaining invalid data had tobe treated. In the case of Hvsres hourly observed time series, invalid data accountfor a 2% of all the values. As mentioned in point 4. above, two paths were followedwhen a flagged wind speed or direction value was encountered: the time stamp itselfwas either removed or filled with some numeric data:

1. Time stamps containing invalid data are removed.

Time stamps containing either an invalid speed or direction value were removed.The resulting wind speed time series will hereafter be denoted chopped time se-ries.


2. Time stamps containing invalid data are filled with some estimated value.

2.1. Months were treated separately when following this scheme, as in Salmonand Taylor (2013), i.e. missing wind speeds were substituted by the monthly aver-age. However, since missing wind speeds at Hvsre are usually grouped in chunksof 100 or more consecutive invalid values, the resulting time series, after suchsubstitution, showed unphysical behaviour. This could be seen in a simple WRF-derived vs. observed wind speed scatter plot as odd horizontal alignments of points.

The resulting wind speed time series will hereafter be denoted monthly-averagetime series.

2.2. A Matlab R function named inpaint nans.m was chosen instead. This func-tion interpolates between the values at the beginning and the end of a missingchunk of data in any time series. It also takes into account the general patternbefore and after the missing values, in order to best simulate the pattern of thegenerated data.

In the case of Hvsres hourly observed dataset, this function was applied component-wise, i.e. separately to the u and v datasets. The reason for doing so is that thefunction did not work well when interpolating direction values (especially around0), so it was chosen to convert speed and direction into components before usingthe function. This conversion into components, however, requires both speed anddirection values to be valid. Therefore, it was enough that a time stamp containedeither invalid speed or invalid direction, to mark it as flagged. The flagged timestamp was then filled by applying the inpaint nans.m function to the u and vdatasets. The fixed time series were ultimately combined back into speed and di-rection.

The resulting wind speed time series will hereafter be denoted painted time series.

6.3 The effect of fixing invalid data on the correlation

This subsection investigates the effect that fixing invalid data in Hvsres observeddataset has on how it correlates to the concurrent reanalysis dataset. In order to do so,the observed time series was subjected to an increasing number of artificially injectedinvalid data. Since Hvsres hourly observed time series already contained invalid data,the starting dataset, which had to be free of invalid data, was in reality the paintedtime series. This dataset was then iteratively corrupted.

At each iteration, each of the fixing schemes described above was applied to the


corrupted dataset, after which both the fixed and the concurrent reference datasetwere correlated, and the correlation coefficient r2 calculated.

In the case where invalid data were substituted by either a monthly average or aninterpolation, the reference dataset remained untouched and both kept as many datapoints. On the other hand, in the case of time stamp removal, the infected dataset andthe reference dataset lost the same (concurrent) values in order to make correlationpossible.

The artificial injection of invalid data into Hvsres hourly observed time series wasdone in two ways:

1. Invalid values (100 individual, randomly scattered) were added to the time seriesat each iteration. See figure 10.

2. Invalid chunks (each comprising 100 consecutive values) were randomly added tothe time series at each iteration. See figure 11.

As mentioned, for both cases, after injecting invalid data at each iteration, the differentfixing schemes were applied.

100 200 300 400 500 600 700

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Invalid data (x100 individual values)

r2

Chopped time seriesPainted time seriesMonthly average time series

Figure 10: Correlation coefficient of Hvsres observed time series after fixing its invalid wind speed

values, and the concurrent reference time series, as a function of the number of invalid data values.

The invalid values were randomly injected, 100 values each time.


100 200 300 400 500 600 700

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Invalid data (packs of 100 consecutive values)

r2

Chopped time seriesPainted time seriesMonthly average time series

Figure 11: Correlation coefficient of Hvsres observed time series after fixing its invalid wind speed

values, and the concurrent reference time series, as a function of the number of invalid data values.

The invalid values were randomly injected, blocks of 100 consecutive values each time.

Note that the chopped time series loses, at each iteration, as many data points asinvalid values were added (and at the same exact positions). Thus, the fact that thisis at the same time the least representative dataset of Hvsre and the most stablein terms of correlation, as seen from figures 10 and 11, means that r2 is not a reliableparameter for quantifying the amount of information lost to invalid values. Other ap-proaches such as comparing parameters (A, k and P ) calculated from the corrupted(and subsequently fixed) observed dataset and parameters from the reference datasetmay be more accurate.

Another distinctive feature of figures 10 and 11 is the huge difference for the paintedtime series between the case where 100 individual invalid values are randomly added ateach time (figure 10), compared to when randomly-scattered packs of 100 consecutiveinvalid values are added (figure 11). The former case allows the interpolating functionto keep both time series similar, whereas in the latter case, the wider gaps make itmore difficult for the fixing scheme to be successful. This is backed up by taking alook at figure 11: from value 520 onwards (along the x-axis), there is no more spaceto assign whole packs of 100 invalid values, and so these are, for increasing numberof invalid data, injected individually (as in figure 10): indeed, from this point on, thedecay of r2 is much less acute.

The correlation coefficient r2 is therefore seen to be ineffective at determining howmuch representativeness has been lost to invalid data. In the case where the number ofdata points decreases in the two concurrent time series (red curve, figures 10 and 11),both time series are still actually representative of each other, so r2 does not decrease.It also does not decrease when invalid data are replaced by similar (interpolated) data,


as seen from the blue curve in figure 10. r2 decreases however considerably when thegaps or holes are replaced by surrogate data which is very different from the localpattern around the gap (blue curve, figures 10 and 11).

For remaining calculations in this thesis, the observed time series used will be the oneresulting from fixing Hvsres real invalid data with the Matlab R function; this schemekeeps the right number of datapoints and does not show the unphysical patterns seenin the monthly-average time series.


7 Sensitivity analysis on the correlation

It is interesting to investigate the correlation between reference and observations, as afunction of three different situations: (i) time-shifting the two concurrent time series,(ii) rotating the reference wind direction and (iii) widening the averaging time-rangearound minute 00 in the observed 10-min average dataset. For this section, the corre-lation coefficient r2 was calculated separately for uref and uobs (in blue in the figuresbelow), and separately for vref and vobs (in red). This was done for the period 01012005 to 31122012 using WRF simulations as reference data.

7.1 Sensitivity to time-shifting the concurrent time series

The maximum correlation is obtained at a 1 hour shift between WRF and measure-ments, as seen in figure 12. This was expected, since the reference data time stampswere not initially time-shifted (to account for the 1-hour difference between referenceand observations time-zones).

20 15 10 5 0 5 10 15 20

0.2

0.3

0.4

0.5

0.6

0.7

0.8

r2

Time shift [hours]

u-componentv-component

Figure 12: Effect of time a time shift between concurrent the observed and the reference wind speed

time series on the correlation between them.

7.2 Sensitivity to rotating the reference wind direction

A rotation of the reference wind direction was carried out, in order to detect any possiblemisalignment between reference and observed wind speed. The procedure was to addor subtract some degrees to the reference direction time series, and then calculate new


misaligned uref and vref velocity components with which to correlate to uobs and vobs,

which stayed the same. The result is shown in figure 13.

20 15 10 5 0 5 10 15 20

0.65

0.7

0.75

0.8

Angle rotation of reference direction []

r2

u-componentv-component

Figure 13: Effect of rotating the reference wind direction on the correlation between the observed and

the reference wind speeds.

There is an offset in direction, but this shows only in the correlation coefficient betweenthe varying uwrf and the fixed uobs. Indeed, while the correlation is symmetric for thev and has a maximum value at 0, the u components maximum is displaced 5.It was assumed that the wind vane was correctly calibrated throughout the measuringperiod, so the offset can be associated exclusively to a systematic error in WRF.

7.3 Sensitivity to widening the averaging time-range aroundminute 00 in the observed 10-min average dataset

It was explained in section 6 that the number of time stamps in Hvsres measuredtime series had to be reduced from 6 per hour to 1 per hour, in order to correlate it tothe concurrent reference dataset. As seen, the procedure consisted of picking out onlythe 10-min average value corresponding to the 00 min time stamp. It is interesting,however, to see what happens if a broader range (always around 00 min) is used toaverage and obtain a single hourly value of wind speed, i.e. 10, 20, 30 min,and so on, instead of just the raw value at 00 min. The results of such procedure areshown in figure 14.


00 min +/10 min +/20 min +/30 min +/40 min +/50 min

0.76

0.77

0.78

0.79

0.8

0.81

0.82

Averaging range around minute 00

r2

u-component

v-component

Figure 14: Effect on the correlation of a changing breadth of the averaging range around 00 m in the

observed time series.

As seen in figure 14, the correlation increases (although very slightly) with increasingwidth of the averaging range. Indeed, it is easier for two concurrent averages (overcertain time window) to correlate well than for just a single point from uobs or vobs(i.e. at 00 minutes) to correlate well to the concurrent hourly uwrf or vwrf . Averag-ing smooths both WRF and observed time series, as will be seen in the next section,causing the correlation coefficient to increase. In this case, the difference is so smallbecause the difference in width of the averaging range is very small as well.

After these sensitivity analyses, the behaviour of the correlation with respect to atime shift, a rotation and an averaging is known. The version of the WRF-derived windspeed that is used henceforth is the one to which a 1-hr time shift has been applied,to which no rotation has been applied, and to which no extra averaging is applied (i.e.the instantaneous value of the 10-min average wind speed directly outputted by themodel at minute 00).


8 The wind climate at Hvsre

From on-site hourly observations spanning the period 20052012, there is a clear pat-tern at Hvsre: the wind comes mainly from the North Sea as a northwesterly wind,with a mean speed of 9.3 m/s at a height of 100 m. Figure 15 shows the wind speeddistribution for observations and the concurrent WRF output at Hvsre for the 8-yearperiod.

In this section, similarities between observed and WRF time series will be investigated,for different averaging periods.

8.1 The local wind speed and direction

0 5 10 15 20 25 30 35 400

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Wind speed [m/s]

p.d.f

A = 10.5 m/s

k = 2.2

A = 10.47 m/s

k = 2.17

WRFObservations

Figure 15: All-sector histogram and Weibull distribution function at Hvsre, 20052012. Observations

in blue and WRF-derived wind speeds in red.

Figure 15 shows that the observed and WRF-derived all-sector wind speed distributionsare in good agreement. Both p.d.f. curves overlap for all wind speeds. For sector-wiseand yearly representations, see figures 53 and 54 in Appendix A, which also show accor-dance between WRF-derived and observed wind speed distributions (except for sector1 in figure 53, which shows the wake effect of the test center facility).


Figure 16 shows the observed and WRF-derived wind rose for the entire period 20052012, again for the hourly time series. There is also a good agreement for site and WRFdirections, with three exceptions. Firstly, northerly winds are smaller in magnitude inthe observed wind rose than in the reference wind rose, most probably due to the wakeof the wind turbines North of the mast (WRF does not take the effect of the turbinetest center into account).

Secondly, there is a slight mismatch in the northerwesterly winds, probably due tothe fact that Hvsre is a coastal site and, as mentioned in section 5, small directionmisalignments between WRF and the observations (in direction sectors with a sea-landboundary) may cause large and abrupt changes in the roughness which is fed to themodel, thus affecting the modelled wind speed.

Lastly, observations have slightly higher maxima in wind speed values than those pre-dicted by the model (this is also shown in figure 15 but it is not as clear). This is dueto the fact that the horizontal resolution in the mesoscale model is not small enoughto correctly predict extreme events, e.g. storms, which contribute to these wind speedmaxima.

1%

2%

3%

4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

100 m REF 20052012

Wind speed [m s1]

(a)

1%

2%

3%

4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

100 m OBS 20052012

Wind speed [m s1]

(b)

Figure 16: All-year (20052012) reference and observed wind rose at Hvsre site.


8.2 The effect of averaging on the WRF-observations correla-tion

For an overall impression of the observed wind speed it is also interesting to take a lookat the time series itself for the entire period 20052012, as expressed through differentaveraging periods, i.e. hourly, daily, monthly and yearly average wind speed. (Note thatthe hourly average version of the observed time series comes from merely picking the00 values; the reference dataset already comes, on the other hand, as hourly values,as explained in section 6).

1 2 3 4 5 6 7x 104

5

10

15

20

25

30

35

Time [s]

Windspeed[m

/s]

Hourly Daily Monthly Yearly Mean

Figure 17: Observed wind speed at Hvsre, 20052012, as expressed by different averaging periods.

As seen from figure 17, a time series is smoothed down to different levels by succes-sively averaging over longer periods of time; added to this, the longer the averagingperiod, the fewer the values comprising the time series.

More importantly, averaging both the observed and reference time series (as in figure17) has a direct impact on the mutual correlation. Indeed, hourly-averaged WRF andobservations (figure 18) correlate poorly in comparison to yearly-averaged versions ofthe same time series (figure 21).


50 100 150 200 250 300 350

5

10

15

20

25

30

35

Time [hours]

Windspeed[m

/s]

WRFObservations

Figure 18: Hourly-averaged observed and WRF-derived wind speeds at Hvsre (first 120 hours of

January 2005 depicted).

2 4 6 8 10 12 14

10

15

20

25

Time [days]

Windspeed[m

/s]

WRFObservations

Figure 19: Daily-averaged observed and WRF-derived wind speeds at Hvsre (first 15 days of January

2005 depicted).

Figures 18 and 19 show the same time window in hours and in days, respectively. Theobserved 40 m/s wind speed storm spike occurring at hour 192 or day 8 (January 2005)stands out in both plots and the effect of averaging is most noticeable in figure 19.


Figure 20 below shows the same spike for month 1 (at the very left of the x-axis) andit does barely reach 15 m/s.

10 20 30 40 50 60 70 80 906

8

10

12

14

Time [months]

Windspeed[m

/s]

WRF Observations

Figure 20: Monthly-averaged observed and WRF-derived wind speeds at Hvsre, 20052012.

2005 2006 2007 2008 2009 2010 2011 20128.6

8.8

9

9.2

9.4

9.6

9.8

Time [years]

Windspeed[m

/s]

WRFObservations

Figure 21: Yearly-averaged observed and WRF-derived wind speeds at Hvsre, 20052012.

Table 3 summarises the effect that averaging the two time series has on the numberof data points. The table also quantifies how well observed wind speeds are matchedby WRF simulations, by calculating the mean of the absolute value of the percentage


difference between all observed and simulated values (in a yearly, monthly, daily andhourly basis). This mean percentage difference or relative error between reference andobservations is highest on an hourly basis (see figure 18), whereas yearly averaging,on the other hand (figure 21), shows an apparent biggest similarity between both timeseries.

Averagingperiod

Mean absolute percentage differenceObservationsWRF [%]

Correlation coefficient r2

ObservationsWRFNumber of pointsin both time series

Yearly 1.07 0.93 8

Monthly 4.47 0.92 96

Daily 18.18 0.78 2922

Hourly 35.17 0.64 70128

Table 3: Absolute value mean percentage difference calculated as the mean of

100| ((Uobs/Uref ) 1) |. r2 coefficients between reference and observed time series and num-ber of data points are also displayed, as a function of different averaging periods. Data taken from

datasets spanning 20052012.

This is however misleading, since it is really the yearly-averaged values of WRF thatare closest to the yearly-averaged values of observations: it is therefore important toexplicitly state which averaging period is being used in an investigation of this kind,moreover when dealing with correlation coefficients between observations and WRFsimulations.

Moreover, to describe a long-term wind climatology through its wind speed distri-bution, it it is not necessary to capture an hour-to-hour wind speed behaviour. Ofcourse anyone would want the reference time series to be identical to the observed onefor the concurrent period, which would imply r2=1, but a reference time series with alower correlation need not necessarily be worse at estimating average parameters suchas A, k or P . As seen in table 3, if the concurrent WRF time series is yearly-averaged,the correlation is high, but the LTC time series comprises just 8 points and thus suffersfrom the biggest loss of information. This was seen already in section 6.

8.3 A description of each observed year

In section 9, where LTC methods will finally be applied, it will be important to knowhow similar single years of observations are to the entire observed period at Hvsre.

A simple investigation on similarity of years is conducted in this section, and thisis important because a bad LTC whose correction factors were calculated from someyear in particular could be attributed to that observed years dissimilarity to the entireperiod 20052012. Therefore, single estimators were firstly calculated for each year of


the hourly observed time series: A, k and wind power density P , as seen in figure 22.

05 06 07 08 09 10 11 12

10

10.5

11

A[m/s]

WRF OBS

05 06 07 08 09 10 11 122.1

2.2

2.3

2.4

k[

]

05 06 07 08 09 10 11 12700800900

1000

P[W/m

2]

Year

Figure 22: All-sector yearly reference (WRF-derived) and observed A, k and P parameters, 20052012.

Figure 22 depicts the three all-sector observed parameters. The variability around themean is shown in table 4, and it was calculated as the relative error of the all-year(20052012) parameter with respect to the mean value of the parameter each year,

e.g. for year i, the error in the A parameter is Ai = 100(

AiAtot 1)

.

Percentage difference [%]

Year A k P

2005 1.02 0.73 2.682006 5.31 1.38 16.552007 5.09 5.40 17.722008 1.46 5.12 7.672009 4.43 2.44 17.442010 7.09 4.80 29.192011 4.72 0.29 12.862012 2.99 4.61 4.12

Table 4: All-sector percentage difference of yearly observed parameters with respect to the all-year

(20052012) parameters.

Year 2010 is clearly the outlier in the case of the three estimators. P shows the


biggest difference because it was calculated as in equation 14, i.e. A is cubed.

Years 2006, 2007 and 2009 also present large deviations in mean power density. Notethat although 2006 and 2007s respective errors in A are roughly equal in magnitudebut opposite in sign, the fact that they are consecutive creates a steep 2-year changein P . This can also be seen, even more acutely, in the case of years 2009 and 2010.

The number of yearly counts above certain wind speeds helps to explain the differ-ence in year-to-year P . It can be seen, for example, why year 2010 has such a low P .See table 5.

Observed counts above wind speed:

Year 10 m/s 15 m/s 20 m/s 25 m/s 30 m/s

2005 3528 991 170 24 5

2006 3118 765 110 6 0

2007 3792 1285 273 41 3

2008 3537 1156 234 18 0

2009 3197 677 80 5 2

2010 2978 595 65 0 0

2011 3678 1157 248 32 4

2012 3825 985 146 11 1

Table 5: All-sector observed wind speed counts above certain values for each year.

As for wind direction, one way to see which of the observed years is anomalous is byvisual inspection of Hvsres yearly observed wind roses. From figures 23 and 24 it isclear that, regarding direction, year 2010 is also anomalous: its wind speed does notcome mainly from the North-West, but is evenly distributed between North-West andNorth-East directions. Hvsres 8-year observed wind rose is shown in figure 16.


1%

2%

3%

4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

OBS 2005

Wind speed [m s1]

(a)

1%2%

3%4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

OBS 2006

Wind speed [m s1]

(b)

1%2%3%4%5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

OBS 2007

Wind speed [m s1]

(c)

1%

2%

3%

4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

OBS 2008

Wind speed [m s1]

(d)

Figure 23: Yearly observed wind roses at Hvsre, for height 100 m, and hourly direction time series.

Years 20052008 displayed.


1%

2%

3%

4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

OBS 2009

Wind speed [m s1]

(a)

1%

2%

3%

4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

OBS 2010

Wind speed [m s1]

(b)

1%

2%

3%

4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

OBS 2011

Wind speed [m s1]

(c)

1%2%

3%4%

5%

W E

S

N

0 - 55 - 1010 - 1515 - 2020 - 2525 - 3030 - 3535 - 4040 - 45

OBS 2012

Wind speed [m s1]

(d)

Figure 24: Yearly observed wind roses at Hvsre, for height 100 m, and hourly direction time series.

Years 20092012 displayed.

8.4 Similarity of concurrent WRF-derived and observed param-eters

Regarding the LTCs that will be calculated in section 9, it is also important to deter-mine how similar reference and observational estimators are, on a yearly basis. Thissubsection is therefore a check of WRFs ability to describe the observed wind climateon a year-to-year basis. This is important because, if a LTC is biased, it might happenthat the concurrent period its correction factors arose from shows a low similarity be-tween the observed and reference datasets.

Firstly from a correlation point of view, figures 25 and 26 show the value-to-valuerelationship of reference and observed speed and direction, separately for each year.

0 10 20 300

10

20

302005

r 2 =0.66

Uobs[m

/s]

0 10 20 300

10

20

302006

r 2 =0.6

0 10 20 300

10

20

302007

r 2 =0.66

0 10 20 300

10

20

302008

r 2 =0.7

0 10 20 300

10

20

302009

r 2 =0.57

Uobs[m

/s]

Uref [m/s]0 10 20 300

10

20

302010

r 2 =0.54

Uref [m/s]0 10 20 300

10

20

302011

r 2 =0.69

Uref [m/s]0 10 20 300

10

20

302012

r 2 =0.62

Uref [m/s]

Figure 25: All-sector hourly observed vs. reference wind speed, on a yearly basis.

0 2000

100

200

300

r 2 =0.84

2005

dobs[]

0 2000

100

200

300

r 2 =0.81

2006

0 2000

100

200

300

r 2 =0.87

2007

0 2000

100

200

300

r 2 =0.85

2008

0 2000

100

200

300

r 2 =0.85

2009

dobs[]

dref []

0 2000

100

200

300

r 2 =0.85

2010

dref []

0 2000

100

200

300

r 2 =0.82

2011

dref []

0 2000

100

200

300

r 2 =0.82

2012

dref []

Figure 26: All-sector hourly observed vs. reference wind direction, on a yearly basis.


Note that r2, in the case of direction (figure 26), has been calculated taking only theblue values into account, which are those which represent a difference dobs dref 6200. This filters out, on average, 4% of all values each year, and has been applied inorder to avoid the two corner clouds, which would otherwise unfairly bias the correla-tion. As seen in both figures, the yearly direction correlation is, when calculated thisway, higher than that of the wind speed (shown in figure 25).

As for wind speed, it is easy to visually verify from figure 25 that WRF-derived windspeeds matches observations with moderate-high success on a year-to-year basis. Doesthis mean, however, that the reference data are representative of the observed data?Table 6 below shows each years hourly r2 (between WRF simulations and observedwind speeds), but also the percentage difference between yearly WRF and observedparameters (A, k and P ). Numerically, the biggest difference between values ofr2 occurs between years 2008 and 2010, with a difference of 20%. At first glance thiscould explain 2008s P , which is double that of 2010. However, two other years whichhave equal r2, such as 2005 and 2007, have the second biggest percentage differencebetween years, namely 570%, showing that r2s effect on yearly parameter similarity isnot so clear. Furthermore, the year with the largest correlation coefficient (2008 withr2 = 0.70) has at the same time the second largest difference between yearly simulatedand observed power density: P = 2.54%.

The hourly correlation between observed and reference time series is therefore seento have no connection to the yearly differences between the two datasets parameters.However, this does not mean that the difference between yearly WRF and observedparameters should be trusted over r2, in terms of representativeness.

All in all, when analysing LTCs in sections 9 and 10, a certain years odd result will beattributed to either:

1. That years large yearly difference in observed P , with respect to the averageobserved P in 20052012

2. That years big difference between WRF and observations.

3. That years low r2.

Points 2. and 3. both measure WRFs ability to match observations but are, as seen,unrelated. Which one of the two has a more determinant effect on the LTCs will beseen in sections 9 and 10. As seen at the beginning of this section and in section 6, r2

may be more a measure of the mutual synchronisation between two concurrent timeseries than a measure of representativeness (Lileo et al. (2013)).


Percentage difference [%]

Year A k PHourly correlation coefficient r2

ObservationsWRF [%]2005 0.48 2.67 0.82 0.66

2006 0.11 0.31 0.62 0.60

2007 1.45 1.34 5.49 0.66

2008 0.48 1.09 2.54 0.70

2009 2.36 7.53 1.48 0.57

2010 1.26 3.05 1.30 0.54

2011 1.23 2.32 1.71 0.69

2012 0.97 1.43 1.81 0.62

Table 6: All-sector percentage difference of yearly observed parameters with respect to yearly WRF-

derived parameters.

The relative difference between yearly observed and reference parameters was also rep-resented graphically in figure 22, and for further detail, it is worthwhile looking at itssector-wise version in Appendix A (figures 55 through 57).

As for the yearly difference between WRF and observed wind directions, a visual in-spection is carried out in Appendix A (figures 58 and 59), where the wind roses ofboth are displayed for each year. Such a study is most important when wind directionis long-term corrected, and this is done in subsection 9.4.


9 Results I - Which is the best LTC method?

There is a wide variety of empirical methods with which to estimate the past long-termwind climatology of a target site. As already mentioned, LTCs describe the climatologythat could have been recorded at the target site, had a mast started recording at thesite much earlier.

The connection between the long-term reference time series and the short-term obser-vations at the site is the time period where both datasets are concurrent. The concur-rent period is therefore as long as the shortest of both time series, i.e. the observations.The concurrent period used to calculate the correction factors, however, can be cho-sen in such a way that it comprises the entire observed time series, or just a subset of it.

Part of the uniqueness of Hvsres data resides in the fact that the measured setspans a long time: 20052012. Therefore, in this case, the short-term site observationsare in reality a long-term set (8 years), which allows for the creation of a wide varietyof subsets of different lengths and positions within the total 20052012 set.

Thus, following the approach explained in Rogers et al. (2005), different subset lengthswere defined, starting from just 3 up to 27 months, in steps of 3 months, i.e. 9 dif-ferent subset lengths. For each subset length, the subset was placed in successivenon-overlapping positions along the whole period 20052012; for each position, cor-rection factors were computed, with which to calculate a LTC spanning the period20052012. Finally, each LTC was validated against what had been actually observedat Hvsre for 20052012. The optimum position for each subset length was also de-termined in subsection 9.3, i.e. the position which yielded the LTC closest to actualobservations.

To better explain the above scheme, it is helpful to take as an example the concurrentsubset of length 6 months: in this case, as depicted in figure 27, there are 16 differentpossible positions in the period 20052012 (16 possible 6-month-long non-overlappingsubsets); for each of these positions, each LTC method was applied sector-wise; fromeach LTC obtained for each subset position, sector-wise and ultimately all-sector pa-

rameters A, k and P were calculated.


Figure 27: Possible positions of the concurrent reference and observed subsets, for the case of 6-

month-long non-overlapping subsets, within the total (20052012) concurrent set.

A set of bias ratios was defined, as in Rogers et al. (2005):

bA =A

A, (21)

bk =k

k, (22)

and

bP =P

P, (23)

which express the quotient of LTC vs. observed parameters A, k and P for the period20052012. The next step was to calculate a mean value and a standard deviation foreach subset length (over all positions), i.e. bA and (bA) in the case of the A parameter.The mean and the standard deviation simplified the presentation of the data and provedenough to determine how many months of concurrent time are enough for each LTCmethod to produce a successful LTC at Hvsre and for the period of observations.

9.1 How many months are enough to long-term correct?

In this work in particular, the available observations span the period 20052012, thusproviding, along with the reference data, 8 years to choose the concurrent time from.Real life projects, on the other hand, usually have no more than a couple of years ofobservations; therefore, unless the the observations at hand are long (3 years or more),chopping them into subsets and evaluating the effect of changing the subset positionon the LTC may be an overkill. However, when observations are long as in Hvsre, this


methodology does reveal how different LTC methods work under different conditions.

The scheme explained in the introduction to this section was carried out sector-wise,the sector distribution chosen to have 12 sectors, 30 each, with sector 1 facing North.This configuration is a common choice in wind power meteorology.

Figure 28: Sector distribution chosen for sector-wise calculations.

For each subset length and position, each observed subset was sectorised with respectto its concurrent WRF subset; sector-wise correction factors were thus obtained, whichwere applied to each corresponding sector of the entire 20052012 reference wind speed(long-term reference data had been previously sectorised with respect to the reference

direction). To convert sector-wise values of A, k and P to single all-sector values,the procedure explained in Troen and Petersen (1989) was followed. This all-sectorprocedure is as follows:

1. Each sectors mean wind speed is calculated, and multiplied by its sector frequency(i.e. the number of data points). This is repeated for all sectors and the result isadded. The result is divided by the total frequency (the sum of all sector-wisefrequencies) in order to find the all-sector mean wind speed . (This is a weightedaverage).

2. The sector-wise quadratic mean wind speed, u2 = A2(1 + 2

k

), is calculated, and

weighted over all sectors, as was done with the mean wind speed.

3. The all-sector parameter 2/u2 is calculated, with which to solve the equation2/u2 = 2 (1 + 1/k) / (1 + 2/k) and obtain the value of the all-sector k pa-rameter.


4. The all-sector A parameter is computed as A = /(1 + 1

k

).

5. The all-sector mean wind power density P is calculated with equation 14.

Note: the frequency mentioned in points 1. and 2. refers to the LTC frequency. Theonly LTC method (as implemented in this thesis) which yields a LTC frequency is theWBL method. No other non-regression method in this work yields such a result, buta specific procedure will be explained in subsection 9.4 by which to use regressionmethods to obtain a LTC direction d. Until then, however, f is assumed to be equalto the long-term reference frequency.

Figures 29 through 31 show all-sector mean bias ratios bA, bk and bP for both re-gression (solid lines) and non-regression methods (dashed lines).

3 6 9 12 15 18 21 24 27

0.98

0.99

1

1.01

1.02

1.03

1.04

Subset length [months]

b A[]

OLSTLSPR3VARMORKHTNWWWBL

Figure 29: Mean bias ratio bA as a function of the concurrent subset length (months). For each

subset length, the mean value was obtained by averaging over the bias ratios found at all possible

non-overlapping positions of the concurrent subset within the total (20052012) concurrent set.

Figure 29 shows all bA curves within just 1% of the exact match with observationsfrom 9 months subset length onwards. Only the non-regression methods MOR and WWshow a larger difference of average +3% and 3%, respectively, for all subset lengths.Significant initial drops and jumps are seen for three of the four regression methods(TLS, PR3 and VAR), as well as for the non-regression KH method. This suggeststhat LTCs coming from non-regression methods (except MOR and WW) may depictconcurrent observations of A more accurately than regression methods for short-lengthsubsets (36 months). Moreover, two methods clearly stand out from among the rest:OLS and WBL. These are the simplest linear and the simplest sector-wise transforma-tion, respectively, yet show the best corrections of the A parameter.


It is worthwhile looking at figures 60 through 65 in Appendix A, which are a sector-wise representation of bA (for regression and non-regression methods, respectively).Sector 2 in figure 60 shows the only anomalous value of bA for the regression methods,specifically for PR3 at subset length 6 months. The remaining regression methods donot show this behaviour for the same concurrent subset length, so the cause for thisspike is most probably the inability of the cubic polynomial to correctly describe therelationship between WRF and observed wind speeds. The cubic fit was seen to haveeither explosive or curly shapes for high wind speeds, but these odd fits are less fre-quent as the subset length grows, and indeed no anomalous spike can be seen for thePR3 method (for none of the three bias ratios) for subsets longer than 6 months.

In their study regarding the length of reference period to be taken, Lileo et al. (2013)obtained a very similar shape for their curve of mean absolute prediction error of themean wind speed, even though they took the reference period in years.

Regarding the correction of the k parameter, figure 30 also depicts all the methodsbehaviour as a function of concurrent subset length.

3 6 9 12 15 18 21 24 270.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2


b k[]


Figure 30: Mean bias ratio bk as a function of the concurrent subset length (months). For each



The mean bias ratio bk has on the other hand a wider spread, its value being con-strained roughly as |bk| < 7% for subset lengths larger than 9 months, for all methodsexcept MOR, OLS and PR3. Note that for this thesis, when applying the non-regressionmethods KH, TN and WW, sector-wise k was assumed to be equal to the concurrent


observed subsets k. One would therefore expect to see the red, green and pink dashedcurves in figure 30 overlapping each other; this however happens only in the sector-wisedepiction of the bias ratios, and figure 30 shows all-sector values.

While in Rogers et al. (2005), the VAR method showed that |bk| < 5% from 6 monthssubset length onwards, figure 30 shows a smaller difference of roughly 1% from 12months onwards. Figure 30 also shows a perfect match of WBLs the VARs bk, anda steady bias of roughly +1% for the KH method, from 12 months on. Rogers et al.(2005) also showed very unbiased results for bk for the MOR method, which contrastswith the constant bk = 0.8 seen in the figure. However, the results depicted here forthe regression TLS method are better than those seen in Rogers et al. (2005) for theirlinear regression method (which showed a difference of around constant +40%).

As for the OLS and TN methods, which worked very well in the correction of A, theypresent on the other hand large deviations for the k parameter. PR3, which workedwell for A (in the all-sector case) yields, together with OLS, the worst result witharound +20% bias for all subset lengths. It also shows the same bias when representedsector-wise in figure 61.

What matters, however, from a power production point of view, is how well P iscorrected. This is indeed the crucial parameter in wind farm assessment. As seen fromlooking globally at all three figures 29, 30 and 31, and as could be suspected fromequation 14, for a good correction of P , good corrections of both A and k are needed.Indeed, all ratios which are biased in the estimation of either A or k are also biased inP ; but only those which show small bias ratios in both A and k are truly unbiased inthe correction of P . i.e. the TN, VAR, KH and WBL methods.

All in all, looking at the all-sector figures above, it can be seen that bA and bP sta-bilise to a constant value for all sectors after 912 months. The initial jumps or dropscan be associated to small subset lengths. However, after this all methods look quiteinsensitive to increasing length of the concurrent subset, since the three bias ratios donot vary wildly along the way up to the maximum length of 27 months (neither insector-wise nor in all-sector representations).

Also, certain sectors seem to have systematically worse results, as seen in figures 60through 65; in sectors 1 and 2, the WW method yields especially biased results forbA and bk, while PR3 fails for bP and bk. Both these sectors happen to comprise thefewest number of data points for each subset length: figure 32 shows the sector-wisefrequency of occurrence, plotted as the mean frequency of each subset length, over allthe possible positions. Sectors 1 and 2 have the smallest values of mean frequency ffor all subset lengths. Moreover, this may explain the low sector-wise mean correlationcoefficient r2 for these two sectors, depicted in figure 33. In the case of sector 1, there


3 6 9 12 15 18 21 24 27

0.9

1

1.1

1.2

1.3

1.4


b P[]


Figure 31: Mean bias ratio bP as a function of the concurrent subset length (months). For each



is also a wake disrupting the observations, which could in turn explain the invertedpattern of r2.

3 6 9 121518212427

200

400

600

Sector 1

f[

]

3 6 9 121518212427

200

400

600

Sector 2

3 6 9 121518212427200400600800

10001200

Sector 3

3 6 9 121518212427

500

1000

1500Sector 4

3 6 9 121518212427

500

1000

1500Sector 5

f[

]

3 6 9 121518212427200400600800

10001200

Sector 6

3 6 9 121518212427200400600800

10001200

Sector 7

3 6 9 121518212427500

1000

1500

2000Sector 8

3 6 9 121518212427500

100015002000

Sector 9


f[

]

3 6 9 121518212427500

1000150020002500

Sector 10

Subset length [months]3 6 9 121518212427

5001000150020002500

Sector 11

Subset length [months]3 6 9 121518212427

200400600800

10001200

Sector 12


Figure 32: Mean correlation coefficient f [-] as a function of the concurrent subset length (months).

For each subset length, the mean value was obtained by averaging over the f found at all possible



3 6 9 121518212427

0.21

0.22

0.23Sector 1

r2[

]

3 6 9 1215182124270.260.28

0.30.320.34

Sector 2

Long-term Corrections for Wind Resource Assessment

Documents

longterm corrected wind

longterm reference wind

wind climate

concurrent time series327

dtu wind energyabstract

reference wind direction

concurrent period

hahmanndtu wind energy