Top Banner
On the Automatic Prediction of PM 10 with In-Situ Measurements, Satellite AOT Retrievals and Ancillary Data Piero Campalani 1,2 , Thi Nhat Thanh Nguyen 1,2 , Simone Mantovani 2,3 , Gianluca Mazzini 4 1 UNIFE, Via Saragat 1, 44122 Ferrara, Italy. Email: [email protected] 2 MEEO Srl, Via Saragat 9, 44122 Ferrara, Italy. Email: [email protected] 3 SISTEMA GmbH, Wäringerstraße 61, 1090 Wien, Austria. Email: [email protected] 4 LepidaSpA, Via Aldo Moro 64, 40127 Bologna, Italy. Email: [email protected] Abstract: Daily monitoring of unhealthy particles suspended in the low troposphere is of major concern around the world, and ground- based measuring stations represent a reliable but still inadequate means for a full spatial coverage assessment. Advances in satellite sensors have provided new datasets and though less precise than in- situ observations, they can be combined altogether to enhance the prediction of particulate matter. In this article we evaluate a method- ology for automatic multi-variate estimation of PM10 dry mass con- centrations along with a comparison of three different cokriging es- timators, which integrate ground measurements of PM10, satellite MODIS-derived retrievals of aerosols optical thickness and further auxiliary data. Results highlight the need for further improvements and studies. The analysis employs the available data in 2007 over the Emilia Romagna region (Padana Plain, Northern Italy), where stag- nant meteorological conditions further urge for a comprehensive air quality monitoring. Qualitative PM10 full maps of Emilia Romagna are then automatically yielded on-line in a dynamic GIS environment for multi-temporal analysis on air quality. Keywords: Particulate Matter; Prediction; Satellite; AOT; MODIS; Cokriging. I. INTRODUCTION The correct evaluation of air quality in the low troposphere is clearly of great concern since it has direct effects on human health. Tens of thousands of premature deaths have been asso- ciated with increased exposure to Particulate Matter (PM) [1]. These fine particles are suspended in the air and can have ei- ther anthropogenic (power plants, burning of fossil fuels in ve- hicles, spray cans) or natural (dust storms, volcanoes, fires) source. Prolonged exposure to substantial concentrations of particulate matter is cause of several heart and lungs diseases. Particles are usually classified by scale, referred to as frac- tions: PM 10 represents the particles with aerodynamic diame- ter smaller than 10 μm and is known as thoracic fraction. Tradi- tionally, the particulate matter is measured by ground stations on either an hourly or daily base. Due to their spatial proxim- ity to the particles, these sensors usually provide highly trust- worthy measurements nevertheless they still do not represent a suitable means to a full spatial monitoring over an area. Satel- lite imagery has proven to be tremendously important in the achievement of an exhaustive description of PM: the scientific research has systematically increased interest on this applica- tion because of the ability of satellites to monitor the columnar Aerosol Optical Thickness (AOT or τ ) 1 [2, 3, 4, 5, 6, 7, 8]. This represents a measure of the light absorbed or scattered by the aerosols through the atmospheric path, and can be ex- ploited to infer the dry mass concentration of particulate in the near-surface. The relationship between PM and AOT is how- ever very challenging since actually many variables can affect it: aerosol type, hygroscopicity, atmospheric mixing height, relative humidity, temperature and cloud contamination are amongst them. One should also note the difference in tem- poral support between an instantaneous satellite footprint and a time-averaged measurement of a ground sensor. The purpose of this study is to describe and evaluate a method for automatic prediction of PM 10 concentrations along with a comparison of three different multivariate cokriging so- lutions which make use of the information carried by satel- lite AOT retrievals and further auxiliary images. The analysis are performed with the whole available data of 2007 over the Emilia Romagna region (Northern Italy) in the Padana Plain, which is sadly known for its dangerous combination of high pollutant concentrations and stable air masses. The prediction methodology is then applied to yield daily continuous qualita- tive maps of PM 10 concentrations over Emilia Romagna which are shared on-line via the web platform Multi-sensor Evolution Analysis PM (MEA-PM) in an interactive GIS environment [9]. In the next section the used datasets are described in de- tail. In Sec. III we put light on the methodology whereas in Sec. IV the results of the three different cokriging estima- tors are shown. Conclusions and future works are reported in Sec. V. 1 In the literature the concept of τ is also widely known as optical depth, causing the acronym AOT to turn into AOD. However, the former is preferred in this article. 000089 978-1-4673-0753-6/11/$26.00 ©2011 IEEE
6

On the Automatic Prediction of PM10 with in-situ measurements, satellite AOT retrievals and ancillary data

Feb 08, 2023

Download

Documents

Julia Kusznir
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: On the Automatic Prediction of PM10 with in-situ measurements, satellite AOT retrievals and ancillary data

On the Automatic Prediction of PM10 with In-Situ Measurements, Satellite AOTRetrievals and Ancillary Data

Piero Campalani1,2, Thi Nhat Thanh Nguyen1,2, Simone Mantovani2,3, Gianluca Mazzini4

1UNIFE, Via Saragat 1, 44122 Ferrara, Italy. Email: [email protected] Srl, Via Saragat 9, 44122 Ferrara, Italy. Email: [email protected]

3SISTEMA GmbH, Wäringerstraße 61, 1090 Wien, Austria. Email: [email protected], Via Aldo Moro 64, 40127 Bologna, Italy. Email: [email protected]

Abstract: Daily monitoring of unhealthy particles suspended in thelow troposphere is of major concern around the world, and ground-based measuring stations represent a reliable but still inadequatemeans for a full spatial coverage assessment. Advances in satellitesensors have provided new datasets and though less precise than in-situ observations, they can be combined altogether to enhance theprediction of particulate matter. In this article we evaluate a method-ology for automatic multi-variate estimation of PM10 dry mass con-centrations along with a comparison of three different cokriging es-timators, which integrate ground measurements of PM10, satelliteMODIS-derived retrievals of aerosols optical thickness and furtherauxiliary data. Results highlight the need for further improvementsand studies. The analysis employs the available data in 2007 over theEmilia Romagna region (Padana Plain, Northern Italy), where stag-nant meteorological conditions further urge for a comprehensive airquality monitoring. Qualitative PM10 full maps of Emilia Romagnaare then automatically yielded on-line in a dynamic GIS environmentfor multi-temporal analysis on air quality.

Keywords: Particulate Matter; Prediction; Satellite; AOT;MODIS; Cokriging.

I. INTRODUCTION

The correct evaluation of air quality in the low troposphereis clearly of great concern since it has direct effects on humanhealth. Tens of thousands of premature deaths have been asso-ciated with increased exposure to Particulate Matter (PM) [1].These fine particles are suspended in the air and can have ei-ther anthropogenic (power plants, burning of fossil fuels in ve-hicles, spray cans) or natural (dust storms, volcanoes, fires)source. Prolonged exposure to substantial concentrations ofparticulate matter is cause of several heart and lungs diseases.

Particles are usually classified by scale, referred to as frac-tions: PM10 represents the particles with aerodynamic diame-ter smaller than 10 µm and is known as thoracic fraction. Tradi-tionally, the particulate matter is measured by ground stationson either an hourly or daily base. Due to their spatial proxim-ity to the particles, these sensors usually provide highly trust-worthy measurements nevertheless they still do not represent asuitable means to a full spatial monitoring over an area. Satel-lite imagery has proven to be tremendously important in the

achievement of an exhaustive description of PM: the scientificresearch has systematically increased interest on this applica-tion because of the ability of satellites to monitor the columnarAerosol Optical Thickness (AOT or τ ) 1 [2, 3, 4, 5, 6, 7, 8].This represents a measure of the light absorbed or scatteredby the aerosols through the atmospheric path, and can be ex-ploited to infer the dry mass concentration of particulate in thenear-surface. The relationship between PM and AOT is how-ever very challenging since actually many variables can affectit: aerosol type, hygroscopicity, atmospheric mixing height,relative humidity, temperature and cloud contamination areamongst them. One should also note the difference in tem-poral support between an instantaneous satellite footprint anda time-averaged measurement of a ground sensor.

The purpose of this study is to describe and evaluate amethod for automatic prediction of PM10 concentrations alongwith a comparison of three different multivariate cokriging so-lutions which make use of the information carried by satel-lite AOT retrievals and further auxiliary images. The analysisare performed with the whole available data of 2007 over theEmilia Romagna region (Northern Italy) in the Padana Plain,which is sadly known for its dangerous combination of highpollutant concentrations and stable air masses. The predictionmethodology is then applied to yield daily continuous qualita-tive maps of PM10 concentrations over Emilia Romagna whichare shared on-line via the web platform Multi-sensor EvolutionAnalysis PM (MEA-PM) in an interactive GIS environment[9].

In the next section the used datasets are described in de-tail. In Sec. III we put light on the methodology whereasin Sec. IV the results of the three different cokriging estima-tors are shown. Conclusions and future works are reported inSec. V.

1In the literature the concept of τ is also widely known as optical depth,causing the acronym AOT to turn into AOD. However, the former is preferredin this article.

000089978-1-4673-0753-6/11/$26.00 ©2011 IEEE

Page 2: On the Automatic Prediction of PM10 with in-situ measurements, satellite AOT retrievals and ancillary data

Fig. 1. Example of AOT satellite retrieval; the location of ARPA ground stations is alsoplotted (black points).

II. DATASETS

Our whole ensemble of data has been cut a priori so as tomatch the area of interest, the Emilia Romagna region in Italy(bounding box [9◦N,43.5◦E] to [13◦N,45.5◦E]). This area ispartly covered by the Padana Plain, where important industrialactivities take place, and the Northern Appenines on the south-ern side. In the following subsections we present the used datasources: the ground stations of PM10 (the target variable) willbe described first, then an overview of the satellite AOT re-trievals and the remaining covariates will follow.

A. PM10 Ground Measurements

Daily averages of PM10 mass concentrations were takenfrom the ARPA monitoring network of (http://www.arpa.emr.it/), which includes tens of measuring stations of parti-cles concentration in Emilia Romagna. The number of avail-able PM10 measurements each day falls between 29 and 33;PM measurements are expressed in µg/m3. The stations mainlycover highly populated areas (urban and industrial scenes),which results in significantly clustered sampling locations: themodel can therefore get biased by the incomplete view offeredby the samples, either because the area is not regularly coveredor the feature space might be not well represented.

B. AOT Satellite Imagery

Currently there are a lot of satellites which can monitorthe presence of aerosols in the atmosphere. The polar or-biting sensors MODIS (MODerate resolution Imaging Spec-troradiometer), MISR (Multi-angle Imaging Spectroradiome-ter), POLDER (POLarization and Directionality of the Earth’sReflectance) and OMI (Ozone Monitoring Instrument) are anexample, as well as geostationary satellites like GOES (Geo-stationary Operations Environmental Satellite) and Meteosat,which can provide aerosol information on a high temporal res-olution ([10]). The MODIS sensors aboard Terra and Aquasatellites have revealed to be the most widely exploited foraerosol applications mainly due to the near-daily global cover-age, with a spatial resolution of 10×10 km2 and an excellent

cloud screening procedure. For details on the MODIS AOTretrieval algorithm see [11]. In this study we chose MODISimagery as satellite information as well, allowing for a dailycoverage of Emilia Romagna over the whole 2007. We ac-tually did not use the original MODIS product but rather theAOT retrievals of the PM MAPPER system, a software pack-age which takes MODIS Level 1B data as input and yieldsa set of air quality information at increased spatial resolution[12, 13], now at 1×1 km2. The validation of the PM MAPPERAOT maps has been carried out by comparison with severalground-based radiometers of the AErosol RObotic NETwork(AERONET) over Europe [14]. An example of PM MAPPERAOT retrieval can be seen in Fig. 1.

C. Ancillary Data

Digital Elevation Model (DEM) and Night Lights maps(NL) were considered as further auxiliary data (see Fig. 2).Presumably the high altitudes prevent the settlement of indus-trial activities, which are in fact far more present on the plain.Moreover the night lights should represent a very good indi-cator of urban scenes, roads and other places with persistentanthropic activities. The SRTM DEM at 90 m of spatial resolu-tion was used (see http://srtm.csi.cgiar.org/) by join-ing the four different tiles that were needed to cover Emilia Ro-magna. The yearly averages of night lights, with 30 arcsecondsof spatial resolution, are freely available thanks to the NOAA’sNational Geophysical Data Center — http://www.ngdc.

noaa.gov/dmsp/downloadV4composites.html — repre-senting lights from cities, towns, and other sites with persis-tent lighting (including gas flares) with values ranging from 0to 63.

Night Lights

Digital Elevation Model

Fig. 2. Digital Elevation Model (above) and satellite-based night lights averages (below)used for prediction.

000090978-1-4673-0753-6/11/$26.00 ©2011 IEEE

Page 3: On the Automatic Prediction of PM10 with in-situ measurements, satellite AOT retrievals and ancillary data

III. PREDICTION METHODOLOGY

The prediction of a variable over unknown locations is not asimple and straight task, but needs several preliminary opera-tions. Furthermore, the choice of the best method needs tuningover several iterations, and failed attempts can offer hints onhow to work on better.

Traditionally, the methods to estimate values of a variableinvolve weighted linear combinations:

estimate = v =n∑i=1

wi · vi (1)

where v1, ___, vn are the available samples and wi are theweights assigned to the samples vi, and which usually sum to1. Different approaches assign different weights to the sam-ples, some of them base on just some sort of common sense,others instead based on statistical theory. Every method relieson its estimation criteria, e.g. distribution of estimates, distri-bution of errors, precision, bias, etc. Kriging is known as theBest Linear Unbiased Estimator (BLUE) since, besides con-straining the model to unbiased zero-mean residuals, it has thedistinguishing feature to minimize the variance of the errors.

The availability of auxiliary data, the predictors, promptedthe construction of multivariate estimators. Ordinary krigingassumes stationarity of the one target variable, however an ex-ternal drift could be assumed on the target to define a non-constant spatial mean, a trend. Ubiquitous predictor(s) wouldbe needed to achieve this, like in the case of our maps of DEMand NL, but unluckily the maps of AOT have empty pixelscaused by e.g. clouds, ice and snow, making them not suitablefor this kind of estimation. To integrate the AOT retrievals, acokriging system can be adopted: in its simpler form with twovariables of interest, the cokriging estimate is a linear combi-nation of both primary and secondary data values, given by:

u0 =n∑i=1

ai · ui +n∑j=1

bj · vj (2)

being ui and vj the samples of primary and secondary vari-ables, associated with the weights ai and bj respectively, andbeing u0 the estimate of u at the unknown location (for furtherstudies of spatial modeling and prediction see [15]). It has beenproven that the usefulness of cokriging is often enhanced whenthe primary variable is underestimated with respect to the sec-ondary variable [16], as in our case. Three different cokrigingestimators were tested in this study:

� 2-VARIATE ORDINARY COKRIGING (OCK).ARPA PM10 ground samples as primary target variable,and AOT satellite retrievals as unique covariate. Station-arity, i.e. constant mean, is assumed for both PM10 andAOT.

� 2-VARIATE UNIVERSAL COKRIGING (UCK).Constant mean assumed on ARPA PM10 measurements,

whereas a spatial trend is assumed on AOT defined bymultiple linear regression (with intercept) with DEM andNL as predictors.

� 4-VARIATE ORDINARY COKRIGING.PM10, AOT, DEM and NL are treated as four indepen-dent stationary variables, with their four different sets ofcokriging weights.

Even though the stationarity assumptions may seem unrea-sonable, all these methods have been localized by constrain-ing the search of nearby samples which build up the predic-tion to be within a distance of 50 km. Further samples arenot taken into account and hence don’t influence the predic-tion: this usually makes the stationarity assumption more ac-ceptable, moreover larger areas of neighbours search wouldintroduce conceptual errors due to topographic or aerosol typeheterogeneity [17]. The search of nearby locations is limitedto the nearest 200 samples, when present, so as to reduce thematrix dimensions in the cokriging systems and achieve com-putationally feasible estimations.

A. Exploratory Analysis

The whole process begins with a look into the distribu-tion of both PM10 samples and AOT pixels: smoothed his-tograms, Quantile-Quantile plots and Box plots are visualizedand stored to allow for further a posteriori investigations. Cok-riging should perform better when AOT is more densely sam-pled than PM10, thus only the AOT maps which provide a min-imum of 20% of pixels in at least 3 out of its 4 quadrants wereconsidered. After that, data transformations were evaluated toensure Normality of the variables (preferable for both linearregression models and kriging interpolations). To achieve this,Box-Cox power transforms are a good tool:{

x(λ) =(xλ−1)

λ λ 6= 0x(λ) = log λ λ = 0

(3)

where λ is chosen so as to maximize a likelihood function.An a priori Box-Cox analysis suggested to log-transform theelevation values of the DEM. The optimal λ was instead com-puted on the fly for each AOT dataset and then accordinglytransformed. Since DEM and NL maps were also involved ina linear regression, their Principal Components (PC) were ex-tracted to reduce their evident multicolinearity. All the data —PM10 ground samples, AOT satellite pixels and PCs — werefinally scaled by their standard deviation for more stable com-putations. The residual plots from linear regression of PM10,which is de facto employed in the Universal Kriging estimator,were stored so as to catch model misspecifications (easily rec-ognizable by e.g. non-linearity or heteroskedasticity). Finally,all data is reprojected to the same spatial reference system,which should be preferably grid-based to speed up distancecalculations: in our case we chose the EPSG:32632 reference

000091978-1-4673-0753-6/11/$26.00 ©2011 IEEE

Page 4: On the Automatic Prediction of PM10 with in-situ measurements, satellite AOT retrievals and ancillary data

10000 20000 30000 40000

0.0

0.4

0.8

1.2

classicrobustfitted

10000 20000 30000 40000

0.0

0.4

0.8

distance

classicrobustfitted

sem

ivar

ianc

e scaled PM10

scaled AOT

Fig. 3. An example of variogram estimation and modeling for both PM10 groundsamples and AOT satellite pixels. Both the classic (continuous line) and robust (dashed)

estimations are plotted, along with the final fitted model (fine dashed).

system, i.e. the UTM projection of zone 32 North over theWGS84 datum.

B. Statistical Model

Kriging estimators rely on an underlying statistical model,represented by the so-called variogram: after computing thevariances between the samples (variogram cloud), they arethen averaged over intervals of distance (lags) to obtain theexperimental variogram. Finally a suitable variogram model,which ensures positive-definiteness and hence a unique andstable prediction, is fitted over it. Hereafter we list our deci-sions for automatic (no user interaction) variogram modeling:

− Variogram Estimate. Cressie robust variogram estima-tion is adopted, obtaining more stable experimental var-iograms to outliers and biasness [18].

− Model. An evaluation of various models has been doneand the Matern model (M. Stein’s parameterization) hasshowed to be the best fitting one.

− Sill/Range/Nugget. The fitting procedure requires startingvalues of this key parameters. A sill — limit of the var-iogram tending to infinite distances — of 1 was chosen(data are scaled); a suitable range — distance at whichthe variogram first reaches 95% of the sill — was set to15 km and a nugget component — jump at the origin —was enabled to be fitted on the model.

− Cutoff Distance. This is the limit distance to which con-sider varigram estimation, and it was set to 50 km accord-ingly to our localized kriging strategy.

− Lag. Distance intervals of 10 km have proven to bethe minimum for a sufficiently robust estimation of var-iograms: ARPA ground stations are sparse and clustered,thus narrower lags would yield highly unstable variancesand poor statistics.

− Anisotropy. Building up an anisotropic model would havefurther reduced the samples used for each variogram, andgiven the scarcity of the ARPA stations it was decided toassume isotropy.

− Coregionalization. As a simple way to achieve core-gionalization under overall positive definiteness of anycokriging system, the Linear Model of Coregionalization(LMC) was adopted.

Whereas the highly dense presence of AOT pixels permits astable definition of the variograms, on the other side one couldsee how the PM10 ground measurements do not always suc-cessfully show a spatial pattern. It can happen that the PM10

variogram is well defined and the Matern model faithfully fitsits variances, but the model may be not adequate, may havea too strong nugget or simply show no clear spatial structure(pure nugget model). Besides, the linear model of coregion-alization often needs to vertically translate the models of di-rect and cross variograms to ensure the positive definiteness.As an example, in Fig. 3 the variograms for the 13th of Jan-uary are showed and in Fig. 4 its coregionalization models:even though independently fitted models were pertinent withthe experimental variograms, the coregionalization constraintshave pushed the models far from the sampled variances in bothdirect- and cross-variograms.

C. Output

Once the variograms are modeled, the cokriging predic-tions can be performed. Our purpose was to evaluate thepotential of several variables in predicting PM10 concentra-tions, by means of slightly different estimators. 5-fold cross-validation was adopted as evaluation meter: Root Mean SquareError (RMSE), Mean Error (ME) and R-squared (R2 = 1 −σ2res/σ

2values) were chosen as output statistics, so as to con-

sider prediction precision, bias and the magnitude of residualswith respect to the actual variance of PM10 samples. To yieldmore stable results, these statistics were repeated 10 times thenaveraged due to the randomness introduced by the folds selec-tion.

IV. RESULTS

In this final section we are going to present the results of ourevaluation. More precisely we will report the cross-validationstatistics, described in Subsec. III-C, that were obtained for thethree different cokriging estimators (described in Sec. III) overthe year 2007. All the results and plots in this article weredeveloped using R [19].

000092978-1-4673-0753-6/11/$26.00 ©2011 IEEE

Page 5: On the Automatic Prediction of PM10 with in-situ measurements, satellite AOT retrievals and ancillary data

2007013.1025 − 2−OCK LM Coregionalisation

distance

sem

ivar

ianc

e

0.0

0.2

0.4

0.6

0.8

1.0

1.2

10000 20000 30000 40000

●●

pm10.aot

0.0

0.2

0.4

0.6

0.8

1.0

1.2

●●

aot

0.0

0.2

0.4

0.6

0.8

1.0

1.2

pm10

2007013.1025 − 4−OCK LM Coregionalisation

distance

sem

ivar

ianc

e

0.0

0.5

1.0

1.5

2.0

10000 30000

● ● ● ● ●

pm10.pc2

0.0

0.5

1.0

1.5

2.0

● ● ● ● ●

aot.pc2

0.0

0.5

1.0

1.5

2.0

10000 30000

● ● ● ● ●

pc1.pc2

0.0

0.5

1.0

1.5

2.0

● ● ● ● ●

pc2

0.0

0.5

1.0

1.5

2.0

●● ● ● ●

pm10.pc1

0.0

0.5

1.0

1.5

2.0

● ●●

● ●

aot.pc1

0.0

0.5

1.0

1.5

2.0

●●

●●

pc1

0.0

0.5

1.0

1.5

2.0

● ● ● ●

pm10.aot

0.0

0.5

1.0

1.5

2.0

●●

● ● ●

aot

0.0

0.5

1.0

1.5

2.0

● ● ●●

pm10

Fig. 4. Examples of Linear Models of Coregionalization for both 2-variate (above) and4-variate (below) ordinary cokriging systems (same day of Fig. 3 is shown).

More than 400 possible different cases were available in2007. The filter on the available number of AOT pixels, de-scribed in Subsec. III-A, reduced our analysis to 123 cases.

Fig. 5 shows the overall cross-validation results. RMSE,ME and R-squared statistics were plotted for each cokrigingsystem. RMSE of a simple Inverse Distance Weighting (IDW)interpolation were also plotted (red line). Looking at the chartsit is clear how the three cokriging systems do not differenti-ate each other, moreover they seem to have brought no sig-nificant improvement with respect to a much simpler interpo-lation method like the IDW. The 2-variate OCK system hasyielded the lowest average error 10.511 µg/m3, whereas IDWperformed worse with an average error of 10.961 µg/m3. Eventhough RMSE scores from cross-validations should not be con-sidered alone for an exhaustive evaluation of a spatial estima-tor, this surely underlines the need for further investigations onthe models, which probably have a higher predicting potential.Mean errors, always in Fig. 5, reveal general underestimationexcept for the 2-variate UCK which shows a near-zero averageME (-0.192 µg/m3), showing that a regression-based methodmight be more suitable. The performance of any estimator areclearly better and more stable in Summer, but this is not re-lated to better models nor to a better PM10/AOT agreement,since IDW follows this trend as well. Looking at the distribu-

tions stored in the exploratory analysis it has turned out thatthis is related to the variance of the ARPA samples: the In-terquartile Range (IQR) in Summer is always low (less than10 µg/m3) whereas in Winter IQRs also of 30 to 40 µg/m3 arefound. This trend might be due to the height of the MixingLayer Height (MLH) of the atmosphere over our area whichusually gets lower in Winter: at low MLH the particles arekept near-surface and high polluted areas generate peaks ofPM; instead this peaks get diluted far from surface when highMLH levels are present. Despite this, the R-squared (see againFig. 5) does not show any trend, and hence the prediction per-formance seems independent of the season.

V. CONCLUSIONS AND FUTURE WORK

In this article we have proposed an evaluation of three dif-ferent cokriging estimators on their ability to predict PM10

mass concentrations using aerosol optical thickness retrievalsfrom satellite imagery, a digital elevation model, a map ofyearly averaged night lights and PM10 in-situ measurements aswell. A practical and detailed description of methodology forautomatic prediction has been given, focusing on exploratoryanalysis, data transformations, variogram modeling and core-gionalization.

The results of cross-validation have decisely highlightedthe need for further investigations on the estimators: the ad-dition of auxiliary data from satellite indeed seemed to bringno benefit on the prediction, neither relevant distinctions werebrought out amongst the different estimators. Despite thecross-validation results, a cokriging estimator with auxiliarydata probably ensures a closer fit to reality in the predictionon PM10 over the whole grid of interpolation, since it keeps incount environmental predictors.

The columnar aerosols information can surely help improv-ing the prediction of particulate matter on the surface, espe-cially in a area like the Padana plain in which the differenttemporal support of ground measurement and satellite foot-prints is smoothed by the static stagnant meteorological condi-tions. Nevertheless, the complex relationship that binds PM10

and AOT should then prompt the employment of additional ex-planatory data, like the height of the mixing layer, temperature,relative humidity, wind, or vertical profiles of aerosols opticalthickness. A clustered point pattern like the one offered by theARPA measuring stations can fake the models and thus declus-tered statistics or subsetting might be evaluated.

The linear model of coregionalization is hardly adequate inthe cokriging systems, even in a 2-variate system, thereforemore complex non-linear coregionalizations might be evalu-ated, or otherwise different kriging techniques can be explored(e.g. kriging with external drift). The temporal dimensionmight be considered to account for lasting events and supportsmismatch. Finally, more elaborate statistical models (e.g. hier-archical bayesian models) could be suitable to better face thischallenging application.

000093978-1-4673-0753-6/11/$26.00 ©2011 IEEE

Page 6: On the Automatic Prediction of PM10 with in-situ measurements, satellite AOT retrievals and ancillary data

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

2-variate OCK2-variate UCK4-variate OCK

2007

R-s

qu

are

d

Winter FallSummerSpring

-6

-4

-2

0

2

4

6

5-fold Cross-Validation Bias

2-variate OCK2-variate UCK4-variate OCK

2007

Me

an

Err

or

Winter FallSummerSpring

4

8

12

16

20

24

28

IDW2-variate OCK2-variate UCK4-variate OCK

2007

RM

SE

Winter FallSummerSpring

Fig. 5. 5-fold averaged cross-validation statistics (RMSE, ME, R-squared) for each of the three tested cokriging solutions. RMSE for inverse distance weighting (IDW) interpolation isalso highlighted in red. Vertical lines are plotted around equinoxes and solstices to identify the four seasons.

REFERENCES

[1] F. Rolaf van Leeuwen, “A european perspective on hazardous air pollutants,” Toxi-cology, vol. 181, pp. 355–359, 2002.

[2] J. Wang and S. Christopher, “Intercomparison between satellite-derived aerosol op-tical thickness and PM2.5 mass: Implications for air quality studies,” Geophys. Res.Lett., vol. 30, no. 21, p. 2095, 2003.

[3] D. Chu, Y. Kaufman, G. Zibordi, J. Chern, J. Mao, C. Li, and B. Holben, “Globalmonitoring of air pollution over land from the earth observing system-terra mod-erate resolution imaging spectroradiometer (MODIS),” J. Geophys. Res, vol. 108,no. D21, 2003.

[4] J. Engel-Cox, C. Holloman, B. Coutant, and R. Hoff, “Qualitative and quantitativeevaluation of MODIS satellite sensor data for regional and urban scale air quality,”Atmospheric Environment, vol. 38, no. 16, pp. 2495–2509, 2004.

[5] P. Gupta, S. Christopher, J. Wang, R. Gehrig, Y. Lee, and N. Kumar, “Satelliteremote sensing of particulate matter and air quality assessment over global cities,”Atmospheric Environment, vol. 40, no. 30, pp. 5880–5892, 2006.

[6] Y. Liu, M. Franklin, R. Kahn, and P. Koutrakis, “Using aerosol optical thicknessto predict ground-level PM2.5 concentrations in the St. Louis area: a comparisonbetween MISR and MODIS,” Remote sensing of environment, vol. 107, no. 1-2,pp. 33–44, 2007.

[7] P. Gupta and S. Christopher, “Particulate matter air quality assessment using in-tegrated surface, satellite, and meteorological products: 2. a neural network ap-proach,” J. Geophys. Res, vol. 114, 2009.

[8] M. Schaap, A. Apituley, R. Timmermans, R. Koelemeijer, and G. De Leeuw,“Exploring the relation between aerosol optical depth and PM2.5 at Cabauw, theNetherlands,” Atmos. Chem. Phys, vol. 9, pp. 909–925, 2009.

[9] S. Natali, A. Beccati, S. D’Elia, M. Veratelli, P. Campalani, M. Folegani, andS. Mantovani, “Multitemporal data management and exploitation infrastructure,”

in MultiTemp, (Trento), July 2011.[10] P. Gupta and S. Christopher, “Seven year particulate matter air quality assessment

from surface and satellite measurements,” Atmospheric Chemistry and Physics Dis-cussions, vol. 8, no. 1, pp. 327–365, 2008.

[11] L. Remer, D. Tanré, and Y. Kaufman, “Algorithm for remote sensing of troposphericaerosol from MODIS: Collection 005,” tech. rep., National Aeronautics and SpaceAdministration; Goddard Space Flight Center: Greenbelt, MD, 2009.

[12] “PM MAPPER system description, issue 1.1,” 2009. Internal report, unpublished.If requested, can be delivered upon agreement from the sponsor of the project.

[13] T. Nguyen, M. Bottoni, and S. Mantovani, “PM MAPPER: an air quality monitoringsystem with fine spatial resolution product and integrated surface information,” inHyperspectral Workshop, (Frascati, Italy), 2010.

[14] Campalani, P. and Nguyen, T.N.T. and Mantovani, S. and Bottoni, M. and Mazzini,G., “Validation of PM MAPPER aerosol optical thickness retrievals at 1×1 km2of spatial resolution,” in Software, Telecommunications and Computer Networks(SoftCOM), 2011 19th International Conference on, pp. 1 –5, sept. 2011.

[15] E. Isaaks and R. Srivastava, An introduction to applied geostatistics, vol. 46. OxfordUniversity Press, USA, 1989.

[16] A. Stein and L. Corsten, “Universal kriging and cokriging as a regression proce-dure,” Biometrics, vol. 47, no. 2, pp. 575–587, 1991.

[17] C. Ichoku, A. Chu, S. Mattoo, Y. Kaufman, L. Remer, D. Tanré, I. Slutsker,and B. Holben, “A spatio-temporal approach for global validation and analysis ofMODIS aerosol products,” Geophys. Res. Lett, vol. 29, no. 12, p. 8006, 2002.

[18] N. Cressie and D. Hawkins, “Robust estimation of the variogram: I,” MathematicalGeology, vol. 12, no. 2, pp. 115–125, 1980.

[19] R Development Core Team, R: A Language and Environment for Statistical Com-puting. R Foundation for Statistical Computing, Vienna, Austria, 2010. ISBN3-900051-07-0.

000094978-1-4673-0753-6/11/$26.00 ©2011 IEEE