Top Banner
RESEARCH Open Access Geographical patterns and predictors of malaria risk in Zambia: Bayesian geostatistical modelling of the 2006 Zambia national malaria indicator survey (ZMIS) Nadine Riedel 1 , Penelope Vounatsou 1* , John M Miller 2 , Laura Gosoniu 1 , Elizabeth Chizema-Kawesha 3 , Victor Mukonka 3 , Rick W Steketee 2 Abstract Background: The Zambia Malaria Indicator Survey (ZMIS) of 2006 was the first nation-wide malaria survey, which combined parasitological data with other malaria indicators such as net use, indoor residual spraying and household related aspects. The survey was carried out by the Zambian Ministry of Health and partners with the objective of estimating the coverage of interventions and malaria related burden in children less than five years. In this study, the ZMIS data were analysed in order (i) to estimate an empirical high-resolution parasitological risk map in the country and (ii) to assess the relation between malaria interventions and parasitaemia risk after adjusting for environmental and socio-economic confounders. Methods: The parasitological risk was predicted from Bayesian geostatistical and spatially independent models relating parasitaemia risk and environmental/climatic predictors of malaria. A number of models were fitted to capture the (potential) non-linearity in the malaria-environment relation and to identify the elapsing time between environmental effects and parasitaemia risk. These models included covariates (a) in categorical scales and (b) in penalized and basis splines terms. Different model validation methods were used to identify the best fitting model. Model-based risk predictions at unobserved locations were obtained via Bayesian predictive distributions for the best fitting model. Results: Model validation indicated that linear environmental predictors were able to fit the data as well as or even better than more complex non-linear terms and that the data do not support spatial dependence. Overall the averaged population-adjusted parasitaemia risk was 20.0% in children less than five years with the highest risk predicted in the northern (38.3%) province. The odds of parasitaemia in children living in a household with at least one bed net decreases by 40% (CI: 12%, 61%) compared to those without bed nets. Conclusions: The map of parasitaemia risk together with the prediction error and the population at risk give an important overview of the malaria situation in Zambia. These maps can assist to achieve better resource allocation, health management and to target additional interventions to reduce the burden of malaria in Zambia significantly. Repeated surveys will enable the evaluation of the effectiveness of on-going interventions. * Correspondence: [email protected] 1 Department of Public Health and Epidemiology, Swiss Tropical Institute, PO Box, 4002 Basel, Switzerland Riedel et al. Malaria Journal 2010, 9:37 http://www.malariajournal.com/content/9/1/37 © 2010 Riedel et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
13

RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

Jul 04, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

RESEARCH Open Access

Geographical patterns and predictors of malariarisk in Zambia Bayesian geostatistical modellingof the 2006 Zambia national malaria indicatorsurvey (ZMIS)Nadine Riedel1 Penelope Vounatsou1 John M Miller2 Laura Gosoniu1 Elizabeth Chizema-Kawesha3Victor Mukonka3 Rick W Steketee2

Abstract

Background The Zambia Malaria Indicator Survey (ZMIS) of 2006 was the first nation-wide malaria survey whichcombined parasitological data with other malaria indicators such as net use indoor residual spraying andhousehold related aspects The survey was carried out by the Zambian Ministry of Health and partners with theobjective of estimating the coverage of interventions and malaria related burden in children less than five years Inthis study the ZMIS data were analysed in order (i) to estimate an empirical high-resolution parasitological riskmap in the country and (ii) to assess the relation between malaria interventions and parasitaemia risk afteradjusting for environmental and socio-economic confounders

Methods The parasitological risk was predicted from Bayesian geostatistical and spatially independent modelsrelating parasitaemia risk and environmentalclimatic predictors of malaria A number of models were fitted tocapture the (potential) non-linearity in the malaria-environment relation and to identify the elapsing time betweenenvironmental effects and parasitaemia risk These models included covariates (a) in categorical scales and (b) inpenalized and basis splines terms Different model validation methods were used to identify the best fitting modelModel-based risk predictions at unobserved locations were obtained via Bayesian predictive distributions for thebest fitting model

Results Model validation indicated that linear environmental predictors were able to fit the data as well as oreven better than more complex non-linear terms and that the data do not support spatial dependence Overall theaveraged population-adjusted parasitaemia risk was 200 in children less than five years with the highest riskpredicted in the northern (383) province The odds of parasitaemia in children living in a household with at leastone bed net decreases by 40 (CI 12 61) compared to those without bed nets

Conclusions The map of parasitaemia risk together with the prediction error and the population at risk give animportant overview of the malaria situation in Zambia These maps can assist to achieve better resource allocationhealth management and to target additional interventions to reduce the burden of malaria in Zambia significantlyRepeated surveys will enable the evaluation of the effectiveness of on-going interventions

Correspondence penelopevounatsouunibasch1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

copy 2010 Riedel et al licensee BioMed Central Ltd This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (httpcreativecommonsorglicensesby20) which permits unrestricted use distribution and reproduction inany medium provided the original work is properly cited

BackgroundMalaria is an endemic disease in Zambia with a nationalincidence of 412 per 1000 inhabitants in 2006 Despitea drop of the reported cases over the last years it is stillthe leading cause of morbidity and mortality accountingfor 45 of hospitalizations and outpatient departmentvisits with 6000-8000 reported deaths [1] Through theNational Malaria Strategic Plan (NMSP) 2005-2010 theMinistry of Health and a network of partners are work-ing toward scaling up effective malaria control interven-tions with the goal of substantially reducing malaria-related burden especially among vulnerable populationssuch as children under five years of age [2] Lead by theMinistry of Health numerous partners including theGlobal Fund the Presidentrsquos Malaria Initiative (PMI)the World Bank and the Malaria Control and EvaluationPartnership in Africa (MACEPA) provide support forscaling up malaria control prevention and treatment ser-vices throughout Zambia The national malaria controlprogramme advocates malaria control through wide-spread distribution of insecticide-treated mosquito netsapplication of insecticides in homes preventive treat-ment for pregnant women and effective treatment ofinfected persons [3] As part of the programme 53 mil-lion insecticide-treated nets were distributed all over thecountry in the years 2006 and 2007 and 85 householdsof 15 target district have been sprayed [4] In additionin 2007 60 of all pregnant women got malaria preven-tion drug and all pregnant woman who visited a publicclinic received one insecticide-treated net for herself andevery under-five child in the same household [5]The efforts of malaria reduction require comprehen-

sive baseline maps of malaria risk over the whole coun-try These maps can guide malaria control at areas ofhighest need help limited resources to be distributedmore efficiently and assist in the evaluation of the pro-gress of all intervention programmes Earlier maps ofmalaria risk in Zambia are based on malaria climaticsuitability conditions [67] however to date there is noempirical malaria map for the country Although histori-cal survey data have been compiled by the MappingMalaria risk in Africa (MARA) project malariarisk estimates based on these data will not reflect thecurrent situation which is changing due to ongoinginterventionsIn 2006 the Ministry of Health the Central Statistics

Office (CSO) MACEPA and partners conducted thefirst national Zambia Malaria Indicator Survey (ZMIS)This is a nationally representative household survey inchildren under five to assess the coverage of key malariainterventions and to measure malaria-related burden [5]The survey contains geo-referenced parasitological datafor each child that can be used to estimate the malaria

risk and draw accurate maps of the current malariasituation in Zambia In addition the ZMIS collectedinformation on previous interventions at household levellike bed nets or indoor residual spraying (IRS) andsocio-economic aspectsIn this paper the ZMIS data of 2006 were analysed

and the first contemporary empirical parasitaemia riskmap for the country was produced The MIS data areexpected to be correlated in space due to commonenvironmental exposures which influence transmissionsimilarly in neighbouring areas The standard statisticalmethods assume independence of the observations Totake into account spatial correlation Bayesian geostatis-tical models [8] were developed to establish the relationbetween the parasitaemia data and environmentalcli-matic predictors of the disease In addition the corre-sponding non-spatial models were fitted for comparisonpurposes Environmental data were obtained vie remotesensing (RS) Potential non-linearity in the environment-malaria relation and elapsing time in the effects ofenvironmental predictors on parasite risk were modelledusing predictors in categorical scales and fitted by pena-lized and basis spline (P- and B-splines) functions Dueto large number of model parameters Bayesian Markovchain Monte Carlo (MCMC) simulation was used formodel fit Model based predictions estimating the risk atunobserved locations were obtained via Bayesian krigingParasitaemia risk estimates were linked to populationdata and the number of children at risk at provincelevel was calculated

MethodsThe study area and the ZMISZambia is a republic in Southern Africa Most parts ofthe country are high plateau areas covered with savan-nas and some rivers valleys and mountains The countryhas a tropical climate with the rainy season occurringduring December and AprilThe ZMIS was carried out from May to June 2006

shortly after the rainy season The data were obtainedfrom a nationally representative two-stage cluster sam-ple [5] At the first stage 120 standard enumerationareas (SEA) were randomly selected among about17000 SEAs the country is divided They are locatedwithin 58 out of 72 districts from all 9 provinces inZambia Within each SEA a random sample of 25households was chosen resulting in a total of 3000households A household and a womenrsquos questionnairewere conducted with Personal Digital Assistants (PDA)In addition blood samples in children under five werecollected and analysed for anaemia using Hemocue Hb201 and malaria parasites using Paracheck Pf and thickand thin blood smears Households were geo-located

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 2 of 13

using the Global Positioning System (GPS) All datawere entered in an ACCESS database

Socio-economic dataSocio-economic data were obtained from a householdsurvey carried out during the ZMIS An asset index wascreated as a weighted sum of 59 different householdassets extracted from 17 relevant questions which wereincluded in the household survey The weights were cal-culated by principle component analysis (PCA) on theasset indicators [9] Then the household asset index wasdivided into wealth quintiles to create a socio-economicstatus analysis variable

Environmental and population density dataEnvironmental predictors were extracted from RemoteSensing (RS) sources at spatial and temporal resolutionsshown in Table 1 This data are available for free athigh spatial and temporal resolution To take intoaccount the elapsing time between the climatic suitabil-ity for malaria transmission and parasitaemia the cli-matic data were gathered for different periods (up toone year) prior to the survey starting from May 2005Day and night land surface temperature (LST) normal-ized difference vegetation index (NDVI) and land covertypes were downloaded from the Moderate ResolutionImaging Spectroradiometer (MODIS) from the USGeological Survey (USGS) Land Processes DistributedActive Archive Center (LP DAAC) [10] LST data wereextracted as averages over 8-day periods at 1 km spatialresolution NDVI was obtained as a 16-day average at a025 km spatial resolution Land cover data were avail-able from MODIS for the year 2004 and contained 17different land cover categories as defined by the Interna-tional Geosphere-Biosphere Programme (IGBP) Theywere grouped into five categories namely wetlands for-ests urban areas shrublands and others At each clusterlocation the land cover covariate was summarized bythe proportion of each land type within a radius of 3km During the model fit the category ldquoothersrdquo was

dropped from the analysis to avoid effects of colinearityDaily rainfall estimates (RFE) were taken from Meteosat7 satellite images and downloaded from the USGS Fam-ine Early Warning Systems Network (FEWS NET) Afri-can Data Dissemination Service (ADDS) [11] at 8 kmspatial resolutionAltitude data were extracted from an interpolated

USGS digital elevation model (DEM) [12] available at aspatial resolution of 1 km The digital maps for threedifferent kinds of water bodies in Zambia (lakes riversand wetlands) and urbanrural regions were acquiredfrom the HealthMapper database [13] The distancefrom each location to the nearest water body source wascalculated in IDRISI 32 (Clark Labs) Estimates of thenumber of persons living in an area of 500 by 500square meters were downloaded from the LandScantradeGlobal Population Database [14] for the year 2006 Thepercentage of under-five children out of the total Zam-bia population (173) was obtained from the 2006 dataof the US Census Bureau International Database [15]The coordinates of the SEAs were calculated by the

average of latitude and longitude over all householdlocations within the SEA These coordinates were usedto link the environmental and malaria data For the pur-pose of predicting parasitaemia risk at the unobservedlocations a grid with cell size of 3 km by 3 km wasoverlaid on the Zambia map (resulting in around100000 grid cells) and the remote sensing data werealso extracted for the centroids of the grid cellsThe MODIS Reprojection Tool (USGS) was used to

convert the RS data to geo-referenced maps Furtherprocessing of the environmental data and distance cal-culation for the water bodies was carried out in IDRISI32 ArcMap v 91 (ESRI) was used as a mapping toolAdditional data processing was performed in StataSE92 (StataCorp LP)

Statistical modelsMost of the climatic RS data are available continuouslyin time Depending on the malaria endemicity the

Table 1 Source spatial and temporal resolution of remote sensing (RS) data

Predictor SpatialResolution

TemporalResolution

Source

Day land surface temperature (day LST) 1 times 1 km2 8 days MODIS

Night land surface temperature (night LST) 1 times 1 km2 8 days MODIS

Normalized difference vegetation index (NDVI) 025 times 025 km2 16 days MODIS

Land cover 1 times 1 km2 - MODIS

Rainfall estimate (RFE) 8 times 8 km2 daily ADDS

Elevation 1 times 1 km2 - USGS

Region (urbanrural) 1 times 1 km2 - HealthMapper

Water bodies (rivers lakes amp wetlands) 1 times 1 km2 - HealthMapper

Population counts 05 times 05 km2 - Landscan2006

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 3 of 13

duration of the malaria transmission season and envir-onmental factors there is an elapsing (lag) time betweenthe climatic suitability for malaria transmission and theoccurrence of the disease To determine this periodwhich may differ among environmental factors a lagtime analysis was carried out A lag time is defined as aperiod prior to the survey during which an averagevalue of the climatic factor was calculated However thefirst 16 days preceding the survey were excludedbecause parasite development in the mosquito takesaround two weeks before the mosquito becomes infec-tious For each climatic predictor a number of analysesvariables were created These variables represent shortand long term mean averages of the values of the cli-matic factor at different lag times For factors extractedat 1- or 8-day temporal resolution the lag times weremultiples of 8 days (that is 8 16 360) For NDVIwhich was extracted at 16 days temporal resolution thelag times were multiples of 16 days up to one year Atthe end 45 lag time variables were created for rainfallday and night LST and 22 lag time variables for NDVIBivariate logistic regression models were fitted to

assess the relation between the parasitaemia risk out-come and the environmental lag time variables Foreach climatic factor the lag time variable which wasfurther considered in the analysis was the one giving amodel with the smallest Akaikersquos Information Criterion(AIC) All covariates which were significant in thebivariate analysis at 15 significance level determinedby likelihood ratio tests were included in a multiplegeostatistical logistic regression analysisSeveral geostatistical multiple logistic regression mod-

els were fitted to assess and capture potential non-line-arity in the malaria-environment relation These modelsincluded covariates (i) in continuous scales (ii) in cate-gorical scales with categories based on quartiles and (iii)fitted by penalized and basis spline (P- and B-splines)curves (see Additional file 1) The model with the bestpredictive ability was chosen via a model validation pro-cedure In the geostatistical model specification spatialcorrelation was taken into account by including house-hold location-specific random effects and assuming thatthey derive from a multivariate Gaussian spatial processwith zero mean [8] The covariance between any pair oflocations was assumed to be an exponential function ofdistance between the locations Covariates and randomeffects were modelled on the logit scale of the parasitae-mia risk parameters The above geostatistical modelshave at least as many parameters as the number of loca-tions but model fit is possible via MCMC simulationmethods Exploratory analyses suggested weak spatialcorrelation therefore non-spatial models (having smallernumbers of parameters) were also fitted The modelwith the best predictive ability was employed to predict

the risk at the unsampled locations via Bayesian krigingPredictions were made over a grid of around 100000pixels to obtain a parasitaemia risk map for Zambia

Model fit and validationA random sample of 89 (training) locations wasselected for model fit and the predictive ability of themodels was assessed on the remaining 20 (test) loca-tions The range of distances from the selected testlocations to the nearest training location varies from900 m to 76 km with a median of about 26 km sug-gesting that the locations are representative of theunderlying spatial process Model predictions werecompared using the following three approaches[1617] i) the model with the highest proportion oftest locations falling within the 95 Bayesian confi-dence (credible) interval (CI) and ii) distance measuresbetween observed and predicted parasitaemia preva-lence calculated by the Kullback-Leibler (KL) diver-gence and an analogue to the c2-test In particular foreach test location the posterior predictive distribution(PPD) was computed using each fitted model Basedon the PPD Bayesian confidence intervals were calcu-lated with probability coverage of 95 Each credibleinterval was examined weather the test locations werefalling within that interval The model predicting thelargest number of test locations within the 95 CI ofsmallest width was considered as the best one The KLcalculates the mean divergences between the observedand predicted parasitaemia prevalence on the logitscales weighted by the observed value The analogue tothe c2-test is based on the squared distance betweenthe median of the PPD and the observed parasitaemiaprevalence divided by the observed prevalence Themodel giving the smallest divergence or the smallestdistance is considered as the best modelA mathematical description of the models used is

given in Additional file 1 The statistical analysis wascarried out in StataSE 92 (StataCorp LP) Winbugs(Imperial College and Medical Research Council UK)and in specialized software written by the authors inFortan 95 (Digital Equipment Corporation) program-ming language using standard numerical libraries(Numerical Algorithms Group Ltd)

ResultsThe MIS included 2364 children under five years of agerandomly sampled over 120 cluster locations Howevera sample of only 1324 children at 109 cluster locationshad complete parasitological data linked to a geo-locatedhousehold in order to estimate the distribution of para-sitaemia risk The study profile is given in Figure 1 Thesample locations with the observed parasitaemia preva-lence are shown Figure 2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 4 of 13

The lag time analysis suggested that the following per-iods preceding the survey are best (in terms of modelfit) for summarizing the climatic factors 27 months forrainfall 1 month for NDVI 15 months for day andnight LST The geographical distributions of the envir-onmental factors summarized at the above lag times aredisplayed in Figure 3Non-spatial bivariate logistic regression analyses and

the likelihood ratio test indicated that all RS factors

were significant at 15 significance level (results notpresented) All these variables were further included inthe geostatistical analysis Exploratory analysis indicatednon-linearity in the relation between the parasitaemiarisk and the following environmental predictors NDVIrainfall day and night LST Various Bayesian multiplelogistical regression models (spatial as well as non-spa-tial) were fitted modelling the non-linearity of the abovefactors via spline curves or categorical covariates In

Figure 1 Study profile of the ZMIS for predicting parasitaemia risk

Figure 2 Observed parasitaemia prevalence (left) and province names (right) Observed parasitaemia prevalence within district boundariesat 109 cluster locations used in estimating the distribution of parasitaemia risk in Zambia (left-hand side) The grey dots indicate the 11 clustersthat were excluded from the analysis Province names are given on the right-hand map

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 5 of 13

Figure 3 Spatial distribution of remotely sensed covariates in Zambia The climatic factors were summarized over a period preceding thesurvey indicated by the lag time analysis (day LST night LST NDVI rainfall) The land use map presents the most frequent land use category ina buffer of 3 km around every pixel

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 6 of 13

addition all models included land cover types regiontype (urbanrural) altitude and distance to the nearestwater bodies as categorical covariates Results on modelvalidation are presented in Table 2 The B-spline modelswere able to predict correctly most of the test locations(70) within the 95 CI however the B-spline modelsproduced always 95 CIs with largest absolute widthsIn comparison the spatial P-spline model is able to pre-dict correctly nearly the same proportion of test loca-tions (65) within a 95 CI with considerably smallerwidth Among those models predicting 60 of test loca-tions correctly within a 95 CI the non-spatial modelwith the linear terms had the smallest absolute widthThis model shows additionally the smallest KL diver-gence followed by the spatial P-spline model and thesecond smallest c2-value The spatial model with linearterms has the smallest c2-value Based on model valida-tion results the non-spatial model with the linear termswas chosen as the final model used for prediction dueto the very good KL divergence and c2-test results andthe smallest width of the CIs This model has also theadvantage of a small number of model covariates (com-pared to the second best model based on spatial P-splines) avoiding over-parameterization problems Themodel was employed to predict the parasitaemia risk atunsampled locations and included the following predic-tors proportion of each land cover type (excluding theldquootherrdquo category) within a 3 km radius around the loca-tion categorical covariates for the environmental predic-tors (altitude region type and distance to the nearestwater bodies) and linear climatic predictors (NDVI rain-fall day and night LST)Predictions obtained at around 100000 pixels via

Bayesian kriging are shown in Figure 4 The predictedparasitaemia prevalence is ranging between 08 and809 based on the posterior predictive distributionwhile the observed data vary between 02 and 598The overall prevalence (mean of the prediction for everypixel) is approximately 264 with a standard deviation

of 152 (observed locations 228 with 164 standarddeviation) Relatively low risks areas (lt10) were fre-quently predicted for North-Western Western andSouthern province in comparison to the high risk areas(gt50) mainly found in Eastern province and adjacentregions Estimates of the corresponding prediction errorare depicted in Figure 5 The map shows that regionswith high prediction errors have high parasitaemia riskThese are mainly areas with sparse survey locationsThe parasitaemia risk estimates were combined with

the estimated number of children below five years livingin the areas of the corresponding pixels to calculate thenumber of children with parasitaemia These estimatesare presented in Figure 6 in map form and in Table 3 astotal counts at province level Taking into account thepopulation distribution the average prevalence of para-sitaemia risk is approximately 200 Lusaka provincehas the lowest population-adjusted prevalence level of73 and is the only province with a predicted meanprevalence of lt10 Northern province has the highestpopulation-adjusted prevalence of 383 followed byLuapula (303) and Eastern province (277)To assess the effects of malaria interventions in Zam-

bia after adjusting for climatic and environmentaleffects the above model was fitted again with threeadditional covariates socio-economic status of thehousehold indoor residual spraying within the last 12months and presence of at least one bed net in thehousehold These covariates were not used for predic-tion as accurate estimates of their distribution in Zam-bia do not exist for 2006 The regression coefficients ofthis model (multivariate non-spatial) are given in Table4 together with the bivariate non-spatial logistic regres-sion models In addition results of the non-spatialmodel with linear terms and the second best model(spatial P-spline) are presented The results of thebivariate regression models reflect significant negativerelations with day LST proportion of urban areas withina 3 km buffer the region type with urban areas havinglower parasitaemia risk distance to the nearest waterbody altitude levels above 14 km socio-economic sta-tus (4th and 5th quintile) spraying and presence of bednets Positive significant relations were detected withnight LST NDVI rainfall within the last 27 monthsand proportion of wetlands in the surrounding areaThe implementation of the multivariate non-spatialmodel indicated a loss of significant covariates The onlyremaining significant parameter was the presence of atleast one bed net in a household which reduces theodds of parasitaemia in children by 40 (CI 12 61)The final prediction model (non-spatial model with thelinear terms) without the additional household specificcovariates showed no significant correlation between thepredictors and the parasitaemia risk The non-spatial

Table 2 Model validation summary for the spatial (s) andnon-spatial (ns) models

Model CI (width) KL c2

Linear (ns) 60 (058) 2149 4867

Linear (s) 50 (059) 2258 4459

Categorical (ns) 60 (069) 3081 6943

Categorical (s) 60 (068) 3033 6143

P-spline (ns) 60 (061) 2311 8194

P-spline (s) 65 (061) 2229 7417

B-spline (ns) 70 (072) 2759 26698

B-spline (s) 70 (072) 2885 26846

Comparison of Bayesian credible intervals (CI) of 95 probability coveragewith their corresponding width Kullback-Leibler divergences (KL) and the c2-test analogue on 20 test locations

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 7 of 13

Figure 4 Predicted parasitaemia risk map for children lt5 years in Zambia The map is based on a Bayesian logistic regression model withlinear terms for day LST night LST NDVI and rainfall The estimates correspond to the median of the posterior predictive distributions computedover 100000 pixels

Figure 5 Prediction error of the parasitaemia risk estimates given in Figure 4

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 8 of 13

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 2: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

BackgroundMalaria is an endemic disease in Zambia with a nationalincidence of 412 per 1000 inhabitants in 2006 Despitea drop of the reported cases over the last years it is stillthe leading cause of morbidity and mortality accountingfor 45 of hospitalizations and outpatient departmentvisits with 6000-8000 reported deaths [1] Through theNational Malaria Strategic Plan (NMSP) 2005-2010 theMinistry of Health and a network of partners are work-ing toward scaling up effective malaria control interven-tions with the goal of substantially reducing malaria-related burden especially among vulnerable populationssuch as children under five years of age [2] Lead by theMinistry of Health numerous partners including theGlobal Fund the Presidentrsquos Malaria Initiative (PMI)the World Bank and the Malaria Control and EvaluationPartnership in Africa (MACEPA) provide support forscaling up malaria control prevention and treatment ser-vices throughout Zambia The national malaria controlprogramme advocates malaria control through wide-spread distribution of insecticide-treated mosquito netsapplication of insecticides in homes preventive treat-ment for pregnant women and effective treatment ofinfected persons [3] As part of the programme 53 mil-lion insecticide-treated nets were distributed all over thecountry in the years 2006 and 2007 and 85 householdsof 15 target district have been sprayed [4] In additionin 2007 60 of all pregnant women got malaria preven-tion drug and all pregnant woman who visited a publicclinic received one insecticide-treated net for herself andevery under-five child in the same household [5]The efforts of malaria reduction require comprehen-

sive baseline maps of malaria risk over the whole coun-try These maps can guide malaria control at areas ofhighest need help limited resources to be distributedmore efficiently and assist in the evaluation of the pro-gress of all intervention programmes Earlier maps ofmalaria risk in Zambia are based on malaria climaticsuitability conditions [67] however to date there is noempirical malaria map for the country Although histori-cal survey data have been compiled by the MappingMalaria risk in Africa (MARA) project malariarisk estimates based on these data will not reflect thecurrent situation which is changing due to ongoinginterventionsIn 2006 the Ministry of Health the Central Statistics

Office (CSO) MACEPA and partners conducted thefirst national Zambia Malaria Indicator Survey (ZMIS)This is a nationally representative household survey inchildren under five to assess the coverage of key malariainterventions and to measure malaria-related burden [5]The survey contains geo-referenced parasitological datafor each child that can be used to estimate the malaria

risk and draw accurate maps of the current malariasituation in Zambia In addition the ZMIS collectedinformation on previous interventions at household levellike bed nets or indoor residual spraying (IRS) andsocio-economic aspectsIn this paper the ZMIS data of 2006 were analysed

and the first contemporary empirical parasitaemia riskmap for the country was produced The MIS data areexpected to be correlated in space due to commonenvironmental exposures which influence transmissionsimilarly in neighbouring areas The standard statisticalmethods assume independence of the observations Totake into account spatial correlation Bayesian geostatis-tical models [8] were developed to establish the relationbetween the parasitaemia data and environmentalcli-matic predictors of the disease In addition the corre-sponding non-spatial models were fitted for comparisonpurposes Environmental data were obtained vie remotesensing (RS) Potential non-linearity in the environment-malaria relation and elapsing time in the effects ofenvironmental predictors on parasite risk were modelledusing predictors in categorical scales and fitted by pena-lized and basis spline (P- and B-splines) functions Dueto large number of model parameters Bayesian Markovchain Monte Carlo (MCMC) simulation was used formodel fit Model based predictions estimating the risk atunobserved locations were obtained via Bayesian krigingParasitaemia risk estimates were linked to populationdata and the number of children at risk at provincelevel was calculated

MethodsThe study area and the ZMISZambia is a republic in Southern Africa Most parts ofthe country are high plateau areas covered with savan-nas and some rivers valleys and mountains The countryhas a tropical climate with the rainy season occurringduring December and AprilThe ZMIS was carried out from May to June 2006

shortly after the rainy season The data were obtainedfrom a nationally representative two-stage cluster sam-ple [5] At the first stage 120 standard enumerationareas (SEA) were randomly selected among about17000 SEAs the country is divided They are locatedwithin 58 out of 72 districts from all 9 provinces inZambia Within each SEA a random sample of 25households was chosen resulting in a total of 3000households A household and a womenrsquos questionnairewere conducted with Personal Digital Assistants (PDA)In addition blood samples in children under five werecollected and analysed for anaemia using Hemocue Hb201 and malaria parasites using Paracheck Pf and thickand thin blood smears Households were geo-located

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 2 of 13

using the Global Positioning System (GPS) All datawere entered in an ACCESS database

Socio-economic dataSocio-economic data were obtained from a householdsurvey carried out during the ZMIS An asset index wascreated as a weighted sum of 59 different householdassets extracted from 17 relevant questions which wereincluded in the household survey The weights were cal-culated by principle component analysis (PCA) on theasset indicators [9] Then the household asset index wasdivided into wealth quintiles to create a socio-economicstatus analysis variable

Environmental and population density dataEnvironmental predictors were extracted from RemoteSensing (RS) sources at spatial and temporal resolutionsshown in Table 1 This data are available for free athigh spatial and temporal resolution To take intoaccount the elapsing time between the climatic suitabil-ity for malaria transmission and parasitaemia the cli-matic data were gathered for different periods (up toone year) prior to the survey starting from May 2005Day and night land surface temperature (LST) normal-ized difference vegetation index (NDVI) and land covertypes were downloaded from the Moderate ResolutionImaging Spectroradiometer (MODIS) from the USGeological Survey (USGS) Land Processes DistributedActive Archive Center (LP DAAC) [10] LST data wereextracted as averages over 8-day periods at 1 km spatialresolution NDVI was obtained as a 16-day average at a025 km spatial resolution Land cover data were avail-able from MODIS for the year 2004 and contained 17different land cover categories as defined by the Interna-tional Geosphere-Biosphere Programme (IGBP) Theywere grouped into five categories namely wetlands for-ests urban areas shrublands and others At each clusterlocation the land cover covariate was summarized bythe proportion of each land type within a radius of 3km During the model fit the category ldquoothersrdquo was

dropped from the analysis to avoid effects of colinearityDaily rainfall estimates (RFE) were taken from Meteosat7 satellite images and downloaded from the USGS Fam-ine Early Warning Systems Network (FEWS NET) Afri-can Data Dissemination Service (ADDS) [11] at 8 kmspatial resolutionAltitude data were extracted from an interpolated

USGS digital elevation model (DEM) [12] available at aspatial resolution of 1 km The digital maps for threedifferent kinds of water bodies in Zambia (lakes riversand wetlands) and urbanrural regions were acquiredfrom the HealthMapper database [13] The distancefrom each location to the nearest water body source wascalculated in IDRISI 32 (Clark Labs) Estimates of thenumber of persons living in an area of 500 by 500square meters were downloaded from the LandScantradeGlobal Population Database [14] for the year 2006 Thepercentage of under-five children out of the total Zam-bia population (173) was obtained from the 2006 dataof the US Census Bureau International Database [15]The coordinates of the SEAs were calculated by the

average of latitude and longitude over all householdlocations within the SEA These coordinates were usedto link the environmental and malaria data For the pur-pose of predicting parasitaemia risk at the unobservedlocations a grid with cell size of 3 km by 3 km wasoverlaid on the Zambia map (resulting in around100000 grid cells) and the remote sensing data werealso extracted for the centroids of the grid cellsThe MODIS Reprojection Tool (USGS) was used to

convert the RS data to geo-referenced maps Furtherprocessing of the environmental data and distance cal-culation for the water bodies was carried out in IDRISI32 ArcMap v 91 (ESRI) was used as a mapping toolAdditional data processing was performed in StataSE92 (StataCorp LP)

Statistical modelsMost of the climatic RS data are available continuouslyin time Depending on the malaria endemicity the

Table 1 Source spatial and temporal resolution of remote sensing (RS) data

Predictor SpatialResolution

TemporalResolution

Source

Day land surface temperature (day LST) 1 times 1 km2 8 days MODIS

Night land surface temperature (night LST) 1 times 1 km2 8 days MODIS

Normalized difference vegetation index (NDVI) 025 times 025 km2 16 days MODIS

Land cover 1 times 1 km2 - MODIS

Rainfall estimate (RFE) 8 times 8 km2 daily ADDS

Elevation 1 times 1 km2 - USGS

Region (urbanrural) 1 times 1 km2 - HealthMapper

Water bodies (rivers lakes amp wetlands) 1 times 1 km2 - HealthMapper

Population counts 05 times 05 km2 - Landscan2006

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 3 of 13

duration of the malaria transmission season and envir-onmental factors there is an elapsing (lag) time betweenthe climatic suitability for malaria transmission and theoccurrence of the disease To determine this periodwhich may differ among environmental factors a lagtime analysis was carried out A lag time is defined as aperiod prior to the survey during which an averagevalue of the climatic factor was calculated However thefirst 16 days preceding the survey were excludedbecause parasite development in the mosquito takesaround two weeks before the mosquito becomes infec-tious For each climatic predictor a number of analysesvariables were created These variables represent shortand long term mean averages of the values of the cli-matic factor at different lag times For factors extractedat 1- or 8-day temporal resolution the lag times weremultiples of 8 days (that is 8 16 360) For NDVIwhich was extracted at 16 days temporal resolution thelag times were multiples of 16 days up to one year Atthe end 45 lag time variables were created for rainfallday and night LST and 22 lag time variables for NDVIBivariate logistic regression models were fitted to

assess the relation between the parasitaemia risk out-come and the environmental lag time variables Foreach climatic factor the lag time variable which wasfurther considered in the analysis was the one giving amodel with the smallest Akaikersquos Information Criterion(AIC) All covariates which were significant in thebivariate analysis at 15 significance level determinedby likelihood ratio tests were included in a multiplegeostatistical logistic regression analysisSeveral geostatistical multiple logistic regression mod-

els were fitted to assess and capture potential non-line-arity in the malaria-environment relation These modelsincluded covariates (i) in continuous scales (ii) in cate-gorical scales with categories based on quartiles and (iii)fitted by penalized and basis spline (P- and B-splines)curves (see Additional file 1) The model with the bestpredictive ability was chosen via a model validation pro-cedure In the geostatistical model specification spatialcorrelation was taken into account by including house-hold location-specific random effects and assuming thatthey derive from a multivariate Gaussian spatial processwith zero mean [8] The covariance between any pair oflocations was assumed to be an exponential function ofdistance between the locations Covariates and randomeffects were modelled on the logit scale of the parasitae-mia risk parameters The above geostatistical modelshave at least as many parameters as the number of loca-tions but model fit is possible via MCMC simulationmethods Exploratory analyses suggested weak spatialcorrelation therefore non-spatial models (having smallernumbers of parameters) were also fitted The modelwith the best predictive ability was employed to predict

the risk at the unsampled locations via Bayesian krigingPredictions were made over a grid of around 100000pixels to obtain a parasitaemia risk map for Zambia

Model fit and validationA random sample of 89 (training) locations wasselected for model fit and the predictive ability of themodels was assessed on the remaining 20 (test) loca-tions The range of distances from the selected testlocations to the nearest training location varies from900 m to 76 km with a median of about 26 km sug-gesting that the locations are representative of theunderlying spatial process Model predictions werecompared using the following three approaches[1617] i) the model with the highest proportion oftest locations falling within the 95 Bayesian confi-dence (credible) interval (CI) and ii) distance measuresbetween observed and predicted parasitaemia preva-lence calculated by the Kullback-Leibler (KL) diver-gence and an analogue to the c2-test In particular foreach test location the posterior predictive distribution(PPD) was computed using each fitted model Basedon the PPD Bayesian confidence intervals were calcu-lated with probability coverage of 95 Each credibleinterval was examined weather the test locations werefalling within that interval The model predicting thelargest number of test locations within the 95 CI ofsmallest width was considered as the best one The KLcalculates the mean divergences between the observedand predicted parasitaemia prevalence on the logitscales weighted by the observed value The analogue tothe c2-test is based on the squared distance betweenthe median of the PPD and the observed parasitaemiaprevalence divided by the observed prevalence Themodel giving the smallest divergence or the smallestdistance is considered as the best modelA mathematical description of the models used is

given in Additional file 1 The statistical analysis wascarried out in StataSE 92 (StataCorp LP) Winbugs(Imperial College and Medical Research Council UK)and in specialized software written by the authors inFortan 95 (Digital Equipment Corporation) program-ming language using standard numerical libraries(Numerical Algorithms Group Ltd)

ResultsThe MIS included 2364 children under five years of agerandomly sampled over 120 cluster locations Howevera sample of only 1324 children at 109 cluster locationshad complete parasitological data linked to a geo-locatedhousehold in order to estimate the distribution of para-sitaemia risk The study profile is given in Figure 1 Thesample locations with the observed parasitaemia preva-lence are shown Figure 2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 4 of 13

The lag time analysis suggested that the following per-iods preceding the survey are best (in terms of modelfit) for summarizing the climatic factors 27 months forrainfall 1 month for NDVI 15 months for day andnight LST The geographical distributions of the envir-onmental factors summarized at the above lag times aredisplayed in Figure 3Non-spatial bivariate logistic regression analyses and

the likelihood ratio test indicated that all RS factors

were significant at 15 significance level (results notpresented) All these variables were further included inthe geostatistical analysis Exploratory analysis indicatednon-linearity in the relation between the parasitaemiarisk and the following environmental predictors NDVIrainfall day and night LST Various Bayesian multiplelogistical regression models (spatial as well as non-spa-tial) were fitted modelling the non-linearity of the abovefactors via spline curves or categorical covariates In

Figure 1 Study profile of the ZMIS for predicting parasitaemia risk

Figure 2 Observed parasitaemia prevalence (left) and province names (right) Observed parasitaemia prevalence within district boundariesat 109 cluster locations used in estimating the distribution of parasitaemia risk in Zambia (left-hand side) The grey dots indicate the 11 clustersthat were excluded from the analysis Province names are given on the right-hand map

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 5 of 13

Figure 3 Spatial distribution of remotely sensed covariates in Zambia The climatic factors were summarized over a period preceding thesurvey indicated by the lag time analysis (day LST night LST NDVI rainfall) The land use map presents the most frequent land use category ina buffer of 3 km around every pixel

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 6 of 13

addition all models included land cover types regiontype (urbanrural) altitude and distance to the nearestwater bodies as categorical covariates Results on modelvalidation are presented in Table 2 The B-spline modelswere able to predict correctly most of the test locations(70) within the 95 CI however the B-spline modelsproduced always 95 CIs with largest absolute widthsIn comparison the spatial P-spline model is able to pre-dict correctly nearly the same proportion of test loca-tions (65) within a 95 CI with considerably smallerwidth Among those models predicting 60 of test loca-tions correctly within a 95 CI the non-spatial modelwith the linear terms had the smallest absolute widthThis model shows additionally the smallest KL diver-gence followed by the spatial P-spline model and thesecond smallest c2-value The spatial model with linearterms has the smallest c2-value Based on model valida-tion results the non-spatial model with the linear termswas chosen as the final model used for prediction dueto the very good KL divergence and c2-test results andthe smallest width of the CIs This model has also theadvantage of a small number of model covariates (com-pared to the second best model based on spatial P-splines) avoiding over-parameterization problems Themodel was employed to predict the parasitaemia risk atunsampled locations and included the following predic-tors proportion of each land cover type (excluding theldquootherrdquo category) within a 3 km radius around the loca-tion categorical covariates for the environmental predic-tors (altitude region type and distance to the nearestwater bodies) and linear climatic predictors (NDVI rain-fall day and night LST)Predictions obtained at around 100000 pixels via

Bayesian kriging are shown in Figure 4 The predictedparasitaemia prevalence is ranging between 08 and809 based on the posterior predictive distributionwhile the observed data vary between 02 and 598The overall prevalence (mean of the prediction for everypixel) is approximately 264 with a standard deviation

of 152 (observed locations 228 with 164 standarddeviation) Relatively low risks areas (lt10) were fre-quently predicted for North-Western Western andSouthern province in comparison to the high risk areas(gt50) mainly found in Eastern province and adjacentregions Estimates of the corresponding prediction errorare depicted in Figure 5 The map shows that regionswith high prediction errors have high parasitaemia riskThese are mainly areas with sparse survey locationsThe parasitaemia risk estimates were combined with

the estimated number of children below five years livingin the areas of the corresponding pixels to calculate thenumber of children with parasitaemia These estimatesare presented in Figure 6 in map form and in Table 3 astotal counts at province level Taking into account thepopulation distribution the average prevalence of para-sitaemia risk is approximately 200 Lusaka provincehas the lowest population-adjusted prevalence level of73 and is the only province with a predicted meanprevalence of lt10 Northern province has the highestpopulation-adjusted prevalence of 383 followed byLuapula (303) and Eastern province (277)To assess the effects of malaria interventions in Zam-

bia after adjusting for climatic and environmentaleffects the above model was fitted again with threeadditional covariates socio-economic status of thehousehold indoor residual spraying within the last 12months and presence of at least one bed net in thehousehold These covariates were not used for predic-tion as accurate estimates of their distribution in Zam-bia do not exist for 2006 The regression coefficients ofthis model (multivariate non-spatial) are given in Table4 together with the bivariate non-spatial logistic regres-sion models In addition results of the non-spatialmodel with linear terms and the second best model(spatial P-spline) are presented The results of thebivariate regression models reflect significant negativerelations with day LST proportion of urban areas withina 3 km buffer the region type with urban areas havinglower parasitaemia risk distance to the nearest waterbody altitude levels above 14 km socio-economic sta-tus (4th and 5th quintile) spraying and presence of bednets Positive significant relations were detected withnight LST NDVI rainfall within the last 27 monthsand proportion of wetlands in the surrounding areaThe implementation of the multivariate non-spatialmodel indicated a loss of significant covariates The onlyremaining significant parameter was the presence of atleast one bed net in a household which reduces theodds of parasitaemia in children by 40 (CI 12 61)The final prediction model (non-spatial model with thelinear terms) without the additional household specificcovariates showed no significant correlation between thepredictors and the parasitaemia risk The non-spatial

Table 2 Model validation summary for the spatial (s) andnon-spatial (ns) models

Model CI (width) KL c2

Linear (ns) 60 (058) 2149 4867

Linear (s) 50 (059) 2258 4459

Categorical (ns) 60 (069) 3081 6943

Categorical (s) 60 (068) 3033 6143

P-spline (ns) 60 (061) 2311 8194

P-spline (s) 65 (061) 2229 7417

B-spline (ns) 70 (072) 2759 26698

B-spline (s) 70 (072) 2885 26846

Comparison of Bayesian credible intervals (CI) of 95 probability coveragewith their corresponding width Kullback-Leibler divergences (KL) and the c2-test analogue on 20 test locations

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 7 of 13

Figure 4 Predicted parasitaemia risk map for children lt5 years in Zambia The map is based on a Bayesian logistic regression model withlinear terms for day LST night LST NDVI and rainfall The estimates correspond to the median of the posterior predictive distributions computedover 100000 pixels

Figure 5 Prediction error of the parasitaemia risk estimates given in Figure 4

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 8 of 13

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 3: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

using the Global Positioning System (GPS) All datawere entered in an ACCESS database

Socio-economic dataSocio-economic data were obtained from a householdsurvey carried out during the ZMIS An asset index wascreated as a weighted sum of 59 different householdassets extracted from 17 relevant questions which wereincluded in the household survey The weights were cal-culated by principle component analysis (PCA) on theasset indicators [9] Then the household asset index wasdivided into wealth quintiles to create a socio-economicstatus analysis variable

Environmental and population density dataEnvironmental predictors were extracted from RemoteSensing (RS) sources at spatial and temporal resolutionsshown in Table 1 This data are available for free athigh spatial and temporal resolution To take intoaccount the elapsing time between the climatic suitabil-ity for malaria transmission and parasitaemia the cli-matic data were gathered for different periods (up toone year) prior to the survey starting from May 2005Day and night land surface temperature (LST) normal-ized difference vegetation index (NDVI) and land covertypes were downloaded from the Moderate ResolutionImaging Spectroradiometer (MODIS) from the USGeological Survey (USGS) Land Processes DistributedActive Archive Center (LP DAAC) [10] LST data wereextracted as averages over 8-day periods at 1 km spatialresolution NDVI was obtained as a 16-day average at a025 km spatial resolution Land cover data were avail-able from MODIS for the year 2004 and contained 17different land cover categories as defined by the Interna-tional Geosphere-Biosphere Programme (IGBP) Theywere grouped into five categories namely wetlands for-ests urban areas shrublands and others At each clusterlocation the land cover covariate was summarized bythe proportion of each land type within a radius of 3km During the model fit the category ldquoothersrdquo was

dropped from the analysis to avoid effects of colinearityDaily rainfall estimates (RFE) were taken from Meteosat7 satellite images and downloaded from the USGS Fam-ine Early Warning Systems Network (FEWS NET) Afri-can Data Dissemination Service (ADDS) [11] at 8 kmspatial resolutionAltitude data were extracted from an interpolated

USGS digital elevation model (DEM) [12] available at aspatial resolution of 1 km The digital maps for threedifferent kinds of water bodies in Zambia (lakes riversand wetlands) and urbanrural regions were acquiredfrom the HealthMapper database [13] The distancefrom each location to the nearest water body source wascalculated in IDRISI 32 (Clark Labs) Estimates of thenumber of persons living in an area of 500 by 500square meters were downloaded from the LandScantradeGlobal Population Database [14] for the year 2006 Thepercentage of under-five children out of the total Zam-bia population (173) was obtained from the 2006 dataof the US Census Bureau International Database [15]The coordinates of the SEAs were calculated by the

average of latitude and longitude over all householdlocations within the SEA These coordinates were usedto link the environmental and malaria data For the pur-pose of predicting parasitaemia risk at the unobservedlocations a grid with cell size of 3 km by 3 km wasoverlaid on the Zambia map (resulting in around100000 grid cells) and the remote sensing data werealso extracted for the centroids of the grid cellsThe MODIS Reprojection Tool (USGS) was used to

convert the RS data to geo-referenced maps Furtherprocessing of the environmental data and distance cal-culation for the water bodies was carried out in IDRISI32 ArcMap v 91 (ESRI) was used as a mapping toolAdditional data processing was performed in StataSE92 (StataCorp LP)

Statistical modelsMost of the climatic RS data are available continuouslyin time Depending on the malaria endemicity the

Table 1 Source spatial and temporal resolution of remote sensing (RS) data

Predictor SpatialResolution

TemporalResolution

Source

Day land surface temperature (day LST) 1 times 1 km2 8 days MODIS

Night land surface temperature (night LST) 1 times 1 km2 8 days MODIS

Normalized difference vegetation index (NDVI) 025 times 025 km2 16 days MODIS

Land cover 1 times 1 km2 - MODIS

Rainfall estimate (RFE) 8 times 8 km2 daily ADDS

Elevation 1 times 1 km2 - USGS

Region (urbanrural) 1 times 1 km2 - HealthMapper

Water bodies (rivers lakes amp wetlands) 1 times 1 km2 - HealthMapper

Population counts 05 times 05 km2 - Landscan2006

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 3 of 13

duration of the malaria transmission season and envir-onmental factors there is an elapsing (lag) time betweenthe climatic suitability for malaria transmission and theoccurrence of the disease To determine this periodwhich may differ among environmental factors a lagtime analysis was carried out A lag time is defined as aperiod prior to the survey during which an averagevalue of the climatic factor was calculated However thefirst 16 days preceding the survey were excludedbecause parasite development in the mosquito takesaround two weeks before the mosquito becomes infec-tious For each climatic predictor a number of analysesvariables were created These variables represent shortand long term mean averages of the values of the cli-matic factor at different lag times For factors extractedat 1- or 8-day temporal resolution the lag times weremultiples of 8 days (that is 8 16 360) For NDVIwhich was extracted at 16 days temporal resolution thelag times were multiples of 16 days up to one year Atthe end 45 lag time variables were created for rainfallday and night LST and 22 lag time variables for NDVIBivariate logistic regression models were fitted to

assess the relation between the parasitaemia risk out-come and the environmental lag time variables Foreach climatic factor the lag time variable which wasfurther considered in the analysis was the one giving amodel with the smallest Akaikersquos Information Criterion(AIC) All covariates which were significant in thebivariate analysis at 15 significance level determinedby likelihood ratio tests were included in a multiplegeostatistical logistic regression analysisSeveral geostatistical multiple logistic regression mod-

els were fitted to assess and capture potential non-line-arity in the malaria-environment relation These modelsincluded covariates (i) in continuous scales (ii) in cate-gorical scales with categories based on quartiles and (iii)fitted by penalized and basis spline (P- and B-splines)curves (see Additional file 1) The model with the bestpredictive ability was chosen via a model validation pro-cedure In the geostatistical model specification spatialcorrelation was taken into account by including house-hold location-specific random effects and assuming thatthey derive from a multivariate Gaussian spatial processwith zero mean [8] The covariance between any pair oflocations was assumed to be an exponential function ofdistance between the locations Covariates and randomeffects were modelled on the logit scale of the parasitae-mia risk parameters The above geostatistical modelshave at least as many parameters as the number of loca-tions but model fit is possible via MCMC simulationmethods Exploratory analyses suggested weak spatialcorrelation therefore non-spatial models (having smallernumbers of parameters) were also fitted The modelwith the best predictive ability was employed to predict

the risk at the unsampled locations via Bayesian krigingPredictions were made over a grid of around 100000pixels to obtain a parasitaemia risk map for Zambia

Model fit and validationA random sample of 89 (training) locations wasselected for model fit and the predictive ability of themodels was assessed on the remaining 20 (test) loca-tions The range of distances from the selected testlocations to the nearest training location varies from900 m to 76 km with a median of about 26 km sug-gesting that the locations are representative of theunderlying spatial process Model predictions werecompared using the following three approaches[1617] i) the model with the highest proportion oftest locations falling within the 95 Bayesian confi-dence (credible) interval (CI) and ii) distance measuresbetween observed and predicted parasitaemia preva-lence calculated by the Kullback-Leibler (KL) diver-gence and an analogue to the c2-test In particular foreach test location the posterior predictive distribution(PPD) was computed using each fitted model Basedon the PPD Bayesian confidence intervals were calcu-lated with probability coverage of 95 Each credibleinterval was examined weather the test locations werefalling within that interval The model predicting thelargest number of test locations within the 95 CI ofsmallest width was considered as the best one The KLcalculates the mean divergences between the observedand predicted parasitaemia prevalence on the logitscales weighted by the observed value The analogue tothe c2-test is based on the squared distance betweenthe median of the PPD and the observed parasitaemiaprevalence divided by the observed prevalence Themodel giving the smallest divergence or the smallestdistance is considered as the best modelA mathematical description of the models used is

given in Additional file 1 The statistical analysis wascarried out in StataSE 92 (StataCorp LP) Winbugs(Imperial College and Medical Research Council UK)and in specialized software written by the authors inFortan 95 (Digital Equipment Corporation) program-ming language using standard numerical libraries(Numerical Algorithms Group Ltd)

ResultsThe MIS included 2364 children under five years of agerandomly sampled over 120 cluster locations Howevera sample of only 1324 children at 109 cluster locationshad complete parasitological data linked to a geo-locatedhousehold in order to estimate the distribution of para-sitaemia risk The study profile is given in Figure 1 Thesample locations with the observed parasitaemia preva-lence are shown Figure 2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 4 of 13

The lag time analysis suggested that the following per-iods preceding the survey are best (in terms of modelfit) for summarizing the climatic factors 27 months forrainfall 1 month for NDVI 15 months for day andnight LST The geographical distributions of the envir-onmental factors summarized at the above lag times aredisplayed in Figure 3Non-spatial bivariate logistic regression analyses and

the likelihood ratio test indicated that all RS factors

were significant at 15 significance level (results notpresented) All these variables were further included inthe geostatistical analysis Exploratory analysis indicatednon-linearity in the relation between the parasitaemiarisk and the following environmental predictors NDVIrainfall day and night LST Various Bayesian multiplelogistical regression models (spatial as well as non-spa-tial) were fitted modelling the non-linearity of the abovefactors via spline curves or categorical covariates In

Figure 1 Study profile of the ZMIS for predicting parasitaemia risk

Figure 2 Observed parasitaemia prevalence (left) and province names (right) Observed parasitaemia prevalence within district boundariesat 109 cluster locations used in estimating the distribution of parasitaemia risk in Zambia (left-hand side) The grey dots indicate the 11 clustersthat were excluded from the analysis Province names are given on the right-hand map

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 5 of 13

Figure 3 Spatial distribution of remotely sensed covariates in Zambia The climatic factors were summarized over a period preceding thesurvey indicated by the lag time analysis (day LST night LST NDVI rainfall) The land use map presents the most frequent land use category ina buffer of 3 km around every pixel

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 6 of 13

addition all models included land cover types regiontype (urbanrural) altitude and distance to the nearestwater bodies as categorical covariates Results on modelvalidation are presented in Table 2 The B-spline modelswere able to predict correctly most of the test locations(70) within the 95 CI however the B-spline modelsproduced always 95 CIs with largest absolute widthsIn comparison the spatial P-spline model is able to pre-dict correctly nearly the same proportion of test loca-tions (65) within a 95 CI with considerably smallerwidth Among those models predicting 60 of test loca-tions correctly within a 95 CI the non-spatial modelwith the linear terms had the smallest absolute widthThis model shows additionally the smallest KL diver-gence followed by the spatial P-spline model and thesecond smallest c2-value The spatial model with linearterms has the smallest c2-value Based on model valida-tion results the non-spatial model with the linear termswas chosen as the final model used for prediction dueto the very good KL divergence and c2-test results andthe smallest width of the CIs This model has also theadvantage of a small number of model covariates (com-pared to the second best model based on spatial P-splines) avoiding over-parameterization problems Themodel was employed to predict the parasitaemia risk atunsampled locations and included the following predic-tors proportion of each land cover type (excluding theldquootherrdquo category) within a 3 km radius around the loca-tion categorical covariates for the environmental predic-tors (altitude region type and distance to the nearestwater bodies) and linear climatic predictors (NDVI rain-fall day and night LST)Predictions obtained at around 100000 pixels via

Bayesian kriging are shown in Figure 4 The predictedparasitaemia prevalence is ranging between 08 and809 based on the posterior predictive distributionwhile the observed data vary between 02 and 598The overall prevalence (mean of the prediction for everypixel) is approximately 264 with a standard deviation

of 152 (observed locations 228 with 164 standarddeviation) Relatively low risks areas (lt10) were fre-quently predicted for North-Western Western andSouthern province in comparison to the high risk areas(gt50) mainly found in Eastern province and adjacentregions Estimates of the corresponding prediction errorare depicted in Figure 5 The map shows that regionswith high prediction errors have high parasitaemia riskThese are mainly areas with sparse survey locationsThe parasitaemia risk estimates were combined with

the estimated number of children below five years livingin the areas of the corresponding pixels to calculate thenumber of children with parasitaemia These estimatesare presented in Figure 6 in map form and in Table 3 astotal counts at province level Taking into account thepopulation distribution the average prevalence of para-sitaemia risk is approximately 200 Lusaka provincehas the lowest population-adjusted prevalence level of73 and is the only province with a predicted meanprevalence of lt10 Northern province has the highestpopulation-adjusted prevalence of 383 followed byLuapula (303) and Eastern province (277)To assess the effects of malaria interventions in Zam-

bia after adjusting for climatic and environmentaleffects the above model was fitted again with threeadditional covariates socio-economic status of thehousehold indoor residual spraying within the last 12months and presence of at least one bed net in thehousehold These covariates were not used for predic-tion as accurate estimates of their distribution in Zam-bia do not exist for 2006 The regression coefficients ofthis model (multivariate non-spatial) are given in Table4 together with the bivariate non-spatial logistic regres-sion models In addition results of the non-spatialmodel with linear terms and the second best model(spatial P-spline) are presented The results of thebivariate regression models reflect significant negativerelations with day LST proportion of urban areas withina 3 km buffer the region type with urban areas havinglower parasitaemia risk distance to the nearest waterbody altitude levels above 14 km socio-economic sta-tus (4th and 5th quintile) spraying and presence of bednets Positive significant relations were detected withnight LST NDVI rainfall within the last 27 monthsand proportion of wetlands in the surrounding areaThe implementation of the multivariate non-spatialmodel indicated a loss of significant covariates The onlyremaining significant parameter was the presence of atleast one bed net in a household which reduces theodds of parasitaemia in children by 40 (CI 12 61)The final prediction model (non-spatial model with thelinear terms) without the additional household specificcovariates showed no significant correlation between thepredictors and the parasitaemia risk The non-spatial

Table 2 Model validation summary for the spatial (s) andnon-spatial (ns) models

Model CI (width) KL c2

Linear (ns) 60 (058) 2149 4867

Linear (s) 50 (059) 2258 4459

Categorical (ns) 60 (069) 3081 6943

Categorical (s) 60 (068) 3033 6143

P-spline (ns) 60 (061) 2311 8194

P-spline (s) 65 (061) 2229 7417

B-spline (ns) 70 (072) 2759 26698

B-spline (s) 70 (072) 2885 26846

Comparison of Bayesian credible intervals (CI) of 95 probability coveragewith their corresponding width Kullback-Leibler divergences (KL) and the c2-test analogue on 20 test locations

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 7 of 13

Figure 4 Predicted parasitaemia risk map for children lt5 years in Zambia The map is based on a Bayesian logistic regression model withlinear terms for day LST night LST NDVI and rainfall The estimates correspond to the median of the posterior predictive distributions computedover 100000 pixels

Figure 5 Prediction error of the parasitaemia risk estimates given in Figure 4

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 8 of 13

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 4: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

duration of the malaria transmission season and envir-onmental factors there is an elapsing (lag) time betweenthe climatic suitability for malaria transmission and theoccurrence of the disease To determine this periodwhich may differ among environmental factors a lagtime analysis was carried out A lag time is defined as aperiod prior to the survey during which an averagevalue of the climatic factor was calculated However thefirst 16 days preceding the survey were excludedbecause parasite development in the mosquito takesaround two weeks before the mosquito becomes infec-tious For each climatic predictor a number of analysesvariables were created These variables represent shortand long term mean averages of the values of the cli-matic factor at different lag times For factors extractedat 1- or 8-day temporal resolution the lag times weremultiples of 8 days (that is 8 16 360) For NDVIwhich was extracted at 16 days temporal resolution thelag times were multiples of 16 days up to one year Atthe end 45 lag time variables were created for rainfallday and night LST and 22 lag time variables for NDVIBivariate logistic regression models were fitted to

assess the relation between the parasitaemia risk out-come and the environmental lag time variables Foreach climatic factor the lag time variable which wasfurther considered in the analysis was the one giving amodel with the smallest Akaikersquos Information Criterion(AIC) All covariates which were significant in thebivariate analysis at 15 significance level determinedby likelihood ratio tests were included in a multiplegeostatistical logistic regression analysisSeveral geostatistical multiple logistic regression mod-

els were fitted to assess and capture potential non-line-arity in the malaria-environment relation These modelsincluded covariates (i) in continuous scales (ii) in cate-gorical scales with categories based on quartiles and (iii)fitted by penalized and basis spline (P- and B-splines)curves (see Additional file 1) The model with the bestpredictive ability was chosen via a model validation pro-cedure In the geostatistical model specification spatialcorrelation was taken into account by including house-hold location-specific random effects and assuming thatthey derive from a multivariate Gaussian spatial processwith zero mean [8] The covariance between any pair oflocations was assumed to be an exponential function ofdistance between the locations Covariates and randomeffects were modelled on the logit scale of the parasitae-mia risk parameters The above geostatistical modelshave at least as many parameters as the number of loca-tions but model fit is possible via MCMC simulationmethods Exploratory analyses suggested weak spatialcorrelation therefore non-spatial models (having smallernumbers of parameters) were also fitted The modelwith the best predictive ability was employed to predict

the risk at the unsampled locations via Bayesian krigingPredictions were made over a grid of around 100000pixels to obtain a parasitaemia risk map for Zambia

Model fit and validationA random sample of 89 (training) locations wasselected for model fit and the predictive ability of themodels was assessed on the remaining 20 (test) loca-tions The range of distances from the selected testlocations to the nearest training location varies from900 m to 76 km with a median of about 26 km sug-gesting that the locations are representative of theunderlying spatial process Model predictions werecompared using the following three approaches[1617] i) the model with the highest proportion oftest locations falling within the 95 Bayesian confi-dence (credible) interval (CI) and ii) distance measuresbetween observed and predicted parasitaemia preva-lence calculated by the Kullback-Leibler (KL) diver-gence and an analogue to the c2-test In particular foreach test location the posterior predictive distribution(PPD) was computed using each fitted model Basedon the PPD Bayesian confidence intervals were calcu-lated with probability coverage of 95 Each credibleinterval was examined weather the test locations werefalling within that interval The model predicting thelargest number of test locations within the 95 CI ofsmallest width was considered as the best one The KLcalculates the mean divergences between the observedand predicted parasitaemia prevalence on the logitscales weighted by the observed value The analogue tothe c2-test is based on the squared distance betweenthe median of the PPD and the observed parasitaemiaprevalence divided by the observed prevalence Themodel giving the smallest divergence or the smallestdistance is considered as the best modelA mathematical description of the models used is

given in Additional file 1 The statistical analysis wascarried out in StataSE 92 (StataCorp LP) Winbugs(Imperial College and Medical Research Council UK)and in specialized software written by the authors inFortan 95 (Digital Equipment Corporation) program-ming language using standard numerical libraries(Numerical Algorithms Group Ltd)

ResultsThe MIS included 2364 children under five years of agerandomly sampled over 120 cluster locations Howevera sample of only 1324 children at 109 cluster locationshad complete parasitological data linked to a geo-locatedhousehold in order to estimate the distribution of para-sitaemia risk The study profile is given in Figure 1 Thesample locations with the observed parasitaemia preva-lence are shown Figure 2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 4 of 13

The lag time analysis suggested that the following per-iods preceding the survey are best (in terms of modelfit) for summarizing the climatic factors 27 months forrainfall 1 month for NDVI 15 months for day andnight LST The geographical distributions of the envir-onmental factors summarized at the above lag times aredisplayed in Figure 3Non-spatial bivariate logistic regression analyses and

the likelihood ratio test indicated that all RS factors

were significant at 15 significance level (results notpresented) All these variables were further included inthe geostatistical analysis Exploratory analysis indicatednon-linearity in the relation between the parasitaemiarisk and the following environmental predictors NDVIrainfall day and night LST Various Bayesian multiplelogistical regression models (spatial as well as non-spa-tial) were fitted modelling the non-linearity of the abovefactors via spline curves or categorical covariates In

Figure 1 Study profile of the ZMIS for predicting parasitaemia risk

Figure 2 Observed parasitaemia prevalence (left) and province names (right) Observed parasitaemia prevalence within district boundariesat 109 cluster locations used in estimating the distribution of parasitaemia risk in Zambia (left-hand side) The grey dots indicate the 11 clustersthat were excluded from the analysis Province names are given on the right-hand map

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 5 of 13

Figure 3 Spatial distribution of remotely sensed covariates in Zambia The climatic factors were summarized over a period preceding thesurvey indicated by the lag time analysis (day LST night LST NDVI rainfall) The land use map presents the most frequent land use category ina buffer of 3 km around every pixel

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 6 of 13

addition all models included land cover types regiontype (urbanrural) altitude and distance to the nearestwater bodies as categorical covariates Results on modelvalidation are presented in Table 2 The B-spline modelswere able to predict correctly most of the test locations(70) within the 95 CI however the B-spline modelsproduced always 95 CIs with largest absolute widthsIn comparison the spatial P-spline model is able to pre-dict correctly nearly the same proportion of test loca-tions (65) within a 95 CI with considerably smallerwidth Among those models predicting 60 of test loca-tions correctly within a 95 CI the non-spatial modelwith the linear terms had the smallest absolute widthThis model shows additionally the smallest KL diver-gence followed by the spatial P-spline model and thesecond smallest c2-value The spatial model with linearterms has the smallest c2-value Based on model valida-tion results the non-spatial model with the linear termswas chosen as the final model used for prediction dueto the very good KL divergence and c2-test results andthe smallest width of the CIs This model has also theadvantage of a small number of model covariates (com-pared to the second best model based on spatial P-splines) avoiding over-parameterization problems Themodel was employed to predict the parasitaemia risk atunsampled locations and included the following predic-tors proportion of each land cover type (excluding theldquootherrdquo category) within a 3 km radius around the loca-tion categorical covariates for the environmental predic-tors (altitude region type and distance to the nearestwater bodies) and linear climatic predictors (NDVI rain-fall day and night LST)Predictions obtained at around 100000 pixels via

Bayesian kriging are shown in Figure 4 The predictedparasitaemia prevalence is ranging between 08 and809 based on the posterior predictive distributionwhile the observed data vary between 02 and 598The overall prevalence (mean of the prediction for everypixel) is approximately 264 with a standard deviation

of 152 (observed locations 228 with 164 standarddeviation) Relatively low risks areas (lt10) were fre-quently predicted for North-Western Western andSouthern province in comparison to the high risk areas(gt50) mainly found in Eastern province and adjacentregions Estimates of the corresponding prediction errorare depicted in Figure 5 The map shows that regionswith high prediction errors have high parasitaemia riskThese are mainly areas with sparse survey locationsThe parasitaemia risk estimates were combined with

the estimated number of children below five years livingin the areas of the corresponding pixels to calculate thenumber of children with parasitaemia These estimatesare presented in Figure 6 in map form and in Table 3 astotal counts at province level Taking into account thepopulation distribution the average prevalence of para-sitaemia risk is approximately 200 Lusaka provincehas the lowest population-adjusted prevalence level of73 and is the only province with a predicted meanprevalence of lt10 Northern province has the highestpopulation-adjusted prevalence of 383 followed byLuapula (303) and Eastern province (277)To assess the effects of malaria interventions in Zam-

bia after adjusting for climatic and environmentaleffects the above model was fitted again with threeadditional covariates socio-economic status of thehousehold indoor residual spraying within the last 12months and presence of at least one bed net in thehousehold These covariates were not used for predic-tion as accurate estimates of their distribution in Zam-bia do not exist for 2006 The regression coefficients ofthis model (multivariate non-spatial) are given in Table4 together with the bivariate non-spatial logistic regres-sion models In addition results of the non-spatialmodel with linear terms and the second best model(spatial P-spline) are presented The results of thebivariate regression models reflect significant negativerelations with day LST proportion of urban areas withina 3 km buffer the region type with urban areas havinglower parasitaemia risk distance to the nearest waterbody altitude levels above 14 km socio-economic sta-tus (4th and 5th quintile) spraying and presence of bednets Positive significant relations were detected withnight LST NDVI rainfall within the last 27 monthsand proportion of wetlands in the surrounding areaThe implementation of the multivariate non-spatialmodel indicated a loss of significant covariates The onlyremaining significant parameter was the presence of atleast one bed net in a household which reduces theodds of parasitaemia in children by 40 (CI 12 61)The final prediction model (non-spatial model with thelinear terms) without the additional household specificcovariates showed no significant correlation between thepredictors and the parasitaemia risk The non-spatial

Table 2 Model validation summary for the spatial (s) andnon-spatial (ns) models

Model CI (width) KL c2

Linear (ns) 60 (058) 2149 4867

Linear (s) 50 (059) 2258 4459

Categorical (ns) 60 (069) 3081 6943

Categorical (s) 60 (068) 3033 6143

P-spline (ns) 60 (061) 2311 8194

P-spline (s) 65 (061) 2229 7417

B-spline (ns) 70 (072) 2759 26698

B-spline (s) 70 (072) 2885 26846

Comparison of Bayesian credible intervals (CI) of 95 probability coveragewith their corresponding width Kullback-Leibler divergences (KL) and the c2-test analogue on 20 test locations

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 7 of 13

Figure 4 Predicted parasitaemia risk map for children lt5 years in Zambia The map is based on a Bayesian logistic regression model withlinear terms for day LST night LST NDVI and rainfall The estimates correspond to the median of the posterior predictive distributions computedover 100000 pixels

Figure 5 Prediction error of the parasitaemia risk estimates given in Figure 4

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 8 of 13

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 5: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

The lag time analysis suggested that the following per-iods preceding the survey are best (in terms of modelfit) for summarizing the climatic factors 27 months forrainfall 1 month for NDVI 15 months for day andnight LST The geographical distributions of the envir-onmental factors summarized at the above lag times aredisplayed in Figure 3Non-spatial bivariate logistic regression analyses and

the likelihood ratio test indicated that all RS factors

were significant at 15 significance level (results notpresented) All these variables were further included inthe geostatistical analysis Exploratory analysis indicatednon-linearity in the relation between the parasitaemiarisk and the following environmental predictors NDVIrainfall day and night LST Various Bayesian multiplelogistical regression models (spatial as well as non-spa-tial) were fitted modelling the non-linearity of the abovefactors via spline curves or categorical covariates In

Figure 1 Study profile of the ZMIS for predicting parasitaemia risk

Figure 2 Observed parasitaemia prevalence (left) and province names (right) Observed parasitaemia prevalence within district boundariesat 109 cluster locations used in estimating the distribution of parasitaemia risk in Zambia (left-hand side) The grey dots indicate the 11 clustersthat were excluded from the analysis Province names are given on the right-hand map

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 5 of 13

Figure 3 Spatial distribution of remotely sensed covariates in Zambia The climatic factors were summarized over a period preceding thesurvey indicated by the lag time analysis (day LST night LST NDVI rainfall) The land use map presents the most frequent land use category ina buffer of 3 km around every pixel

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 6 of 13

addition all models included land cover types regiontype (urbanrural) altitude and distance to the nearestwater bodies as categorical covariates Results on modelvalidation are presented in Table 2 The B-spline modelswere able to predict correctly most of the test locations(70) within the 95 CI however the B-spline modelsproduced always 95 CIs with largest absolute widthsIn comparison the spatial P-spline model is able to pre-dict correctly nearly the same proportion of test loca-tions (65) within a 95 CI with considerably smallerwidth Among those models predicting 60 of test loca-tions correctly within a 95 CI the non-spatial modelwith the linear terms had the smallest absolute widthThis model shows additionally the smallest KL diver-gence followed by the spatial P-spline model and thesecond smallest c2-value The spatial model with linearterms has the smallest c2-value Based on model valida-tion results the non-spatial model with the linear termswas chosen as the final model used for prediction dueto the very good KL divergence and c2-test results andthe smallest width of the CIs This model has also theadvantage of a small number of model covariates (com-pared to the second best model based on spatial P-splines) avoiding over-parameterization problems Themodel was employed to predict the parasitaemia risk atunsampled locations and included the following predic-tors proportion of each land cover type (excluding theldquootherrdquo category) within a 3 km radius around the loca-tion categorical covariates for the environmental predic-tors (altitude region type and distance to the nearestwater bodies) and linear climatic predictors (NDVI rain-fall day and night LST)Predictions obtained at around 100000 pixels via

Bayesian kriging are shown in Figure 4 The predictedparasitaemia prevalence is ranging between 08 and809 based on the posterior predictive distributionwhile the observed data vary between 02 and 598The overall prevalence (mean of the prediction for everypixel) is approximately 264 with a standard deviation

of 152 (observed locations 228 with 164 standarddeviation) Relatively low risks areas (lt10) were fre-quently predicted for North-Western Western andSouthern province in comparison to the high risk areas(gt50) mainly found in Eastern province and adjacentregions Estimates of the corresponding prediction errorare depicted in Figure 5 The map shows that regionswith high prediction errors have high parasitaemia riskThese are mainly areas with sparse survey locationsThe parasitaemia risk estimates were combined with

the estimated number of children below five years livingin the areas of the corresponding pixels to calculate thenumber of children with parasitaemia These estimatesare presented in Figure 6 in map form and in Table 3 astotal counts at province level Taking into account thepopulation distribution the average prevalence of para-sitaemia risk is approximately 200 Lusaka provincehas the lowest population-adjusted prevalence level of73 and is the only province with a predicted meanprevalence of lt10 Northern province has the highestpopulation-adjusted prevalence of 383 followed byLuapula (303) and Eastern province (277)To assess the effects of malaria interventions in Zam-

bia after adjusting for climatic and environmentaleffects the above model was fitted again with threeadditional covariates socio-economic status of thehousehold indoor residual spraying within the last 12months and presence of at least one bed net in thehousehold These covariates were not used for predic-tion as accurate estimates of their distribution in Zam-bia do not exist for 2006 The regression coefficients ofthis model (multivariate non-spatial) are given in Table4 together with the bivariate non-spatial logistic regres-sion models In addition results of the non-spatialmodel with linear terms and the second best model(spatial P-spline) are presented The results of thebivariate regression models reflect significant negativerelations with day LST proportion of urban areas withina 3 km buffer the region type with urban areas havinglower parasitaemia risk distance to the nearest waterbody altitude levels above 14 km socio-economic sta-tus (4th and 5th quintile) spraying and presence of bednets Positive significant relations were detected withnight LST NDVI rainfall within the last 27 monthsand proportion of wetlands in the surrounding areaThe implementation of the multivariate non-spatialmodel indicated a loss of significant covariates The onlyremaining significant parameter was the presence of atleast one bed net in a household which reduces theodds of parasitaemia in children by 40 (CI 12 61)The final prediction model (non-spatial model with thelinear terms) without the additional household specificcovariates showed no significant correlation between thepredictors and the parasitaemia risk The non-spatial

Table 2 Model validation summary for the spatial (s) andnon-spatial (ns) models

Model CI (width) KL c2

Linear (ns) 60 (058) 2149 4867

Linear (s) 50 (059) 2258 4459

Categorical (ns) 60 (069) 3081 6943

Categorical (s) 60 (068) 3033 6143

P-spline (ns) 60 (061) 2311 8194

P-spline (s) 65 (061) 2229 7417

B-spline (ns) 70 (072) 2759 26698

B-spline (s) 70 (072) 2885 26846

Comparison of Bayesian credible intervals (CI) of 95 probability coveragewith their corresponding width Kullback-Leibler divergences (KL) and the c2-test analogue on 20 test locations

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 7 of 13

Figure 4 Predicted parasitaemia risk map for children lt5 years in Zambia The map is based on a Bayesian logistic regression model withlinear terms for day LST night LST NDVI and rainfall The estimates correspond to the median of the posterior predictive distributions computedover 100000 pixels

Figure 5 Prediction error of the parasitaemia risk estimates given in Figure 4

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 8 of 13

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 6: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

Figure 3 Spatial distribution of remotely sensed covariates in Zambia The climatic factors were summarized over a period preceding thesurvey indicated by the lag time analysis (day LST night LST NDVI rainfall) The land use map presents the most frequent land use category ina buffer of 3 km around every pixel

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 6 of 13

addition all models included land cover types regiontype (urbanrural) altitude and distance to the nearestwater bodies as categorical covariates Results on modelvalidation are presented in Table 2 The B-spline modelswere able to predict correctly most of the test locations(70) within the 95 CI however the B-spline modelsproduced always 95 CIs with largest absolute widthsIn comparison the spatial P-spline model is able to pre-dict correctly nearly the same proportion of test loca-tions (65) within a 95 CI with considerably smallerwidth Among those models predicting 60 of test loca-tions correctly within a 95 CI the non-spatial modelwith the linear terms had the smallest absolute widthThis model shows additionally the smallest KL diver-gence followed by the spatial P-spline model and thesecond smallest c2-value The spatial model with linearterms has the smallest c2-value Based on model valida-tion results the non-spatial model with the linear termswas chosen as the final model used for prediction dueto the very good KL divergence and c2-test results andthe smallest width of the CIs This model has also theadvantage of a small number of model covariates (com-pared to the second best model based on spatial P-splines) avoiding over-parameterization problems Themodel was employed to predict the parasitaemia risk atunsampled locations and included the following predic-tors proportion of each land cover type (excluding theldquootherrdquo category) within a 3 km radius around the loca-tion categorical covariates for the environmental predic-tors (altitude region type and distance to the nearestwater bodies) and linear climatic predictors (NDVI rain-fall day and night LST)Predictions obtained at around 100000 pixels via

Bayesian kriging are shown in Figure 4 The predictedparasitaemia prevalence is ranging between 08 and809 based on the posterior predictive distributionwhile the observed data vary between 02 and 598The overall prevalence (mean of the prediction for everypixel) is approximately 264 with a standard deviation

of 152 (observed locations 228 with 164 standarddeviation) Relatively low risks areas (lt10) were fre-quently predicted for North-Western Western andSouthern province in comparison to the high risk areas(gt50) mainly found in Eastern province and adjacentregions Estimates of the corresponding prediction errorare depicted in Figure 5 The map shows that regionswith high prediction errors have high parasitaemia riskThese are mainly areas with sparse survey locationsThe parasitaemia risk estimates were combined with

the estimated number of children below five years livingin the areas of the corresponding pixels to calculate thenumber of children with parasitaemia These estimatesare presented in Figure 6 in map form and in Table 3 astotal counts at province level Taking into account thepopulation distribution the average prevalence of para-sitaemia risk is approximately 200 Lusaka provincehas the lowest population-adjusted prevalence level of73 and is the only province with a predicted meanprevalence of lt10 Northern province has the highestpopulation-adjusted prevalence of 383 followed byLuapula (303) and Eastern province (277)To assess the effects of malaria interventions in Zam-

bia after adjusting for climatic and environmentaleffects the above model was fitted again with threeadditional covariates socio-economic status of thehousehold indoor residual spraying within the last 12months and presence of at least one bed net in thehousehold These covariates were not used for predic-tion as accurate estimates of their distribution in Zam-bia do not exist for 2006 The regression coefficients ofthis model (multivariate non-spatial) are given in Table4 together with the bivariate non-spatial logistic regres-sion models In addition results of the non-spatialmodel with linear terms and the second best model(spatial P-spline) are presented The results of thebivariate regression models reflect significant negativerelations with day LST proportion of urban areas withina 3 km buffer the region type with urban areas havinglower parasitaemia risk distance to the nearest waterbody altitude levels above 14 km socio-economic sta-tus (4th and 5th quintile) spraying and presence of bednets Positive significant relations were detected withnight LST NDVI rainfall within the last 27 monthsand proportion of wetlands in the surrounding areaThe implementation of the multivariate non-spatialmodel indicated a loss of significant covariates The onlyremaining significant parameter was the presence of atleast one bed net in a household which reduces theodds of parasitaemia in children by 40 (CI 12 61)The final prediction model (non-spatial model with thelinear terms) without the additional household specificcovariates showed no significant correlation between thepredictors and the parasitaemia risk The non-spatial

Table 2 Model validation summary for the spatial (s) andnon-spatial (ns) models

Model CI (width) KL c2

Linear (ns) 60 (058) 2149 4867

Linear (s) 50 (059) 2258 4459

Categorical (ns) 60 (069) 3081 6943

Categorical (s) 60 (068) 3033 6143

P-spline (ns) 60 (061) 2311 8194

P-spline (s) 65 (061) 2229 7417

B-spline (ns) 70 (072) 2759 26698

B-spline (s) 70 (072) 2885 26846

Comparison of Bayesian credible intervals (CI) of 95 probability coveragewith their corresponding width Kullback-Leibler divergences (KL) and the c2-test analogue on 20 test locations

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 7 of 13

Figure 4 Predicted parasitaemia risk map for children lt5 years in Zambia The map is based on a Bayesian logistic regression model withlinear terms for day LST night LST NDVI and rainfall The estimates correspond to the median of the posterior predictive distributions computedover 100000 pixels

Figure 5 Prediction error of the parasitaemia risk estimates given in Figure 4

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 8 of 13

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 7: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

addition all models included land cover types regiontype (urbanrural) altitude and distance to the nearestwater bodies as categorical covariates Results on modelvalidation are presented in Table 2 The B-spline modelswere able to predict correctly most of the test locations(70) within the 95 CI however the B-spline modelsproduced always 95 CIs with largest absolute widthsIn comparison the spatial P-spline model is able to pre-dict correctly nearly the same proportion of test loca-tions (65) within a 95 CI with considerably smallerwidth Among those models predicting 60 of test loca-tions correctly within a 95 CI the non-spatial modelwith the linear terms had the smallest absolute widthThis model shows additionally the smallest KL diver-gence followed by the spatial P-spline model and thesecond smallest c2-value The spatial model with linearterms has the smallest c2-value Based on model valida-tion results the non-spatial model with the linear termswas chosen as the final model used for prediction dueto the very good KL divergence and c2-test results andthe smallest width of the CIs This model has also theadvantage of a small number of model covariates (com-pared to the second best model based on spatial P-splines) avoiding over-parameterization problems Themodel was employed to predict the parasitaemia risk atunsampled locations and included the following predic-tors proportion of each land cover type (excluding theldquootherrdquo category) within a 3 km radius around the loca-tion categorical covariates for the environmental predic-tors (altitude region type and distance to the nearestwater bodies) and linear climatic predictors (NDVI rain-fall day and night LST)Predictions obtained at around 100000 pixels via

Bayesian kriging are shown in Figure 4 The predictedparasitaemia prevalence is ranging between 08 and809 based on the posterior predictive distributionwhile the observed data vary between 02 and 598The overall prevalence (mean of the prediction for everypixel) is approximately 264 with a standard deviation

of 152 (observed locations 228 with 164 standarddeviation) Relatively low risks areas (lt10) were fre-quently predicted for North-Western Western andSouthern province in comparison to the high risk areas(gt50) mainly found in Eastern province and adjacentregions Estimates of the corresponding prediction errorare depicted in Figure 5 The map shows that regionswith high prediction errors have high parasitaemia riskThese are mainly areas with sparse survey locationsThe parasitaemia risk estimates were combined with

the estimated number of children below five years livingin the areas of the corresponding pixels to calculate thenumber of children with parasitaemia These estimatesare presented in Figure 6 in map form and in Table 3 astotal counts at province level Taking into account thepopulation distribution the average prevalence of para-sitaemia risk is approximately 200 Lusaka provincehas the lowest population-adjusted prevalence level of73 and is the only province with a predicted meanprevalence of lt10 Northern province has the highestpopulation-adjusted prevalence of 383 followed byLuapula (303) and Eastern province (277)To assess the effects of malaria interventions in Zam-

bia after adjusting for climatic and environmentaleffects the above model was fitted again with threeadditional covariates socio-economic status of thehousehold indoor residual spraying within the last 12months and presence of at least one bed net in thehousehold These covariates were not used for predic-tion as accurate estimates of their distribution in Zam-bia do not exist for 2006 The regression coefficients ofthis model (multivariate non-spatial) are given in Table4 together with the bivariate non-spatial logistic regres-sion models In addition results of the non-spatialmodel with linear terms and the second best model(spatial P-spline) are presented The results of thebivariate regression models reflect significant negativerelations with day LST proportion of urban areas withina 3 km buffer the region type with urban areas havinglower parasitaemia risk distance to the nearest waterbody altitude levels above 14 km socio-economic sta-tus (4th and 5th quintile) spraying and presence of bednets Positive significant relations were detected withnight LST NDVI rainfall within the last 27 monthsand proportion of wetlands in the surrounding areaThe implementation of the multivariate non-spatialmodel indicated a loss of significant covariates The onlyremaining significant parameter was the presence of atleast one bed net in a household which reduces theodds of parasitaemia in children by 40 (CI 12 61)The final prediction model (non-spatial model with thelinear terms) without the additional household specificcovariates showed no significant correlation between thepredictors and the parasitaemia risk The non-spatial

Table 2 Model validation summary for the spatial (s) andnon-spatial (ns) models

Model CI (width) KL c2

Linear (ns) 60 (058) 2149 4867

Linear (s) 50 (059) 2258 4459

Categorical (ns) 60 (069) 3081 6943

Categorical (s) 60 (068) 3033 6143

P-spline (ns) 60 (061) 2311 8194

P-spline (s) 65 (061) 2229 7417

B-spline (ns) 70 (072) 2759 26698

B-spline (s) 70 (072) 2885 26846

Comparison of Bayesian credible intervals (CI) of 95 probability coveragewith their corresponding width Kullback-Leibler divergences (KL) and the c2-test analogue on 20 test locations

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 7 of 13

Figure 4 Predicted parasitaemia risk map for children lt5 years in Zambia The map is based on a Bayesian logistic regression model withlinear terms for day LST night LST NDVI and rainfall The estimates correspond to the median of the posterior predictive distributions computedover 100000 pixels

Figure 5 Prediction error of the parasitaemia risk estimates given in Figure 4

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 8 of 13

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 8: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

Figure 4 Predicted parasitaemia risk map for children lt5 years in Zambia The map is based on a Bayesian logistic regression model withlinear terms for day LST night LST NDVI and rainfall The estimates correspond to the median of the posterior predictive distributions computedover 100000 pixels

Figure 5 Prediction error of the parasitaemia risk estimates given in Figure 4

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 8 of 13

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 9: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

variance was around 17 The P-spline spatial modelestimates a ratio of spatial to the total variation ofalmost 12 The minimum distance at which the spatialcorrelation is lower than 5 estimated by the spatialmodel is 380 meters (95 confidence interval 210 m3390 m) which is even lower than the resolution of thegrid used for prediction (3 km) This suggests a veryweak spatial correlation and supports the choice of thenon-spatial model as the one with the best predictiveability

DiscussionThe ZMIS in 2006 was the first nation-wide malaria sur-vey which combined parasitological data with other

malaria indicators such as bed net use indoor residualspraying and household related aspects The aim of thesurvey was to estimate the coverage of interventions andthe malaria related burden in children less than fiveyears However the MIS data are also a very importantsource of information for estimating parasitaemia risk atlocal scales and thus for identifying the high-risk areasthat require high intervention coverage and continuousmonitoring Combining parasitaemia risk estimates withpopulation data the number of infected children can beestimated which can help for better resource allocationhealth management and targeted additional interven-tions to achieve the highest risk reduction for the mostpopulated areas Repeated surveys will enable the

Table 3 Predicted number of children lt5 years with malaria parasites in the blood (per province)

Province Prev 1(in )

Childrenlt5 years

InfectedChildren

95CI Prev 2(in )

Central 260 182847 34572 21589 50252 189

Copperbelt 233 311317 37763 18572 70719 121

Eastern 374 240137 66614 46297 87219 277

Luapula 320 125049 37943 29039 47638 303

Lusaka 318 275120 20134 8121 46849 73

North-Western 210 128935 29011 16200 51616 225

Northern 391 277764 106322 79379 135701 383

Southern 188 243743 33430 19862 53854 137

Western 144 147229 20321 12730 30232 138

Total 264 1932141 386110 251789 574080 200

Estimates are based on the mean and the 95 confidence intervals (CI) of the posterior predictive distribution of the non-spatial model with linear termsPrev 1 Model based risk estimatesPrev 2 Model-based population-adjusted prevalence

Figure 6 Estimated number of infected children lt5 years per 100 km2

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 9 of 13

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 10: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

evaluation of the effectiveness of on-going interventionsAlready the Zambian Ministry of Health and partnershave completed the second ZMIS in the country in2008 and planning another follow-up survey in 2010Prior to the MIS compiled historical survey data has

been used to obtain estimates of parasitaemia risk athigh resolution The mapping malaria risk in Africa(MARA) project was initiated in 1998 with the aim tocompile published and unpublished malaria survey datain Africa MARA is the most comprehensive malario-metric database compiling data from 1900 up to dateThe MARA data has been analysed using rigorous spa-tial statistical modelling [17-20] to obtain high-resolu-tion malaria risk estimates at regional and country-level

in Africa However risk estimates of historical data donot reflect the current malaria situation which is influ-enced by on-going interventions The surveys are notrepresentative as high risk areas tend to be over-repre-sented In addition historical surveys have been con-ducted between various locations using differentmethodologies including different age groups and car-ried out at different seasons On the contrary MISs donot suffer from these drawbacks The MIS locations arerandomly chosen and the data are available at individuallevel allowing for estimation of age-specific riskAnother advantage of the MIS data is the householdlevel information available which gives the possibility ofdifferentiating the contribution of climate socio-

Table 4 Parasitaemia risk predictors of different models

Covariates Bivariatenon-spatial

Multivariatenon-spatial

Predictionmodel

Spatial P-spline model

OR (95CI) OR (95CI) OR (95CI) OR (95CI)

Day LST 049 (043 057) 065 (037 115) 061 (032 117)

Night LST 123 (109 140) 121 (077 188) 118 (079 177)

NDVI 225 (190 266) 128 (067 273) 129 (066 277)

Rainfall 156 (137 176) 121 (085 168) 118 (080 173)

Land cover covariates

Wetland 120 (107 134) 097 (062 155) 098 (067 148) 072 (040 137)

Forest 096 (084 110) 072 (043 108) 072 (043 110) 064 (038 099)

Urban 035 (025 050) 070 (038 121) 071 (037 129) 075 (033 148)

Shrubland 106 (094 120) 107 (076 153) 105 (071 147) 107 (072 153)

Region (rural)

urban 017 (011 025) 053 (014 203) 037 (011 113) 043 (012 150)

Distance to water bodies (lt1000 m)

1000-2499 071 (053 096) 073 (029 186) 071 (029 172) 055 (019 150)

2500-4999 054 (038 077) 061 (020 155) 060 (019 172) 049 (017 142)

ge 5000 011 (004 030) 022 (003 140) 021 (003 139) 020 (002 193)

Altitude (lt850 m)

850-1199 072 (049 106) 021 (003 170) 022 (003 180) 020 (002 192)

1200-1399 073 (049 109) 032 (003 327) 030 (003 329) 027 (002 481)

ge 1400 048 (025 092) 074 (004 102) 057 (003 768) 032 (002 786)

Socio-economic index(1st quintile)

2nd quintile 106 (075 150) 121 (078 192) - -

3rd quintile 085 (060 122) 131 (075 223) - -

4th quintile 028 (017 046) 075 (033 175) - -

5th quintile 009 (004 019) 040 (009 156) - -

Interventions

IRS 016 (007 036) 173 (042 690) - -

Bed nets 059 (046 077) 060 (039 088) - -

Mean (95CI) Mean (95CI) Mean (95CI)

Range (in km) - - - 038 (021 339)

s2 (spatial error) - - - 098 (001 277)

τ2 (measurement error) - 177 (090 323) 171 (093 284) 082 (001 269)

Associations between parasitaemia risk and predictors of the non-spatial model with linear terms and the Bayesian spatial logistic regression P-spline modelpresented as odds ratios (OR) with their respective 95 confidence intervals (CI) regression coefficients are based on P-spline curves

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 10 of 13

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 11: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

economic characteristics and control interventions tothe overall parasitaemia risk If in addition these dataare known at high spatial resolution they can beincluded in the geostatistical model to obtain moreaccurate predictions of the malaria riskThis study created the first contemporary empirical

parasitaemia risk map for Zambia Many existing mapson malaria transmission rely only on rough geographicaland climatic iso-lines and expert opinions Until 1998none of the maps had a numerical definition hence themalaria risk maps were not comparable and trustworthyThen Hay et al [21] produced a climatic map formalaria transmission in Kenya followed by Craig et al[22] who developed a climatic suitability malaria riskmap for the whole Africa Up to now there are only afew maps containing also empirical data the first mapof this kind was published in 2000 for Mali by Kleinsch-midt et al [18] For Zambia the first empirical malariarisk map was produced by Hay et al [23] and is part ofa global risk map based on historical dataMalaria survey data are expected to be correlated in

space Spatial correlation at short distances is introducedby the transmission process driven by the flight range ofthe mosquito vector while at wider ranges spatial corre-lation reflects common exposures to environmental con-ditions which influence mosquito survival and longevityHigh spatial resolution risk estimation requires predic-tion at locations where malaria survey data are not avail-able High-resolution environmental data can beobtained via remote sensing GIS software has excellentmapping capabilities and it is a very useful tool for pro-cessing RS dataStatistical techniques model the relation between para-

sitaemia risk and risk factors (environmental possibleinterventions socio-economic factors) via a logisticregression model which is further used for predictionStandard statistical methods assume independence ofthe survey locations Violating this assumption whenmodelling spatially-correlated malaria survey data maylead to imprecise estimates of the risk the significanceof the risk factors and of the prediction error Similarlymodelling spatial correlation in weakly correlated dataincreases the number of model parameters anddecreases the precision of the estimates Geostatisticalmodels take into account spatial correlation by introdu-cing an additional parameter (random effect) at eachsurvey location and assume that geographical depen-dence is a function of distance between locationsDepending on the number of survey locations thesemodels can be highly parameterized and they can onlybe estimated using Bayesian inference and MCMCsimulation Bayesian geostatistical models have beenemployed in malaria risk estimation by eg Diggle et al[19] Gemperli et al [2024] and Gosoniu et al [1617]

However in this study the Bayesian geostatistical modelhas estimated very low spatial correlation dropping toless than 5 at distances larger than 380 m which islower than the spatial resolution of the pixel size usedfor prediction (3 km by 3 km) The non-spatial modelwas superior to the fitted spatial models because spatialcorrelation is only present at a very local scale indicatedby the flight range of the mosquito rather than environ-mental covariates Possible reasons might be on-goinginterventions which determine mosquito densities andparasitaemia and therefore reduce the influence of envir-onmental predictors on the mosquitoPrevious models of malaria transmission addressed

non-linearity between the risk of the malaria-relatedoutcome (in the logit scale) and its predictor solely bycategorizing the non-linear covariates In this studynon-linearity is modelled additionally by using differenttypes of spline curves The resulting risk estimates sug-gested that predictions are sensitive to the model fittedHowever model validation indicated that models basedon linear terms are superior to non-linear models inZambia Even though the P-spline model had the higherpredictive ability at 95 CIs the linear model was con-sidered as the one with the best predictive ability due tothe smaller range of these intervals and the results ofthe KL divergence and c2-measures between observed-predicted prevalence data This model has also theadvantage of an easy interpretability of the regressioncoefficients for non-statisticians in comparison to thespline curve onesNone of the regression coefficients of the final predic-

tion model were significant However they are neededto determine the mean risk estimate for each locationand excluding them reduces heavily the model predic-tive ability (results have not been shown) The lack ofsignificance of the environmental factors is partiallyexplained by the effects of malaria interventions whichcan have stronger influence on the parasitaemia riskthan the environmental factors In fact the spatialmodel which adjusted for different types of interven-tions indicated a significant effect of the presence of atleast one bed net in a household in reducing parasitae-mia risk Therefore interventions are a major driver ofparasitaemia risk in Zambia and including these data inthe prediction model would increase the accuracy ofmodel-based risk predictions Unfortunately in thisstudy the parasitaemia risk could not be predicted con-ditional on bed net coverage because the geographicaldistribution of bed net coverage was not known for thetime of the ZMIS 2006 Intervention data are neededover the entire study area in order to be used for predic-tion purposes Since eg MACEPA is putting a lot ofefforts in scaling up malaria interventions in Zambiathe role of interventions is likely to even increase within

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 11 of 13

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 12: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

the next couple of years Therefore compilation ofintervention coverage data at high spatial resolution isbecoming essential to create reliable risk mapsThe prediction map indicates high variation of parasi-

taemia risk over the country In particular high risk ispredicted a stripe from south to east of Zambia which ischaracterized mainly by low altitude high NDVI highday and night LST The relatively small risk in the south-west of the country might be influenced by low vegeta-tion and rainfall while the small risk in the north-westregions could be due to low day LST and high proportionof forests It is also possible that malaria interventionshave been focused on selected areas like the Zambeziriver reducing the parasite prevalence in those areas It isinteresting to note that the maps have been shown tolocal experts who confirmed the depicted risk patternsThe map of the prediction error of the Bayesian modelcould assist in improving precision of the parasitaemiarisk prediction by identifying the areas of high uncer-tainty where additional survey locations could be ran-domly distributed in following MISrsquos and hence reducethe error and raise the precision of following studiesThe reduced smoothness of the map might be

explained by the weak spatial correlation which mostlikely indicate that interventions counteract the environ-mental effects The sampling framework of the MIS mayalso tend to underestimate spatial correlation becausethe survey clusters are partially driven by populationdensity where transmission is largely influenced byinterventions However the drastic changes within smalldistances are highlighting the importance of the high-resolution maps If prediction would have been done atsmaller resolution some of the high risk areas woulddisappear and the burden would have been underesti-mated in these regions For other purposes which donot need such high precision reducing the resolutioncan be easily done by combining neighbouring pixelsThe average predicted parasitaemia risk over the

whole country was 264 however after adjusting forthe population size the risk dropped down to 200 Itis striking that although the average risk in Lusaka pro-vince is the forth highest of all provinces (318) aftertaking into account the population density the riskreduces to 73 indicating that the highest risk areas arethe less populated In fact Lusaka is the province of thelowest population-adjusted risk The provinces of Cop-perbelt Southern and Western have low population-adjusted risks (less than 15) On the other handNorthern province has the highest population-adjustedparasitaemia risk followed by Eastern province and Lua-pula In addition Northern province has the highestnumber of infected children less than five years oldTherefore interventions should be concentrated in theNorthern province especially at the north-eastern border

and in the middle of the province Further interventionsin Copperbelt and Southern might also have a strongimpact in reducing the overall burden since the totalnumber of infected children is as high as in Luapulaeven though their prevalence is lower than 15

ConclusionsThe map of parasitaemia risk together with the predic-tion error and the population at risk give an importantoverview of the malaria situation in Zambia The mapscan be used by decision-makers to allocate resourcesand interventions to reach the most persons in theregions of highest risk to reduce the burden of malariasignificantly for Zambia The ZMIS of 2008 and 2010will provide very important information on the changesof the parasitaemia risk over space and time and help inthe evaluation of the progress of new and establishedintervention programmes adjusted for environmentaldrivers of the transmission risk

Additional file 1 Additional information regarding modelformulation and spline curvesClick here for file[ httpwwwbiomedcentralcomcontentsupplementary1475-2875-9-37-S1PDF ]

AcknowledgementsThis investigation received financial support from the Swiss National ScienceFoundation (SNF) (Project No 325200-118379) and the Bill and MelindaGates Foundation (Project No 39777) Special thanks to Prof Thomas Smithfor his valuable inputs on data analysis aspects

Author details1Department of Public Health and Epidemiology Swiss Tropical Institute POBox 4002 Basel Switzerland 2Malaria Control and Evaluation Partnership inAfrica PATH Ferney France 3Departement of Public Health and ResearchMinistry of Health Zambia

Authorsrsquo contributionsNR analysed the data and drafted the manuscript PV was responsible forconception and design of the analysis supervised the implementation andrevised the manuscript JMM participated in designing and coordinating thefieldwork and provided important intellectual content to this study LGcontributed in the analysis and helped to draft the manuscript ECK gaveintellectual content and revised the draft critically VM enabled the MIS 2006to occur and has been instrumental in championing the use of this data forguiding decisions RWS coordinated the fieldwork and reviewed themanuscript All authors read and approved the final manuscript

Competing interestsThe authors declare that they have no competing interests

Received 10 November 2009Accepted 1 February 2010 Published 1 February 2010

References1 Ministry of Health 2008 National malaria control action plan actions for

scale-up for impact on malaria in Zambia Zambia 2008 [httpwwwnmccorgzmpublicationshtm]

2 Ministry of Health A road map for impact on malaria in Zambia 2006-2010a 5-year strategic plan Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 12 of 13

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References
Page 13: RESEARCH Open Access Geographical patterns and predictors ... · National Malaria Strategic Plan (NMSP) 2005-2010, the Ministry of Health and a network of partners are work-ing toward

3 Campbell K Terry D Wood S Scaling up for impact a model for malariacontrol Zambia 2007 [httpwwwpathorgfilesMACEPA_Bro_2007-08-06pdf]

4 Ministry of Health 2009 National malaria control action plan actions forscale-up for impact on malaria in Zambia Zambia 2009

5 Ministry of Health Zambia Central Statistical Office PATH MACEPA CDCUniversity of Zambia WHO Zambia National Malaria Indicator Survey 2006Zambia 2006 [httpwwwnmccorgzmpublicationshtm]

6 Craig MH Snow RW Le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

7 Guerra CA Gikandi PW Tatem AJ Noor AM Smith DL Hay SI Snow RWThe limits and intensity of Plasmodium falciparum transmissionimplications for malaria control and elimination worldwide PLoS Med2008 5e38

8 Diggle PJ Tawn JA Moyeed RA Model-based geostatistics Appl Stat 199847299-350

9 Filmer D Pritchett LH Estimating wealth effects without expendituredata - or tears an application to educational enrolments in states ofIndia Demography 2001 38115-32

10 US Geological Survey (USGS) Land Processes Distributed Active ArchiveCenter (LP DAAC) MODIS land products [httpslpdaacusgsgovlpdaacproductsmodis_products_table]

11 US Geological Survey (USGS) Famine Early Warning Systems Network(FEWS NET) African Data Dissemination Service (ADDS) [httpearlywarningusgsgovadds]

12 US Geological Survey (USGS) Digital Elevation Models (DEM) [httpedc2usgsgovgeodataindexphp]

13 World Health Organization (WHO) The HealthMapper Database [httpwwwwhointhealth_mappingtoolshealthmapperenindexhtml]

14 Oak Ridge National Laboratory (Oak Ridge TN) LandScantrade GlobalPopulation Database httpwwwornlgovlandscan

15 US Census Bureau International Data Base (IDB) Population DivisionZambia 2006 httpwwwcensusgovipcwwwidbcountryphp

16 Gosoniu L Vounatsou P Sogoba N Smith T Bayesian modelling ofgeostatistical malaria risk data Geospat Health 2006 1127-139

17 Gosoniu L Vounatsou P Sogoba N Maire N Smith T Mapping malaria riskin West Africa using a Bayesian nonparametric non-stationary modelComput Stat Data Anal 2009 533358-3371

18 Kleinschmidt I Bagayoko M Clarke GPY Craig M le Sueur D A spatialstatistical approach to malaria mapping Int J Epidemiol 2000 29355-361

19 Diggle P Moyeed R Rowlingson B Thomson M Childhood malaria in theGambia a case-study in model-based geostatistics Appl Stat 200251493-506

20 Gemperli A Sogoba N Fondjo E Mabaso M Bagayako M Brieumlt OJTAnderegg D Liebe J Smith T Mapping malaria transmission in West andCentral Africa Trop Med Int Health 2006 111032-46

21 Hay SI Snow RW Rogers DJ Predicting malaria seasons in Kenya usingmultitemporal meteorological satellite sensor data Trans R Soc Trop MedHyg 1998 9212-20

22 Craig MH Snow RW le Sueur D A climate-based distribution model ofmalaria transmission in sub-Saharan Africa Parasitol Today 199915105-111

23 Hay SI Guerra CA Gething PW Patil AP Tatem AJ Noor AM Kabaria CWManh BH Elyazar IRF Brooker S Smith DL Moyeed RA Snow RW A worldmalaria map Plasmodium falciparum endemicity in 2007 PLoS Med 20096e1000048

24 Gemperli A Vounatsou P Kleinschmidt I Bagayoko M Lengeler C Smith TSpatial patterns of infant mortality in Mali the effect of malariaendemicity Am J Epidemiol 2004 15964-72

doi1011861475-2875-9-37Cite this article as Riedel et al Geographical patterns and predictors ofmalaria risk in Zambia Bayesian geostatistical modelling of the 2006Zambia national malaria indicator survey (ZMIS) Malaria Journal 2010937

Submit your next manuscript to BioMed Centraland take full advantage of

bull Convenient online submission

bull Thorough peer review

bull No space constraints or color figure charges

bull Immediate publication on acceptance

bull Inclusion in PubMed CAS Scopus and Google Scholar

bull Research which is freely available for redistribution

Submit your manuscript at wwwbiomedcentralcomsubmit

Riedel et al Malaria Journal 2010 937httpwwwmalariajournalcomcontent9137

Page 13 of 13

  • Abstract
    • Background
    • Methods
    • Results
    • Conclusions
      • Background
      • Methods
        • The study area and the ZMIS
        • Socio-economic data
        • Environmental and population density data
        • Statistical models
        • Model fit and validation
          • Results
          • Discussion
          • Conclusions
          • Acknowledgements
          • Author details
          • Authors contributions
          • Competing interests
          • References