Exploring Spatial and Temporal Heterogeneity of Environmental Noise in Toronto Camila Casquilho-Resende, Nick Fishbane, Seong-Hwan Jun, and Yunlong Nie Department of Statistics, The University of British Columbia Introduction Objective The objective is to analyze the variation of environmental noise across temporal and spatial dimensions. In the process, we identify site characteristics that contribute to the noise level in Toronto. Data Noise was measured at 10 sites throughout Toronto over a week (or more) at 30-minute intervals. Time series were recorded at various intervals from mid-June to early September, starting at different times in the week, with various amplitudes and noise levels. This can be seen by the time-series from three of the ten sites. The map shows the site locations of this case study, collected in three phases: Cycle1 (n = 241), Cycle1.rep (n = 100), and Cycle2 (n = 213). 30 40 50 60 70 80 Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Time (labels on midnight) LEQ Site 6 30 40 50 60 70 80 Wed Thu Fri Sat Sun Mon Tue Wed LEQ Site 3 30 40 50 60 70 80 Fri Sat Sun Mon Tue Wed Thu LEQ Site 9 43.60 43.65 43.70 43.75 43.80 43.85 -79.7 -79.6 -79.5 -79.4 -79.3 -79.2 -79.1 Longtitude Latitude 50 60 70 80 90 LEQ Data cycle1 cycle1.rep cycle2 Methods Overview I Temporal variation was modeled using a linear mixed effect (LME) model with harmonic and polynomial com- ponents I Importance of site characteristics assessed with random forest and Dirichlet process mixture models I Spatial variation of adjusted noise (LEQ) explored using residual kriging by assuming a Gaussian process with Mat´ ern covariance function I Aforementioned methods reapplied on test data Temporal Model We assume each site to be independent since we observe no relationship between distance and noise correlation between sites. Also, for each site we only consider the “time point” of the observations, which we define as the time of week (of which there are 48*7 = 336). The LME is as follows: LEQ it = b i + d 1 (t)+ d 2 (t)1 [t∈W ] + w (t)+ it b i ∼N (0,σ 2 ) d j (t)= β 0j + β 1j sin 2πt 24 + β 2j cos 2πt 24 + β 3j sin 2πt 12 +β 4j sin 2πt 8 + β 5j cos 2πt 8 j =1, 2 w (t)= α 1 t + α 2 t 2 + α 3 t 3 The components (periods) of d j (t) were chosen based on a spectral analysis on all sites time-series individually. Then, the combination of components and the order of the polynomial are determined by significance and AIC. Finally, the choice of time point baseline reference (t =0 on Sunday midnight) is chosen by AIC. The resulting trend will come from the fixed effects which will look like d 1 (t)+ w (t) for weekdays and d 1 (t)+ d 2 (t)+ w (t) for weekends. Describing temporal variations The isolated harmonic functions d 1 (t) and d 1 (t)+ d 2 (t) are shown below as well as the overall weekly trend. We see that for weekdays, noise is fairly constant throughout work hours, rising and falling at both rush hours, with a peak at 3:30 PM; on weekends noise levels rise later in the day, and higher at that, peaking at 6:00 PM. 48 50 52 54 0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 Time Fitted LEQ Weekend 48 50 52 54 0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 Time Fitted LEQ Weekday ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● -10 0 10 20 6 8 5 3 7 4 1 10 2 9 SiteID (in increasing order of median residual magnitude) Residual LEQ 50 55 60 65 70 Mean LEQ The week-long trend shown below illustrates that Wednesday is the loudest weekday while Saturday is the loudest day of the week. The model was able to fit approximate the observed levels, as seen by the residuals boxplots. 50.0 52.5 55.0 57.5 Sun 0:00 Sun 6:00 Sun 12:00 Sun 18:00 Mon 0:00 Mon 6:00 Mon 12:00 Mon 18:00 Tue 0:00 Tue 6:00 Tue 12:00 Tue 18:00 Wed 0:00 Wed 6:00 Wed 12:00 Wed 18:00 Thu 0:00 Thu 6:00 Thu 12:00 Thu 18:00 Fri 0:00 Fri 6:00 Fri 12:00 Fri 18:00 Sat 0:00 Sat 6:00 Sat 12:00 Sat 18:00 Sat 23:30 Time of Week Fitted LEQ Weekday Sun Mon Tue Wed Thu Fri Sat Site Characteristics The site characteristics can potentially obscure the spatial pattern in the LEQ noise process, thus the need for identifying the important site characteristics. Random Forest Regression The random forest regression implicitly models the interaction effects and it can identify important variables. The variable importance plot shows that the traffic count is the most important variable followed by industrial land use (IND100M) and population density (POP500M). REC100M OPEN100M COM100M DIST_EXP POP500M IND100M Total.Traffic 0 2000 4000 6000 Improvement in split criterion Data Cycle1 Cycle2 0 200 400 600 HIGH MEDIUM LOW Total.Traffic 0 5000 10000 15000 20000 25000 HIGH MEDIUM LOW IND100M 0 3000 6000 9000 12000 HIGH MEDIUM LOW COM100M 0 2500 5000 7500 10000 HIGH MEDIUM LOW REC100M Dirichlet Process Mixture Models We wish to cluster the sites based on the noise levels and identify the characteristics for each cluster, but the number of clusters K is unknown. An appropriate method in this framework is the Dirichlet Process Mixture Models. Obtained three clusters, which we label as HIGH, MEDIUM, and LOW noise. High level of traffic count and IND100M for the HIGH cluster and MEDIUM noise sites. Recreational land use (REC100M) is high for LOW noise sites. Describing spatial variations Question: Describe the spatial variations; are there locations in Toronto experiencing higher noise levels? It is assumed that data Z (s) = (Z (s 1 ),...,Z (s n )) are a realization of a spatial process {Z (s) : s ∈ R 2 }. For every site location s i , the residuals of the following regression model, denoted as R(s i ), retain the spatial variability of the process but some of it has been removed, as a result of the external information used in the modelling. Z (s i )= μ(s i )+ ε(s i ),ε(s i ) ∼ N (0,ψ 2 ) μ(s i )= X (s i ) > β These residuals are then kriged by using Ordinary Kriging assuming a Gaussian Process with Mat´ ern covariance model with κ =1.5 as described below. R(s) ∼ N (μ 0 ,τ 2 I + σ 2 Σ(φ)). The objective of the kriging procedure is to obtain predictions at unsampled locations using information available elsewhere in the study domain. Describing spatial variations Question: Are the spatial variations of Cycle1.rep similar to Cycle1? I If Cycle1 and Cycle1.rep follow the same spatial pattern in terms of “unexplained noise”, it is likely the kriging prediction of Cycle1.rep obtained from the residual kriging using Cycle1 is somewhat a fair prediction of the noise after adjusting the predictors. I We observe a good agreement for the two sets of residuals, so we augment Cycle1 with Cycle1.rep to provide calibrated predictions on Toronto. These are mapped in the first contour plot to show areas of high residual noise, such as downtown and near the Pearson airport, as expected. Cycle 1 43.60 43.65 43.70 43.75 43.80 43.85 -79.7 -79.6 -79.5 -79.4 -79.3 -79.2 -79.1 Longtitude Latitude -15-10 -5 0 5 10 Residual Noise (LEQ) Cycle 2 43.60 43.65 43.70 43.75 43.80 43.85 -79.7 -79.6 -79.5 -79.4 -79.3 -79.2 -79.1 Longtitude Latitude -10 0 10 20 Residual Noise (LEQ) Describing spatial variations Question: Cycles 1 and 2 similarity I The same methods as described above for Cycle1–linear model with chosen covariates and residual kriging–is implemented on Cycle2 to obtain a contour plot for residual noise in Toronto. I A large porportion of LEQ variation between sites in Cycle 2 can be explained by site characteristics similarly to Cycle 1. Random forest shows nearly identical results for the importance of site characteristics, so the cause of spatial noise variation is comparable. I When employing residual kriging, we obtain the contour map shown above which is dissimilar to that obtained with Cycle 1. This is likely due to the complementarity of the site placements coupled with the low long-distance prediction capability of this particular spatial model.