Page 1
TEL AVIV UNIVERSITY
The Iby and Aladar Fleischman Faculty of Engineering
The Zandman-Slaner School of Graduate Studies
MODELING AND ESTIMATION OF CELLULAR PATH
LOSS DISTRIBUTIONS
A thesis submitted toward the degree of
Master of Science in Electrical and Electronic Engineering
by
Yehonatan Broyde
August 9200
Page 2
TEL AVIV UNIVERSITY The Iby and Aladar Fleischman Faculty of Engineering
The Zandman-Slaner School of Graduate Studies
MODELING AND ESTIMATION OF CELLULAR PATH
LOSS DISTRIBUTIONS
A thesis submitted toward the degree of
Master of Science in Electrical and Electronic Engineering
by
Yehonatan Broyde
This research was carried out in the Department of Electrical Engineering - Systems under the supervision of Prof. Hagit Messer-Yaron
August 2009
Page 3
Acknowledgment
Warm thanks go to my supervisor Prof. Hagit Messer-Yaron, for her wise advices,
support and encouragement.
This research was initiated in cooperation with Schema Ltd. (http://schema.com), who
presented the issue of indoor/outdoor ratio estimation, and shared its parsing and
analysis tools, as well as raw cellular data. Fruitful discussions with Schema people
led to the possibility of using a mixed-Gaussian model for describing the path loss
distribution, in addition to some other useful ideas.
I therefore deeply thank Schema people, and especially to Dr. Michael Lifvschitz, Mr.
Shmuel Nowik and Mr. Roy Bass, for their ideas, knowledge and support.
Special thanks to my family and friends, who supported me throughout the long
journey of the thesis research and writing.
Page 4
Abstract
Cellular network path losses are key factors in the behaviour and performance of the
network. Many models have been studied in the last few decades, for describing path
loss behaviour as a function of the distance between the transmitter and receiver, and
of other parameters such as antenna height, buildings and vegetation characteristics in
the signal path, and more.
In this thesis, "Sector-to-Users" path losses were studied, and a novel model
describing this path loss distribution is presented. First, the distribution is computed
theoretically, based on a few simple assumptions such as a power law signal decay,
uniform geographical distribution of cellular calls, etc. Next, a simple Gaussian model
is suggested, based on the theoretical computation with real-life parameters and on
analysis of empirical data. This thesis shows, both theoretically and empirically, that
the simple Gaussian model is valid for many real-life sectors, where the following
criteria are met: standard deviation of shadowing effect is above ~5dB, and mixing of
indoor and outdoor users is low. The complete model incorporates Indoor/Outdoor
mixing, and offers a Gaussian Mixture (GM) model for the path loss distribution.
Throughout the work, different algorithms are analyzed and used to estimate the path
loss distribution parameters: Gaussian Mixture estimation techniques for distributions
with known means and variances, and Expectation-Maximization (EM) algorithm for
distributions with unknown means and variances. The estimators were thoroughly
tested and analyzed based on empirical path loss data from a real UMTS cellular
network in Israel. The data was retrieved from various sectors, including sectors with
high and low Indoor/Outdoor mixing, sectors which use repeaters in their coverage
area and more.
Based on the model and estimators presented, two applications will be proposed in
this thesis, utilizing the fact that path loss data is already measured and logged
automatically in most cellular networks. The first application is a "coverage map" of
the cellular network composed by estimations of the sector's actual coverage area.
This can help locate coverage "blind spots" and overlaps in the network. The second
application is an estimation of the Indoor/Outdoor ratio, using the GM model and the
EM algorithm. Both applications can lead to significant improvements in network
design and optimization.
To conclude, a few directions for future research will be suggested. The most
prominent of these directions is calibration of the model studied in this thesis, which
can be done using standard "drive-tests" measurements.
Page 5
Contents
1. Introduction
1.1. Background
1.2. Previous Work
1.2.1 General path loss models
1.2.2 Sector-to-users path loss models, and indoor/outdoor ratio
estimation
1.3. This thesis contribution
2. Cellular Path Loss Distribution – a Theoretical Computation
3. Gaussian Mixture Model
3.1. Why choose a GM model?
3.2. GM background
3.3. Estimators for distributions with known means and variances
3.4. EM Algorithm
4. The Monitored System
5. Results Analysis
5.1. Highly bi-modal sectors
5.2. Highly uni-modal sectors
5.3. Influence of repeaters
5.4. Dependence of initial values
5.5. EM results for pure Gaussian and theoretical model distributions
5.6. Sector coverage area estimation
6. Conclusions and Suggestions for Future Research
6.1. Conclusions
6.2. Suggestions for future research
6.2.1 Calibrating the models presented in this thesis
6.2.2 Studying the convergence rate of the EM algorithm and developing
techniques for its acceleration
6.2.3 Using the models as a new "measuring tool"
6.2.4 Combining location and path loss data
6.2.5 Mobility estimation, in-car path loss behavior and discrimination
6.2.6 Integrating additional information to the Gaussian mixture
estimation
Appendices
A. CRLB Computation
Bibliography
Page 6
List of Acronyms
BSC – Base Station Controller
BTS – Base Transceiver Station
CDMA – Code Division Multiple Access
CPICH – Common Pilot Channel
CRLB – Cramer-Rao Lower Bound
EM – Expectation Maximization
GM – Gaussian Mixture
GPEH – General Performance Event Handler
GSM – Global System for Mobile communication
ITU – International Telecommunication Union
ML – Maximum Likelihood
MLC – Maximum Likelihood Classifier
MS – Mobile Station
MSC – Mobile Switching Center
MSE – Mean Square Error
OSS – Operations Support System
PDF – Probability Density Function
PPE – Prior Probability Estimator
RF – Radio Frequency
RNC – Radio Network Controller
RRM – Radio Resource Management
RSL – Received Signal Level
RSS – Received Signal Strength
UE – User Equipment
UMTS – Universal Mobile Telecommunication System
Page 7
List of Figures
Figure 1.1 – Signal propagation in a wireless network --------------------------------- 2
Figure 2.1 – Cellular sector-to-users theoretical distribution - I --------------------- 8
Figure 2.2 – Cellular sector-to-users theoretical distribution - II --------------------- 9
Figure 2.3 – Theoretical Vs. Measured Path Loss Distribution ----------------------- 9
Figure 2.4 – Path loss mean Vs. Sector's maximal radius, for model values of
----------------------------------------------------------- [dB]and σ=9 3.7=10A
10
Figure 2.5 – Kullback-Leibler divergence between (2.9) and its Gaussian fit, Vs.
the standard deviation σ used in (2.9) ------------------------------------
11
Figure 3.1 – Measured Path Loss Distribution (EKG141952) Vs. Unimodal
Gaussian fit --------------------------------------------------------------------
13
Figure 3.2 – Measured Path Loss Distribution (EKG142181) Vs. Unimodal
Gaussian fit --------------------------------------------------------------------
13
Figure 3.3 – MLCunbiased Vs. Measured Path Loss Data, Sector EKG 141952 ------ 17
Figure 3.4 – MLCunbiased Vs. Measured Path Loss Data, Sector EKG 141952 ------ 18
Figure 3.5 – Estimators' MSE Vs. Distance between the Gaussians
(100 samples) -----------------------------------------------------------------
19
Figure 3.6 – Estimators' MSE Vs. Distance between the Gaussians
(10,000 samples) -------------------------------------------------------------
19
Figure 4.1 – Sector locations in the Beer-Sheva area ---------------------------------- 22
Figure 4.2 – Typical trisector Node B units --------------------------------------------- 23
Figure 5.1 – Google Earth photo of the central bus station, sectors
EKG153363 and EKG142181 ---------------------------------------------
25
Figure 5.2 – Measured Data Vs. EM Estimation of EKG153363 -------------------- 26
Figure 5.3 – Measured Data Vs. EM Estimation of EKG142181 -------------------- 26
Figure 5.4 – Google Earth photo of the "BIG" shopping center, sector
EKG141952 ------------------------------------------------------------------
27
Figure 5.5 – Measured Data Vs. EM Estimation of EKG141952 -------------------- 27
Figure 5.6 – Google Earth photo of an urban area, sector EKG143613 ------------- 28
Figure 5.7 – Measured Data Vs. EM Estimation of EKG143613 -------------------- 29
Figure 5.8 – Google Earth photo of urban area, sector EKG159853 ----------------- 29
Figure 5.9 – Measured Data Vs. EM Estimation of EKG159853 -------------------- 30
Figure 5.10 – Google Earth photo of the university area sectors and repeaters ---- 31
Figure 5.11 – Measured Data Vs. EM Estimation of EKG130711 ------------------- 31
Figure 5.12 – Measured Data Vs. EM Estimation of EKG131731 ------------------- 32
Figure 5.13 – Google Earth photo of the area around EKG148911/2/3 ------------- 33
Figure 5.14 – Measured Data Vs. EM Estimation of EKG148911 ------------------- 33
Figure 5.15 – Measured Data Vs. EM Estimation of EKG148912 ------------------- 34
Figure 5.16 – Measured Data Vs. EM Estimation of EKG148913 ------------------- 35
Figure 5.17 – EM estimation with different initial parameters values,
EKG141952 -----------------------------------------------------------------
36
Figure 5.18 – EM Algorithm steps of two different runs (right and left),
EKG141952 -----------------------------------------------------------------
36
Figure 5.19 – EM estimation with different initial parameters values,
EKG148912 -----------------------------------------------------------------
37
Page 8
Figure 5.20 – EM estimation with different initial parameters values, pure
Gaussian data ---------------------------------------------------------------
38
Figure 5.21 – EM estimation with different initial parameters values, equation
(2.9) theoretical computation ----------------------------------------------
39
Figure 5.22 – Beer-Sheva city center coverage map estimation ---------------------- 40
Figure 5.23 – Coverage map estimation outside of Beer-Sheva ----------------------
40
Page 9
Chapter 1
Introduction
1.1 Background
Cellular network path losses are one of the most significant factors influencing the
behavior of the network and its performance, and therefore have a great effect on
network design and optimization. Path loss models have been widely studied in the
last decades, though most of the study referred to "loss vs. distance" behavior and the
different factors influencing it. In this thesis we study "sector-to-users" path loss
distributions, present a model to describe these distributions, develop estimation
techniques and analyze empirical data from real cellular networks.
Path loss is usually described as the cumulative effect of three different phenomena
[1, 2, 3, 4, 5]–
1. Propagation loss1 –
This is the median attenuation of the signal power, usually described as a
power law relation between the signal power and the distance between the
transmitter and the receiver. Propagation loss is due to free space wavefront
spreading, and the median effect of the propagation environment, such as
buildings, vegetation, water sources etc.
2. Shadowing –
This is the low variations of the signal strength around the average curve
(Figure 1.1). The radio signal undergoes additional attenuation caused by local
obstacles (or masks) between the transmitter and the receiver. It is generally
modeled by a Gaussian distribution with zero mean and standard deviation σ
(when referring to the signal power in dB). A typical standard deviation for
rural environment is 6 dB and can be as high as 10-12 dB for urban
environments [6, 7].
3. Fast Fading –
This is the short-term fading, which is present in the signal in form of rapid
variations (Figure 1.1). It is due to multipath effects, and can cause variations
of up to 20 dB in the signal power, over distances in the order of magnitude of
the signal wavelength [8]. Since the variations happen over such a small scale
of distances, fast fading is usually averaged and not taken into account in large
scale optimization and design of cellular networks.
1 In the literature "path loss" is sometimes referred to only as the median large-scale loss. For the sake of clarity I chose to refer to the total signal loss from the transmitter to receiver as the "path loss", and refer to the median large-scale loss as "propagation
loss".
Page 10
Figure 1.1 - Signal propagation in a wireless network (Drawing taken from [6])
1.2 Previous work
1.2.1 General path loss models
Path loss models have been widely studied in the last few decades. Many models have
been suggested for various environments, signal frequencies and distances, and these
models are widely used by network designers, operators and others in the cellular
industry.
The methods for modeling and predicting the path loss in a specific network can be
roughly divided into two main groups: statistical and computational.
I. Statistical methods.
These methods use measured path loss data, plus RF propagation models, to formulate
the path loss model. They are usually based on path loss measurements in specific
areas and cities, and give a mathematical description of the path loss behavior, usually
just for the median behavior (thus actually modeling the propagation loss). Besides
the radio frequency and the distance between the transmitter and receiver, methods
differ in the parameters they take into consideration. Such parameters can be the
height of the transmitter and receiver, the buildings' height, the building density etc.
The most widely used method for cellular path loss modeling is the Okumura-Hata
model, which is based on numerous measurements taken in and around Tokyo.
Okumura [9] suggested a method based on a series of curves describing the average
attenuation ( ),A frequency distance relative to the free-space loss in an urban
environment. Hata [10] developed a mathematical formula to describe Okumura's
curves, which takes into account the frequency of the signal, the distance between the
base station antenna and the mobile antenna, the height of the base station antenna
and of the mobile antenna, and the character of the area (urban – large and medium
size cities, suburban and rural).
Propagation Loss
Shadowing Fast
Fading
Page 11
( )[ ] 069.55 26.16log( ) 13.82log( ) ( ) 44.9 6.55log( ) log( )dB T R TPL f h a h h r= + − − + −
For example, the median path loss for a medium-size city is modeled by:
(1.1)
Where 0f is the signal center frequency, r is the distance between the base station and
the mobile antenna, hT is the base station antenna height, hR is the mobile antenna
height, and
( ) ( )( ) ( )( )0[ ]1.1log 0.7 1.56log 0.8R R RdB
a h f h h= − − − (1.2)
0150 1500 , 30 200 , 1 10 , 1 20T Rf MHz h m h m r km (1.3)
Other statistical methods exist for path loss prediction, including COST-231,
Walfisch-Ikegami model, and Ibrahim-Parsons method [1]-[5].
II. Computational methods.
These methods are sometimes called "deterministic methods", and use wave
propagation models or ray-tracing techniques to compute the path loss in specific
radio links. They are usually more accurate than the statistical methods, but require
detailed and accurate description of the objects in the propagation space, plus
expensive computational resources.
This thesis is aimed to produce a general and easily applicable model for the path loss
distribution, without the need to rely on such detailed data (or expensive
computational resources). Therefore we focused on statistical methods for modeling
the path loss and did not go deep into computational methods.
A good review of computational methods can be found in [2, Section III].
1.2.2 Sector-to-users path loss models, and indoor/outdoor ratio estimation.
While most of the statistical methods provide formulas similar to (1.1), which relate
the path loss to the frequency, distance and environmental characteristics, we focus in
this thesis on the path loss distribution between the sector's antenna and the mobile
users. This sector-to-users path loss distribution is important because of three main
reasons:
i. It is more relevant for planning and optimizing cellular networks than the classic
path loss formulas.
ii. It can be extracted from actual measurements, which are usually performed
automatically by the cellular network.
iii. As we show later in this thesis, a general model can be built to describe the
sector-to-users path loss distribution. Using this model, we can extract
interesting network features from empirical path loss data.
One major difficulty which arises when we try to model the sector-to-users path loss
distribution is the indoor/outdoor path loss differences. It is accepted that for indoor
calls, penetration and in-building losses of 7-20dB [1, 7, 8] should be added to the
total path loss. Many of the path loss models mentioned in section 1.2.1, solve this
Page 12
difficulty by dealing with the indoor and outdoor cases separately, and by adding a
penetration loss factor for the indoor case. In the "real world", the measurements
contain a mixture of indoor and outdoor calls, and we must take this into account.
Some previous work has been done on the subject of sector-to-users path loss models,
and on the indoor/outdoor issue:
Barucha and Haas (2008) [11] derived an analytic formula for sector-to-users path
loss distribution in the case of uniformly spatial calls distribution, without considering
the indoor/outdoor issue. We actually made an almost identical computation,
independently, which is presented in section 3 and compared to this paper's results.
Zhu and Durgin (2005) [12] studied Received Signal Strength (RSS) methods for
locating indoor and outdoor cellular calls. In their work, they calculated the RSSAN,
the average of the N strongest RSS measurements of a specific call. They assumed
that the RSSAN is log-normally distributed with different means and variances for
indoor and outdoor calls. In their measurement set at the Georgia Tech campus, they
measured an indoor mean RSSA6 of -97.8dBm with standard deviation of 14.1dB, and
outdoor mean RSSA6 of -85.5dBm with standard deviation of 9.7dBm. These
differences between the indoor and outdoor log-normal distributions allowed them to
achieve ~90% correct indoor/outdoor estimates with a simple threshold estimator.
Villebrun et al. (2006) [8] studied a few methods for discrimination of
indoor/outdoor/incar calls in a cellular network. They state the basic differences
between the three situations – penetration loss for indoor calls, and large RSS
fluctuations and cell change rates for incar calls. They then present a mobility
detection algorithm, based on fluctuations of the RSS. Finally an indoor/outdoor
discrimination algorithm is presented, based on a Neyman-Pearson criterion, where
the different indoor and outdoor a-priori distributions are calculated using a network
planning software simulation. Results are given only for a small number of
measurements, using special mobile equipment that records the power control
measurements it takes and sends over the network. Even though their methods are
designed to use data from standard wireless networks, no such examples are
presented.
Alaya-Feki and Le Cornec (2007) [6] suggest methods for intelligent real-time
analysis of RRM (Radio Resource Management) measurements, for improving RRM
processes. They too rely on RSS measurements taken by the mobile user, and use
smoothing and regression techniques to estimate the different attenuation components
– propagation loss, shadowing and fast fading. They use these estimations to
discriminate unmoving, pedestrian and incar mobile users, according to differences in
the fluctuations of the shadowing and propagation loss components. Indoor situations
are ignored, and the authors do not state what kind of fluctuations is expected in
indoor measurements. Another application they present is an amelioration of the
handover process, by better choosing the next server cell, relying on the slope of the
neighboring cells RSS Vs. time measurements, in addition to the RSS values
themselves.
Page 13
1.3 This thesis contribution
In this thesis, we introduce new methods for measuring and modeling the path-loss
statistics in cellular networks.
In section 2 we develop a theoretical model for the cellular "sector-to-user" path loss
distribution. We show that under real-life conditions, this model is close to Gaussian,
and use empirical data to show that the model corresponds well to sectors which have
high rates of either indoor or outdoor calls.
In section 3 we deal with the issue of indoor/outdoor mixing, and suggest a Gaussian
mixture (GM) model for the path loss distribution. We study different estimators for
Gaussian mixture distributions. Three different estimators are compared, for
distributions with known means and variances, and we introduce an Expectation-
Maximization (EM) algorithm for distributions with unknown means, variances and
mixing probability.
Empirical path loss distributions are presented in section 5, from a real UMTS cellular
network in Israel. Results are analyzed and examples are given for two different
applications – estimating sectors' physical coverage area, and estimation of the
sectors' indoor/outdoor ratio.
Conclusions and suggestions for future research are presented in section 6.
The uniqueness and innovation of this thesis consist of three main aspects:
1. Development of a theoretical model for "sector-to-user" path loss distributions
plus a quantitative range of parameters (which are met in real-life conditions), in
which the theoretical model is equivalent to a simple unimodal Gaussian model.
2. Introduction of new applications – monitoring each sector's path loss
performances, physical coverage area and indoor/outdoor ratios. The monitoring
can be done almost in real-time, and further improve cellular network design and
optimization tools.
3. Development of the model and applications based on readily available
measurements from the cellular network, which are measured and logged
automatically by network operators.
Page 14
Chapter 2
Cellular Path Loss Distribution – a theoretical computation
In this section we present a theoretical computation of the expected sector-to-users
path loss distribution, based on a few simple assumptions. The aim of the computation
is to give a "first order" analysis of the nature of the path loss distribution, and to
better understand the factors which dominate it.
The main assumptions of the computation are:
I. The geographic distribution of the cellular calls is uniform –
As we have no prior information about the geographical distribution of the calls,
the most reasonable assumption is to assume the distribution is uniform. We
believe that in urban areas this assumption is usually valid, on the average,
although in each sector the geographic distribution of the calls might be
clustered in specific areas, such as office buildings, malls, main junctions etc.
II. The signal power median decays according to a power law –
Since we want a general model, we turned to the group of statistical methods,
like COST-231 and the Okumura-Hata model. Both models (plus some other
statistical models) give a power law decays for signal power medians in cellular
frequency bands. Each model provides different constants and factors for
different parameters (such as the height of the antenna, the height of the
buildings in the signal space etc.). But for examining the general nature of the
distribution, as we shall see ahead, it was sufficient to consider the path loss
model as a power law, e.g. sent receivedP X Ar P
= (2.1),
where X represents a slow-fading zero-mean log-normal distribution.
III. The geographic area in which mobile phones communicate with the sector
antenna is a slice with an opening angle θ and radius R –
In addition this is actually an assumption of a distinct division of the cellular
network into separate cells. In reality this division is not distinct nor has clear
borders, and the spatial form of the cells does not have a well defined shape.
Nonetheless, it is a good enough assumption for the geographical area of a
general cell.
IV. The measurements are statistically independent –
This assumption validity depends on the way the measurements are taken. We
expect the measurements to be statistically independent if, for example, each
measurement is taken for a different call, and the call positions are independent
too. In this thesis the measurements are taken in multiple times during a call (at
each handover), and so the independency is weaker.
V. Shadowing effect add an independent log-normal factor, with constant mean and
variance in the sector area –
Since we develop a statistical model, the independency is reasonable. The
constant mean and variance of the shadowing effect is a common assumption in
Page 15
2( ) 2 , 0r
rf r r R
R=
the literature, though non-constant values can be added to the model, requiring
numerical computation.
Under these assumptions, we compute the distribution of the path loss:
We first compute the distribution of the path loss using the power law formula, and
then consider the X log-normal factor. Relying on the statistical independency of
the two phenomena, taking the log-normal factor into account is done by convoluting
the two distribution functions.
We define PL to be a random variable representing the path loss.
According to assumption II –
[ ] [ ] 10log( ) 10 log( ) (0, )ratio dBPL X Ar PL A r N
= = + + (2.2)
According to assumptions I and III, we can compute the distribution of r, the random
variable representing the distance between the sector's antenna and the mobile user –
(R represents the maximal radius of the sector's slice) (2.3)
Ignoring the log-normal factor (for now), we can compute the distribution of PL as a
function of r –
10( ) 10log( ) 10 log( ) '( )
ln(10)g r A r g r
r
+ = (2.4)
[ ]
2
[ ] 2 2
( ) 2 ln(10) ln(10)( )
'( ) 10 5dB
rPL dB
f r r rf PL r
g r R R = = = (2.5)
According to (2.2), ignoring the independent log-normal factor – [ ]1
10[ ] 10log( ) 10 log( ) 10
dBPL
dBPL A r r A −
= + = (2.6)
Therefore [ ]
[ ]
2
5[ ] 2
ln(10)( ) 10
5
dB
dB
PL
PL dBf PL AR
−
= [ ] 10log( ) 10 log( )dBPL A R− + (2.7)
And we get an exponential distribution for PL. As a sanity check we validate that
[ ] [ ] [ ]( ) 1dBPL dB dBf PL dPL
+
−
= (2.8)
Taking the shadowing log-normal factor into account, we convolute (2.7) with the
log-normal distribution to get: 2[ ][ ]
2
[ ]
2
5 2[ ] 2 2
ln(10) 1( ) 10
5 2
dBdB
dB
PLPL
PL dBf PL A eR
−−
= (2.9)
Page 16
Figure 2.1 shows the path loss distribution according to equation (2.9), for values of
α=3, A=103.7 (parameters specified for UMTS path loss model by ITU, [7]),
R=500[meters] and σ = 6[dB] and 12[dB]. The red line represents the propagation loss
without the log-normal shadowing factor, and the blue lines represent the total path
loss.
Figure 2.2 shows the path loss distribution according to equation (2.9) for the same
parameters, but with two different R values of 500 and 1000 meters.
Figure 2.3 shows a real sector path loss distribution (black), compared to equation
(2.9) (blue) and a unimodal Gaussian estimation (dashed purple). In this figure and
A were specified as in figures 2.1, 2.2, by ITU [7], while R and were set manually
to create a good fit.
Figure 2.1 – Cellular sector-to-users theoretical distribution - I
Red line – Propagation loss without shadowing
Blue narrow distribution – Total path loss with shadowing, σ = 6[dB]
Blue wide distribution – Total path loss with shadowing, σ = 12[dB]
Page 17
Figure 2.2 – Cellular sector-to-users theoretical distribution - II
Blue narrow distribution – R = 500[meters], σ = 6[dB]
Blue wide distribution – R = 500[meters], σ = 12[dB]
Purple narrow distribution – R = 1000[meters], σ = 6[dB]
Purple wide distribution – R = 1000[meters], σ = 12[dB]
Figure 2.3 – Theoretical Vs. Measured Path Loss Distribution
Black line – Measured path loss distribution from sector EKG142183
Blue line – Equation (2.9), with values manually chosen for a good fit
Purple dashed line – A unimodal Gaussian fit
Page 18
Two important conclusions arise from the above computation and from equation (2.9):
1. As might be expected, the maximal propagation loss (without the shadowing
effect) depends logarithmically on R (eq. (2.7)). Taking the shadowing factor into
account, we still get a clear dependence of the maximum likelihood path loss (the
maximum of the path loss PDF), and of the path loss mean on the sector's radius.
Figure 2.4 shows this dependency for α=3, A=103.7 [7] and σ=9[dB] (the value of σ
is less relevant, since a 6-12[dB] range of values changes the maximum likelihood
and mean path loss by only ~1dB). As we present in section 5.6, this dependency
can be used reversely, to estimate the coverage area of cellular sectors by their
path loss distribution, and thus help cellular operators to better monitor and
optimize their network.
2. Looking at figures 2.1-2.3, a question rises as to how different the suggested
model is from a simple Gaussian model. Though we get a clear exponential
distribution for the propagation loss, we see that with real propagation loss and
shadowing parameters, the path loss distribution is quite similar to Gaussian.
Figure 2.5 shows the Kullback-Leibler divergence between equation (2.9) and a
Gaussian fit, for fixed α, A, and R values and different standard deviations. We see
that for real-life cellular path loss parameters (i.e. standard deviations above
~5[dB]), a simple Gaussian model describe the path loss distribution well, and in
most cases the complicated expression (2.9) is not needed.
Figure 2.4 – Path loss mean Vs. Sector's maximal radius, for model values of A=103.7
and σ=9[dB]
Page 19
Figure 2.5 – Kullback-Leibler divergence between (2.9) and its Gaussian fit, Vs.
the standard deviation σ used in (2.9)
As mentioned in section 1.2.2, Barucha and Haas [11] made an almost identical
computation (though they didn't relate to the conclusions presented above). Their goal
was to develop an analytic formula for the distribution of path losses for uniformly
distributed nodes in a circle. This is the same case as ours, though they didn't target
their computation specifically for the cellular case, and indeed it is relevant for any
uniformly scattered wireless network, with a distance power law path loss behavior.
They used different symbolization but came to the same result as (2.9). They then
continued the analytical computation of the convolution, to get:
2 2[ ]
2
[ ]
2 ln(10)( ) 2 (ln(10))
[ ] 2
ln(10)( ) 1 ( )
dB
dB
b PL a
bPL dBf PL e erf D
bR
− +
= − (2.10)
Where: 2 2
[ ] log( ) 2 ln(10)10log( ) 10
2
dBPL b ab b Ra A b D
b
− − += = = (2.11)
Barucha and Haas also compared the computation to Monte-Carlo simulations of the
path loss distribution, and as expected the computation and simulations were in good
agreement. Good agreement was also found between the computation and simulations
done for hexagonal sectors, which are slightly more similar to the real shape of
cellular sectors.
Page 20
Chapter 3
Gaussian Mixture Model
3.1 Why choose a GM model?
In section 2 we derived a theoretical expression of the sector-to-user path loss
distribution. As was demonstrated in figure 2.3, there are some real cellular sectors
whose path loss distribution is in agreement with that expression. Another simple
model which might be suggested (As was shown in figures 2.3, 2.5) is a unimodal
Gaussian distribution. However, it turns out there are some sectors with completely
different path loss behavior. Figures 3.1 and 3.2 show two such sectors. A unimodal
Gaussian estimator is presented (The purple dashed line), but it is clear that neither
the unimodal Gaussian model nor the expression derived in section 2 can describe
these kinds of distributions.
The main factor which was not referred to in section 2 is the indoor penetration loss,
usually taken to be 7-20dB [1, 7, 8]. Taking the indoor penetration loss into account,
we propose a bimodal Gaussian mixture (GM) model for the path loss distribution:
2 2
1 1 2 2( ) ( , ) (1 ) ( , )PLf pl p N p N = + − (3.1)
Where: 2
1 1( , )N – Represents a Gaussian model for outdoor calls path loss distribution.
2
2 2( , )N – Represents a Gaussian model for indoor calls path loss distribution.
p – Represents the mixing probability between indoor and outdoor calls in
the sector coverage area.
We note that in Israel there is a mandatory use of hands free cell phones in cars, and
therefore the effect of in-car penetration loss is expected to be small. Since in-car calls
are transmitted through an antenna outside the vehicles, these calls are treated as
outdoor calls.
In section 3.2 we give a short background about the Gaussian Mixture estimation
problem. In section 3.3, we present some estimation methods for cases in which the
means and variances of the two Gaussians are known or can be estimated separately.
In addition a comparison is presented between the different estimators and the
Cramer-Rao Lower Bound, which is computed in appendix A. In section 3.4, an
Expectation-Maximization (EM) algorithm is presented, for estimating all five
parameters ( 1 1 2 2, , , ,p ) with no prior knowledge other than the model of
equation 3.1.
Page 21
Figure 3.1 – Measured Path Loss Distribution (EKG141952) Vs. Unimodal Gaussian fit
Black line – Measured path loss distribution from sector EKG141952
Purple dashed line – A unimodal Gaussian fit
Figure 3.2 – Measured Path Loss Distribution (EKG142181) Vs. Unimodal Gaussian fit
Black line – Measured path loss distribution from sector EKG142181
Purple dashed line – A unimodal Gaussian fit
Page 22
3.2 GM background
Parameter estimation of Gaussian mixtures has been widely studied in the past, with
studies on many different aspects of the problem – univariate and multivariate
densities, estimating parameters when some a-priori information is given, lower
bounds for different cases etc (general background on Gaussian distributions and
estimations can be found in [13, 14]). The problem of estimating the five parameters
( 1 1 2 2, , , ,p ) in a mixture of two univariate normal distributions was considered
as long ago as 1894 (by Karl Pearson, [15]).
Studies which are more relevant to this thesis include those made by –
B.S. Everitt (1984) [16], who presents a comparison between different Maximum
Likelihood (ML) algorithms for estimating the five parameters in a mixture of two
univariate normal distributions (Expectation-Maximization algorithm, Newton's
method, Fletcher-Reeves algorithm and Nelder-Mead Simplex algorithm).
Dick and Bowden (1973) [17], who studied the estimation of the five parameters,
when independent sample information is available from one of the populations.
G.R. Dattatreya, who studied a few aspects of the problem –
• Estimating parameters of a mixture of M Gaussian densities, with known and
distinct means and unknown (possibly different) variances (Dattatreya 2002,
[18]).
• Estimating the prior probabilities in a mixture of M classes with known class
conditional distribution (Dattatreya & Kanal 1990, [19]). This study deals with
general distributions, though it's applicable to Gaussian distributions as well,
and even presents an example regarding the estimation of the prior probability
of a Gaussian mixture.
• Estimating prior probabilities in a mixture of M multivariate Gaussians with
known means and unknown common covariance matrix (Dattatreya & Fang
2003, [20]).
The most suitable estimation algorithm for our problem is probably the Expectation-
Maximization algorithm, which is presented in section 3.4.
3.3 Estimators for distributions with known means and variances
We begin our study of Gaussian mixture estimators by looking at a simple Maximum-
Likelihood estimator, for distributions with known means and variances.
Given a bi-modal Gaussian mixture model – 2 2
1 1 2 2( ) ( , ) (1 ) ( , )Yf y p N p N = + − (3.2)
and a vector of N statistically independent measurements y , the total distribution
function of y is:
Page 23
1
( ) ( )N
Y Y n
n
f y f y=
= (3.3)
The total likelihood and log-likelihood functions are, respectively:
( ) ( ) ( )| |
1
| |N
Y p Y p n
n
L p f y p f y p=
= = (3.4)
(3.5)
Maximizing the total likelihood function is a difficult task, which will be dealt with in
section 3.4.
A simpler task will be maximizing the likelihood function for each measurement
separately. This estimator thus classifies each measurement to one of the two
Gaussians, and after classifying the N measurements, we define a "Maximum-
Likelihood Classifier" (MLC) estimator by:
1 1( , ) _
_
#ˆ
#
N classifications
MLC
Total Samples
p
= (3.6)
We denote
2121
0
( )
2 0 1
2 2
1 1
1 1
22 2
x
x
xA e dx erfc
− − − = =
(3.7)
And
22022
( )
2 0 2
2 2
2 2
1 11
22 2
xxx
B e dx erf
−−
−
− = = +
(3.8)
For 0x such that
2 20 1 0 2
2 21 2
( ) ( )
2 2
2 2
1 2
1 1
2 2
x x
e e
− −− −
= (3.9)
Notice that when 0p = ˆ1
MLC
BE p
B=
−, when 1p =
1ˆ
MLC
AE p
A
−= and in
general:
( ) 1 2
1 1ˆ 1
1 1 1MLC
A B A B BE p p p p C p C
A B A B B
− − = + − = − + = +
− − − (3.10)
1 2
1,
1 1
A B BC C
A B B
− = − =
− − (3.11)
We see that ˆMLCp is a biased estimator, though an unbiased version of it can be easily
defined: 1_
2
ˆˆ MLC
MLC unbiased
p Cp
C
−= (3.12)
( 1C and 2C are computed for each set of 1 1 2 2, , , .)
( ) ( )( ) ( )( )|
1
ln ln |N
Y p n
n
l p L p f y p=
= =
Page 24
Dattatreya and Kanal [19] developed a different estimator, for the general case of
estimating mixing probabilities in multiclass finite mixtures, when the class
conditional density functions are known:
Consider a random variable X with a multiclass distribution function:
1
( ) ( | )M
X i X i
i
f x p f x =
= (3.13)
Where
M is the number of classes.
, 1,...,ip i M= are the unknown prior probabilities
( | ), 1,...,X if x i M = are the known class conditional density functions of X.
Now consider ( )h X , a vector of M functions of the mixture random variable X :
1
1 1
( ) ( ) ( ) ( ) ( | )
( ) |
M
i i X i j X j
j
M M
j i j j ij
j j
E h X h x f x dx h x p f x dx
p E h x p h p
=− −
= =
= = =
= =
H
(3.14)
Since ( | ), 1,...,X if x i M = are known, ( ) |i j ijE h x h are known too.
( )iE h X can be estimated by averaging 1
1( ) ( )
n
i i k
k
h n h xn =
= , and the resulting
estimator is: 1 ˆˆ ( ) ( )PPEp n h n−= H (3.15)
The design of a Prior Probability Estimator (PPE) then boils down to finding M
functions ( )ih X such that the matrix H is invertible. Such functions exist if and only
if ip are identifiable [19, p. 152].
The ( )ih X functions presented by Dattatreya and Kanal for the Gaussian mixture
problem (and for some other problems) are:
( ) ( | )i X ih x f x = (3.16)
And the estimator is:
( )
( )
1| 1
1
1
|
|1
ˆ ( )
|M
X kn
PPE
k
X k M
f x
p nn
f x
−
=
=
E (3.17)
When E is a matrix whose elements ije are defined by:
( | | )ij X i je E f x (3.18)
And for a bi-modal Gaussian mixture: 2
2 2
( )
2( )
2 2
1
2 ( )
i j
i j
ij
i j
e e
−−
+=
+ (3.19)
Figure 3.3 shows the estimation of p for the data of sector EKG141952. We estimate
the means of the two Gaussians as the local maxima of the distribution, and estimate
the variances of the Gaussians by taking into account only the data samples that falls
Page 25
in the 'outer' side of the two Gaussians. Figure 3.4 shows the estimation of p by the
same algorithm, but with a slight shift outwards (2%) in the Gaussians mean values.
As expected, figure 3.4's estimation is better than figure 3.3's, and simple estimation
algorithms can be developed by trying few Gaussian mean values or by iterative
methods.
Figure 3.3 – MLCunbiased Vs. Measured Path Loss Data, Sector EKG 141952
Black line – Measured Path Loss Distribution
Blue line – MLCunbiased estimation
Red line – The two separate Gaussians
Page 26
Figure 3.4 – MLCunbiased Vs. Measured Path Loss Data, Sector EKG 141952
Black line – Measured Path Loss Distribution
Blue line – MLCunbiased estimation
Red line – The two separate Gaussians
Figures 3.5, 3.6 show the mean square error (MSE) of the two unbiased estimators,
compared with the Cramer-Rao lower bound (for CRLB computation, see appendix
A). The MSE was computed versus the distance between the Gaussians' means, with
standard deviation of 1, and taking the mean value of the MSE over p values between
0 and 1. Sample size of 100 samples was used in figure 3.5, and 10,000 samples were
used in figure 3.6.
Page 27
Figure 3.5 – Estimators' MSE Vs. Distance between the Gaussians (100 samples)
Black dotted line – _ˆ
MLC unbiasedp MSE
Purple line - ˆPPEp MSE
Blue line – Cramer-Rao Lower Bound
Figure 3.6 – Estimators' MSE Vs. Distance between the Gaussians (10,000 samples)
Black dotted line – _ˆ
MLC unbiasedp MSE
Purple line - ˆPPEp MSE
Blue line – Cramer-Rao Lower Bound
Page 28
3.4 The EM Algorithm
In section 3.3 we developed GM estimators for cases in which the means and
variances of the Gaussians are known. Other estimators exist for other types of a-
priori information (some of them are mentioned in section 3.2). Theoretically, even
without known means and variances, we could have used separate estimators to
estimate the means and variances of the two Gaussians (as was exampled in figure 3.3
and 3.4). These estimators could use additional information, such as geographical
information (distances to the mobile users, distances from other sectors, buildings
heights in the sector's area etc.), information about the expected indoor attenuation,
information from drive tests and more.
Since we searched for good estimation without relying on additional information, we
chose to study estimators which estimate all five parameters together. In section 6 we
relate to the issue of integrating additional information and/or constraints to our
estimator and present some ideas for future research in this area.
Among the different estimators groups for the GM case (most of them are reviewed in
[16]), the EM estimator is the most widely used and is well suited to our problem.
The algorithm is thoroughly described in the literature ([21] – [26]) and we will give
here only a short overview of it.
The EM (Expectation-Maximization) algorithm was introduced in 1977, in a classic
paper by Dempster, Laird and Rubin [21]. Though the authors themselves state that
the method had been "proposed many times in special circumstances" by other
authors, the 1977 paper generalized the method and developed the theory behind it.
The algorithm is a general method of performing ML parameter estimation of an
underlying distribution from a given data set, when the data is incomplete or has
missing values. It is also used in many cases when optimizing the likelihood function
is analytically intractable, but the likelihood function can be simplified by assuming
the existence of missing or hidden parameters. The latter is the case with our problem
of Gaussian mixtures estimation.
The EM algorithm is an iterative algorithm, composed of two steps (as hinted in its
name) – The Expectation step and the Maximization step.
Let's assume we have a data set X, and we want to estimate the parameter set Θ. The
classic ML estimator will try to maximize the likelihood function:
( ) ( | ) ( | )XL L X f X = = (3.20)
As mentioned earlier, there are many cases in which maximizing L(Θ) is analytically
intractable. In some of these cases, we can add another "hidden" data set Y and define
a new "complete" data set Z={X, Y}.
The new "complete-data" likelihood function will be:
,( ) ( | ) ( | ) ( , | )complete complete Z X YL L Z f Z f X Y = = = (3.21)
And we denote the likelihood function of equation (3.20) by ( )incompleteL .
The new "hidden" data set Y is unknown, and it is actually a random variable (or
vector). Thus the complete data likelihood function is not deterministic, and can be
regarded as a function ,( | , ) ( )complete XL X Y h Y = (3.22) of a random variable Y
where X and Θ are constants. Now we can assume initial parameters 0 and
Page 29
compute the expectation of the complete-data likelihood function with respect to
random variable Y. We define
( ) ( ) ( )1 1 1
, |, | , | , , | | ,i
i i i i i i
Y complete Y X YQ E L X Y X E f X Y X− − −
=
(3.23)
And respectively the log-likelihood function:
( ) ( )( )
( )( )
( )( ) ( )
1 1
1
, |
1
, |
, log | , | ,
log , | | ,
log , | | ,
i
i
i i i i
Y complete
i i
Y X Y
i i
YX Y
y Y
q E L X Y X
E f X Y X
f x y f y X dy
− −
−
−
=
= =
=
(3.24)
In many cases, as in the Gaussian mixture case, although it is impossible to find an
analytical solution for maximizing the incomplete likelihood (or log-likelihood)
function, computing ( )1,i iq − and maximizing it with respect to i is analytically
possible.
Fixing i as ( )( )1max ,i iArg q −
, it can be proven that:
( ) ( )1i i
incomplete incompleteL L − (3.25)
And the process converges to a local maximum of the likelihood function.
We wish to state two important remarks on the issue of local maximum:
1. Notice that in the Gaussian mixture case (and others), the likelihood function is
not bounded and has several singularity points. One such singularity point, for
example, is when the mean of one of the Gaussians is equal to one of the samples,
and the Gaussian's variance tends to zero.
2. When the two Gaussians are too close to each other (the Rayleigh criterion, as a
rule of thumb), there might be many close local maxima, and the estimation result
might be very dependant on the initial values of the process. We encounter this
problem in some of the sectors' estimates, and we'll discuss this issue more
thoroughly in section 5.4.
Page 30
Chapter 4
The monitored system
The work presented in this thesis has been implemented on data records from an
actual UMTS cellular network in Israel. The network is comprised of 256 sectors, in
and around Beer-Sheva, a medium-size city in the south of Israel (population of
~185,000). Almost all of the sectors lie in the city area of around ~100km2, a few lie
near main rural roads, and others in some small towns or villages. Among the sectors
which lie inside Beer-Sheva, some are located densely near the city center, and the
density becomes smaller towards the outskirts of the city (see Figure 4.1).
Figure 4.1 – Sector locations in the Beer-Sheva area1
In general, cellular networks are quite complex, and contain many levels,
relationships, hierarchies, and interconnections between the different network
modules. To make things worse, different terminology is used to describe different
types of cellular networks, e.g. GSM, UMTS, CDMA etc.
Since the focus of this thesis is on path loss models, which should be similar and
easily applicable to all kinds of cellular network protocols, we define a simplified
model of a cellular network, which is suitable to our needs and can be easily
implemented to any other cellular network protocol.
1 This figure is based on data from the cellular operator, which was analyzed using Schema Ltd. Ultima
Mentor software and presented using Google Earth software.
Page 31
The network model consists of three main levels:
1. The User Equipment (UE) –
This is the mobile user end equipment, usually a hand-held cellular phone.
While the appropriate term in 3rd generation systems (like UMTS) is UE, GSM
systems use the term Mobile Station (MS) with similar meaning.
2. The cellular Sector –
We use this term to describe the basic communication module which
communicates with the UEs. Usually in the literature, this is referred to as a cell
or a sector. Though there are sectors which cover 360o, in most cases 2-3
sectors are located at the same place, each covering an 180o or 1200 slice
respectively, thus creating a full 360o cover. A network device called Node B
facilitates the wireless communication in the sectors. Usually each Node B is
responsible of a group of 1-3 sectors, all positioned in the same place. The
common term for Node B in GSM networks is the BTS (Base Transceiver
Station).
3. The radio network controller (RNC) –
This is the governing element in the network. UMTS networks use the term
RNC, and it is responsible for control of the Node Bs that are connected to it. It
also carries out radio resource management, some of the mobility / handover2
management functions and it's the point where encryption is done before user
data is sent to and from the mobile. GSM networks use slightly different
network governing mechanism, and use the term BSC (Base Station Controller).
Usually the UEs' and sectors' radio measurements are logged in the RNCs.
Figure 4.2 – Typical trisector Node B units [27, 28]
Page 32
As this thesis was meant to be easily applicable, no special measurements were taken,
and only data which is regularly collected by the network was used.
We used GPEH (General Performance Event Handler) data, which is the format used
by Ericsson's UMTS equipment to monitor the network parameters and performances.
GPEH records are in fact messages which are sent over the network, and contain
numerous types of measurements. These messages are sent between different modules
in the network (UE's, Node B's, RNC's etc.), and the cellular operator can decide
which messages will be saved for later analysis. Log files are created every 15
minutes, containing all the relevant messages from that time period.
As can be understood from its name, GPEH messages are event-evoked, meaning that
a message will be sent (and saved if defined so) only when a specific event happens.
For example, messages containing radio signal power measurements are sent
whenever a handover process happens. We used these messages to measure the path
loss, by looking at the power of a pilot signal which is transmitted by the sectors at
known power levels (The power levels change from sector to sector, but these
changes are accounted for in the analysis process).
2 A UMTS handover event occurs when a UE gains or looses connection to one of the sectors in its
vicinity. A hard handover is the event of changing the UE's "best-server" sector, while in a soft
handover event the "best-server" sector stays the same, but the list of the UE's available sectors
changes.
Page 33
Chapter 5
Result Analysis
5.1 Highly bi-modal sectors
We begin our result analysis by looking at sectors which show a highly bi-modal
behavior. Unsurprisingly, these sectors are located in places with large outdoor and
indoor areas. Such places were found near the city's central bus station, in large
shopping complexes with large outdoor parking areas and in the city's university area.
In the following figures we present empirical path-loss distributions (black), EM
estimation results (dashed blue), estimated parameters and the Kullback-Leibler
divergence between the modeled and empirical probability density functions.
The red lines represent the relevant sectors. Each "V" is located where the sector
antenna is located, directed at the same direction, and has a 60° opening (similar to
the 55°~70° horizontal widths of most of the urban sectors' antennas). The length of
the red lines does not represent the actual size of the coverage area (see section 5.6 for
more on sector coverage area estimation).
Figure 5.1 – Google Earth photo of the central bus station, sectors EKG153363 and
EKG142181
Page 34
Figure 5.2 – Measured Data Vs. EM Estimation of EKG153363
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm
Figure 5.3 – Measured Data Vs. EM Estimation of EKG142181
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm
estimated parameters:
2
1 1
2
2 2
ˆ ˆ101.2 103.7
ˆ ˆ120.1 102.3
ˆ 0.556p
= =
= =
=
Kullback divergence = 0.00073
estimated parameters:
2
1 1
2
2 2
ˆ ˆ96.8 75.8
ˆ ˆ118.0 100.7
0
ˆ 0.391p
= =
= =
=
Kullback divergence = 0.00046
Page 35
Figure 5.4 – Google Earth photo of the "BIG" shopping center, sector EKG141952
Figure 5.5 – Measured Data Vs. EM Estimation of EKG141952
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm
estimated parameters:
2
1 1
2
2 2
ˆ ˆ107.7 61.9
ˆ ˆ126.1 61.4
ˆ 0.516p
= =
= =
=
Kullback divergence = 0.00026
Page 36
We wish to highlight a few interesting remarks:
1. The difference between the two estimated indoor Vs. outdoor Gaussian means
is ~18-20dB, as expected from penetration loss.
2. The variances of the Gaussians are similar to those mentioned in the literature
[6, 7], expressing a standard deviation of ~7-11dB.
3. The path loss distribution of figures 5.2 and 5.3, near the central bus station,
looks a bit like a tri-modal Gaussian mixture. This might be caused by an
indoor/outdoor/in-car discrepancy (Despite Israel mandatory hands free cell
phone usage, cellular calls from within buses can be considered as 'regular' in-
car calls). Notice that near the shopping center (figure 5.5) the distribution
shows a much more explicit bi-modal behavior.
5.2 Highly uni-modal sectors
Our analysis found highly uni-modal sectors in most of the city's dense urban area.
Figures 5.6-5.9 show the location, distribution and estimation of some of these
sectors. Notice that due to a relatively high shadowing effect in dense urban areas, the
measured standard deviation of these distributions is around ~9-10dB. As was
presented in section 2, with these values of standard deviations, we expect to see
distributions very similar to Gaussian, and indeed our EM estimator produces very
high p̂ values.
Figure 5.6 – Google Earth photo of an urban area, sector EKG143613
Page 37
Figure 5.7 – Measured Data Vs. EM Estimation of EKG143613
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm
(The small peak around 147dB (marked with a red circle in figure 5.7) is due to the
measurement equipment minimal power threshold, leading to a maximum possible
path loss measurement)
Figure 5.8 – Google Earth photo of urban area, sector EKG159853
estimated parameters:
2
1 1
2
2 2
ˆ ˆ106.2 64.6
ˆ ˆ123.8 71.6
ˆ 0.055p
= =
= =
=
Kullback divergence = 0.00015
Page 38
Figure 5.9 – Measured Data Vs. EM Estimation of EKG159853
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm
5.3 Influence of repeaters
In areas which contain cellular repeaters, we should expect to see similar bi-modality
(or multi-modality of higher order) in the path loss distribution.
In UMTS, the cellular repeaters do not communicate with a specific sector, but rather
act as amplifiers for the UEs, and connect to the 'strongest' sectors in the area, much
like the UEs themselves. Therefore we can't examine a specific sector, and we need to
examine sectors which look towards the repeater location.
Figures 5.10-5.16 show path loss distributions in two areas which contain repeaters.
One area (figure 5.10) is in the city university campus. Since it contains large outdoor
and indoor areas as well, we should expect to see a more complex path loss behavior.
estimated parameters:
2
1 1
2
2 2
ˆ ˆ98.1 9.4
ˆ ˆ1
0
21.5 81.9
ˆ 0.002p
= =
= =
=
Kullback divergence = 0.00026
Page 39
Figure 5.10 – Google Earth photo of the university area sectors and repeaters
Figure 5.11 – Measured Data Vs. EM Estimation of EKG130711
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm
estimated parameters:
2
1 1
2
2 2
ˆ ˆ100.9 64.6
ˆ ˆ119.1 100.7
ˆ 0.304p
= =
= =
=
Kullback divergence = 0.00055
Page 40
Figure 5.12 – Measured Data Vs. EM Estimation of EKG131731
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm
Figures 5.13-5.16 show an industrial area in the outskirts of the city. We look at the
three sectors of EKG148911/2/3, and see a clear difference in the distributions of the
sectors looking toward the repeater. Regarding these sectors (EKG148912 and
EKG148913), the EM estimation results were indistinct and dependent on the
estimation initial values. All EM runs estimated a Gaussian with ~90% mixing
probability and a mean around ~114-117dB (EKG148912) and ~117-120dB
(EKG148913). However, for some initial values the second Gaussian was estimated
to be around 110dB (EKG148912) and 112dB (EKG148913), and for others the
second Gaussian was around 134dB (EKG148912) and 136dB (EKG148913). This
might be caused by a multi-modality of a higher order in the distributions (a tri-modal
of indoor/outdoor/repeater might be expected in such sectors), and can be solved by
integrating additional information (as the expected indoor/outdoor path loss means),
by calibrating the model, by regarding the Kullback-Leibler divergence or by other
estimation techniques, for distributions with multi-modality of a higher order (see
section 6.2).
estimated parameters:
2
1 1
2
2 2
ˆ ˆ106.9 62.0
ˆ ˆ123.6 57.5
ˆ 0.502p
= =
= =
=
Kullback divergence = 0.00151
Page 41
Figure 5.13 – Google Earth photo of the area around EKG148911/2/3
Figure 5.14 – Measured Data Vs. EM Estimation of EKG148911
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm
1
3 2
estimated parameters:
2
1 1
2
2 2
ˆ ˆ110.8 111.7
ˆ ˆ123.2 90.2
ˆ 0.039p
= =
= =
=
Kullback divergence = 0.00073
Page 42
Figure 5.15 – Measured Data Vs. EM Estimation of EKG148912
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm (first option)
Red dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm (second option)
2
1 1
2
2 2
ˆ ˆ114.4 110.0
ˆ ˆ134.3 37.3
ˆ 0.905p
= =
= =
=
Kullback divergence = 0.0022
estimated parameters (blue line):
estimated parameters (red line):
2
1 1
2
2 2
ˆ ˆ110.0 22.0
ˆ ˆ117.1 145.7
ˆ 0.11p
= =
= =
=
Kullback divergence = 0.0013
Page 43
Figure 5.16 – Measured Data Vs. EM Estimation of EKG148913
Black line – Measured Path Loss Distribution
Blue dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm (first option)
Red dashed line – Mixed Gaussian path loss distribution with 1 1 2 2( , , , , )p =
estimated by the EM algorithm (second option)
5.4 Dependence of initial values
As mentioned in sections 3.4, the EM algorithm gets initial values as an input for the
iterative estimation process. Therefore we must validate that our estimation is
independent of these initial values. In fact we want to validate that the likelihood
function has a clear global maximum, and in cases with different local maxima we
want to be able to choose the right local maximum, describing the actual path loss
distribution.
Figures 5.17 and 5.19 show different estimation results from 100 different initial
means and variances (chosen randomly), for the path loss distributions of sector
EKG141952 (see figure 5.5) and EKG148912 (see figure 5.15). The blue and red dots
represent the log-likelihood value and Kullback-Leibler divergence, respectively,
achieved in the last step of the EM algorithm. In the background we plot the
histogram of p̂ , the estimated indoor/outdoor mixing probability.
2
1 1
2
2 2
ˆ ˆ117.3 100.4
ˆ ˆ135.6 34.4
ˆ 0.90p
= =
= =
=
Kullback divergence = 0.0045
estimated parameters (blue line):
estimated parameters (red line):
2
1 1
2
2 2
ˆ ˆ111.7 18.4
ˆ ˆ119.9 128.1
ˆ 0.10p
= =
= =
=
Kullback divergence = 0.0023
Page 44
Figure 5.17 – EM estimation with different initial parameters values, EKG141952
Blue dots – Log Likelihood final value of each algorithm run
Red dots – Kullback-Leilbler Divergence of algorithm run
Background blue histogram – p̂ estimation histogram
The two 'groups' of estimated values marked by the black arrows, represents the same
global maximum of the likelihood function. They are due to stopping of the algorithm
run coming from different 'directions', as shown in figure 5.18.
Figure 5.18 – EM Algorithm steps of two different runs (right and left), EKG141952
Blue dots – Log Likelihood value of each step
Red dots – Kullback-Leilbler Divergence of each step
left side estimated parameters: 2
1 1
2
2 2
ˆ ˆ107.5 60.3
ˆ ˆ125.8 63.0
ˆ 0.50304p
= =
= =
=
right side estimated parameters: 2
1 1
2
2 2
ˆ ˆ107.9 63.3
ˆ ˆ126.3 59.9
ˆ 0.52712p
= =
= =
=
Page 45
Figure 5.19 – EM estimation with different initial parameters values, EKG148912
Blue dots – Log Likelihood function of each algorithm run
Red dots – Kullback-Leilbler Divergence of algorithm run
Background blue histogram – p̂ estimation histogram
In figure 5.19, we see two relevant estimations (i.e. two relevant local maxima of the
likelihood function): one estimates p̂ = ~0.1, 1̂ = ~110dB, 2̂ = ~117dB, and the
other estimates p̂ = ~0.9, 1̂ = ~114dB, 2̂ = ~134dB (see estimated PDFs in figure
5.15). While the first estimation yields a smaller Kullback-Leibler divergence, it
doesn't have a higher log-likelihood value. In such cases we should combine other
information in order to decide which estimation to choose. This information can be
the expected outdoor path loss mean, the expected difference between the indoor and
outdoor Gaussian means etc.
Although some of the sectors exhibited behaviors similar to that presented in figure
5.19, most analyzed cases had a clear global maximum and yielded a clear estimation,
independent of the EM algorithm initial parameters values.
5.5 EM results for pure Gaussian and theoretical model distributions
As a sanity check, we want to test our EM estimator with a pure Gaussian distribution
and with distributions given by the theoretical computation of section 2 (equation
(2.9)).
Figure 5.20 shows results of the EM estimation for a pure Gaussian distribution. We
see that the EM algorithm does not, supposedly, converge to the "right" solution (i.e.
p̂ close to zero or unity). However, further analysis of the results shows that the EM
Page 46
algorithm would have eventually converged to zero or unity results if we let it
continue running. Because the tested distribution is pure Gaussian, the estimation fits
the data well for many different p̂ values, and the algorithm stopping conditions are
met at a relative early stage. Other than that reason, we see some abnormal results,
marked with black arrows, which we believe are caused by the finiteness of the data
and the excellent fit that can be achieved for a unimodal Gaussian, by a combination
of two other close Gaussians.
Figure 5.20 – EM estimation with different initial parameters values, pure Gaussian
data
Blue dots – Log Likelihood function of each algorithm run
Red dots – Kullback-Leilbler Divergence of algorithm run
Figure 5.21 shows results of the EM estimation for the theoretical distribution of
equation (2.9), with standard variation of 9dB. We see that although the distribution is
similar to pure Gaussian, we don't see the phenomena of figure 5.20, and the EM
algorithm estimates similar p values for different initial values. The fact that the
algorithm estimates a ~80% indoor ratio even for the "purely indoor" (or outdoor)
theoretical distribution of equation (2.9), places a limit on the validity of the
indoor/outdoor estimation to clear bi-modal distributions. Calibration of the model
(see section 6.2.1) might lead to better estimation of the indoor/outdoor ratio, in
general and specifically for distributions with low bi-modality.
Page 47
Figure 5.21 – EM estimation with different initial parameters values, equation (2.9)
theoretical computation
Blue dots – Log Likelihood function of each algorithm run
Red dots – Kullback-Leilbler Divergence of algorithm run
5.6 Sector coverage area estimation
Using the theoretical computation of the path loss distribution, which was presented
in section 2 (figures 2.2, 2.4, and equation (2.9)), we can estimate the sector's
maximal radius. Combined with the antenna's aperture and location, we can
empirically estimate the actual geographical area covered by the sector. As mentioned
in section 2, real-life sectors usually do not have clear and "nice" shapes or borders.
However, the coverage area estimation presented here enables a preliminary
estimation, which is useful when no other data (such as "drive-tests") is available, or
when easily accessible or frequent coverage area estimation is needed.
Figure 5.22 shows a Google Earth photo of central Beer-Sheva coverage map
estimation. The semi-transparent slices represent the sectors' coverage area. Each slice
is positioned according to the original sector's antenna position and aperture. The
mean and variance of the "outdoor" Gaussian, estimated by the EM algorithm, are
used to determine the sector's maximal radius.
Page 48
Figure 5.22 – Beer-Sheva city center coverage map estimation
Figure 5.23 shows a Google Earth photo and coverage map estimation of Omer, a
small village outside of Beer-Sheva, and of the roads near it. As expected, we see
larger sectors in rural areas, with less sector overlaps (notice that both figures are
identically scaled).
Figure 5.23 – Coverage map estimation outside of Beer-Sheva
Page 49
The estimation results presented here are not calibrated and the model constants were
taken from the UMTS model of ITU [7]. We relate to the calibration issue in section
6.2.1, and believe that calibrating the model parameters will lead to much more
accurate results.
Page 50
Chapter 6
Conclusions and suggestions for future research
6.1 Conclusions
In this thesis we presented a new model for "sector-to-users" path loss distributions,
based on a theoretical computation and empirical data.
The computation assumed uniform spatial distribution of the users, a power law decay
of the radio signal and a "pie slice" shaped sector.
We then compared the computation to a Gaussian model, and showed both
theoretically and empirically that the Gaussian model is valid in most cases, where the
shadowing effect standard deviation is above ~5dB and the indoor/outdoor mixing is
low.
The complete model incorporates the indoor/outdoor differences, and the effect of
indoor penetration losses on the path loss distribution. A Gaussian mixture (GM)
model was suggested, and the problem of estimating GM parameters was studied. We
developed two simple estimators for estimating the GM mixing probability with
known means and variances, and compared them with a third estimator developed by
Dattatreya and Kanal [19].
Since usually the means and variances of the two Gaussians are unknown, we studied
the Expectation-Maximization (EM) algorithm for estimating the distribution's
parameters. The EM estimator was used with empirical data from an actual UMTS
cellular network in Israel, and results were analyzed for different sectors. We
presented examples of sectors with high bi-modality behavior, located in areas with
high indoor/outdoor mixing, and sectors with high uni-modal behavior, located in
dense urban areas with high indoor rate.
Two main applications have resulted from the research:
1. Estimation of the sector's physical coverage area. Since inter-sector
interferences are one of the key factors influencing network capacity, knowing
the sectors' real physical area and boundaries is a highly important goal. Our
novel method enables the estimation of the physical boundaries without
relying on positioning data (based on time of arrival, received signal strength
etc.), which is sometimes not available or hard to implement in cellular
networks.
2. Estimation of a sector's indoor/outdoor call ratio from path loss data which is
automatically measured and logged by cellular operators. The indoor/outdoor
ratio is important both in design of cellular networks (e.g. installation of
indoor sectors or repeaters in certain places), and in optimization of network
parameters.
In addition, this thesis presented a theoretical background with quantitative validity
boundaries (shadowing effect > ~5dB), for describing "sector-to-users" path loss
distributions as Gaussian.
Page 51
6.2 Suggestions for future research
6.2.1 Calibrating the models presented in this thesis
Calibration of the models is important mainly for estimating the sectors' boundaries
and the indoor/outdoor ratio. It can be done in several ways, though two ways seem
more practical than others:
I. "Drive-tests" – "Drive-test" data can be used to estimate the outdoor path loss
distribution in a given sector, with a uniform or other spatial distribution of the
calls. "Drive-test" data can also be used to measure the physical boundaries of
sectors, at least for outdoor calls, and thus calibrate the sectors' boundary
estimation.
II. Geographic location data – Integrating geographic location data of the calls can be
used to calibrate the models for both the indoor/outdoor and sectors' boundary
applications, in addition to other results which are discussed in 6.2.4.
6.2.2 Studying the convergence rate of the EM algorithm and developing techniques
for its acceleration
The issue of the EM algorithm convergence rate hasn't been deeply dealt with in this
thesis, though it is widely studied in the literature (for example, see [26]). We did
notice that the EM convergence rate was quite slow for many of the sectors,
especially those with low bi-modality. Practical applications which use
indoor/outdoor ratio estimation will need to be based on a faster version of the EM
estimator. This might be achieved by using standard EM acceleration techniques, or
by using methods specifically "tailored" for the indoor/outdoor case (such as choosing
good initial values, relying on known penetration losses etc.).
6.2.3 Using the models as a new "measuring tool"
The models presented in this thesis constitute a basis for future analysis of the "sector-
to-users" path loss behavior. An interesting research might be analyzing the path loss
distribution, sector's size or indoor/outdoor ratio in different hours of the day, on
different days of the week, seasons etc. The path loss distribution will be effected also
from non-routinely events, such as large gatherings ("hot spots") or even
environmental phenomena such as heavy rainfall, snow etc (for more on
environmental monitoring by cellular networks, see [29]).
6.2.4 Combining location and path loss data
As cellular localization becomes widespread in recent years, as well as the use of GPS
receivers in cellular mobile phones, combining the geographic location of the call
with path loss and other power control data will lead to promising results. Model
calibration, more accurate indoor/outdoor ratio estimation and further improvements
in the power management processes in cellular networks can be achieved. In our
opinion, the lack of easily accessible data is the current bottleneck for this kind of
Page 52
researches. However, we believe that this bottleneck will be removed in the coming
few years and a whole field of research opportunities will become available.
6.2.5 Mobility estimation, in-car path loss behavior and discrimination
Another future research direction might be expanding the current indoor/outdoor ratio
estimation to indoor/outdoor/in-car discrimination. Using localization data, path loss
or other power control data (see Villebrun et al., [8]), a better indoor/outdoor/in-car
discrimination might be achieved, statistically or per single call. This in turn can lead
to further improvements in the efficiency of the network.
6.2.6 Integrating additional information to the Gaussian mixture estimation
In section 3.3 we discussed estimators for Gaussian mixture distributions with known
means and variances. In section 3.4 we presented the EM algorithm for distributions
with unknown means, variances and mixing probability. As was mentioned in section
3.4, additional information can be used to improve these estimations. Some
constraints can be suggested, like a certain difference between the means of the indoor
and outdoor Gaussians, limits or even specific values for the variances of the
Gaussians etc. Information about the physical size of the sectors' area (from separate
data, such as geographic location of the calls or the network sectors), can help
estimate the Gaussian means separately, and thus help the EM algorithm or even
estimate the indoor/outdoor mixing probability using simpler estimators.
Page 53
Appendix A
CRLB Computation
In section 3.3, we examined three estimators for estimating the prior probability p for
two Gaussians mixture with known means and variances ( ˆMLCp – equation 3.6,
_ˆ
MLC unbiasedp – equation 3.11 and ˆPPEp – equation 3.16).
We now compute the Cramer-Rao lower bound of the problem.
For the CRLB computation we make the following definitions:
y – A random variable which represents the path loss in a given measurement.
y – A random vector of N path loss measurements.
f1 – The path loss PDF for outdoor measurements.
f2 – The path loss PDF for indoor measurements.
fy – The total path loss PDF.
Thus
( ) ( )
( )2
1
212
12
1
1|
2Y
y
Yf y f y outdoor e
− − = = (A.1)
( ) ( )
( )2
2
222
22
2
1|
2Y
y
Yf y f y indoor e
− − = = (A.2)
And
( ) ( ) ( ) ( )
( )
( )
( )2 2
1 2
2 21 22 2
1 22 2
1 2
1 11 1
2 2Y Y
y y
Yf y pf y p f y p e p e
− − − − = + − = + − (A.3)
We assume the measurements are statistically independent, and therefore –
( ) ( )1
N
Y Y n
n
f y f y=
= (A.4)
The likelihood and log-likelihood functions are, respectively:
( ) ( ) ( )| |
1
| |N
Y p Y p n
n
L p f y p f y p=
= = (A.5)
( ) ( )( ) ( )( )| |
1
ln | ln |N
Y p Y p n
n
l p f y p f y p=
= = (A.6)
Page 54
( )( )( ) ( )( )
( ) ( )
( )
( )
( )
( ) ( )
2 2
1 2
2 21 2
2 2
1 2
2 21 2
2 2
1 2
21 2
||1
1
2 2
1 2
12 2
1 2
2 2
1 2
ln |ln |
1 1
2 2
1 11
2 2
1 1
2 2
n n
n n
n n
N
Y p n NY p nn
n
y y
N
y yn
y y
f y pf y pl p
p p p
e e
p e p e
e e
=
=
− − − −
− − = − −
− − − −
= = =
−
= =
+ −
−
=
( ) ( ) ( )
2
2 2 2
1 2 2
2 2 21 2 2
12 2 2
1 2 2
1
1 1 1
2 2 2
n n n
N
y y yn
Nn
n n n
p e e e
A
pA B
− − −= − − −
=
=
− +
=+
The first derivative:
(A.7)
When An and Bn are defined to be:
( ) ( ) ( )2 2 2
1 2 2
2 2 21 2 22 2 2
1 2 2
1 1 1,
2 2 2
n n ny y y
n nA e e B e
− − − − − − − (A.8)
Thus the second derivative will be:
( )
( )
2 2
221
Nn
n n n
l p A
p pA B=
= −
+ (A.9)
Page 55
And now we can compute Fisher's information:
( )( )
( )
( ) ( )
2 2
221
2 2
2 21
Nn
n n n
Nn
n n n
l p AJ p E E
p pA B
A AE N E
pA B pA B
=
=
= − = =
+
= =
+ +
(A.10)
For
( ) ( ) ( )2 2 2
1 2 2
2 2 21 2 22 2 2
1 2 2
1 1 1,
2 2 2
y y y
A e e B e
− − − − − − − (A.11)
And the Cramer-Rao lower bound:
( ) ( )1CRLB p
J p= (A.12)
Though we didn't get a 'nice' analytic expression for the CRLB, it can be easily
computed for any set of parameters.
Page 56
Bibliography
[1] Doble, J. Introduction to radio propagation for fixed and mobile
communication, Artech House Publishers, 1996.
[2] Blaunstein, N. Radio propagation in cellular networks, Artech House
Publishers, 2000.
[3] Steele R., Mobile Radio Communications, Pentech Press Publishers, 1992.
[4] Stüber, G.L. Principles of Mobile Communication, 2nd Ed., Kluwer Academic
Publishers, 2001.
[5] Rappaport, T.S. Wireless Communications Principles and Practice, IEEE
Press, 1996.
[6] Alaya-Feki, A.B.H. et al. Optimization of radio measurements exploitation in
wireless mobile networks, Journal of Communications, Vol. 2, No. 7,
December 2007.
[7] TR 101 112 (UMTS 30.03 version 3.2.0): "Universal Mobile
Telecommunications System (UMTS); Selection procedures for the choice of
radio transmission technologies of the UMTS", 1998-04.
(http://www.itu.int/itudoc/itu-r/archives/rsg/1998-00/rtg8-1/42763.html)
[8] Villebrun, E. et al. Indoor outdoor user discrimination in mobile wireless
networks, IEEE Vehicular Technology Conference VTC 2006 Fall, pages 1-5,
2006.
[9] Okumura, Y. et al. Field Strength and Its Variability in VHF and UHF Land-
Mobile Radio Service, Review of the Electrical Communication Laboratory.
(Japan), Sept./Oct. 1968, pp. 825-873.
[10] Hata, M. Empirical Formula for Propagation Loss in Land Mobile Radio
Services, IEEE Transactions on Vehicular Technology, VT-29, pp. 317 - 325,
1980.
[11] Barucha, Z. and H. Haas, The distribution of path losses for uniformly
distributed nodes in a circle, Research Letters in Communications, Vol. 2008,
Article ID 376895.
[12] Zhu, J. and G.D. Durgin, Indoor / outdoor location of cellular handsets based
on received signal strength, Electronic Letters, 6th January 2005 Vol. 41 No.
1.
[13] Papoulis, A. Probability, Random Variables and Stochastic Processes,
Mcgraw-Hill, 2006.
[14] Van Trees, H.L. Detection, Estimation and Modulation Theory, Part I, Wiley-
Interscience Publication, 2001.
[15] Pearson, K. Contribution to the mathematical theory of evolution,
Philosophical Transactions (1894), A 185, 71-110.
[16] Everitt, B.S. Maximum Likelihood Estimation of the Parameters in a Mixture
of Two Univariate Normal Distributions; A Comparison of Different
Algorithms, The Statistician, Vol. 33, No. 2, (Jun., 1984), pp. 205-215.
[17] Dick, N.P. and D.C. Bowden, Maximum Likelihood Estimation for Mixtures
of Two Normal Distributions, Biometrics, Vol. 29, No. 4, (Dec., 1973), pp.
781-790.
[18] Dattatreya, G.R. Gaussian mixture parameter estimation with known means
and unknown class-dependent variances, Pattern Recognition 35 (2002) 1611–
1616.
Page 57
[19] Dattatreya, G.R. and L.N. Kanal, Estimation of Mixing Probabilities in
Multiclass Finite Mixtures, IEEE Transactions on Systems, Man and
Cybernetics, VOL. 20, NO. 1, JANUARY/FEBRUARY 1990.
[20] Dattatreya, G.R. and Xiaori (Frank) Fang, Parameter estimation: known vector
signals in unknown Gaussian noise, Pattern Recognition 36 (2003) 2317 –
2332.
[21] Dempster, A.P. et al. Maximum-likelihood from incomplete data via the EM
algorithm, Journal of the Royal Statistics Society. Series B., Vol. 39, 1977.
[22] Bilmes, J. A. A gentle tutorial of the EM algorithm and its application to
parameter estimation for Gaussian mixture and hidden Markov models,
University of Berkeley, Tech. Rep., 1998.
[23] Prescher, D. A tutorial on the expectation-maximization algorithm including
maximum-likelihood estimation and EM training of probabilistic context-free
grammars, March 2005. [Online]. Available: http://arxiv.org/abs/cs/0412015
[24] Redner, R.A. and H.F. Walker, Mixture densities, maximum likelihood and
the em algorithm SIAM Review, vol. 26, no. 2, pp. 195-239, 1984. [Online].
Available: http://dx.doi.org/10.2307/2030064.
[25] McCulloch C.E., Maximum Likelihood Algorithms for Generalized Linear
Mixed Models, Journal of the American Statistical Association, Vol. 92, No.
437 (Mar., 1997), pp. 162- 170.
[26] Jamshidian, M. and R. I. Jennrich, Acceleration of the EM algorithm by using
quasi-Newton methods, Journal of the Royal Statistical Society. Series B
(Methodological), Vol. 59, no. 3, pp. 569-587, 1997. [Online]. Available:
http://www.jstor.org/stable/2346010
[27] Trisector antenna photo #1 –
http://en.wikipedia.org/wiki/File:CellPhoneTower_OR.jpg
[28] Trisector antenna photo #2 – http://www.mbs.ie/images/antenna3.jpg
[29] Messer H. et al. Environmental Monitoring by Cellular Networks, Science,
312. 713 (2006)
Page 58
תקציר
מהווים גורם חשוב ביותר בהתנהגות הרשת ובביצועיה, ומודלים רבים תהפסדי ערוץ ברשתות סלולאריו
נבנו בעשורים האחרונים לתיאור התופעה. הרוב המכריע של המודלים מתייחס לתלות של הפסד הערוץ
וצמחייה במסלול במרחק בין המשדר למקלט, ובפרמטרים אחרים כגון גובה האנטנות, תכונות בניינים
האות ועוד. בעבודה זו, נבחנה התפלגות הפסדי הערוץ שבין אנטנת התא הסלולארי לבין משתמשי הקצה,
ומוצג מודל חדשני לתיאור התפלגות זו.
בשלב ראשון אנו מבצעים חישוב אנליטי של ההתפלגות. חישוב זה מבוסס על מספר הנחות יסוד, כגון
חק ע"י חוק חזקה, התפלגות מרחבית אחידה של השיחות הסלולאריות תאור דעיכת האות כתלות במר
יועוד. בהמשך, אנו מציעים מודל גאוסי פשוט לתיאור ההתפלגות, אשר מתבסס על החישוב התיאורט
בשילוב נתונים סטנדרטיים מהספרות. אנו מראים, באופן תיאורטי וכן אמפירית, שהמודל הגאוסי מתאים
, המקיימים את שני התנאים הבאים: לתאים סלולאריים רבים
. 5dBסטיית התקן של אפקט ההצללה בתא גדולה מ~ .א
(, outdoor)( או לחילופין מחוץ למבנים indoorרוב גדול מהשיחות בתא נעשות מתוך מבנים ) .ב
אך ללא ערבוב גדול ביניהם.
התא, ומציע המודל השלם המוצג בעבודה זו מכליל עירוב של שיחות מתוך ומחוץ למבנים באותו
על מנת לתאר את התפלגות הפסדי הערוץ. (Gaussian Mixture)להתבסס על עירוב גאוסיאנים
במהלך העבודה נלמדו ופותחו מספר אלגוריתמים לשערוך הפרמטרים של התפלגות הפסדי הערוץ:
עות מראש, שיטות שונות לשערוך פרמטרים עבור התפלגות עירוב גאוסיאנים עם תוחלות וסטיות תקן ידו
עבור התפלגות עירוב גאוסיאנים עם תוחלות וסטיות EM (Expectation-Maximization)ואלגוריתם
תקן אשר אינן ידועות מראש. אלגוריתמי שערוך אלה נותחו ונבחנו בצורה נרחבת, עבור נתונים
ממספר של אחת מחברות הסלולאר בישראל. הנתונים נמדדו UMTSאמפיריים של הפסדי ערוץ מרשת
רב של תאים, ביניהם תאים עם עירוב גבוה ונמוך של שיחות מתוך ומחוץ למבנים, תאים אשר משתמשים
בממסרים לשיפור הכיסוי שלהם, ועוד.
בהתבסס על המודל והמשערכים אשר מוצגים בעבודה זו, ובהתבסס על העובדה שנתוני הפסד הערוץ
של הרשתות הסלולאריות, אנו מציעים שתי אפליקציות נמדדים ונשמרים בצורה אוטומאטית ברוב הגדול
עיקריות לשיפור תכנון וטיוב רשתות סלולאריות. האפליקציה הראשונה היא שערוך אמפירי של "מפות
הכיסוי" של התאים ברשת הסלולארית. "מפות כיסוי" אלו יאפשרו זיהוי של חורים בכיסוי, וכן חפיפות
פליקציה השנייה היא שיערוך היחס בין מספר השיחות אשר מבוצעות אשר פוגעות ביעילות הכיסוי. הא
מתוך ומחוץ למבנים. יחס זה חשוב למתכנני רשתות סלולאריות, וכן תורם לטיוב הפעלת רשתות קיימות.
לסיכום, מוצעים מספר כיוונים למחקרי המשך. העיקרי מביניהם הוא כיול של המודל המוצג בעבודה זו,
)ביצוע מדידות בשטח הכיסוי של התאים ע"י "drive-tests"ע ע"י שימוש במדידות אשר יכול להתבצ
ציוד מדידה ייעודי(. כיווני מחקר נוספים יכולים להיות שילוב של מידע נוסף במודל כגון מידע על מיקומי
השיחות, שילוב המודל באלגוריתמים קיימים לתכנון וטיוב רשתות סלולאריות וכן ניטור פרמטרים
סביבתיים בהתבסס על המודל המוצע בעבודה זו ובשילוב מחקרים אחרים בתחום הניטור הסביבתי.
Page 59
אביב -אוניברסיטת תל הפקולטה להנדסה ע"ש איבי ואלדר פליישמן
סליינר-בית הספר לתארים מתקדמים ע"ש זנדמן
מידול ושיערוך התפלגות הפסדי ערוץ ייםראסלול יםבתא
ת חשמל בהנדס" מוסמך אוניברסיטה"חיבור זה הוגש כעבודת גמר לקראת התואר ואלקטרוניקה
ידי -על
יהונתן ברוידא
תשס"טאב
Page 60
אביב -אוניברסיטת תל הפקולטה להנדסה ע"ש איבי ואלדר פליישמן
סליינר-בית הספר לתארים מתקדמים ע"ש זנדמן
מידול ושיערוך התפלגות הפסדי ערוץ םים סלולארייבתא
חשמל בהנדסה ה" מוסמך אוניברסיט" חיבור זה הוגש כעבודת גמר לקראת התואר
ואלקטרוניקה ידי -על
יהונתן ברוידא
מערכות –הנדסת חשמל העבודה נעשתה במחלקה ל
ירון-חגית מסרפרופ' הנחיתב
תשס"טאב