-
Hindawi Publishing CorporationThe Scientific World JournalVolume
2013, Article ID 704504, 19
pageshttp://dx.doi.org/10.1155/2013/704504
Review ArticleA Review of Data Fusion Techniques
Federico Castanedo
Deusto Institute of Technology, DeustoTech, University of
Deusto, Avenida de las Universidades 24, 48007 Bilbao, Spain
Correspondence should be addressed to Federico Castanedo;
[email protected]
Received 9 August 2013; Accepted 11 September 2013
Academic Editors: Y. Takama and D. Ursino
Copyright 2013 Federico Castanedo.This is an open access article
distributed under the Creative Commons Attribution License,which
permits unrestricted use, distribution, and reproduction in any
medium, provided the original work is properly cited.
The integration of data and knowledge from several sources is
known as data fusion. This paper summarizes the state of the
datafusion field and describes the most relevant studies. We first
enumerate and explain different classification schemes for data
fusion.Then, the most common algorithms are reviewed. These methods
and algorithms are presented using three different categories:
(i)data association, (ii) state estimation, and (iii) decision
fusion.
1. Introduction
In general, all tasks that demand any type of
parameterestimation from multiple sources can benefit from the
useof data/information fusion methods. The terms informationfusion
and data fusion are typically employed as synonyms;but in some
scenarios, the term data fusion is used forraw data (obtained
directly from the sensors) and the terminformation fusion is
employed to define already processeddata. In this sense, the term
information fusion implies ahigher semantic level than data fusion.
Other terms associ-ated with data fusion that typically appear in
the literatureinclude decision fusion, data combination, data
aggregation,multisensor data fusion, and sensor fusion.
Researchers in this field agree that the most accepteddefinition
of data fusion was provided by the Joint Directorsof Laboratories
(JDL) workshop [1]: A multi-level processdealing with the
association, correlation, combination of dataand information from
single and multiple sources to achieverefined position, identify
estimates and complete and timelyassessments of situations, threats
and their significance.
Hall and Llinas [2] provided the following well-knowndefinition
of data fusion: data fusion techniques combine datafrom multiple
sensors and related information from associateddatabases to achieve
improved accuracy and more specificinferences than could be
achieved by the use of a single sensoralone.
Briefly, we can define data fusion as a combination ofmultiple
sources to obtain improved information; in thiscontext, improved
information means less expensive, higherquality, or more relevant
information.
Data fusion techniques have been extensively employedon
multisensor environments with the aim of fusing andaggregating data
from different sensors; however, these tech-niques can also be
applied to other domains, such as textprocessing.The goal of using
data fusion inmultisensor envi-ronments is to obtain a lower
detection error probability anda higher reliability by using data
from multiple distributedsources.
The available data fusion techniques can be classified intothree
nonexclusive categories: (i) data association, (ii)
stateestimation, and (iii) decision fusion. Because of the
largenumber of published papers on data fusion, this paper doesnot
aim to provide an exhaustive review of all of the studies;instead,
the objective is to highlight the main steps that areinvolved in
the data fusion framework and to review themostcommon techniques
for each step.
The remainder of this paper continues as follows. Thenext
section provides various classification categories for datafusion
techniques. Then, Section 3 describes the most com-mon methods for
data association tasks. Section 4 providesa review of techniques
under the state estimation category.Next, the most common
techniques for decision fusion areenumerated in Section 5. Finally,
the conclusions obtained
-
2 The Scientific World Journal
from reviewing the different methods are highlighted inSection
6.
2. Classification of Data Fusion Techniques
Data fusion is a multidisciplinary area that involves
severalfields, and it is difficult to establish a clear and strict
classifi-cation.The employedmethods and techniques can be
dividedaccording to the following criteria:
(1) attending to the relations between the input datasources, as
proposed by Durrant-Whyte [3]. Theserelations can be defined as (a)
complementary, (b)redundant, or (3) cooperative data;
(2) according to the input/output data types and theirnature, as
proposed by Dasarathy [4];
(3) following an abstraction level of the employed data:(a)
rawmeasurement, (b) signals, and (c) characteris-tics or
decisions;
(4) based on the different data fusion levels defined by
theJDL;
(5) Depending on the architecture type: (a) centralized,(b)
decentralized, or (c) distributed.
2.1. Classification Based on the Relations between the
DataSources. Based on the relations of the sources (see Figure
1),Durrant-Whyte [3] proposed the following
classificationcriteria:
(1) complementary: when the information provided bythe input
sources represents different parts of thescene and could thus be
used to obtainmore completeglobal information. For example, in the
case of visualsensor networks, the information on the same
targetprovided by two cameras with different fields of viewis
considered complementary;
(2) redundant: when two or more input sources provideinformation
about the same target and could thus befused to increment the
confidence. For example, thedata coming from overlapped areas in
visual sensornetworks are considered redundant;
(3) cooperative: when the provided information is com-bined into
new information that is typically morecomplex than the original
information. For example,multi-modal (audio and video) data fusion
is consid-ered cooperative.
2.2. Dasarathys Classification. One of the most well-knowndata
fusion classification systems was provided by Dasarathy[4] and is
composed of the following five categories (seeFigure 2):
(1) data in-data out (DAI-DAO): this type is the mostbasic or
elementary data fusion method that is con-sidered in
classification. This type of data fusionprocess inputs and outputs
raw data; the results
are typically more reliable or accurate. Data fusion atthis
level is conducted immediately after the data aregathered from the
sensors. The algorithms employedat this level are based on signal
and image processingalgorithms;
(2) data in-feature out (DAI-FEO): at this level, the datafusion
process employs raw data from the sourcesto extract features or
characteristics that describe anentity in the environment;
(3) feature in-feature out (FEI-FEO): at this level, boththe
input and output of the data fusion process arefeatures. Thus, the
data fusion process addresses aset of features with to improve,
refine or obtain newfeatures. This process is also known as feature
fusion,symbolic fusion, information fusion or intermediate-level
fusion;
(4) feature in-decision out (FEI-DEO): this level obtains aset
of features as input and provides a set of decisionsas output. Most
of the classification systems thatperform a decision based on a
sensors inputs fall intothis category of classification;
(5) Decision In-Decision Out (DEI-DEO): This type
ofclassification is also known as decision fusion. It fusesinput
decisions to obtain better or new decisions.
The main contribution of Dasarathys classification is
thespecification of the abstraction level either as an input or
anoutput, providing a framework to classify different methodsor
techniques.
2.3. Classification Based on the Abstraction Levels. Luo et
al.[5] provided the following four abstraction levels:
(1) signal level: directly addresses the signals that
areacquired from the sensors;
(2) pixel level: operates at the image level and could beused to
improve image processing tasks;
(3) characteristic: employs features that are extractedfrom the
images or signals (i.e., shape or velocity),
(4) symbol: at this level, information is represented assymbols;
this level is also known as the decision level.
Information fusion typically addresses three levels
ofabstraction: (1) measurements, (2) characteristics, and
(3)decisions. Other possible classifications of data fusion basedon
the abstraction levels are as follows:
(1) low level fusion: the raw data are directly providedas an
input to the data fusion process, which providemore accurate data
(a lower signal-to-noise ratio)than the individual sources;
(2) medium level fusion: characteristics or features(shape,
texture, and position) are fused to obtainfeatures that could be
employed for other tasks. Thislevel is also known as the feature or
characteristiclevel;
-
The Scientific World Journal 3
S1 S2 S3 S4 S5
Complementaryfusion
Redundantfusion
Cooperativefusion
Fusedinformation
Sources
Information
(a + b) (b) (c)
A B
A B
B C
C
C
Figure 1: Whytes classification based on the relations between
the data sources.
Data
Data
Features
Features
Decisions
Data
Features
Features
Decisions
Decisions
Data in-data out(DAI-DAO)
Data in-feature out(DAI-FEO)
Feature in-decision out(FEI-DEO)
Decision in-decision out(DEI-DEO)
Feature in-feature out(FEI-FEO)
Figure 2: Dasarathys classification.
(3) high level fusion: this level, which is also knownas
decision fusion, takes symbolic representations assources and
combines them to obtain amore accuratedecision. Bayesiansmethods
are typically employed atthis level;
(4) multiple level fusion: this level addresses data pro-vided
from different levels of abstraction (i.e., whena measurement is
combined with a feature to obtain adecision).
2.4. JDL Data Fusion Classification. This classification is
themost popular conceptual model in the data fusion commu-nity. It
was originally proposed by JDL and the American
Department of Defense (DoD) [1]. These organizations clas-sified
the data fusion process into five processing levels, anassociated
database, and an information bus that connectsthe five components
(see Figure 3). The five levels could begrouped into two groups,
low-level fusion and high-levelfusion, which comprise the following
components:
(i) sources: the sources are in charge of providingthe input
data. Different types of sources can beemployed, such as sensors, a
priori information (ref-erences or geographic data), databases, and
humaninputs;
(ii) human-computer interaction (HCI): HCI is an inter-face that
allows inputs to the system from the oper-ators and produces
outputs to the operators. HCIincludes queries, commands, and
information on theobtained results and alarms;
(iii) database management system: the database manage-ment
system stores the provided information andthe fused results. This
system is a critical componentbecause of the large amount of highly
diverse infor-mation that is stored.
In contrast, the five levels of data processing are defined
asfollows:
(1) level 0source preprocessing: source preprocessingis the
lowest level of the data fusion process, andit includes fusion at
the signal and pixel levels. Inthe case of text sources, this level
also includes theinformation extraction process.This level reduces
theamount of data and maintains useful information forthe
high-level processes;
(2) level 1object refinement: object refinement employsthe
processed data from the previous level. Com-mon procedures of this
level include spatio-temporalalignment, association, correlation,
clustering orgrouping techniques, state estimation, the removal
offalse positives, identity fusion, and the combining offeatures
that were extracted from images. The output
-
4 The Scientific World Journal
Fusion domain
Level 0 Level 1 Level 2 Level 3Sourcepreprocessing
Objectrefinement
Situationassessment
Threatassessment
Information bus
SourcesSensorsDatabasesKnowledge
Level 4 DatabasemanagementProcess
refinement
Userinterface
Figure 3: The JDL data fusion framework.
results of this stage are the object
discrimination(classification and identification) and object
track-ing (state of the object and orientation). This
stagetransforms the input information into consistent
datastructures;
(3) level 2situation assessment: this level focuses ona higher
level of inference than level 1. Situationassessment aims to
identify the likely situations giventhe observed events and
obtained data. It establishesrelationships between the objects.
Relations (i.e.,proximity, communication) are valued to
determinethe significance of the entities or objects in a
specificenvironment. The aim of this level includes perform-ing
high-level inferences and identifying significantactivities and
events (patterns in general). The outputis a set of high-level
inferences;
(4) level 3impact assessment: this level evaluates theimpact of
the detected activities in level 2 to obtain aproper
perspective.The current situation is evaluated,and a future
projection is performed to identifypossible risks, vulnerabilities,
and operational oppor-tunities. This level includes (1) an
evaluation of therisk or threat and (2) a prediction of the
logicaloutcome;
(5) level 4process refinement: this level improves theprocess
from level 0 to level 3 and provides resourceand sensor management.
The aim is to achieve effi-cient resource management while
accounting for taskpriorities, scheduling, and the control of
availableresources.
High-level fusion typically starts at level 2 because thetype,
localization, movement, and quantity of the objectsare known at
that level. One of the limitations of the JDLmethod is how the
uncertainty about previous or subsequentresults could be employed
to enhance the fusion process(feedback loop). Llinas et al. [6]
propose several refinementsand extensions to the JDL model. Blasch
and Plano [7]proposed to add a new level (user refinement) to
support ahumanuser in the data fusion loop.The JDLmodel
represents
the first effort to provide a detailed model and a
commonterminology for the data fusion domain. However, becausetheir
roots originate in the military domain, the employedterms are
oriented to the risks that commonly occur inthese scenarios. The
Dasarathy model differs from the JDLmodel with regard to the
adopted terminology and employedapproach. The former is oriented
toward the differencesamong the input and output results,
independent of theemployed fusion method. In summary, the Dasarathy
modelprovides a method for understanding the relations betweenthe
fusion tasks and employed data, whereas the JDL modelpresents an
appropriate fusion perspective to design datafusion systems.
2.5. Classification Based on the Type of Architecture. One ofthe
main questions that arise when designing a data fusionsystem is
where the data fusion process will be performed.Based on this
criterion, the following types of architecturescould be
identified:
(1) centralized architecture: in a centralized architecture,the
fusion node resides in the central processor thatreceives the
information from all of the input sources.Therefore, all of the
fusion processes are executedin a central processor that uses the
provided rawmeasurements from the sources. In this schema,
thesources obtain only the observationas measurementsand transmit
them to a central processor, where thedata fusion process is
performed. If we assume thatdata alignment and data association are
performedcorrectly and that the required time to transfer thedata
is not significant, then the centralized scheme istheoretically
optimal. However, the previous assump-tions typically do not hold
for real systems. Moreover,the large amount of bandwidth that is
required to sendraw data through the network is another
disadvantagefor the centralized approach. This issue becomes
abottleneck when this type of architecture is employedfor fusing
data in visual sensor networks. Finally,the time delays when
transferring the informationbetween the different sources are
variable and affect
-
The Scientific World Journal 5
the results in the centralized scheme to a greaterdegree than in
other schemes;
(2) decentralized architecture: a decentralized architec-ture is
composed of a network of nodes in which eachnode has its own
processing capabilities and there isno single point of data fusion.
Therefore, each nodefuses its local information with the
information thatis received from its peers. Data fusion is
performedautonomously, with each node accounting for its
localinformation and the information received from itspeers.
Decentralized data fusion algorithms typicallycommunicate
information using the Fisher and Shan-non measurements instead of
the objects state [8];The main disadvantage of this architecture is
thecommunication cost, which is (2) at each com-munication step,
where is the number of nodes;additionally, the extreme case is
considered, in whicheach node communicates with all of its peers.
Thus,this type of architecture could suffer from
scalabilityproblems when the number of nodes is increased;
(3) distributed architecture: in a distributed
architecture,measurements from each source node are
processedindependently before the information is sent to thefusion
node; the fusion node accounts for the infor-mation that is
received from the other nodes. In otherwords, the data association
and state estimation areperformed in the source node before the
informationis communicated to the fusion node. Therefore, eachnode
provides an estimation of the object state basedon only their local
views, and this information isthe input to the fusion process,
which provides afused global view. This type of architecture
providesdifferent options and variations that range from onlyone
fusion node to several intermediate fusion nodes;
(4) hierarchical architecture: other architectures com-prise a
combination of decentralized and distributednodes, generating
hierarchical schemes in which thedata fusion process is performed
at different levels inthe hierarchy.
In principle, a decentralized data fusion system is
moredifficult to implement because of the computation
andcommunication requirements. However, in practice, there isno
single best architecture, and the selection of the mostappropriate
architecture should be made depending on therequirements, demand,
existing networks, data availability,node processing capabilities,
and organization of the datafusion system.
The reader might think that the decentralized anddistributed
architectures are similar; however, they havemeaningful differences
(see Figure 4). First, in a distributedarchitecture, a
preprocessing of the obtainedmeasurements isperformed, which
provides a vector of features as a result (thefeatures are fused
thereafter). In contrast, in the decentralizedarchitecture, the
complete data fusion process is conductedin each node, and each of
the nodes provides a globallyfused result. Second, the
decentralized fusion algorithmstypically communicate information,
employing the Fisher
and Shannon measurements. In contrast, distributed algo-rithms
typically share a common notion of state (position,velocity, and
identity) with their associated probabilities,which are used to
perform the fusion process [9]. Third,because the decentralized
data fusion algorithms exchangeinformation instead of states and
probabilities, they havethe advantage of easily separating old
knowledge from newknowledge. Thus, the process is additive, and the
associativemeaning is not relevant when the information is
receivedand fused. However, in the distributed data fusion
algorithms(i.e., distributed by Kalman Filter), the state that is
goingto be fused is not associative, and when and how the
fusedestimates are computed is relevant. Nevertheless, in
contrastto the centralized architectures, the distributed
algorithmsreduce the necessary communication and computationalcosts
because some tasks are computed in the distributednodes before data
fusion is performed in the fusion node.
3. Data Association Techniques
The data association problem must determine the set
ofmeasurements that correspond to each target (see Figure 5).Let us
suppose that there are targets that are being trackedby only one
sensor in a cluttered environment (by a clutteredenvironment, we
refer to an environment that has severaltargets that are to close
each other).Then, the data associationproblem can be defined as
follows:
(i) each sensors observation is received in the fusionnode at
discrete time intervals;
(ii) the sensormight not provide observations at a
specificinterval;
(iii) some observations are noise, and other
observationsoriginate from the detected target;
(iv) for any specific target and in every time interval, wedo
not know (a priori) the observations that will begenerated by that
target.
Therefore, the goal of data association is to establish theset
of observations or measurements that are generated bythe same
target over time. Hall and Llinas [2] provided thefollowing
definition of data association: The process of assignand compute
the weights that relates the observations or tracks(A track can be
defined as an ordered set of points that followa path and are
generated by the same target.) from one set tothe observation of
tracks of another set.
As an example of the complexity of the data associationproblem,
if we take a frame-to-frame association and assumethat possible
points could be detected in all frames, thenthe number of possible
sets is (!)1. Note that from allof these possible solutions, only
one set establishes the truemovement of the points.
Data association is often performed before the stateestimation
of the detected targets. Moreover, it is a keystep because the
estimation or classification will behaveincorrectly if the data
association phase does not workcoherently. The data association
process could also appear inall of the fusion levels, but the
granularity varies dependingon the objective of each level.
-
6 The Scientific World Journal
Preprocessing
Preprocessing
Preprocessing
Alignment Association Estimation
Stateof theobject
Centralized architecture
Decentralized architecture
Distributed architecture
S1
S2
Fusion node
Preprocessing
Stateof theobject
Stateof theobject
Stateof theobject
S1
S2
S1
S2
Preprocessing
Preprocessing
Preprocessing
Preprocessing
Preprocessing
Alignment
Alignment
Alignment
Alignment
Alignment
Alignment
Alignment
Association
Association
Association
Association
Association
Association
Association
Estimation
Estimation
Estimation
Estimation
Estimation
Estimation
Estimation
Sn
Sn
Sn
Stateof theobject
Figure 4: Classification based on the type of architecture.
In general, an exhaustive search of all possible combina-tions
grows exponentially with the number of targets; thus,the data
association problem becomes NP complete. Themost common techniques
that are employed to solve the dataassociation problem are
presented in the following sections(from Sections 3.1 to 3.7).
3.1. Nearest Neighbors and K-Means. Nearest neighbor(NN) is the
simplest data association technique. NN isa well-known clustering
algorithm that selects or groups
the most similar values. How close the one measurement isto
another depends on the employed distance metric andtypically
depends on the threshold that is established by thedesigner. In
general, the employed criteria could be based on(1) an absolute
distance, (2) the Euclidean distance, or (3) astatistical function
of the distance.
NN is a simple algorithm that can find a feasible (approx-imate)
solution in a small amount of time. However, in acluttered
environment, it could provide many pairs that havethe same
probability and could thus produce undesirable
-
The Scientific World Journal 7
Targets Sensors Observations Tracks
Track 1
Track 2
False alarms
Ass
ocia
tion
S1
S2
...
Sn
Track n
y1, y2, . . . , yn
Figure 5: Conceptual overview of the data association process
from multiple sensors and multiple targets. It is necessary to
establish the setof observations over time from the same object
that forms a track.
error propagation [10]. Moreover, this algorithm has
poorperformance in environments in which false measurementsare
frequent, which are in highly noisy environments.
All neighbors use a similar technique, in which all of
themeasurements inside a region are included in the tracks.-Means
[11] method is a well-known modification of
the NN algorithm. -Means divides the dataset values into
different clusters. -Means algorithm finds the best local-ization
of the cluster centroids, where best means a centroidthat is in the
center of the data cluster.-Means is an iterativealgorithm that can
be divided into the following steps:
(1) obtain the input data and the number of desiredclusters
();
(2) randomly assign the centroid of each cluster;(3) match each
data point with the centroid of each
cluster;(4) move the cluster centers to the centroid of the
cluster;(5) if the algorithm does not converge, return to step
(3).
-Means is a popular algorithm that has been widelyemployed;
however, it has the following disadvantages:
(i) the algorithm does not always find the optimal solu-tion for
the cluster centers;
(ii) the number of clusters must be known a priori andone must
assume that this number is the optimum;
(iii) the algorithm assumes that the covariance of thedataset is
irrelevant or that it has been normalizedalready.
There are several options for overcoming these limita-tions. For
the first one, it is possible to execute the algorithmseveral times
and obtain the solution that has less variance.For the second one,
it is possible to start with a low valueof and increment the values
of until an adequate resultis obtained. The third limitation can be
easily overcome bymultiplying the datawith the inverse of the
covariancematrix.
Many variations have been proposed to Lloyds basic-Means
algorithm [11], which has a computational upperbound cost of (),
where is the number of input pointsand is the number of desired
clusters. Some algorithmsmodify the initial cluster assignments to
improve the separa-tions and reduce the number of iterations.
Others introduce
soft or multinomial clustering assignments using fuzzy
logic,probabilistic, or the Bayesian techniques. However, most
ofthe previous variations still must perform several
iterationsthrough the data space to converge to a reasonable
solution.This issue becomes a major disadvantage in several
real-time applications. A new approach that is based on havinga
large (but still affordable) number of cluster candidatescompared
to the desired clusters is currently gainingattention. The idea
behind this computational model is thatthe algorithm builds a good
sketch of the original data whilereducing the dimensionality of the
input space significantly.In this manner, a weighted -Means can be
applied to thelarge candidate clusters to derive a good clustering
of theoriginal data. Using this idea, [12] presented an
efficientand scalable -Means algorithm that is based on
randomprojections. This algorithm requires only one pass throughthe
input data to build the clusters. More specifically, if theinput
data distribution holds some separability requirements,then the
number of required candidate clusters grows onlyaccording to(log ),
where is the number of observationsin the original data. This
salient feature makes the algorithmscalable in terms of both the
memory and computationalrequirements.
3.2. Probabilistic Data Association. The probabilistic
dataassociation (PDA) algorithm was proposed by Bar-Shalomand Tse
[13] and is also known as the modified filter of allneighbors. This
algorithm assigns an association probabilityto each hypothesis from
a valid measurement of a target.A valid measurement refers to the
observation that falls inthe validation gate of the target at that
time instant. Thevalidation gate, , which is the center around the
predictedmeasurements of the target, is used to select the set of
basicmeasurements and is defined as
( () ( | 1))1() ( () ( | 1)) ,
(1)
where is the temporal index, () is the covariance gain,and
determines the gating or window size. The set of validmeasurements
at time instant is defined as
() = () , = 1, . . . ,
, (2)
-
8 The Scientific World Journal
where () is the -measurement in the validation region at
time instant . We give the standard equations of the
PDAalgorithm next. For the state prediction, consider
( | 1) = ( 1) ( 1 | 1) , (3)
where ( 1) is the transition matrix at time instant 1.To
calculate the measurement prediction, consider
( | 1) = () ( | 1) , (4)
where () is the linearization measurement matrix. Tocompute the
gain or the innovation of the -measurement,consider
V() =
() ( | 1) . (5)
To calculate the covariance prediction, consider
( | 1) = ( 1) ( 1 | 1) ( 1)+ () ,
(6)
where () is the process noise covariance matrix. To com-pute the
innovation covariance () and the Kalman gain ()
() = () ( | 1)()+ ,
() = ( | 1)()()1.
(7)
To obtain the covariance update in the case in which
themea-surements originated by the target are known, consider
0( | ) = ( | 1) () ()()
. (8)
The total update of the covariance is computed as
V () =
=1
() V() ,
() = () [
=1
(() V() V()) V () V()
]() ,
(9)
whereis the number of valid measurements in the instant
.The equation to update the estimated state, which is formedby
the position and velocity, is given by
( | ) = ( | 1) + () V () . (10)
Finally, the association probabilities of PDA are as
follows:
() =
()
=0(), (11)
where
() =
{{{{{{
{{{{{{
{
(2)/2 ()
(1 )
if = 0
exp [12V () 1 () V ()] if = 0
0 in other cases,(12)
where is the dimension of themeasurement vector, is thedensity
of the clutter environment,
is the detection prob-
ability of the correct measurement, and is the validation
probability of a detected value.In the PDA algorithm, the state
estimation of the target is
computed as a weighted sum of the estimated state under allof
the hypotheses.The algorithm can associate different mea-surements
to one specific target. Thus, the association of thedifferent
measurements to a specific target helps PDA toestimate the target
state, and the association probabilitiesare used as weights. The
main disadvantages of the PDAalgorithm are the following:
(i) loss of tracks: because PDA ignores the interferencewith
other targets, it sometimes could wrongly clas-sify the closest
tracks. Therefore, it provides a poorperformance when the targets
are close to each otheror crossed;
(ii) the suboptimal Bayesian approximation: when thesource of
information is uncertain, PDA is the sub-optimal Bayesian
approximation to the associationproblem;
(iii) one target: PDA was initially designed for the
asso-ciation of one target in a low-cluttered environment.The
number of false alarms is typically modeled withthe Poisson
distribution, and they are assumed to bedistributed uniformly in
space. PDA behaves incor-rectlywhen there aremultiple targets
because the falsealarm model does not work well;
(iv) track management: because PDA assumes that thetrack is
already established, algorithms must be pro-vided for track
initialization and track deletion.
PDA is mainly good for tracking targets that do notmake abrupt
changes in their movement patterns. PDA willmost likely lose the
target if it makes abrupt changes in itsmovement patterns.
3.3. Joint Probabilistic Data Association. Joint
probabilisticdata association (JPDA) is a suboptimal approach for
trackingmultiple targets in cluttered environments [14]. JPDA
issimilar to PDA, with the difference that the
associationprobabilities are computed using all of the
observationsand all of the targets. Thus, in contrast to PDA,
JPDAconsiders various hypotheses together and combines them.JPDA
determines the probability
() that measurement is
originated from target , accounting for the fact that underthis
hypothesis, the measurement cannot be generated byother targets.
Therefore, for a known number of targets, itevaluates the different
options of the measurement-targetassociation (for the most recent
set of measurements) andcombines them into the corresponding state
estimation. Ifthe association probability is known, then the Kalman
filterupdating equation of the track can be written as
( | ) =
( | 1) + () V
() , (13)
where ( | ) and ( | 1) are the estimation andprediction of
target , and() is the filter gain.Theweighted
-
The Scientific World Journal 9
sum of the residuals associated with the observation () oftarget
is as follows:
V() =
()
=1
() V
() , (14)
where V= ()
( | 1). Therefore, this method
incorporates all of the observations (inside the neighborhoodof
the targets predicted position) to update the estimatedposition by
using a posterior probability that is a weightedsum of
residuals.
The main restrictions of JPDA are the following:
(i) a measurement cannot come from more than onetarget;
(ii) two measurements cannot be originated by the sametarget (at
one time instant);
(iii) the sum of all of the measurements probabilities thatare
assigned to one target must be 1: ()
=0
() = 1.
The main disadvantages of JPDA are the following:
(i) it requires an explicit mechanism for track initial-ization.
Similar to PDA, JPDA cannot initialize newtracks or remove tracks
that are out of the observationarea;
(ii) JPDA is a computationally expensive algorithm whenit is
applied in environments that havemultiple targetsbecause the number
of hypotheses is incrementedexponentially with the number of
targets.
In general, JPDA is more appropriate than MHT insituations in
which the density of false measurements is high(i.e., sonar
applications).
3.4. Multiple Hypothesis Test. The underlying idea of
themultiple hypothesis test (MHT) is based on using more thantwo
consecutive observations to make an association withbetter results.
Other algorithms that use only two consecutiveobservations have a
higher probability of generating an error.In contrast to PDA and
JPDA, MHT estimates all of thepossible hypotheses and maintains new
hypotheses in eachiteration.
MHTwas developed to track multiple targets in
clutteredenvironments; as a result, it combines the data
associationproblem and tracking into a unified framework,
becomingan estimation technique as well. The Bayes rule or
theBayesian networks are commonly employed to calculate theMHT
hypothesis. In general, researchers have claimed thatMHT
outperforms JPDA for the lower densities of falsepositives.
However, the main disadvantage of MHT is thecomputational cost when
the number of tracks or falsepositives is incremented. Pruning the
hypothesis tree usinga window could solve this limitation.
The Reid [15] tracking algorithm is considered the stan-dard MHT
algorithm, but the initial integer programmingformulation of the
problem is due to Morefield [16]. MHT isan iterative algorithm in
which each iteration starts with a set
of correspondence hypotheses. Each hypothesis is a collec-tion
of disjoint tracks, and the prediction of the target in thenext
time instant is computed for each hypothesis. Next, thepredictions
are compared with the new observations by usinga distance metric.
The set of associations established in eachhypothesis (based on a
distance) introduces new hypothesesin the next iteration. Each new
hypothesis represents a newset of tracks that is based on the
current observations.
Note that each new measurement could come from (i) anew target
in the visual field of view, (ii) a target being tracked,or (iii)
noise in the measurement process. It is also possiblethat a
measurement is not assigned to a target because thetarget
disappears, or because it is not possible to obtain atarget
measurement at that time instant.
MHT maintains several correspondence hypotheses foreach target
in each frame. If the hypothesis in the instant is represented by
() = [
(), = 1, . . . , ], then
the probability of the hypothesis () could be represented
recursively using the Bayes rule as follows:
(() | ()) = (
( 1) ,
() | ())
=1
( () |
( 1) ,
())
(() |
( 1)) (
( 1)) ,
(15)
where ( 1) is the hypothesis of the complete set until
the time instant 1; () is the th possible association of the
track to the object; () is the set of detections of the
currentframe, and is a normal constant.
The first term on the right side of the previous equationis the
likelihood function of the measurement set () giventhe joint
likelihood and current hypothesis. The second termis the
probability of the association hypothesis of the currentdata given
the previous hypothesis
( 1). The third term
is the probability of the previous hypothesis from which
thecurrent hypothesis is calculated.
The MHT algorithm has the ability to detect a newtrack while
maintaining the hypothesis tree structure. Theprobability of a true
track is given by theBayes decisionmodelas
( | ) = ( | )
()
(), (16)
where ( | ) is the probability of obtaining the set
ofmeasurements given ,
() is the a priori probability of
the source signal, and () is the probability of obtaining theset
of detections .
MHT considers all of the possibilities, including boththe track
maintenance and the initialization and removalof tracks in an
integrated framework. MHT calculates thepossibility of having an
object after the generation of a setof measurements using an
exhaustive approach, and thealgorithm does not assume a fixed
number of targets.The keychallenge of MHT is the effective
hypothesis management.The baseline MHT algorithm can be extended as
follows:(i) use the hypothesis aggregation for missed targets
births,
-
10 The Scientific World Journal
cardinality tracking, and closely spaced objects; (ii) applya
multistage MHT for improving the performance androbustness in
challenging settings; and (iii) use a feature-aided MHT for
extended object surveillance.
The main disadvantage of this algorithm is the compu-tational
cost, which grows exponentially with the number oftracks
andmeasurements.Therefore, the practical implemen-tation of this
algorithm is limited because it is exponential inboth time and
memory.
With the aim of reducing the computational cost, [17]presented a
probabilistic MHT algorithm in which theassociations are considered
to be random variables thatare statistically independent and in
which performing anexhaustive search enumeration is avoided. This
algorithm isknown as PMHT. The PMHT algorithm assumes that
thenumber of targets and measurements is known. With thesame goal
of reducing the computational cost, [18] presentedan efficient
implementation of the MHT algorithm. Thisimplementation was the
first version to be applied to performtracking in visual
environments. They employed the Murty[19] algorithm to determine
the best set of hypothesesin polynomial time, with the goal of
tracking the points ofinterest.
MHT typically performs the tracking process by employ-ing only
one characteristic, commonly the position. TheBayesian combination
to use multiple characteristics wasproposed by Liggins II et al.
[20].
A linear-programming-based relaxation approach to
theoptimization problem in MHT tracking was proposed inde-pendently
by Coraluppi et al. [21] and Storms and Spieksma[22]. Joo and
Chellappa [23] proposed an association algo-rithm for tracking
multiple targets in visual environments.Their algorithm is based on
in MHT modification in whicha measurement can be associated with
more than one target,and several targets can be associated with one
measurement.They also proposed a combinatorial optimization
algorithmto generate the best set of association hypotheses.
Theiralgorithm always finds the best hypothesis, in contrast
toother models, which are approximate. Coraluppi and Carthel[24]
presented a generalization of the MHT algorithm usinga recursion
over hypothesis classes rather than over a singlehypothesis. This
work has been applied in a special case ofthemulti-target tracking
problem, called cardinality tracking,in which they observed the
number of sensor measurementsinstead of the target states.
3.5. Distributed Joint Probabilistic Data Association. The
dis-tributed version of the joint probabilistic data
association(JPDA-D) was presented by Chang et al. [25]. In this
tech-nique, the estimated state of the target (using two
sensors)after being associated is given by
{ | 1, 2} =
1
=0
2
=0
{ | 1
, 2
, 1, 2}
{1
, 2
| 1, 2} ,
(17)
where , = 1, 2, is the last set of measurements of
sensor 1 and 2, , = 1, 2, is the set of accumulative data,
and is the association hypothesis.The first term of the
rightside of the equation is calculated from the associations
thatwere made earlier. The second term is computed from
theindividual association probabilities as follows:
(1
, 2
| 1, 2) =
1
2
= (1, 2| 1, 2) 1
(1) 2
(2) ,
(1, 2| 1, 2) =1
(1| 1) (
2| 2) (
1, 2) ,
(18)
where are the joint hypotheses involving all of themeasurements
and all of the objectives, and
() are the
binary indicators of the measurement-target
association.Theadditional term (1, 2) depends on the correlation of
theindividual hypothesis and reflects the localization influenceof
the current measurements in the joint hypotheses.
These equations are obtained assuming that commu-nication exists
after every observation, and there are onlyapproximations in the
case in which communication issporadic and when a substantial
amount of noise occurs.Therefore, this algorithm is a theoretical
model that has somelimitations in practical applications.
3.6. Distributed Multiple Hypothesis Test. The
distributedversion of the MHT algorithm (MHT-D) [26, 27] follows
asimilar structure as the JPDA-D algorithm. Let us assume thecase
in which one node must fuse two sets of hypotheses andtracks. If
the hypotheses and track sets are represented by() and () with = 1,
2, the hypothesis probabilities
are represented by ; and the state distribution of the
tracks
() is represented by (
) and ( | ,
); then, the
maximum available information in the fusion node is =1 2. The
data fusion objective of the MHT-D is to
obtain the set of hypotheses(), the set of tracks (),
thehypothesis probabilities ( | ), and the state distribution( | ,
) for the observed data.
The MHT-D algorithm is composed of the followingsteps:
(1) hypothesis formation: for each hypothesis pair 1and
2
, which could be fused, a track is formed by
associating the pair of tracks 1and 2
, where each
pair comes from one node and could originate fromthe same
target. The final result of this stage is a setof hypotheses
denoted by () and the fused tracks();
(2) hypothesis evaluation: in this stage, the
associationprobability of each hypothesis and the estimatedstate of
each fused track are obtained. The dis-tributed estimation
algorithm is employed to calcu-late the likelihood of the possible
associations andthe obtained estimations at each specific
association.
-
The Scientific World Journal 11
Using the information model, the probability of eachfused
hypothesis is given by
( | ) = 1
(()| ())()
( | ) , (19)
where is a normalizing constant, and ( | ) is thelikelihood of
each hypothesis pair.
The main disadvantage of the MHT-D is the high com-putational
cost that is in the order of (), where is thenumber of possible
associations and is the number ofvariables to be estimated.
3.7. Graphical Models. Graphical models are a formalism
forrepresenting and reasoning with probabilities and indepen-dence.
A graphical model represents a conditional decom-position of the
joint probability. A graphical model can berepresented as a graph
in which the nodes denote randomvariables; the edges denote the
possible dependence betweenthe random variables, and the plates
denote the replication ofa substructure, with the appropriate
indexing of the relevantvariables. The graph captures the joint
distribution over therandom variables, which can be decomposed into
a productof factors that each dependononly a subset of
variables.Thereare two major classes of graphical models: (i) the
Bayesiannetworks [28], which are also known as the directed
graphicalmodels, and (ii) the Markov random fields, which are
alsoknown as undirected graphical models. The directed graph-ical
models are useful for expressing causal relationshipsbetween random
variables, whereas undirected models arebetter suited for
expressing soft constraints between randomvariables. We refer the
reader to the book of Koller andFriedman [29] for more information
on graphical models.
A framework based on graphical models can solve theproblem of
distributed data association in synchronizedsensor networkswith
overlapped areas andwhere each sensorreceives noisy measurements;
this solution was proposedby Chen et al. [30, 31]. Their work is
based on graphicalmodels that are used to represent the statistical
dependencebetween random variables. The data association problem
istreated as an inference problem and solved by using
themax-product algorithm [32]. Graphical models
representstatistical dependencies between variables as graphs,
andthe max-product algorithm converges when the graph isa tree
structure. Moreover, the employed algorithm couldbe implemented in
a distributed manner by exchangingmessages between the source nodes
in parallel. With thisalgorithm, if each sensor has possible
combinations ofassociations and there are variables to be
estimated, it hasa complexity of (2), which is reasonable and less
thanthe () complexity of the MHT-D algorithm. However,aspecial
attention must be given to the correlated variableswhen building
the graphical model.
4. State Estimation Methods
State estimation techniques aim to determine the state ofthe
target under movement (typically the position) given
the observation or measurements. State estimation tech-niques
are also known as tracking techniques. In their generalform, it is
not guaranteed that the target observations arerelevant, which
means that some of the observations couldactually come from the
target and others could be only noise.The state estimation phase is
a common stage in data fusionalgorithms because the targets
observation could come fromdifferent sensors or sources, and the
final goal is to obtain aglobal target state from the
observations.
The estimation problem involves finding the values of thevector
state (e.g., position, velocity, and size) that fits as muchas
possible with the observed data. From a mathematicalperspective, we
have a set of redundant observations, andthe goal is to find the
set of parameters that provides thebest fit to the observed data.
In general, these observationsare corrupted by errors and the
propagation of noise in themeasurement process. State estimation
methods fall underlevel 1 of the JDL classification and could be
divided into twobroader groups:
(1) linear dynamics and measurements: here, the esti-mation
problem has a standard solution. Specifically,when the equations of
the object state and the mea-surements are linear, the noise
follows the Gaussiandistribution, and we do not refer to it as a
clutterenvironment; in this case, the optimal theoreticalsolution
is based on the Kalman filter;
(2) nonlinear dynamics: the state estimation problembecomes
difficult, and there is not an analytical solu-tion to solve the
problem in a generalmanner. In prin-ciple, there are no practical
algorithms available tosolve this problem satisfactorily.
Most of the state estimationmethods are based on controltheory
and employ the laws of probability to compute avector state from a
vector measurement or a stream of vectormeasurements. Next, the
most common estimation methodsare presented, including maximum
likelihood and maxi-mum posterior (Section 4.1), the Kalman filter
(Section 4.2),particle filter (Section 4.3), the distributed Kalman
filter(Section 4.4), distributed particle filter (Section 4.5)
and,covariance consistency methods (Section 4.6).
4.1. Maximum Likelihood andMaximum Posterior. Themax-imum
likelihood (ML) technique is an estimation methodthat is based on
probabilistic theory. Probabilistic estimationmethods are
appropriate when the state variable follows anunknown probability
distribution [33]. In the context ofdata fusion, is the state that
is being estimated, and =((1), . . . , ()) is a sequence of
previous observations of. The likelihood function () is defined as
a probabilitydensity function of the sequence of observations given
thetrue value of the state . Consider
() = ( | ) . (20)
The ML estimator finds the value of that maximizes thelikelihood
function:
() = argmax ( | ) , (21)
-
12 The Scientific World Journal
which can be obtained from the analytical or empiricalmodels of
the sensors.This function expresses the probabilityof the observed
data. The main disadvantage of this methodin practice is that it
requires the analytical or empirical modelof the sensor to be known
to provide the prior distributionand compute the likelihood
function. This method can alsosystematically underestimate the
variance of the distribution,which leads to a bias problem.
However, the bias of the MLsolution becomes less significant as the
number of datapoints increases and is equal to the true variance of
thedistribution that generated the data at the limit .
The maximum posterior (MAP) method is based on theBayesian
theory. It is employed when the parameter tobe estimated is the
output of a random variable that has aknown probability density
function (). In the context ofdata fusion, is the state that is
being estimated and =((1), . . . , ()) is a sequence of previous
observations of .The MAP estimator finds the value of that
maximizes theposterior probability distribution as follows:
() = argmax ( | ) . (22)
Both methods (ML andMAP) aim to find the most likelyvalue for
the state . However, ML assumes that is a fixedbut an unknown point
from the parameter space, whereasMAP considers to be the output of
a random variable witha known a priori probability density
function. Both of thesemethods are equivalent when there is no a
priori informationabout , that is, when there are only
observations.
4.2.The Kalman Filter. TheKalman filter is the most
popularestimation technique. It was originally proposed by
Kalman[34] and has been widely studied and applied since then.
TheKalman filter estimates the state of a discrete time
processgoverned by the following space-time model:
( + 1) = () () + () () + () (23)
with the observations ormeasurements at time of the state
represented by
() = () () + V () , (24)
where () is the state transition matrix, () is the inputmatrix
transition, () is the input vector, () is themeasurement matrix,
and and V are the random Gaussianvariables with zero mean and
covariance matrices of ()and (), respectively. Based on the
measurements and onthe system parameters, the estimation of (),
which isrepresented by (), and the prediction of ( + 1), whichis
represented by ( + 1 | ), are given by the following:
() = ( | + 1) + () [ () () ( | 1)] ,
( + 1 | ) = () ( | ) + () () ,
(25)
respectively, where is the filter gain determined by
() = ( | 1)()
[ () ( | 1)() + ()]
1
,
(26)
where ( | 1) is the prediction covariance matrix andcan be
determined by
( + 1 | ) = () ()() + () (27)
with
() = ( | 1) () () ( | 1) . (28)
The Kalman filter is mainly employed to fuse low-leveldata. If
the system could be described as a linear model andthe error could
be modeled as the Gaussian noise, then therecursive Kalman filter
obtains optimal statistical estimations[35]. However, othermethods
are required to address nonlin-ear dynamicmodels and
nonlinearmeasurements.Themodi-fied Kalman filter known as the
extended Kalman filter (EKF)is an optimal approach for implementing
nonlinear recursivefilters [36]. The EKF is one of the most often
employedmethods for fusing data in robotic applications. However,it
has some disadvantages because the computations of theJacobians are
extremely expensive. Some attempts have beenmade to reduce the
computational cost, such as linearization,but these attempts
introduce errors in the filter and make itunstable.
The unscented Kalman filter (UKF) [37] has gainedpopularity,
because it does not have the linearization step andthe associated
errors of the EKF [38]. The UKF employs adeterministic sampling
strategy to establish theminimum setof points around the mean. This
set of points captures thetrue mean and covariance completely.
Then, these points arepropagated through nonlinear functions, and
the covarianceof the estimations can be recuperated. Another
advantage ofthe UKF is its ability to be employed in parallel
implementa-tions.
4.3. Particle Filter. Particle filters are recursive
implemen-tations of the sequential Monte Carlo methods [39].
Thismethod builds the posterior density function using
severalrandom samples called particles. Particles are
propagatedover time with a combination of sampling and
resamplingsteps. At each iteration, the sampling step is employed
todiscard some particles, increasing the relevance of regionswith a
higher posterior probability. In the filtering process,several
particles of the same state variable are employed,and each particle
has an associated weight that indicatesthe quality of the particle.
Therefore, the estimation is theresult of a weighted sum of all of
the particles. The standardparticle filter algorithm has two
phases: (1) the predictingphase and (2) the updating phase. In the
predicting phase,each particle is modified according to the
existing modeland accounts for the sum of the random noise to
simulatethe noise effect. Then, in the updating phase, the weight
ofeach particle is reevaluated using the last available
sensorobservation, and particles with lower weights are
removed.Specifically, a generic particle filter comprises the
followingsteps.
-
The Scientific World Journal 13
(1) Initialization of the particles:
(i) let be equal to the number of particles;(ii) ()(1) = [(1),
(1), 0, 0] for = 1, . . . , .
(2) Prediction step:
(i) for each particle = 1, . . . , , evaluate the state( + 1 | )
of the system using the state at timeinstant with the noise of the
system at time .Consider
()( + 1 | ) = ()
()()
+ (cauchy-distribution-noise)(),
(29)
where () is the transition matrix of the sys-tem.
(3) Evaluate the particle weight. For each particle =1, . . . ,
:
(i) compute the predicted observation state of thesystem using
the current predicted state and thenoise at instant . Consider
()( + 1 | ) = ( + 1)
()( + 1 | )
+ (gaussian-measurement-noise)(+1);
(30)
(ii) compute the likelihood (weights) according tothe given
distribution. Consider
likelihood() = (() ( + 1 | ) ; () ( + 1) , var) ; (31)
(iii) normalize the weights as follows
()=
likelihood()
=1likelihood()
. (32)
(4) Resampling/Selection: multiply particles with higherweights
and remove those with lower weights. Thecurrent state must be
adjusted using the computedweights of the new particles.
(i) Compute the cumulative weights. Consider
Cum Wt() =
=1
(). (33)
(ii) Generate uniform distributed random variablesfrom () (0, 1)
with the number of stepsequal to the number of particles.
(iii) Determine which particles should be multipliedand which
ones removed.
(5) Propagation phase:
(i) incorporate the new values of the state after theresampling
of instant to calculate the value atinstant + 1. Consider
(1:)( + 1 | + 1) = ( + 1 | ) ; (34)
(ii) compute the posterior mean. Consider
( + 1) = mean [ ( + 1 | + 1)] , = 1, . . . , ; (35)
(iii) repeat steps 2 to 5 for each time instant.
Particle filters are more flexible than the Kalman filtersand
can copewith nonlinear dependencies and non-Gaussiandensities in
the dynamic model and in the noise error.However, they have some
disadvantages. A large numberof particles are required to obtain a
small variance in theestimator. It is also difficult to establish
the optimal number ofparticles in advance, and the number of
particles affects thecomputational cost significantly. Earlier
versions of particlefilters employed a fixed number of particles,
but recent studieshave started to use a dynamic number of particles
[40].
4.4. The Distributed Kalman Filter. The distributed Kalmanfilter
requires a correct clock synchronization between eachsource, as
demonstrated in [41]. In other words, to correctlyuse the
distributed Kalman filter, the clocks from all ofthe sources must
be synchronized. This synchronization istypically achieved through
using protocols that employ ashared global clock, such as the
network time protocol (NTP).Synchronization problems between clocks
have been shownto have an effect on the accuracy of the Kalman
filter,producing inaccurate estimations [42].
If the estimations are consistent and the cross covarianceis
known (or the estimations are uncorrelated), then it ispossible to
use the distributed Kalman filters [43]. However,the cross
covariance must be determined exactly, or theobservations must be
consistent.
We refer the reader to Liggins II et al. [20] formore
detailsabout the Kalman filter in a distributed and
hierarchicalarchitecture.
4.5. Distributed Particle Filter. Distributed particle
filtershave gained attention recently [4446]. Coates [45] used
adistributed particle filter to monitor an environment thatcould be
captured by the Markovian state-space model,involving nonlinear
dynamics and observations and non-Gaussian noise.
In contrast, earlier attempts to solve
out-of-sequencemeasurements using particle filters are based on
regeneratingthe probability density function to the time instant of
theout-of-sequence measurement [47]. In a particle filter, thisstep
requires a large computational cost, in addition to thenecessary
space to store the previous particles. To avoidthis problem, Orton
and Marrs [48] proposed to store theinformation on the particles at
each time instant, saving thecost of recalculating this
information. This technique is close
-
14 The Scientific World Journal
to optimal, and when the delay increases, the result is
onlyslightly affected [49].However, it requires a very large
amountof space to store the state of the particles at each time
instant.
4.6. Covariance Consistency Methods: Covariance
Intersec-tion/Union. Covariance consistency methods
(intersectionand union) were proposed by Uhlmann [43] and are
generaland fault-tolerant frameworks for maintaining
covariancemeans and estimations in a distributed network.These
meth-ods do not comprise estimation techniques; instead, they
aresimilar to an estimation fusion technique. The distributedKalman
filter requirement of independent measurements orknown
cross-covariances is not a constraintwith thismethod.
4.6.1. Covariance Intersection. If the Kalman filter is
employ-ed to combine two estimations, (
1, 1) and (
2, 2), then it
is assumed that the joint covariance is in the following
form:
[
[
1
2
]
]
, (36)
where the cross-covariance should be known exactly sothat the
Kalman filter can be applied without difficulty.Because the
computation of the cross-covariances is compu-tationally intensive,
Uhlmann [43] proposed the covarianceintersection (CI)
algorithm.
Let us assume that a joint covariance can be definedwith the
diagonal blocks
1> 1and
2> 2. Consider
[
[
1
2
]
]
(37)
for every possible instance of the unknown cross-covariance;
then, the components of the matrix could be employedin the Kalman
filter equations to provide a fused estimation(, ) that is
considered consistent. The key point of thismethod relies on
generating a joint covariance matrix thatcan represent a useful
fused estimation (in this context, usefulrefers to something with a
lower associated uncertainty). Insummary, the CI algorithm computes
the joint covariancematrix , where the Kalman filter provides the
best fusedestimation (, ) with respect to a fixed measurement of
thecovariance matrix (i.e., the minimum determinant).
Specific covariance criteria must be established becausethere is
not a specific minimum joint covariance in theorder of the positive
semidefinite matrices. Moreover, thejoint covariance is the basis
of the formal analysis of theCI algorithm; the actual result is a
nonlinear mixture of theinformation stored on the estimations being
fused, followingthe following equation.
= (1
11
11+ 2
21
22+ +
1
)1
,
= (1
11
11+ 2
21
22+ +
1
)1
,
(38)
where is the transformation of the fused state-space
estimation to the space of the estimated state . The values
of can be calculated to minimize the covariance determi-nant
using convex optimization packages and semipositivematrix
programming. The result of the CI algorithm hasdifferent
characteristics compared to the Kalman filter. Forexample, if two
estimations are provided (, ) and (, )and their covariances are
equal = , since the Kalmanfilter is based on the statistical
independence assumption, itproduces a fused estimation with
covariance = (1/2).In contrast, the CI method does not assume
independenceand, thus, must be consistent even in the case in
whichthe estimations are completely correlated, with the
estimatedfused covariance = . In the case of estimations where <
, the CI algorithm does not provide information aboutthe estimation
(, ); thus, the fused result is (, ).
Every joint-consistent covariance is sufficient to producea
fused estimation, which guarantees consistency. However,it is also
necessary to guarantee a lack of divergence. Diver-gence is avoided
in the CI algorithm by choosing a specificmeasurement (i.e., the
determinant), which is minimized ineach fusion operation. This
measurement represents a non-divergence criterion, because the size
of the estimated covari-ance according to this criterion would not
be incremented.
The application of the CI method guarantees consis-tency and
nondivergence for every sequence of mean andcovariance-consistent
estimations. However, this methoddoes not work well when the
measurements to be fused areinconsistent.
4.6.2. Covariance Union. CI solves the problem of
correlatedinputs but not the problemof inconsistent inputs
(inconsistentinputs refer to different estimations, each of which
has ahigh accuracy (small variance) but also a large differencefrom
the states of the others); thus, the covariance union(CU) algorithm
was proposed to solve the latter [43]. CUaddresses the following
problem: two estimations (
1, 1)
and (2, 2) relate to the state of an object and are mutually
inconsistent from one another. This issue arises when
thedifference between the average estimations is larger thanthe
provided covariance. Inconsistent inputs can be detectedusing the
Mahalanobis distance [50] between them, which isdefined as
= (1 2)
(1+ 2)1
(1 2) , (39)
and detecting whether this distance is larger than a
giventhreshold.
The Mahalanobis distance accounts for the covarianceinformation
to obtain the distance. If the difference betweenthe estimations is
high but their covariance is also high,the Mahalanobis distance
yields a small value. In contrast,if the difference between the
estimations is small and thecovariances are small, it could produce
a larger distancevalue. A high Mahalanobis distance could indicate
that theestimations are inconsistent; however, it is necessary
tohave a specific threshold established by the user or
learnedautomatically.
The CU algorithm aims to solve the following prob-lem: let us
suppose that a filtering algorithm provides twoobservations
withmean and covariance (
1, 1) and (
2, 2),
-
The Scientific World Journal 15
respectively. It is known that one of the observations is
correctand the other is erroneous. However, the identity of
thecorrect estimation is unknown and cannot be determined.In this
situation, if both estimations are employed as aninput to the
Kalman filter, there will be a problem, becausethe Kalman filter
only guarantees a consistent output if theobservation is updated
with a measurement consistent withboth of them. In the specific
case, in which themeasurementscorrespond to the same object but are
acquired from twodifferent sensors, the Kalman filter can only
guarantee thatthe output is consistent if it is consistent with
both separately.Because it is not possible to knowwhich estimation
is correct,the only way to combine the two estimations rigorously
isto provide an estimation (, ) that is consistent with
bothestimations and to obey the following properties:
1+ (
1) (
1)
,
2+ (
2) (
2)
,
(40)
where somemeasurement of thematrix size (i.e., the deter-minant)
is minimized.
In other words, the previous equations indicate that if
theestimation (
1, 1) is consistent, then the translation of the
vector 1to requires to increase the covariance by the sum
of a matrix at least as big as the product of (1) in order
to
be consistent.The same situation applies to the measurement(2,
2) in order to be consistent.
A simple strategy is to choose the mean of the estimationas the
input value of one of themeasurements ( =
1). In this
case, the value of must be chosen, such that the estimationis
consistent with the worst case (the correct measurement is2).
However, it is possible to assign an intermediate value
between 1and
2to decrease the value of . Therefore, the
CU algorithm establishes the mean fused value that hasthe least
covariance but is sufficiently large for the twomeasurements (
1and
2) for consistency.
Because the matrix inequalities presented in previousequations
are convex, convex optimization algorithms mustbe employed to solve
them. The value of can be computedwith the iterative method
described by Julier et al. [51].The obtained covariance could be
significantly larger thanany of the initial covariances and is an
indicator of theexisting uncertainty between the initial
estimations. One ofthe advantages of the CU method arises from the
fact thatthe same process could be easily extended to inputs.
5. Decision Fusion Methods
A decision is typically taken based on the knowledge of
theperceived situation, which is provided by many sources inthe
data fusion domain. These techniques aim to make ahigh-level
inference about the events and activities that areproduced from the
detected targets. These techniques oftenuse symbolic information,
and the fusion process requires toreasonwhile accounting for the
uncertainties and constraints.These methods fall under level 2
(situation assessment) andlevel 4 (impact assessment) of the JDL
data fusion model.
5.1. The Bayesian Methods. Information fusion based on
theBayesian inference provides a formalism for combining evi-dence
according to the probability theory rules. Uncertaintyis
represented using the conditional probability terms thatdescribe
beliefs and take on values in the interval [0, 1], wherezero
indicates a complete lack of belief and one indicates anabsolute
belief. The Bayesian inference is based on the Bayesrule as
follows:
( | ) = ( | ) ()
(), (41)
where the posterior probability, ( | ), represents thebelief in
the hypothesis given the information . Thisprobability is obtained
by multiplying the a priori probabilityof the hypothesis () by the
probability of having giventhat is true, ( | ). The value () is
used as anormalizing constant.Themain disadvantage of the
Bayesianinference is that the probabilities () and ( | ) mustbe
known. To estimate the conditional probabilities, Panet al. [52]
proposed the use of NNs, whereas Coue et al. [53]proposed the
Bayesian programming.
Hall and Llinas [54] described the following problemsassociated
with Bayesian inference.
(i) Difficulty in establishing the value of a priori
proba-bilities.
(ii) Complexity when there are multiple potential hypo-theses
and a substantial number of events that dependon the
conditions.
(iii) The hypothesis should be mutually exclusive.
(iv) Difficulty in describing the uncertainty of the
deci-sions.
5.2. The Dempster-Shafer Inference. The Dempster-Shaferinference
is based on the mathematical theory introducedby Dempster [55] and
Shafer [56], which generalizes theBayesian theory. The
Dempster-Shafer theory provides aformalism that could be used to
represent incomplete knowl-edge, updating beliefs, and a
combination of evidence andallows us to represent the uncertainty
explicitly [57].
A fundamental concept in the Dempster-Shafer reason-ing is the
frame of discernment, which is defined as follows.Let = {
1, 2, . . . ,
} be the set of all possible states
that define the system, and let be exhaustive and
mutuallyexclusive due to the system being only in one state
,
where 1 . The set is called a frame of discernment,because its
elements are employed to discern the current stateof the
system.
The elements of the set 2 are called hypotheses. Inthe
Dempster-Shafer theory, based on the evidence , aprobability is
assigned to each hypothesis 2 accordingto the basic assignment of
probabilities or the mass function : 2
[0.1], which satisfies
(0) = 0. (42)
-
16 The Scientific World Journal
Thus, themass function of the empty set is zero. Furthermore,the
mass function of a hypothesis is larger than or equal tozero for
all of the hypotheses. Consider
() 0, 2. (43)
The sum of the mass function of all the hypotheses is
one.Consider
2
() = 1. (44)
To express incomplete beliefs in a hypothesis , the
Demp-ster-Shafer theory defines the belief function bel : 2 [0, 1]
over as
bel () =
() , (45)
where bel(0) = 0, and bel() = 1.The doubt level in can
beexpressed in terms of the belief function by
dou () = bel () =
() . (46)
To express the plausibility of each hypothesis, the functionpl :
2 [0, 1] over is defined as
pl () = 1 dou () = =0
() . (47)
Intuitive plausibility indicates that there is less uncer-tainty
in hypothesis if it is more plausible. The confidenceinterval
[bel(), pl()] defines the true belief in hypothesis. To combine the
effects of the two mass functions
1and
2, the Dempster-Shafer theory defines a rule
1 2as
1 2(0) = 0,
1 2() =
=
1()
2()
1 =0
1()
2().
(48)
In contrast to the Bayesian inference, a priori probabilitiesare
not required in the Dempster-Shafer inference, becausethey are
assigned at the instant that the information is pro-vided. Several
studies in the literature have compared the useof the Bayesian
inference and the Dempster-Shafer inference,such as [5860]. Wu et
al. [61] used the Dempster-Shafertheory to fuse information in
context-aware environments.This work was extended in [62] to
dynamically modify theassociated weights to the sensor
measurements. Therefore,the fusion mechanism is calibrated
according to the recentmeasurements of the sensors (in cases in
which the ground-truth is available). In themilitary domain [63],
theDempster-Shafer reasoning is used with the a priori information
storedin a database for classifying military ships. Morbee et al.
[64]described the use of the Dempster-Shafer theory to build
2Doccupancy maps from several cameras and to evaluate
thecontribution of subsets of cameras to a specific task. Each
taskis the observation of an event of interest, and the goal is
toassess the validity of a set of hypotheses that are fused
usingthe Dempster-Shafer theory.
5.3. Abductive Reasoning. Abductive reasoning, or inferringthe
best explanation, is a reasoning method in which ahypothesis is
chosen under the assumption that in case itis true, it explains the
observed event most accurately [65].In other words, when an event
is observed, the abductionmethod attempts to find the best
explanation.
In the context of probabilistic reasoning, abductive infer-ence
finds the posterior ML of the system variables givensome observed
variables. Abductive reasoning is more areasoning pattern than a
data fusion technique. Therefore,different inference methods, such
as NNs [66] or fuzzy logic[67], can be employed.
5.4. Semantic Methods. Decision fusion techniques thatemploy
semantic data fromdifferent sources as an input couldprovide more
accurate results than those that rely on onlysingle sources. There
is a growing interest in techniques thatautomatically determine the
presence of semantic features invideos to solve the semantic gap
[68].
Semantic information fusion is essentially a scheme inwhich raw
sensor data are processed such that the nodesexchange only the
resultant semantic information. Semanticinformation fusion
typically covers two phases: (i) build-ing the knowledge and (ii)
pattern matching (inference).The first phase (typically offline)
incorporates the mostappropriate knowledge into semantic
information. Then, thesecond phase (typically online or in
real-time) fuses relevantattributes and provides a semantic
interpretation of thesensor data [6971].
Semantic fusion could be viewed as an idea for integratingand
translating sensor data into formal languages. Therefore,the
obtained resulting language from the observations ofthe environment
is compared with similar languages thatare stored in the database.
The key of this strategy is thatsimilar behaviors represented by
formal languages are alsosemantically similar. This type of method
provides savingsin the cost of transmission, because the nodes need
onlytransmit the formal language structure instead of the rawdata.
However, a known set of behaviors must be storedin a database in
advance, which might be difficult in somescenarios.
6. Conclusions
This paper reviews the most popular methods and tech-niques for
performing data/information fusion. To determinewhether the
application of data/information fusion methodsis feasible, we must
evaluate the computational cost of theprocess and the delay
introduced in the communication.A centralized data fusion approach
is theoretically optimalwhen there is no cost of transmission and
there are sufficientcomputational resources. However, this
situation typicallydoes not hold in practical applications.
The selection of the most appropriate technique dependson the
type of the problem and the established assumptionsof each
technique. Statistical data fusion methods (e.g., PDA,JPDA, MHT,
and Kalman) are optimal under specific condi-tions [72]. First, the
assumption that the targets are moving
-
The Scientific World Journal 17
independently and the measurements are normally dis-tributed
around the predicted position typically does nothold. Second,
because the statistical techniques model allof the events as
probabilities, they typically have severalparameters and a priori
probabilities for false measurementsand detection errors that are
often difficult to obtain (atleast in an optimal sense). For
example, in the case of theMHT algorithm, specific parameters must
be established thatare nontrivial to determine and are very
sensitive [73]. Incontrast, statisticalmethods that optimize over
several framesare computationally intensive, and their complexity
typicallygrows exponentially with the number of targets. For
example,in the case of particle filters, tracking several targets
can beaccomplished jointly as a group or individually. If
severaltargets are tracked jointly, the necessary number of
particlesgrows exponentially. Therefore, in practice, it is better
toperform tracking on them individually, with the assumptionthat
targets do not interact between the particles.
In contrast to centralized systems, the distributed datafusion
methods introduce some challenges in the data fusionprocess, such
as (i) spatial and temporal alignments of theinformation, (ii)
out-of-sequence measurements, and (iii)data correlation reported by
Castanedo et al. [74, 75]. Theinherent redundancy of the
distributed systems could beexploited with distributed reasoning
techniques and cooper-ative algorithms to improve the individual
node estimationsreported by Castanedo et al. [76]. In addition to
the previousstudies, a new trend based on the geometric notion of a
low-dimensional manifold is gaining attention in the data
fusioncommunity. An example is the work of Davenport et al.
[77],which proposes a simple model that captures the
correlationbetween the sensor observations by matching the
parametervalues for the different obtained manifolds.
Acknowledgments
The author would like to thank Jesus Garca, Miguel A.Patricio,
and James Llinas for their interesting and relateddiscussions on
several topics that were presented in thispaper.
References
[1] JDL, Data Fusion Lexicon. Technical Panel For C3, F.E.
White,San Diego, Calif, USA, Code 420, 1991.
[2] D. L. Hall and J. Llinas, An introduction to multisensor
datafusion, Proceedings of the IEEE, vol. 85, no. 1, pp. 623,
1997.
[3] H. F. Durrant-Whyte, Sensor models and multisensor
integra-tion, International Journal of Robotics Research, vol. 7,
no. 6, pp.97113, 1988.
[4] B. V. Dasarathy, Sensor fusion potential
exploitation-inno-vative architectures and illustrative
applications, Proceedings ofthe IEEE, vol. 85, no. 1, pp. 2438,
1997.
[5] R. C. Luo, C.-C. Yih, and K. L. Su, Multisensor fusion
andintegration: approaches, applications, and future research
direc-tions, IEEE Sensors Journal, vol. 2, no. 2, pp. 107119,
2002.
[6] J. Llinas, C. Bowman, G. Rogova, A. Steinberg, E. Waltz,
andF. White, Revisiting the JDL data fusion model II,
TechnicalReport, DTIC Document, 2004.
[7] E. P. Blasch and S. Plano, JDL level 5 fusionmodel user
refine-ment issues and applications in group tracking, in
Proceedingsof the Signal Processing, Sensor Fusion, and Target
RecognitionXI, pp. 270279, April 2002.
[8] H. F. Durrant-Whyte and M. Stevens, Data fusion in
decen-tralized sensing networks, in Proceedings of the 4th
Interna-tional Conference on Information Fusion, pp.
302307,Montreal,Canada, 2001.
[9] J. Manyika and H. Durrant-Whyte, Data Fusion and
SensorManagement: A Decentralized Information-Theoretic
Approach,Prentice Hall, Upper Saddle River, NJ, USA, 1995.
[10] S. S. Blackman, Association and fusion ofmultiple sensor
data,inMultitarget-Multisensor: Tracking Advanced Applications,
pp.187217, Artech House, 1990.
[11] S. Lloyd, Least squares quantization in pcm,
IEEETransactionson Information Theory, vol. 28, no. 2, pp. 129137,
1982.
[12] M. Shindler, A. Wong, and A. Meyerson, Fast and
accurate-means for large datasets, in Proceedings of the 25th
AnnualConference on Neural Information Processing Systems (NIPS
11),pp. 23752383, December 2011.
[13] Y. Bar-Shalom and E. Tse, Tracking in a cluttered
environmentwith probabilistic data association, Automatica, vol.
11, no. 5,pp. 451460, 1975.
[14] T. E. Fortmann, Y. Bar-Shalom, and M. Scheffe,
Multi-targettracking using joint probabilistic data association, in
Pro-ceedings of the 19th IEEE Conference on Decision and
Controlincluding the Symposium on Adaptive Processes, vol. 19, pp.
807812, December 1980.
[15] D. B. Reid, An algorithm for tracking multiple targets,
IEEETransactions on Automatic Control, vol. 24, no. 6, pp.
843854,1979.
[16] C. L. Morefield, Application of 0-1 integer programming
tomultitarget tracking problems, IEEETransactions
onAutomaticControl, vol. 22, no. 3, pp. 302312, 1977.
[17] R. L. Streit and T. E. Luginbuhl, Maximum likelihood
methodfor probabilistic multihypothesis tracking, in Proceedings of
theSignal and Data Processing of Small Targets, vol. 2235 of
Pro-ceedings of SPIE, p. 394, 1994.
[18] I. J. Cox and S. L. Hingorani, Efficient implementation of
Reidsmultiple hypothesis tracking algorithm and its evaluation
forthe purpose of visual tracking, IEEE Transactions on
PatternAnalysis and Machine Intelligence, vol. 18, no. 2, pp.
138150,1996.
[19] K. G. Murty, An algorithm for ranking all the assignments
inorder of increasing cost, Operations Research, vol. 16, no. 3,
pp.682687, 1968.
[20] M. E. Liggins II, C.-Y. Chong, I. Kadar et al., Distributed
fusionarchitectures and algorithms for target tracking, Proceedings
ofthe IEEE, vol. 85, no. 1, pp. 95106, 1997.
[21] S. Coraluppi, C. Carthel, M. Luettgen, and S. Lynch,
All-source track and identity fusion, in Proceedings of the
NationalSymposium on Sensor and Data Fusion, 2000.
[22] P. Storms and F. Spieksma, An lp-based algorithm for the
dataassociation problem in multitarget tracking, in Proceedings
ofthe 3rd IEEE International Conference on Information Fusion,vol.
1, 2000.
[23] S.-W. Joo and R. Chellappa, A multiple-hypothesis
approachfor multiobject visual tracking, IEEE Transactions on
ImageProcessing, vol. 16, no. 11, pp. 28492854, 2007.
[24] S. Coraluppi andC. Carthel, Aggregate surveillance: a
cardinal-ity tracking approach, in Proceedings of the 14th
InternationalConference on Information Fusion (FUSION 11), July
2011.
-
18 The Scientific World Journal
[25] K. C. Chang, C. Y. Chong, and Y. Bar-Shalom, Joint
proba-bilistic data association in distributed sensor networks,
IEEETransactions on Automatic Control, vol. 31, no. 10, pp.
889897,1986.
[26] Y. Chong, S. Mori, and K. C. Chang, Information lusion
indistributed sensor networks, in Proceedings of the 4th
AmericanControl Conference, Boston, Mass, USA, June 1985.
[27] Y. Chong, S. Mori, and K. C. Chang, Distributed
multitar-get multisensor tracking, in Multitarget-Multisensor
Tracking:Advanced Applications, vol. 1, pp. 247295, 1990.
[28] J. Pearl, Probabilistic Reasoning in Intelligent Systems:
Networksof Plausible Inference, Morgan Kaufmann, San Mateo,
Calif,USA, 1988.
[29] Koller and N. Friedman, Probabilistic Graphical Models:
Princi-ples and Techniques, MIT press, 2009.
[30] L. Chen, M. Cetin, and A. S. Willsky, Distributed data
associ-ation for multi-target tracking in sensor networks, in
Proceed-ings of the 7th International Conference on Information
Fusion(FUSION 05), pp. 916, July 2005.
[31] L. Chen, M. J. Wainwright, M. Cetin, and A. S. Willsky,
Dataassociation based on optimization in graphical models
withapplication to sensor networks, Mathematical and
ComputerModelling, vol. 43, no. 9-10, pp. 11141113, 2006.
[32] Y. Weiss and W. T. Freeman, On the optimality of
solutionsof the max-product belief-propagation algorithm in
arbitrarygraphs, IEEE Transactions on InformationTheory, vol. 47,
no. 2,pp. 736744, 2001.
[33] C. Brown, H. Durrant-Whyte, J. Leonard, B. Rao, and B.
Steer,Distributed data fusion using Kalman filtering: a
roboticsapplication, in Data, Fusion in Robotics and Machine
Intelli-gence, M. A. Abidi and R. C. Gonzalez, Eds., pp. 267309,
1992.
[34] R. E. Kalman, A new approach to linear filtering and
predictionproblems, Journal of Basic Engineering, vol. 82, no. 1,
pp. 3545,1960.
[35] R. C. Luo and M. G. Kay, Data fusion and sensor
integration:state-of-the-art 1990s, in Data Fusion in Robotics and
MachineIntelligence, pp. 7135, 1992.
[36] Welch and G. Bishop, An Introduction to the Kalman
Filter,ACM SIC-CRAPH, 2001 Course Notes, 2001.
[37] S. J. Julier and J. K. Uhlmann, A new extension of the
Kalmanfilter to nonlinear systems, in Proceedings of the
InternationalSymposium on Aerospace/Defense Sensing, Simulation and
Con-trols, vol. 3, 1997.
[38] A. Wan and R. Van Der Merwe, The unscented kalman filterfor
nonlinear estimation, in Proceedings of the Adaptive Systemsfor
Signal Processing, Communications, and Control Symposium(AS-SPCC
00), pp. 153158, 2000.
[39] D. Crisan and A. Doucet, A survey of convergence results
onparticle filtering methods for practitioners, IEEE Transactionson
Signal Processing, vol. 50, no. 3, pp. 736746, 2002.
[40] J. Martinez-del Rincon, C. Orrite-Urunuela, and J. E.
Herrero-Jaraba, An efficient particle filter for color-based
tracking incomplex scenes, in Proceedings of the IEEE Conference
onAdvancedVideo and Signal Based Surveillance, pp. 176181,
2007.
[41] S. Ganeriwal, R. Kumar, and M. B. Srivastava,
Timing-syncprotocol for sensor networks, in Proceedings of the 1st
Inter-national Conference on Embedded Networked Sensor
Systems(SenSys 03), pp. 138149, November 2003.
[42] M. Manzo, T. Roosta, and S. Sastry, Time synchronization
innetworks, in Proceedings of the 3rd ACMWorkshop on Securityof Ad
Hoc and Sensor Networks (SASN 05), pp. 107116,November 2005.
[43] J. K. Uhlmann, Covariance consistency methods for
fault-tolerant distributed data fusion, Information Fusion, vol. 4,
no.3, pp. 201215, 2003.
[44] S. Bashi, V. P. Jilkov, X. R. Li, and H. Chen, Distributed
imple-mentations of particle filters, in Proceedings of the 6th
Interna-tional Conference of Information Fusion, pp. 11641171,
2003.
[45] M. Coates, Distributed particle filters for sensor
networks, inProceedings of the 3rd International symposium on
InformationProcessing in Sensor Networks (ACM 04), pp. 99107, New
York,NY, USA, 2004.
[46] D. Gu, Distributed particle filter for target tracking, in
Pro-ceedings of the IEEE International Conference on Robotics
andAutomation (ICRA 07), pp. 38563861, April 2007.
[47] Y. Bar-Shalom, Update with out-of-sequencemeasurements
intracking: exact solution, IEEE Transactions on Aerospace
andElectronic Systems, vol. 38, no. 3, pp. 769778, 2002.
[48] M. Orton and A. Marrs, A Bayesian approach to
multi-targettracking and data
fusionwithOut-of-SequenceMeasurements,IEE Colloquium, no. 174, pp.
15/115/5, 2001.
[49] M. L. Hernandez, A. D. Marrs, S. Maskell, and M. R.
Orton,Tracking and fusion for wireless sensor networks, in
Proceed-ings of the 5th International Conference on Information
Fusion,2002.
[50] P. C. Mahalanobis, On the generalized distance in
statistics,Proceedings National Institute of ScienceIndia, vol. 2,
no. 1, pp.4955, 1936.
[51] S. J. Julier, J. K. Uhlmann, and D. Nicholson, A methodfor
dealing with assignment ambiguity, in Proceedings of theAmerican
Control Conference (AAC 04), vol. 5, pp. 41024107,July 2004.
[52] H. Pan, Z.-P. Liang, T. J. Anastasio, and T. S. Huang,
HybridNN-Bayesian architecture for information fusion, in
Proceed-ings of the International Conference on Image Processing
(ICIP98), pp. 368371, October 1998.
[53] C. Coue, T. Fraichard, P. Bessie`re, and E. Mazer,
Multi-sensordata fusion using Bayesian programming: an automotive
appli-cation, in P