-
Pattern Recognition 41 (2008) 3481 -- 3492
Contents lists available at ScienceDirect
Pattern Recognition
journal homepage: www.e lsev ier .com/ locate /pr
Recognising online spatial activities using a bioinformatics
inspired sequencealignment approach
Daniel E. Riedel, Svetha Venkatesh, Wanquan Liu
Department of Computing, Curtin University of Technology, GPO
Box U1987 Perth, Western Australia 6845, Australia
A R T I C L E I N F O A B S T R A C T
Article history:Received 25 May 2007Received in revised form 16
April 2008Accepted 17 April 2008
Keywords:Activity recognitionBioinformaticsSequence
alignmentDynamic time warping
In this paper we address the problem of recognising embedded
activities within continuous spatial se-quences obtained from an
online video tracking system. Traditionally, continuous data
streams such asvideo tracking data are buffered with a sliding
window applied to the buffered data stream for activitydetection.
We introduce an algorithm based on Smith--Waterman (SW) local
alignment from the field ofbioinformatics that can locate and
accurately quantify embedded activities within a windowed
sequence.The modified SW approach utilises dynamic programming with
two dimensional spatial data to quantifysequence similarity and is
capable of recognising sequences containing gaps and significant
amounts ofnoise. A more efficient SW formulation for online
recognition, called Online SW (OSW), is also developed.Through
experimentation we show that the OSW algorithm can accurately and
robustly recognise man-ually segmented activity sequences as well
as embedded sequences from an online tracking system. Tobenchmark
the classification performance of OSW we compare the approach to
dynamic time warping(DTW) and the discrete hidden Markov model
(HMM). Results demonstrate that OSW produces higherprecision and
recall than both DTW and the HMM in an online recognition context.
With accuratelysegmented sequences the SW approach produces results
comparable to DTW and superior to the HMM.Finally, we confirm the
robust property of the SW approach by evaluating it with sequences
containingartificially incorporated noise.
Crown Copyright 2008 Published by Elsevier Ltd. All rights
reserved.
1. Introduction
Video surveillance and automatic recognition of human
activitiesis becoming increasingly important in modern society.
Such recog-nition systems can be used to detect suspicious
activities in complexenvironments such as airports and railway
stations, to control inter-faces in human computer interaction
(HCI) applications or to pro-vide a means for monitoring elderly
individuals in a caring capacity.In this paper we recognise human
activities, represented by spatialsequences, in a simulated smart
home environment and further re-strict the problem domain to the
elderly. In doing so, we assume thatthe elderly carry out
activities in a habitual manner; an assumptionconsistent with the
findings of Monk et al. [1] and Suzuki et al. [2].Providing smart
houses for the elderly is important in maintainingthe quality of
life and independence of the aging population and re-ducing the
on-going costs of care associated with that maintenance.
Corresponding author. Tel.: +61892662746.E-mail addresses:
[email protected] (D.E. Riedel),
[email protected]
(S. Venkatesh), [email protected] (W. Liu).
0031-3203/$30.00 Crown Copyright 2008 Published by Elsevier Ltd.
All rights reserved.doi:10.1016/j.patcog.2008.04.019
Activity recognition is the task of identifying an action or
se-ries of actions taken in pursuit of an objective. This view of
activityrecognition differs from existing works where researchers
view thedetection of walking, running or similar primitive actions
as activityrecognition. In a smart home setting, objectives could
include butnot be limited to having breakfast, reading a newspaper,
watchingtelevision, cooking, having a shower or going to sleep. The
seriesof actions that one would need to recognise in order to
determinewhether an objective has been accomplished may involve
walkingto a table, opening a cupboard, sitting on a chair or even
lying on abed. We focus our attention on the spatial component of
activitiesas they can be obtained non-invasively from video
tracking systemsand for the majority of activities spatial
signatures are unique.
Much work has been done in the area of human activity
recog-nition over the last decade. Existing methodologies can be
classifiedaccording to the manner by which activities are modelled,
produc-ing two distinct methodologies: state-space models and
templatematching techniques [3]. State-space models attempt to
capturethe variation in spatial sequences. These approaches include
neuralnetworks [4--6], hidden Markov models (HMMs) [7] and
extensionsto the HMM [8--12]. The HMM and its' variants have been
used
http://www.sciencedirect.com/science/journal/prhttp://www.elsevier.com/locate/prfile:[email protected]:[email protected]:[email protected]
-
3482 D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 --
3492
successfully in dealing with uncertainty but suffer from high
trainingcomplexity, in particular the multi-layer and hierarchical
models.Template matching approaches such as [13--17] compare
extractedfeatures to pre-stored patterns or templates, but have
issues withhigh runtime complexity, noise intolerance, spatial
activity variation,and/or viewpoint specificity.
The activity sequences used by these activity recognition
ap-proaches are typically captured using online video tracking
sys-tems. These systems produce continuous and uniform streams
oftracking data that contain known activities and non-activity
sub-sequences, corresponding to movement between activities and
de-viations from known activity paths. In order to isolate
individualspatial activity sequences for quantification with models
or tem-plates one can either use a sliding window with width w on
thebuffered stream (Fig. 1(a)--(c)) or segmentation of the data
stream.Sliding window approaches compare the fixed length window
se-quence to pre-stored templates for similarity quantification
prior toclassification, as shown in Fig. 1(d). Segmentation
techniques recog-nise and extract embedded activities through
location of activityboundaries, prior to similarity quantification
and classification. Seg-mentation of continuous data is not a new
problem and has beenaddressed previously in domains such as speech
and gait recogni-tion. Unfortunately, in the spatial activity
domain segmentation ismore difficult as activity boundaries are not
so obvious.
Few methods have been proposed to specifically deal with
onlineactivity segmentation. In the methods of Bobick and Ivanov
[18] andIvanov and Bobick [19] a sliding window is applied to an
observedsequence to allow for inferencing with low level HMMs. The
ob-served sequence is then labelled accordingly. Like other sliding
win-dow approaches, the performance of this technique is sensitive
tothe specified window size. Another segmentation approach is
givenby Peursum et al. [20], where observed sequences are
segmentedand classified using HMMs trained with manually labelled
activitysequences. During classification, the probability of a
sequence hav-ing a particular label is determined and through
calculation of theprobability at each time instance, the boundaries
of the activities canalso be found.
To apply any of the above activity segmentation methodolo-gies
in isolation, for segmentation of a continuous data stream,
isproblematic. This is because the segmentation components of
theapproaches are intertwined with the recognition capabilities.
There-fore, one must still adopt a sliding window approach to
identifyembedded activities, particularly if one wishes to use
un-related se-quence matching techniques for similarity
quantification. Given thisconstraint, two issues relating to
sliding windows need to be ad-dressed. The first of these is the
window size w. If one assumes thatactivities are conducted over a
similar duration, as are habitual ac-tivities with the elderly, and
in ideal tracking conditions (such asin controlled indoor
environments) then it is appropriate to use awindow size
corresponding to the length of the longest activity.
Re-alistically, the window size must be set to some value larger
thenthe longest activity length, taking into account a feasible
increase inpossible activity duration. With an appropriately sized
sliding win-dow (normally set to the length of the longest activity
or the averagelength 2. 5 standard deviations), the second issue
relates to locat-ing an activity within that window sequence as
shown in Fig. 1(d).Quantifying window sequences poses a problem for
classificationas the corresponding sequences can contain additional
subsequenceelements, which are not part of an embedded activity.
These super-fluous elements can in turn reduce the probability of
an activity oc-curring in relation to a learnt model or increase
the aligned distancebetween a class template. Even if duration is
constant across activ-ities, variation in captured sequence length
occurs, due to trackingsystems failing to consistently track
objects. Some possible reasonsfor the failure result from
occlusions, lighting variation, deficien-
cies in background subtraction techniques and geometric
modellinglimitations.
Techniques like the HMM [21] take the whole window sequenceinto
account when calculating the probability of an observed se-quence
belonging to a given model. As a result, the superfluous el-ements
decrease the resulting sequence probability, particularly ifthey
have estimated symbol probabilities close to zero. Dynamictime
warping (DTW) [22] and similar global sequence alignment
ap-proaches such as edit distance with real penalty (ERP) [23] and
editdistance on real sequence (EDR) [24] are also susceptible to
super-fluous sequence elements. This occurs as the techniques
attempt tominimise the distance across the entirety of both the
known and ob-served sequences, taking into account the additional
distance fromthe superfluous elements. Similarity algorithms based
on the longestcommon subsequence (LCSS) [25,26] address this global
limitationby ignoring superfluous elements in the observed
sequences. Unfor-tunately, the techniques also allow significant
deviations in a pat-tern, which can lead to incorrect
classification. For instance, if onemeasures the LCSS between a
known 1D sequence a= [13] and twoobserved sequences b= [123] and c=
[1224443], both b and c havethe same LCSS length (of three),
indicating that both are equally sim-ilar to a. However, by visual
inspection, one would say that b is moresimilar to a.
In order to recognise known spatial activity patterns andto
address the above-mentioned deficiencies, we apply
theSmith--Waterman (SW) local alignment approach from
bioinfor-matics and modify the algorithm for two dimensional online
spatialactivity recognition. In previous work [27], we proposed the
use ofSW for spatial activity recognition and successfully
evaluated theapproach using accurately and inaccurately segmented
activity se-quences. In this paper, we provide OSW (online SW), a
more efficientSW formulation for online recognition. Unlike DTW and
HMMs,OSW does not require accurate sequence segmentation to
correctlyrecognise embedded sequences, such as those found in
sliding win-dows of online recognition systems. To prove the
superiority of theOSW approach over DTW and the discrete HMM, we
evaluate it inan online context with a sliding window and using a
12 activity dataset. We also demonstrate the effectiveness of SW
with accuratelysegmented activity sequences. The robustness claim
of SW overDTW and the HMM is further validated by evaluating the
approachwith accurately segmented spatial sequences containing
noise.
The layout of the paper is as follows. In Sections 2--4 we
intro-duce sequence alignment, discuss the SW algorithm and the
mod-ifications for spatial sequence recognition, and then the
proposedOSW algorithm. In order to perform a benchmark comparison
of SW,we also provide a discussion of DTW (Section 5) and the
discreteHMM (Section 6) and factors involved with their application
in spa-tial activity recognition. Following from this, we present
our datacollection and experimental methodology in Section 7.
Results fromevaluation with a 12 activity data set are shown in
Section 8 and aconclusion is presented in Section 9.
2. Sequence alignment
Sequence alignment methods are concerned with finding the
bestmatching alignments of two query sequences according to
specifiedoptimisation criteria; typically maximising similarity or
minimisingdistance. In order to derive the optimal alignments, each
symbolis compared sequentially with the symbols of the other
sequence.During this stage the local similarity or distance is
calculated be-tween the opposing symbols and using techniques such
as dynamicprogramming (DP), optimal subalignments and finally an
optimalalignment are produced. With the maximising similarity
criteria apositive score is associated with matching symbols, while
negativescores are given to non-matching symbols and
insertions/deletions
-
D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 -- 3492
3483
Fig. 1. Online spatial activity recognition using a sliding
window. (a) Sliding window at t=1, (b) sliding window at t=2, (c)
sliding window at t=3, (d) classification process.
Fig. 2. An example sequence alignment containing gaps of size
one and two.
(referred to as indels). Indels, denoted by the symbol (-) in an
align-ment diagram, are used to represent insertions or deletions
in eithersequence and are incorporated into alignments to fill gaps
causedby differences in either of the sequences. For non-matching
sym-bols, the choice whether to include an indel in the alignment
or notis dependent on which of the options is more optimal, that is
has ahigher total similarity or smaller overall distance.
An alignment assumes that two sequences a and b satisfy
thefollowing constraints [28]:
1. All symbols in the sequences a and b must also be in the
align-ment. For example, if a= [123] and b= [124] the resulting
align-ment must be of the form:
1 2 3 | |
1 2 4 where represents zero or more indels.
2. All symbols in the alignment must appear in the same order
asdefined in the sequences a and b, except that zero or more
indelsor gaps may be present between the symbols, as in the
aboveexample.
3. Symbols of one sequence can be aligned with an indel in the
othersequence. For example, if a= [123] and b= [13], the 2 in a can
bealigned with an indel as shown below:
1 2 3| |1 - 3
4. Indels of different sequences cannot be aligned together.An
example pairwise alignment of one dimensional sequences a
=[12223421144] and b = [111341121114] is shown in Fig. 2. Thegiven
alignment contains seven matching symbols, three symbolmismatches
and three indels.
Fig. 3. Schematic global and local alignments between two
sequences. (a) Globalalignment, (b) local alignment.
Sequence alignment was pioneered in bioinformatics with a
DPapproach in Ref. [29]. This technique aligned two amino acid
se-quences across their entirety, maximising the similarity score
of thematching individual elements. A DP basis is used as it is
possible foroptimal alignments to be calculated from incrementally
derived sub-alignments as each subalignment is itself optimal.
Local alignmentapproaches such as SW [30] differ from global
techniques as theyfind and quantify related regions of similarity
within sequences. Thelocal alignments are typically found via an
optimisation search pro-cess, originating at the beginning of the
sequences until the ends. Anillustrative example of global and
local alignment using sequences aand b can be found in Fig. 3.
Throughout the rest of the paper we use the following notation.A
symbol ai or bj represents a trajectory tuple (x, y), where x and
ydenote the position within a two dimensional tracking space.
Thesequences a and b with lengths |a| and |b| are composed of
sym-bols organised in time sequential order, where i and j (1 i
|a|and 1 j |b|) determine the position within that
correspondingsequence. Thus, a sequence of symbols a= [a1, a2, . .
. , |a|] can also berepresented as a sequence of tuples a=[(x1,
y1), (x2, y2), . . . , (x|a|, y|a|])and in combination as in a =
[(ax1, a
y1), (a
x2, a
y2), . . . , (a
x|a|, a
y|a|)]. To
simplify the notation we will predominately be using the
symbolform.
3. SW local alignment
SW local alignment was originally developed in Ref. [30] to
lo-cate biological sequence patterns within known sequence
databasesand was first applied to spatial recognition in Ref. [27].
SW is simi-lar to the Needleman--Wunsch global alignment approach
[29], ex-cept that SW includes an extra zero. Inclusion of the zero
allows
-
3484 D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 --
3492
termination of subsequence alignments that perform poorly, as
non-matching subsequences produce negative similarity, which
reducesthe similarity between the subsequences. When the similarity
de-creases to less than zero, the zero of the SW relation
terminates anyfurther decrease in similarity and allows new optimal
subsequences,referred to as local alignments, to be found.
Similarity-based sequence alignment techniques like
Needleman--Wunsch global alignment typically have distance
counterparts asthe negative penalties assigned to non-matching
sequence elementscan be replaced by a positive penalty in the
distance form. SW isdistinct in that has no distance counterpart
[31]. This is becausethe algorithm uses negative similarity in
conjunction with the extrazero to terminate poorly matching
subsequence alignments, whichcannot be mimicked in a distance based
approach.
In order to apply SW to two dimensional spatial sequences
re-quired several modifications to be made to the original
algorithm.Firstly, an Euclidean matching function d(ai, bj) and
correspondingmatching threshold were proposed. The matching
threshold al-lows one to specify the maximum allowable Euclidean
distance be-tween points for matching, which adds flexibility to
the matchingprocess of the original specification. Formally, if the
Euclidean dis-tance between the symbols ai and bj is less than
thematching thresh-old then a match occurs and a positive score is
attributed, that iss(ai, bj) = . If the Euclidean distance is equal
to or larger than thematching threshold , reflecting a non-matching
state, a real penaltyis assigned. The real penalty d(ai, bj) is
based on the Euclideandistance (1) and was proposed for use with
the stepwise functions(ai, bj), in the modified SW algorithm.
d(ai, bj)=(axi bxj )
2 + (ayi byj ) (1)
The use of a real penalty function, rather than a constant, was
shownto improve the discrimination capability of the SW algorithm,
asdemonstrated in Ref. [27].
Some consideration is required when choosing an
appropriatematching threshold . If the specified of an x, y
coordinate space istoo large, SW over generalises and matches
dissimilar sequences asdepicted in Fig. 4(a), whilst if is too
small, then matching becomeshighly specific (Fig. 4(b)), preventing
recognition of similar sequencesand thus reducing the recall
statistics.
For gap scores a linear gap model with a gap penalty
associatedwith each indel is adopted in the modified SW algorithm.
Gap scoresare typically a function of the gap length l, denoted by
g(l). A lineargap model is where g(l)=l, that is each indel is
equally weighted.
Resulting alignments are dependent on the values of the
gappenalty , match cost and the matching threshold . If is largerin
relation to the average mismatch penalty, which is dependenton ,
mismatches are favoured over gaps, producing shorter, morecompact
alignments. The opposite occurs when is smaller thanthe average
mismatch penalty. In relation to the match cost , onedoes not want
? or the mismatch penalty otherwise SW ignoresmismatches and gaps
and therefore behaves similar to LCSS.
To calculate the similarity of two spatial sequences a and b
usingthe proposed SW based approach one simply applies (3)--(5) to
theDP matrix C (2) for i = 0, 1, . . . , |a| and j = 0, 1, . . . ,
|b| and finds themaximum value in C.
C(0, 0) . . . C(0, |b|)...
. . ....
C(|a|, 0) . . . C(|a|, |b|)
(2)
At each C(i, j) where i, j = 0, four choices (match or mismatch,
gap ina, gap in b or start a new subsequence) are evaluated with
the choicecorresponding to the maximum similarity value being
selected foreach C(i, j). The match or mismatch score at each C(i,
j) is derived
Fig. 4. The affect of on spatial activity recognition. (a) too
large, activity B isrecognised as A, (b) too small, alternate
activity A sequence not recognised.
using s(ai, bj) as previously described, while the gap scores
for thesequences are derived using the linear gap model. If a
negative sim-ilarity score results from C(i 1, j 1) + s(ai, bj),
C(i 1, j) + andC(i, j 1) + , due to poor subsequence
correspondence, then thefourth option of starting a new
subsequence, represented by zero, isselected as the maximum.
C(i, 0)= 0, 0 i |a| (3)C(0, j)= 0, 0 j |b| (4)C(i, j)= max{C(i
1, j 1)+ s(ai, bj),
C(i 1, j) ,C(i, j l) , 0} (5)
where
s(ai, bj)={ d(ai, bj)
-
D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 -- 3492
3485
Fig. 5. Example sequences a and b.
Table 1SW C matrix using example sequences a and b
(1.0,1.0) (2.0,2.0) (3.0,3.0)
0.00 0.00 0.00 0.00(0.0,0.0) 0.00 0.00 0.00 0.00(1.0,1.0) 0.00
1.00 0.00 0.00(2.0,2.0) 0.00 0.00 2.00 1.00(4.0,4.0) 0.00 0.00 1.00
0.59(5.0,5.0) 0.00 0.00 0.00 0.00(6.0,6.0) 0.00 0.00 0.00
0.00(1.0,1.0) 0.00 1. 00 0.00 0.00(2.0,2.0) 0.00 0.00 2. 00
1.00(3.0,3.0) 0.00 0.00 1.00 3. 00
Fig. 6. An optimal SW alignment using sequences a and b.
4. OSW local alignment
The naive approach for determining SW similarity in an
onlinerecognition context is to calculate the full DP matrix for
each of thesliding windows of width w, resulting in an O((w + 1)
(|Ci| + 1))complexity for each window, where |Ci| is the length of
the classtemplate sequence i. The window size w is typically set to
the lengthof the longest training sequence. Instead, a more
computationallyefficient method is proposed. In this formulation,
the online systemretains a DP matrix of size (w + 1) (|Ci| + 1) in
memory for thei=1, 2, . . . ,n class templates. Each of the i=1, 2,
. . . ,n DP matrices areinitialised by calculating the SW score
using (3)--(5) with the firstbuffered window sequence at time t0 to
tw1 and the correspond-ing class template Ci, according to Fig. 7.
The n SW scores generatedby the initialisation process are stored
in memory for comparison tosubsequent SW scores of sliding windows,
where t >0, and for com-parison to predetermined classification
thresholds i for a specificclass Ci. The row of the maximum SW
score is also retained for eachDP matrix to ensure the current
maximum remains in the matrixas the window slides during
incremental updating. The rationale forthis is explained later.
To quantify further window sequences, we take advantage ofthe
fact that DP optimal alignments are derived incrementally
fromsubalignments that are also optimal. In order to illustrate the
in-cremental derivation of optimal alignments we use the sequencesa
= [(1, 1), (2, 2)] and b = [(1, 1), (2, 2), (3, 3)], with = 1, = 1
and = 1. The first row of the resulting DP matrix is initialised
usingEq. (3), as shown in Table 2(a). The second row of the DP
matrix(Table 2(b)), which corresponds to the optimal alignment
betweena and b1 = [(1, 1)], is then calculated using Eq. (5), with
the first col-umn of the row initialised to zero according to Eq.
(4). The third rowof the DP matrix, seen in Table 2(c), is
calculated by comparing awith b2 = (2, 2) according to Eq. (5).
This extends the optimal align-ment to between a and b1:2 = [(1,
1), (2, 2)], as the optimal align-ment between a and b1:2=[(1, 1),
(2, 2)] comprises the optimal align-ment between a and b1 = [(1,
1)], obtained through calculation ofthe second row of values. To
calculate the final row in the DP matrix(Table 2(d)) we apply a
similar methodology.
With incremental derivation we can add new rows to the endof
each Ci DP matrix and along the window sequence axis (thevertical
axis in the example), and calculate only the values of therow
according to the new online element, the class template Ci andEq.
(5). This requires O(|Ci|+1) time for each window, in comparisonto
the O((w+ 1) (|Ci| + 1)) time of the naive approach. To preventthe
DP matrices from growing in size as new rows are added and toallow
a traceback procedure to be carried out on the DP matrix (tofind
segmentation beginning and end points with a sufficiently
largewindow sizew), we remove the first rows of the DP matrices
prior tonew rows being added. This also means that the variable
that storesthe row of the current max value (Row Max) for each DP
matrix mustconsequently be decremented. In the event that any of
the variablesdesignate rows that have been deleted from the DP
matrix, a searchof the matrix is made to find the new maximum score
and Row Maxis consequently updated.
Optimal subsequences, corresponding to an embedded
activitywithin a sliding window (assuming w is large enough), can
existanywhere within the DP matrix, as the beginning and ends of
theDP matrices are constantly changing. Therefore, to address this
issuewe store the current maximum similarity value and its row
(RowMax) for each of the i matrices. The row position corresponds
to theend of the optimal local alignment and is used as the origin
for atraceback procedure, while the maximum similarity values are
usedin thresholding and classification. For clarity we show the
online DPmatrix update procedure with 1D sequences in Fig. 8 (a)
and (b), witha gap penalty, mismatch penalty and match cost of one.
In Fig. 8(a)the windowed sequence [9876] at t0 to t3 is compared to
the classtemplates C1 and C2 using Eqs. (3)--(5), thereby
initialising the DPmatrices. As the sliding window is moved to the
next window at t1 tot4 (Fig. 8(b)), the first underlined rows of
the DP matrices in Fig. 8(a),are deleted and a new row is added to
the end of the matrices. Foreach of the new rows at j=4, we apply
Eq. (5) using the previous rowsvalues at j=3, the class template Ci
and the new element in the newsliding window, that is five in the
given example, in order to derivethe new SW local alignment. The
row-by-row update procedure isrepeated for further sliding window
sequences. If an observed spatialsequence does not fit within the
specified window size w, possiblydue to an activity taking longer
than expected (e.g. watching TV),the DP matrices retain the
similarity scores of the previous matchesand thus a similarity
score can still be determined. Unfortunately,full segmentation can
no longer occur as the beginning point of theobserved sequence
would have been deleted to allow calculation ofthe new buffer
elements.
In specific applications, OSW can be modified to decrease
spacerequirements and improve segmentation performance. For
example,if segmentation and recovery of an optimal alignment is not
neces-sary the space requirements of the OSW DP matrices can be
signifi-cantly reduced by utilising only the last row of the
matrices. This ispossible as new elements only require the last row
for the currentSW calculation. Additionally, the Row Max variables
are no longerrequired.
Some spatial activities like watching TV have inconsistent
se-quence lengths, thus making the selection of a suitable window
sizew difficult. If w is sufficiently large to deal with the highly
variablesequence lengths, the efficiency of the OSW algorithm
deterioratesrapidly. To maintain computational efficiency we
propose using awindow size of w = 1 and only a single row of
previously calcu-lated SW values with which to perform the current
SW calculation.
-
3486 D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 --
3492
Fig. 7. Initialisation of the DP matrices using OSW.
Table 2Incremental derivation of DP matrices
(a) Step 1
(1,1) (2,2)
0 0 0
(b) Step 2(1,1) (2,2)
0 0 0(1,1) 0 1 0
(c) Step 3(1,1) (2,2)
0 0 0(1,1) 0 1 0(2,2) 0 0 2
(d) Step 4(1,1) (2,2)
0 0 0(1,1) 0 1 0(2,2) 0 0 2(3,3) 0 0 1
Additionally, a traceback threshold is required for triggering
thetraceback procedure. To carry out the traceback another data
streambuffer Bufpast containing previous window elements is
necessaryto store past elements. As new buffer elements are
retrieved, a newrow of the DP vectors is calculated using the
previous row and des-ignated template Ci. The maximum value in that
row SWmax is thendetermined and the value compared to the traceback
threshold .If SWmax
-
D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 -- 3492
3487
Fig. 8. OSW recognition with a window size of w = 4: (a) DP
initialisation with window sequence t0 to t3 and (b) online
updating with window sequence t1 to t4.
Fig. 9. Elastic matching using DTW.
restricts the allowable steps in the warping path to adjacent
cellsin the DP matrix. It is stated formally as, given wk+1 = (i,
j) thenwk = (i, j), where i i1 and j j1.
To efficiently calculate the DTW distance of two time series
se-quences a and b a DP approach is utilised. A DPmatrix C of size
|a||b|is initialised according to Eq. (6). To calculate the DTW
distance weapply Eq. (7) for values of i = 1, 2, . . . , |a| 1 and
j = 1, 2, . . . , |b| 1.The resulting DTW distance is obtained from
the DP matrix at C(|a|1, |b| 1).
C(0, 0)= d(a1, b1)C(i, 0)= C(i 1, 0)+ d(ai, b1), i= 1, 2, . . .
, |a| 1C(0, j)= C(0, j 1)+ d(a1, bj), j= 1, 2, . . . , |b| 1
(6)
C(i, j)=minC(i 1, j 1)C(i 1, j)C(i, j 1)
+ d(ai, bj) (7)
A warping path or alignment can be recovered from C using
atraceback procedure, originating at C(a 1,b 1) and terminat-ing at
C(0, 0), or by using a pointers matrix and retaining point-
ers to the local minima selected at each i and j during the
DPcalculation.
The DTW algorithm presented previously has no local
continuityconstraints (does not restrict the slope of the warping),
unlike thosespecified in Ref. [22]. Normally, choosing an optimal
local constraintis application and domain specific. In the case of
spatial activityrecognition, preliminary investigations have shown
that no restric-tion on the slope of the warping is optimal in
relation to recognitionperformance.
6. Hidden Markov model
The HMM is a stochastic state transition model, capable
ofdealing with time sequential data [21]. It was first applied in
theactivity recognition domain, in Ref. [7], where mesh features
wereextracted from time sequential images of tennis strokes and
usedin training and evaluation of the discrete model. Since then
theHMM has been utilised extensively in activity recognition
research,particularly through the multi-layer and hierarchical
forms [8--12].Adoption of the approach has been motivated by the
models abil-ity to deal with noisy observations and its' high
discriminationproperties.
In this study, we utilise the discrete HMM to recognise spa-tial
activity sequences. A discrete HMM is characterised by a num-ber of
hidden states N, distinct observation symbols per state M,state
transition probability matrix A (A = {aij}), observation sym-bol
probability distribution matrix B (B = {bj(k)}) and the
initialstate distribution vector . A derived HMM is typically
repre-sented by the tri-tuple of parameters {,A,B}, which represent
thefollowing:
= Pr(q1 = Sj), 1 iNaij = Pr(qt+1 = Sj|qt = Si), 1 i, jNbj(k)=
Pr(vk at t|qt = Sj), 1 jN, 1kM
-
3488 D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 --
3492
where qt is the state at time t, S is the individual states
suchthat S = {S1, S2, . . . , SN} and V denotes the individual
symbolsV={v1, v2, . . . , vM}. The HMM model parameters ,A,B are
estimatedusing the Baum--Welch (Forward--Backward) algorithm;
however,scaling [21] is used in both the model estimation and
inferencing,due to the use of lengthy observation sequences.
The discrete HMM can produce models capable of
accuratelyrepresenting a class of sequences but some considerations
must betaken into account when dealing with spatial sequences, due
totheir inherent noise and long length. Firstly, the issue of
choos-ing an appropriate number of hidden states N for each model.
Thevalue of N affects both the runtime of the Baum--Welch
algorithmand the generalisation capability of the model. If N is
too small, thetraining and inferencing runtime are smaller but the
models abil-ity to adequately represent the data also decreases. In
contrast ifN is large, training and inferencing complexity is
increased alongwith generalisation capability; however, the model
may overfit thedata if N is too large, reducing the recall
statistics. Also, if the spa-tial sequence length is long
(>1000), the HMM also fails to cap-ture the nature of the
training data resulting in lower classificationaccuracy.
Another significant issue with the discrete HMM is the amountof
data required for training. HMMs require a large number of
train-ing sequences to produce good generative models. If only
limitedspatial sequences are available for training, models exhibit
lowerrecognition accuracy when evaluated with observed sequences.
Inaddition to the amount of training data required, the discrete
HMMalso encounters issues when calculating the Pr(O|), where O is
theobserved spatial sequence. If an observed sequence contains a
se-ries of symbols not present in the training sequences arising
fromnoise or superfluous sequence elements, that is bj(k) = 0, the
de-rived probability of the overall sequence, calculated using a
scaledforward procedure, will tend to zero in turn causing a
negative in-finity result when calculating the log likelihood. This
characteristiccan be minimised by initialising the symbol
probability matrix B tosmall positive values, but it is possible
that long runs of low prob-ability symbols not observed in the
training data, will dramaticallyreduce the total sequence
probability.
7. Data collection and methodology
For this evaluation, spatial activity sequences are collected in
asimulated smart house environment (Fig. 10(a)) using the
multiplecamera, QOS-based tracking system of Nguyen et al. [11].
The dataset comprises 12 single person activities of 90 s in length
with 20sequences per activity. Activities include such things as
making toast,having breakfast, washing the dishes and watching
television. Videosequences are captured at 10 frames per second and
then processedto obtain two dimensional trajectories for each
frame. To obtainsymbolised one dimensional sequences for the
discrete HMM, thesmart house environment is discretised into one
square metre grids(Fig. 10(b)) with each activity sequence,
composed of x, y trajectories,being mapped to a sequence of unique
integers u, where u U andU = 1, 2, 3, . . . , 72.
In the following experiments we divide the 12 activity data
setinto training and testing sets. In our case the training set
sequencesare used to empirically determine optimal algorithm
parameters andfor use as class templates in testing. To quantify
the recognition per-formance of the algorithms with the testing
sets, a cross-validationmethodology with threshold-based nearest
neighbour (NN) classifi-cation is adopted. In this approach each
experiment utilises 30 ran-domly generated training sets for
evaluation, from which we takethe mean of the 30 test results as
the conclusive result. Thresholdsare derived and used per activity
to determine whether an activityoccurs. This is necessary as it is
unrealistic to firstly learn all activi-
ties that may occur in a given environment and secondly to
assumethat learnt activities are always occurring.
8. Experimental results
In this section we give our experimental results that
demonstratethe ability of SW to accurately recognise windowed
online spatialsequences as well as its high accuracy and robust
characteristicswith accurately segmented sequences. The algorithms
used in theexperiments are developed in Matlab and C according to
the spec-ifications in Sections 3--7. For the DTW benchmark
comparison weuse the symmetric algorithm defined in Ref. [22] with
no local orglobal constraints. Optimal SW and HMM algorithm
parameters areempirically derived from the accurately segmented
training data asshown in Section 8.1. Using the optimally derived
parameters wefurther evaluate the proposed algorithms in relation
to accuracy androbustness and contrast these results to the
existing DTW and HMMspatial activity recognition approaches.
8.1. Parameter selection
With accurately segmented activity sequences we find the
op-timal SW sequence alignment parameters, in relation to
accuracy,with the exception of which is set according to the
required recog-nition task. For instance, automatic activity
recognition in a caringcapacity requires observed patterns to be
strictly matched to thosein the training set, thus requiring small
values of . On the otherhand in a retrieval type role where one
wishes to recognise all sim-ilar activities then should be set to a
larger value. Throughout theexperimentation, we use = 1. 0m to
coincide with the size of thesymbolic states used in HMM
mapping.
Using a fixed we first address the issue of selecting an
appropri-ate value of for the proposed SW algorithm. As specified
in Section3 is the linear gap penalty associated with insertion or
deletionof one more trajectories in either sequence. To minimise
the effectof the match cost during the evaluation we set it to a
constantand use values of between 1 to 10 with cross-validation and
NNclassification in order to find an optimum value. The results
withthe given data set are shown in Fig. 11(a). From Fig. 11(a), a
valueof =2. 0 provided the maximum classification accuracy with
largervalues producing only marginally worse performance. Using =
2. 0we then determined an optimum value for the match score
withvalues between 1 and 10. Results are shown in Fig. 11(b) with
=5. 0producing maximum classification accuracy. Therefore, the
follow-ing experiments used SW parameters of =1. 0m, =2. 0 and =5.
0.It it interesting to note that with the current data set
recognitionperformance does not appear to be sensitive to the
values of the dif-ferent parameters (excluding ), as evidence by
the small changes inrecognition performance with changing parameter
values. The uni-form recognition behaviour with different
parameters has also beennoted in evaluation with other data sets
[27]. From this trend, it ispossible for one to choose a set of SW
parameters without having toempirically determine the optimal
parameter set, and expect a nearoptimal recognition
performance.
For the evaluation, the discrete HMMs used a fixed number
ofsymbols M = 72 and an empirically determined number of
hiddenstates N. To ensure adequate training of the HMMs the number
ofiterations of the Baum--Welch estimation algorithm were limited
bya threshold (
-
D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 -- 3492
3489
Fig. 10. Layout and mapping of the mock smart house environment:
(a) smart house layout and (b) spatial grid for 1D sequence
mapping.
Fig. 11. SW parameter optimisation: (a) evaluation and (b)
evaluation.
8.2. Online activity recognition using SW, DTW and the HMM
As previously mentioned, superfluous elements caused by
thetransition of one activity to the next, can interfere with
sequencequantification methods that utilise whole window sequences
forevaluation. To address this issue we propose the OSW
approach
Fig. 12. HMM number of hidden states N versus classification
accuracy.
which is able to quantify embedded activities within the window
se-quence, therefore ignoring superfluous elements. Furthermore,
weevaluate the effectiveness of SW with a synthetically generated
on-line data stream and contrast the results to DTW and the
discreteHMM.
In the following experiment ten training sequences are
randomlyselected per activity to use as class templates and
additionally fordetermining class thresholds for threshold-based NN
classification.The remaining 10 testing sequences per activity are
used to generatea synthetic online sequence in conjunction with
superfluous transi-tion sequences between the activities. Class
thresholds are derivedby measuring the average intraclass distance
between training se-quences and/ormodels two standard deviations
(minus for SW andHMM, and plus for DTW). With the intraclass
thresholds and the de-rived optimum parameters in Section 8.1, a
threshold-based NN clas-sification experiment is carried out with
the online data stream andusing different window sizes. Initially,
we set the window size w ac-cording to the length of the longest
activity in the set and evaluate thedifferent techniques. We then
increasew by 5% and 10% of the lengthof the longest sequence to
observe the affect of larger windowssizes on the approaches. A true
positive (TP) occurs when values
-
3490 D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 --
3492
of the correct activity exceed their specified threshold within
theground truthed online sequence, with no other class template
fromother activities exceeding corresponding thresholds. A false
positive(FP) occurs if any incorrect activity exceeds their
threshold withinthe ground truthed online sequence. The precision
and recall resultsof the online evaluation are shown in Table
3.
With the window size equal to the length of the longest
sequencein the class template set (w = 100%), SW was still able to
achievehigh precision and recall with the given synthetic online
sequence incontrast to the HMM and DTW. Furthermore, these high
values wereconsistent with the larger evaluated windows sizes of
w=105110%as seen in Table 3. The observed high precision and recall
of theOSW algorithm across the different window sizes can be
explainedby the SW technique optimally locating embedded patterns
(localalignments) within the online window sequence and
terminatingpoorly matching local alignments arising from
significant gaps ormismatches. These poorly matching subsequences
generate regionsof local negative similarity which are terminated
by the zero condi-tion in the relation specified in Eq. (5).
The results in Table 3 demonstrate that DTW is sensitive to
extra-neous elements from window sequences and furthermore is
sensi-tive to the specified window size. This can be seen by the
reductionin recall with the increase in online window size. As DTW
is global
Table 3Threshold-based NN classification with online
recognition
w= 100% w= 105% w= 110%Precision (%) Recall (%) Precision (%)
Recall (%) Precision (%) Recall (%)
HMM 70.00 5.83 0.00 0.00 0.00 0.00DTW 58.62 42.5 59.38 31.67
58.33 23.33SW 83.05 81.67 83.90 82.50 82.75 83.33
Table 4Threshold-based NN classification with accurate activity
segmentation
Precision (%) Recall (%)
HMM 83.9 75.6DTW 97.3 96.9SW 98.1 97.6
Fig. 13. Confusion matrix for SW. The legend represents the
percentage of activity sequences classified.
and accounts for the additional, non-activity elements of the
windowsequence the calculated distance typically increases with
increasingwindow size preventing recognition with derived
thresholds. OverallDTW does manage to maintain its precision across
the evaluatedwindow sizes and does achieve precision and recall
values higherthan the HMM.
The discrete HMM also was not able to recognise the
observedwindow sequences across the different window sizes. The
HMM'shigh sensitivity to the window size, resulting in a low
precision andrecall, is due to two reasons. The first is due to the
log likelihood ofPr(O|), where O is the observed window sequence
and is the de-rived model for an activity, encountering symbols
with zero proba-bility, thus resulting in a log likelihood of
negative infinity. These zerosymbol probabilities occur due to the
failure to observe such sym-bols in the training sequences during
HMM parameter estimation.The second and less significant reason for
HMM's poor performanceis due to the forward inferencing algorithm
encountering significantnumbers of symbols with low probability in
the online window se-quence (due to the global matching nature of
HMM inferencing). Asa result the derived log likelihoods are
reduced such that they do notexceed the specified thresholds and
the activities are not recognised.
8.3. SW, DTW and HMM with accurate activity segmentation
We have shown how OSW is capable of accurate recognition inan
online activity recognition context but we are also interested
inwhether the modified SW approach is capable of adequate
recogni-tion with accurately segmented spatial sequences. To
evaluate thisaspect we use 10 accurately segmented training
sequences from eachof the 12 activity classes, derive intraclass
distances as previouslymentioned, then specify thresholds2 standard
deviations. With theapplied thresholds for each activity, we then
use threshold-basedNN classification with the accurately segmented
observed sequencesand compare the results to DTW and the discrete
HMM. The exper-imental results are depicted in Table 4.
Table 4 shows that the modified SW technique is capable
ofproducing high precision (98%) and recall (97%) in classification
ofaccurately segmented spatial activity sequences with the given
dataset. Furthermore, the recognition performance of SW is
comparable
-
D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 -- 3492
3491
Fig. 14. Spatial patterns for confused activities 4 and 5.
to the global DTW approach. This finding therefore shows that it
ispossible to apply the local SW alignment technique successfully
in aglobal matching role. In contrast, both the SW and DTW
alignmenttechniques significantly outperformed the HMM, further
providingevidence of the strong discriminatory capability of the
proposed SWalignment approach.
Elaborating further on the SW results, the SW confusion
matrix,presented in Fig. 13, demonstrates near perfect
classification withonly minor errors occurring between activities
four and five, whichrepresent two variations of having breakfast.
The misclassificationseen here is understandable as the variants of
having breakfast share90% spatial similarity (see Fig. 14) and the
remainder of the se-quences have only a small spatial
disparity.With future developmentof activity segmentation
algorithms for online recognition, it is pos-sible to apply the SW
technique for accurate sequence quantification.
8.4. Robustness evaluation of SW, DTW and the HMM with
accuratelysegmented activities
Robustness to noise is an important characteristic of any
spa-tial activity recognition approach as video tracking systems
typicallyproduce spatial sequences intertwined with noise and gaps.
To eval-uate the robustness of the modified SW algorithm we
artificially in-corporated random noise with varying magnitudes
into each of theaccurately segmented testing sequences during
threshold-based NNclassification. A training set size of 10 was
used in the evaluation.Benchmarking of the robustness performance
was made in relationto the global DTW technique and the HMM. The
experimental resultswith noise magnitudes of between 0 and 3m is
presented in Fig. 15.
The results from Fig. 15 demonstrate that the modified SW
algo-rithm is more resilient to noise than the HMM, as indicated by
thesmaller decrease in recall with the increased magnitudes of
noise: 3%decrease for SW versus 14% for the HMM. Importantly, SW
was ableto maintain a precision and recall of >93% with the
largest evaluatedmagnitude of noise, while the discrete HMM
achieved only 60%. Theobserved maintenance of high accuracy across
the different magni-tudes of artificially introduced noise also
reinforces the belief thatthe proposed SW approach is more robust
to noise than the discreteHMM.
DTW was also seen to perform similar in relation to noise as
theSW approach in Fig. 15. This was not expected as DTW is
sensitiveto noise [24] and thus should have exhibited a decrease in
recall. It
Fig. 15. Noise magnitude versus classification accuracy: (a)
DTW; (b) SW; and (c)HMM.
is implicit that introduction of artificially introduced noise
increasesthe average DTW distance to the class templates. As we use
NNclassification with thresholding, an increase in distance will
preventsequences from being recognised as they are more likely to
exceedthe specified thresholds. Taking this in to account, it is
likely thatthe derived thresholds, taken from the average
intraclass distancesof the training sets 2 standard deviations,
were sufficiently largesuch that the increased distances did not
exceed the thresholds andthus did not affect the recall
statistics.
9. Conclusion
In this paper we proposed an online modification of SW, the
OSWalgorithm, for spatial activity recognition. The unique local
align-ment property of the OSW algorithm allows efficient
recognition
-
3492 D.E. Riedel et al. / Pattern Recognition 41 (2008) 3481 --
3492
of embedded and partial spatial activities within sliding
windows,preventing the need for accurate sequence segmentation. To
demon-strate the effectiveness of OSW, we evaluated the online
classi-fication performance with a 12 class data set and compared
theresults to DTW and the discrete HMM. The results showed that
theOSW approach obtained a significantly higher precision and
recallin comparison to both DTW and the HMM. Experimentation witha
modified SW algorithm and accurately segmented spatial
activitysequences also showed that the approach is capable of
accurate dis-crimination, producing a classification performance
similar to DTW,and significantly outperforming the HMM. Further
experimentationalso confirmed that DTW and the modified SW approach
are robustto the intrinsic noise commonly found in non-invasively
capturedspatial sequences. Our investigation of SW and OSW has
shown somepromising results in the spatial activity domain. Future
work willnow focus on applying the approach with multi-sensor data,
partic-ularly combining spatial and event information to improve
spatialactivity discrimination.
References
[1] T.H. Monk, C.F. Reynolds, M.A. Machen, D.J. Kupfer, Daily
social rhythms in theelderly and their relation to objectively
recorded sleep, Sleep 15 (4) (1992)322--329.
[2] R. Suzuki, S. Otake, T. Izutsu, M. Yoshida, T. Iwaya,
Monitoring daily livingactivities of elderly people in a nursing
home using an infrared motion-detectionsystem, Telemed. e-Health 12
(2) (2006) 146--155.
[3] J. Aggarwal, Q. Cai, Human motion analysis: a review,
Comput. Vis. ImageUnderstand. 73 (3) (1999) 428--440.
[4] N. Johnson, D. Hogg, Learning the distribution of object
trajectories for eventrecognition, in: Sixth British Conference on
Machine Vision, vol. 2, 1995,pp. 583--592.
[5] J. Owens, A. Hunter, Application of the self-organising map
to trajectoryclassification, in: IEEE International Workshop on
Visual Surveillance, 2000,pp. 77--83.
[6] N. Sumpter, A. Bulpitt, Learning spatio-temporal patterns
for predicting objectbehaviour, Image Vis. Comput. 18 (9) (2000)
697--704.
[7] J. Yamato, J. Ohya, K. Ishii, Recognizing human action in
time-sequential imagesusing hidden Markov model, in: Computer
Vision and Pattern Recognition,1992, pp. 379--385.
[8] H. Bui, S. Venkatesh, G. West, Tracking and surveillance in
wide-areaspatial environments using the abstract hidden Markov
model, Int. J. PatternRecognition Artif. Intell. 15 (1) (2001)
177--195.
[9] N. Oliver, E. Horvitz, A. Garg, Layered representations for
human activityrecognition, in: IEEE International Conference on
Multimodal Interfaces, 2002,pp. 3--8.
[10] S. Luhr, H. Bui, S. Venkatesh, G. West, Recognition of
human activity throughhierarchical stochastic learning, in: IEEE
International Conference on PervasiveComputing and Communications
(PerCom-03), 2003, pp. 416--422.
[11] N. Nguyen, H. Bui, S. Venkatesh, G. West, Recognising and
monitoring high-levelbehaviours in complex spatial environments,
in: IEEE International Conferenceon Computer Vision and Pattern
Recognition (CVPR-03), 2003, pp. 620--625.
[12] T. Duong, H. Bui, D. Phung, S. Venkatesh, Activity
recognition and abnormalitydetection with the switching hidden
semi-Markov model, in: IEEE ComputerSociety Conference on Computer
Vision and Pattern Recognition, vol. 1, 2005,pp. 838--845.
[13] A. Bobick, J. Davis, Real-time recognition of activity
using temporal templates,in: IEEE Workshop on Applications of
Computer Vision, 1996, pp. 39--42.
[14] A. Bobick, Y. Ivanov, The recognition of human movement
using temporaltemplates, IEEE Trans. Pattern Anal. Mach. Intell. 23
(3) (2001) 257--267.
[15] J. Ben-Arie, Z. Wang, P. Pandit, S. Rajaram, Human activity
recognition usingmultidimensional indexing, IEEE Trans. Pattern
Anal. Mach. Intell. 24 (8) (2002)1091--1104.
[16] F. Bashir, W. Qu, A. Khokhar, D. Schonfeld, HMM-based
motion recognitionsystem using segmented PCA, in: IEEE
International Conference on ImageProcessing, 2005, pp.
1288--1291.
[17] J. Han, B. Bhanu, Human activity recognition in thermal
infrared imagery, in:IEEE Computer Society Conference on Computer
Vision and Pattern Recognition,2005.
[18] A. Bobick, Y. Ivanov, Action recognition using
probabilistic parsing, in: IEEEComputer Society Conference on
Computer Vision and Pattern Recognition,1998, pp. 196--202.
[19] Y. Ivanov, A. Bobick, Recognition of visual activities and
interactions bystochastic parsing, IEEE Trans. Pattern Anal. Mach.
Intell. 22 (8) (2000)852--872.
[20] P. Peursum, H. Bui, S. Venkatesh, G. West, Human action
segmentation viacontrolled use of missing data in HMMs, in: IAPR
International Conference onPattern Recognition, 2004, pp.
440--445.
[21] L. Rabiner, A tutorial on hidden Markov models and selected
applications inspeech recognition, Proc. IEEE 77 (2) (1989)
257--286.
[22] H. Sakoe, S. Chiba, Dynamic programming algorithm
optimization for spokenword recognition, IEEE Trans. Acoustics
Speech Signal Process. ASSP-26 (1)(1978) 43--49.
[23] L. Chen, R. Ng, On the marriage of LP-norms and edit
distance, in: 30th VLDBConference, 2004, pp. 792--803.
[24] L. Chen, M.T. Ozsu, V. Oria, Robust and fast similarity
search for moving objecttrajectories, in: 24th ACM International
Conference on Management of Data,2005.
[25] M. Vlachos, D. Gunopulos, G. Kollios, Robust similarity
measures for mobileobject trajectories, in: 13th International
Workshop on Database and ExpertSystems Applications, 2002, pp.
721--726.
[26] M. Vlachos, G. Kollios, D. Gunopulos, Discovering similar
multidimensionaltrajectories, in: 18th International Conference on
Data Engineering, 2002,pp. 673--684.
[27] D. Riedel, S. Venkatesh, W. Liu, A Smith--Waterman local
sequence alignmentapproach to spatial activity recognition, in:
IEEE International Conference onAdvanced Video and Signal based
Surveillance, 2006.
[28] I. Eidhammer, I. Jonassen, W. Taylor, Protein
Bioinformatics: An AlgorithmicApproach to Sequence and Structure
Analysis, Wiley, New York, 2004 pp. 3--23,(Chapter 1.2).
[29] S.B. Needleman, C.D. Wunsch, A general method applicable to
the search forsimilarities in the amino acid sequence of two
proteins, J. Mol. Biol. 48 (1970)443--453.
[30] T. Smith, M. Waterman, Identification of common molecular
subsequences,J. Mol. Biol. 147 (1981) 195--197.
[31] M.S. Waterman, Introduction to Computational Biology, first
ed., Chapman &Hall, London, UK, 1995.
[32] L. Rabiner, A. Rosenberg, S. Levinson, Considerations in
dynamic time warpingalgorithms for discrete word recognition, IEEE
Trans. Acoustics Speech SignalProcess. 26 (6) (1978) 575--582.
[33] S. Das, Some experiments in discrete utterance recognition,
IEEE Trans. AcousticsSpeech Signal Process. ASPP-30 (5) (1982)
766--770.
[34] M. Vlachos, D. Gunopulos, G. Das, Rotation invariant
distance measures fortrajectories, in: 10th ACM SIGKDD
International Conference on KnowledgeDiscovery and Data Mining,
2002, pp. 707--712.
[35] J. Aach, G. Church, Aligning gene expression time series
with time warpingalgorithms, Bioinformatics 17 (6) (2001)
495--508.
[36] T. Rath, R. Manmatha, Word image matching using dynamic
time warping, in:Computer Vision and Pattern Recognition, 2003, pp.
521--527.
About the Author---WANQUAN LIU received the BSc degree in
Applied Mathematics from Qufu Normal University, PR China, in 1985,
the MSc degree in Control Theoryand Operation Research from Chinese
Academy of Science in 1988, and the PhD degree in Electrical
Engineering from Shanghai Jiaotong University, in 1993. He once
holdthe ARC Fellowship and JSPS Fellowship and attracted research
funds from different resources. He is currently a Senior Lecturer
in the Department of Computing at CurtinUniversity of Technology.
His research interests include large scale pattern recognition,
control systems, signal processing, machine learning, and
intelligent systems.
Recognising online spatial activities using a bioinformatics
inspired sequencealignment approachIntroductionSequence alignmentSW
local alignmentOSW local alignmentDynamic time warpingHidden Markov
modelData collection and methodologyExperimental resultsParameter
selectionOnline activity recognition using SW, DTW and the HMMSW,
DTW and HMM with accurate activity segmentationRobustness
evaluation of SW, DTW and the HMM with accurately segmented
activities
ConclusionReferences