INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 MPEG2011/m19188 January 2011, Daegu, Korea Source Peking University, Harbin Institute of Technology, China Status Input Contribution Title Peking University Landmarks: A Context Aware Visual Search Benchmark Database Author [email protected], Lingyu Duan [email protected], Rongrong Ji [email protected], Jie Chen [email protected], Shuang Yang [email protected], Tiejun Huang [email protected], Hongxun Yao [email protected], Wen Gao 1 Introduction The 93 rd MPEG meeting output draft requirements documents (w11529, w11530 and w11531) of Compact Descriptors for Visual Search. To advance this work, this contribution presents our work on establishing context aware visual search benchmark database for mobile Landmark search. In the input contribution m18542 at MPEG 94 th Meeting [9], Peking University has proposed a compact descriptor for visual search, which combines location cues to learn a discriminative and compact visual descriptor that is very suitable for mobile landmark search. We believe our practice as well as the benchmark dataset would enhance the use cases and be helpful to identify requirements for Compact Descriptors for Visual Search. While there are ever growing focuses on mobile visual search in recent years, a comprehensive benchmark database for fair evaluation among different strategies is still missing. In particular, the rich contextual cues in mobile devices, such as GPS information and camera parameters, are left unexploited in the current visual search benchmarks. This contribution introduces a Peking University Landmarks benchmark for the quantitative evaluations of mobile visual search performance with the support of GPS information. It contains over 13179 images organized into 198 distinct landmark locations within the Peking University campus, which is built by 20 volunteers during November and December, 2010. Each location is captured with multiple shot sizes and viewing angles, using both digital cameras and phone cameras, each photo being tagged with rich contextual information in the mobile scenarios. Moreover, this benchmark studies typical quality degeneration scenarios in mobile photographing, including variable resolutions, blurring, lighting changes, occlusions, as well as various viewing angles. Together with this benchmark, we provide the bag-of-visual-words search baselines in the cases of using either spatial or contextual information in returning image ranking. Finally, distractor images are further introduced to evaluate the robustness of visual search methods in the database.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO
ISO/IEC JTC1/SC29/WG11
MPEG2011/m19188 January 2011, Daegu, Korea
Source Peking University, Harbin Institute of Technology, China Status Input Contribution Title Peking University Landmarks: A Context Aware Visual Search Benchmark DatabaseAuthor [email protected], Lingyu Duan
The 93rd MPEG meeting output draft requirements documents (w11529, w11530 and w11531) of Compact Descriptors for Visual Search. To advance this work, this contribution presents our work on establishing context aware visual search benchmark database for mobile Landmark search. In the input contribution m18542 at MPEG 94th Meeting [9], Peking University has proposed a compact descriptor for visual search, which combines location cues to learn a discriminative and compact visual descriptor that is very suitable for mobile landmark search. We believe our practice as well as the benchmark dataset would enhance the use cases and be helpful to identify requirements for Compact Descriptors for Visual Search.
While there are ever growing focuses on mobile visual search in recent years, a comprehensive benchmark database for fair evaluation among different strategies is still missing. In particular, the rich contextual cues in mobile devices, such as GPS information and camera parameters, are left unexploited in the current visual search benchmarks. This contribution introduces a Peking University Landmarks benchmark for the quantitative evaluations of mobile visual search performance with the support of GPS information. It contains over 13179 images organized into 198 distinct landmark locations within the Peking University campus, which is built by 20 volunteers during November and December, 2010. Each location is captured with multiple shot sizes and viewing angles, using both digital cameras and phone cameras, each photo being tagged with rich contextual information in the mobile scenarios. Moreover, this benchmark studies typical quality degeneration scenarios in mobile photographing, including variable resolutions, blurring, lighting changes, occlusions, as well as various viewing angles. Together with this benchmark, we provide the bag-of-visual-words search baselines in the cases of using either spatial or contextual information in returning image ranking. Finally, distractor images are further introduced to evaluate the robustness of visual search methods in the database.
2 Mo
Comingincreasimunitieestablishsearch sthe richbeneficiare left
We photograpplicatevaluate198 lansufficienfocus osystema
3 Be
Scacontainsdigital aDSC-WNIKONfrom miphone respectipair of GPS dedigital ablurringthe voluphone pduring N
Fig. 1. T
otivation
g with the ing interests. Howeverhed benchmscenarios th
h contextualial to refineunexploitedbelieve a
raphing vartions. In thie GPS contendmarks lont real-worln the avail
atic methodo
enchmark
ale and Cs over 1317and phone c
W290, SamsuN COOLPIXmobile phon
3G and LGively. We revolunteers,
evice (HOLand phone cg and shakinunteers comphotos than November a
Two typical
n
explosive ts in compur, state-of-thmark databahat involve l cues, suche solely visud in the exis
real-worldriances is is contributiext assisted ocations wild photograability of cology to ev
k Databa
Constituti79 scene phcameras. Thung TechwiX L12 and ne cameras G Electroniecruited ove
one using LUX M-120camera photng are mor
mpensate ththe digital
and Decemb
l scenarios o
growth ofuter vision,he-art workase, which lots of phot
h as GPS, tiual ranking.sting visual d, context important tion, we intrmobile visu
ithin the Paphing variacontextual caluate the ro
ase Statis
ion: The Photos, organhere are in tin <DigimaCanon IXU(Nokia E7
cs KP500 wer 20 voluntdigital cam
00E) with ttographers ae frequent
heir bad phocamera pho
ber, 2010.
of capturing
f phone ca, multimediks are rarelshould be tographing ime stamp, However, tsearch bencrich bench
to put forwroduce the ual search p
Peking Uniances typicacues to impobustness o
stics
Peking Univnized into19total 6193 pax S830 / KeUS 210 wi72-1, HTC with resoluteers in data
mera and thethem. Theare within 1happeningsoto with a one. All the
g landmark
ameras, moia analysis,y compareddesigned tovariances uand base s
the effectivchmarks. hmark withward mobiPeking Uniperformanceiversity camally for mobprove the vof contextua
versity Lan98 landmarkphotos captuenox S830>ith resolutioDesire, No
ution 640×4a acquisitioe other usinaveraged v0 degrees f
s for mobilenew one, we images in
photos in d
obile visual, and informd among eao target reausing phonetation inforeness and ro
h sufficienle visual siversity Lane. The datasmpus. Our bile phone cisual search
al cue by ad
ndmarks benk locations ured from d>, Canon DIon 2592×19okia 5235, 480, 1600×1n, each landng mobile pviewing angfor both volue phone capwhich thus n the entire
ifferent sho
l search hamation retrach other oal-world moe cameras. Irmation, areobustness o
nt coveragesearch resendmarks beset is collec
benchmarcameras. Wh performan
dding contex
nchmark (Pand captur
digital cameIGITAL IX944) and 6Apple iph
1200 and 2dmark is caphone, withgle variationunteers. Nopturing. In produce mdatabase ar
ots sizes and
as receivedrieval com-
over a well-obile visualIn addition,e extremelyof such cues
e of users’arches andnchmark to
ct from overrk provides
We put morence, with axt distractor
PKUBench)ed via both
eras (SONYXUS 100 IS,
986 photosone, Apple
2048×1536)aptured by ah a portablens between
ote that bothsuch cases,
more mobilere collected
d angles.
d --l ,
y s
’ d o r s e a r.
) h Y , s e ) a e n h , e d
As imediumwhich adegreesdifferenwith resimages
Fig. 2.
Fi
Coscenariomobile
illustrated im shot and cattempt to c respectivel
nt weathers spect to lanof different
The landma
ig.3. The pe
ontextual o is closelyuser’s geo
n Figure 1,close up. Focover 360 dly. The cap(sunny, cloundmark loct landmarks
ark photo don the
ercentages o
Cues: Coy related to ographical l
we captureor each shodegrees frompturing of boudy, etc.) duations are . The perce
istribution bGoogle Ma
of both phon
omparing wrich contex
location can
e photos in ot size, therm the frontoth digital
during Novegiven in Fi
entage of mo
by overlayinap of Peking
ne camera a
with generaxtual informn be levera
three differre are at motal view of camera and
ember and Digure 2. Diobile and ca
ng the locatg University
and digital c
alized visumation on thaged to pre
rent shot sizost 8 directithe landma
d mobile phDecember. Tfferent colo
amera photo
tion point ofy campus.
camera phot
ual search, he mobile pe-filter mos
zes, namelyions in phoark, capturehone photosThe photo dors denote os is given i
f each colle
tos in PKUB
mobile visphone. For st of unrela
y long shot,tographing,
ed every 45s undergoesdistributions
the samplein Figure 3.
ected photo
Bech.
sual searchinstance, a
ated scenes
, ,
5 s s e
h a s
without visual ranking. Over PKUBench, we pay more focus to the use of such contextual cues in facilitating visual search, including: (1) GPS tag (both latitude and longitude); (2) landmark name label; (3) shot size (long, medium, and close-up) and viewpoints (frontal, side, and others) of those photos; (4) camera type (digital camera or mobile phone camera); (5) capture time stamp. We also provide EXIF information: camera setting (focal, resolution).
In addition, we will show the performance improvement of using contextual information by providing baselines that leverage GPS to refine visual ranking. Furthermore, the effects of less precise contextual information are also investigated by adding distractor images by imposing random GPS distortion to the original GPS location of an image.
Scene Diversity: We provide as diverse landmark appearances as possible to simulate the real-world difficulty in visual search. Hence, the volunteers are encouraged to capture both queries and the ground truth photos (for both digital and phone cameras) without any particular intent to avoid the intruding foreground, e.g. cars, human faces, and tree occlusions.
4 Comparing with Related Benchmarks
Zubud Database [2] is widely adopted to evaluate vision-based geographical location recognition, which contains 1,005 color images of 201 buildings or scenes (5 images per building or per scene) in Zurich city, Switzerland.
Oxford Buildings Database [3] contains 5,062 images collected from Flickr by searching for particular Oxford landmarks, with manual annotated ground truth for 11 different landmarks, each represented by 5 possible queries.
SCity Database [4] contains 20, 000 street-side photos for mobile visual search validation in Microsoft Photo2Search system [4]. It is captured along the Seattle urban streets by a car automatically, taken from the main streets of Seattle by a car with six surrounding cameras and a GPS device. The location of each captured photo is obtained by aligning the time stamps of photos and GPS record.
UKBench Database [5] contains 10,000 images with 2,500 objects, containing indoor objects like CD Covers, book set, etc. There are four images per object to offer sufficient variances in viewpoints, rotations, lighting conditions, scales, occlusions, and affine transforms.
Stanford Mobile Visual Search Data Set [6] contains camera-phone images of products, CDs, books, outdoor landmarks, business cards, text documents, museum paintings and video clips. It provides several unique characteristics, e.g. varying lighting conditions, perspective distortion, and mobile phone queries.
Table 1. Brief comparison of related benchmarking databases. Database PKUBench Zubud Oxfold SCity UKBench Stanford Data Scale 13,179 1,005 5,062 20,000 10,000 Images Per Landmark /Object Category
dding Distual informa(note: the lsity) and 20ginal databaration.
Examples ofve, Backgro
Databasecontextual ihone querithe perform
ase caused bhe brief com
Mobile Q
xemplar moworld visual
Query Setby foregroun
d Clutter often captdue to the b
ry Set conily depends
nd Shakining mobile q
tractors iation to visuandmark bu
012 photos fase. We then
f query scenound Clutte
e: Our datainformationes with co
mance degeby cars, peo
mparison of r
Query Sce
obile query search perf
t contains 2nd cars, peo
s Query tured far awbias of othe
ntains 9 mo on the ligh
ng Query queries with
into Dataual search. Wuildings in from PKU, n select 10
narios (Digiers, Blurring
abase is pron to simulatomparison teneration ofople, trees, related benc
enarios
y scenarios formance in
20 mobile qople, and bu
Set contaiway from aer nearby bu
obile phonehting conditi
Set containhout any blu
abase is to We collect aSummer Pathen randolocations (
ital Camera g/ Shaking,
oviding rich te what we to the corref cellphone and nearby
chmarking d
(in total 1n challenging
queries and uildings.
ns 20 mobia landmark, uildings.
e queries anions.
ns 20 mobilurring or sh
evaluate tha distractoralace are viomly assign(30 queries)
versus Moband Night.
query scencan get froesponding queries; (3
y buildings, databasets i
68 queries)g situations
20 corresp
ile queries where GP
nd 9 digital
le queries whaking.
he effects ofset of 6630
isually simied them wi) from PKU
bile Phone)
narios in theom mobile pqueries of 3) Occlusio blurring anin the state-
) are demos (See Figur
ponding dig
and 20 digS based se
l camera qu
with blurring
f applying l0 photos froilar as thoseith the GPSU to evaluat
) (From Top
e followingphones. (2)the digital
ons in bothnd shaking.of-the-art.
onstrated tore 4):
gital camera
gital cameraarch would
ueries. The
g or shaking
less preciseom Summere in Peking
S tagging ofte the mAP
p to Bottom
g ) l h .
o
a
a d
e
g
e r g f P
m:
Lawalkingwith thrmediumsmall on
Small ScMedium
Large Sc
andmark g distances ree scales, s
m scale is 1nes, 75 med
Tabl
cale (0-12m): Scale (12-30
cale (> 30m):
(Examples
Fig. 5. T
Scale: Weof the photsmall, medi2-30 m and
dium ones, a
le. 2. Typic
Sm): C
hLla
s of Differe
The photo v
try to categtographers ium, and lad the large and 60 large
al landmark
Sculpture, stonCourtyard, andhistoric buildinLarge buildingarge object (e.
nt Scales: F
volumes of t
gorize the laaround eac
arge. The tyscale is ov
e ones.
k types of th
ne, pavilions, d small or mngs (smaller flgs, such as lib.g. BoYa Tow
From Top to
three differe
andmark scch assigned ypical distanver 30m. As
hree differen
gates and othemedium sized floor area).. brary, comple
wer).
o Down: Sm
ent landmar
ale by measlandmark l
nce for smas shown in
nt Landmar
ers. buildings, su
ex building, o
mall, Medium
rk scales in
suring the rlocation. W
all scale is 0Figure 5, w
rk scales
uch as office
or a long shot
m, and Larg
PKUBench
range of theWe come up
0-12 m, thewe have 63
buildings,
t of a very
ge)
h
e p e 3
Fina
6 Mo
We proassisted
(1) Bbuild a
ally, we pro
F
obile Vis
ovide severad visual sear
BoW: We eScalable V
ovide more p
Fig.6. Ph
Fig.7. Photo
sual Sear
al visual serch:
extract SIFTocabulary T
photograph
hoto volume
o volume dis
rch Basel
earch baseli
T [7] featurTree [5] to g
details in F
e distributio
stribution b
lines
ines, includ
res from eagenerate the
Figure 6 and
n by differe
by different
ding purely
ach photo, te initial Vo
d Figure 7.
ent shot size
viewing ang
visual sear
the ensemblcabulary V.
es
gles.
rch as well
le of which. The SVT
as context
h is used togenerates a
t
o a
bag-of-wthe branapproxisearch p
(2) functionon the w
where Dand BoWthe BoWis based
It iswhile itsatisfactcontainsbuildingbe well
mAperform
Fig.8. Tcamera
Notehappensto favordegenerperform
words signanching factmate 100,0
performance
GPS + Bon by multipweighting fu
Dis(A,Q) isWDis(A,Q)
W based visd on such sims worth ment typically tory performs lots of trgs (such as distinguish
AP Performmance of eac
The performand mobile
e that mosts in the longr other nearration using
mance when
ature Vi fortor as B. In
000 codewoe, which rev
oW: We fuplying the Gunction as:
(Dis A
s the overal) stands for ual distancemilarity mentioning thagives prom
mance in threes that arancient Chi
hed by RAN
mance withch query sce
mance of oce phone cam
t of occlusig or mediumrby landmag solely Gcombining
r each databn a typical
ords. We usveals its pos
urther leverGPS distanc
, )A Q GeoD
ll distance bthe geogra
e between qeasurement iat we have
mising resulis database.e un-regulainese buildi
NSAC.
h respect tenario respe
cclusive qumera(Y axis:
ive queries m shot of a arks aroundGPS inform
visual sear
base photo Il settlementse mean Avsition-sensit
rage the loce with the
( , )Dis A Q B
between quaphical distaquery Q andin Equationdiscovered
lts in tradit. There are ar for spatiings) that ha
to differenectively as f
ueries with : mAP@N p
come froma large scaled the query mation. Thirch with GP
Ii. We denot, we have
verage Precitive ranking
cation contBoW distan
( ,BoWDis A
uery Q and ance (measud database imn (1).
that the RAtional visuatwo possiblal re-rankinave very sim
nt challengfollows:
respect to performanc
m a large sce landmark.
location, ws may eveS informati
ote the hieraH = 6 an
ision at N (g precision a
text to refinnces to the
)Q (1)
database imured by GPmage A resp
ANSAC baal experimenle reasons: (ng; (2) Themilar local f
ging scena
difference e; X axis: to
cale landmaIn such cas
which woulden degeneraion.
archical levnd B = 10,(mAP@N) at the top N
ine the visuquery exam
)
mage A; GePS distance)pectively. O
ased spatial ents, does n(1) PKUBeere are lotsfeatures, wh
arios: We d
methods uop N return
ark, as occlse, GPS pod lead to pate the vis
el as H and producingto evaluate
N returning.
ual rankingmple, based
eoDis(A,Q)) as well asOur ranking
re-ranking,not producench usually
s of similarhich cannot
discuss the
sing digital
ning results)
usion oftensition tendserformancesual search
d g e
g d
) s g
, e y r t
e
l ).
n s e h
Fig
In psuch casmajor p
The Nrecognitdifferenenough camera
.9. The perf
practice, theses, the pur
part of a que
Fig.10. T
Night querytion perform
nt from the at night. It can achieve
formance of
e backgrounrely visual sery photo is
The perform
y is an intermance. Extrday time. is worth m
e better visu
f backgroun
nd clutter tysearch perfoactually oc
mance of Ni
resting case,racting distHence, weentioning thual search p
nd clutters q
ypically hapormance percupied by b
ight queries
, where GPtinguishing e can obserhat due to b
performance
queries with
ppens in caprforms worsbackgrounds
with respec
PS (contextulocal featuve that usin
better imagee than a mob
h respect to
pturing smase, due to ths.
ct to differe
ual informatures is very ng solely Ge capturing bile phone.
different m
all scale lanhat in most
ent methods
tion) roles tdifficult, th
GPS is almquality, usi
ethods.
ndmarks. Inqueries the
s.
the locationhat is quite
most alreadying a digital
n e
n e y l
F
Fromvisual sbecome
OverexemplaFigure 1phone c
Note mobile por mobi
F
Fig. 11. The
m Fig.11, wesearch perfoe much more
rall Performar scenarios12, which gcamera.
that using phone photile phone is
Fig.12. Over
e performan
e find that ormance. He acceptable
mance Coms) with resp
gives an intu
solely visutos; but withalmost iden
rall perform
nce of blurri
introducingHowever, by
e comparing
mparison: Wpect to usinuitive findin
ual search, th the combintical.
mance comp
ing and sha
g blurring ay incorporatg with the p
We further sng either ding about the
the performination of G
parison betw
aking querie
and shakingting GPS inpure visual q
show the ovigital came
e mAP diffe
mance of caGPS, the per
ween using c
es of phone
g would defnto similariquery result
verall perforra and mob
erence betwe
amera photorformances
camera and
camera pho
finitely degity ranking,ts.
rmance (168bile phone
ween digital
os is better of using eit
mobile pho
otos.
generate the the results
8 queries ofcameras incamera and
than usingther camera
ones.
e s
f n d
g a
Figurthe rest visual sscales dsearch paround scale lan
Final168 queand visupure vis
re 13 furtheas the searc
search perfodue to less bperformancea larger scandmarks yie
Fig. 1
ly, we inveeries), as shual search tsual search,
Fig. 14. O
er compares ched datase
ormance of background e of small s
ale landmarkeld better re
3. Performa
estigate the hown in Figtogether. Athis degene
Overall perf
the performet) among dlarge scale clusters and
scale is bettk. Moreoveresults of fus
ance compa
overall pergure 14. Unlthough diseration effec
formance of
mance over different lan
landmarks d more distiter than largr, as the GP
sing visual s
arison amon
rformance ondoubtedly, stractor imacts are allev
f 570 querie
the whole ddmark scaleare much b
inguishing ige scale, asPS plays relsearch and G
ng different
of in total 5the best res
ages typicallviated by int
es in Peking
database (ones. It is worbetter than tinteresting p the GPS siatively impGPS inform
scales of la
570 queriessults come ly degenerategrating GP
g University
ne image asrth mentionthe mediumpoints. The ignal may b
portant rolesmation.
andmarks.
s (includingfrom fusing
ate the perfPS with vis
y Landmark
s the query,ning that them and small
GPS basedbe distorteds, the small-
g the aboveg both GPSformance ofsual cues.
ks.
, e l d d -
e S f
mAP15-16, operformdiverse both bluinterest adding taken se
P Performaover those
mance can bperformancurring/Nighpoints wou
distractors eriously, wh
Fig.15. S
Fig.16. G
ance with r168 querie
be more orces. Generaht and addinuld be chalin Fig. 15
hile the simp
olely BoW
GPS+BoW p
respect to Des, it is quir less imprally speakinng distractollenged by and 16, wple combina
performanc
performanc
Different Site obvious roved, whilng, the worsor images. mobile blu
we can see tation is not
ce comparis
ce comparis
earch Basethat by ad
le different st performanThe former
urring querithe use of crobust enou
sons in five
ons in five t
elines: Furthdding contex
mobile qunces originar indicates ies. By comcontextual iugh for deal
typical que
typical quer
hermore, frextual informuery scenarate from thethat the us
mparing theinformationling with di
ery scenario
ry scenarios
om Figuresmation, therios presente queries ofse of visuale results ofn should beistractors.
os.
s.
s e t f l f e
7 Application Scenarios
We brief possible application scenarios of our Peking University Landmarks database as follows:
A benchmark dataset for mobile visual search: We hope the Peking University Landmarks could become useful resource to validate mobile visual search systems. It emphasizes two important factors in mobile visual search: query quality and contextual cues. To the best of our knowledge, both are beyond the state-of-the-art benchmark databases. In addition, it offers a dataset to evaluate the effectiveness and robustness of contextual information.
A benchmark dataset for location recognition: This dataset can be used to evaluate traditional location recognition systems since GPS location are bound with each image instance
A training resource for scene modeling: This dataset may facilitate scene analysis and modeling since our photograph is well designed to cover multi-shot, multi-view appearances of the landmarks of multi-scale. To this end, we will provide the camera calibration information in our future work.
A training resource to learn better photograph manners: Our landmark photo collection can be further exploited to learn the (recommended) mobile visual photograph manners (proper angle and shot size for different types of landmarks) towards better visual search results.
8 References
[1] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and F.-F. Li ImageNet: A Large-Scale Hierarchical Image Database. CVPR. 2010. [2] H. Shao, T. Svoboda, and L. Van Gool ZuBuD-zurich buildings database for image based recognition. Technical Report, Computer Vision Lab, Swiss Federal Institute of Technology. 2006. [3] Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching. CVPR. 2007. [4] Rongrong Ji, Xing Xie, Hongxun Yao, Wei-Ying Ma Hierarchical Optimization of Visual Vocabulary for Effective and Transferable Retrieval. CVPR. 2009. [5] Nister D. and Stewenius H. Scalable recognition with a vocabulary tree. CVPR. 2006. [6] S. Tsai, D. Chen, G. Takacs, V. Chandrasekhar, J. Singh, and B. Girod. Location Coding for Mobile Image Retrieval. Proc. 5th International Mobile Multimedia Communications Conference, MobiMedia. 2009. [7] Lowe D. G. Distinctive image features from scale invariant key points. IJCV. 2004. [8] M. Fischler and R. C. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. of the ACM, 24: 381-395, 1981. [9]. Rongrong Ji, Lingyu Duan, Tiejun Huang, Hongxun Yao, and Wen Gao. Compact Descriptors for Visual Search - Location Discriminative Mobile Landmark Search, CDVS AD HOC Group, Input Contribution m18542, 94th MPEG Meeting, Oct. 2010