Towards Personalized and Semantic Retrieval: An …Solution for E-commerce Search via Embedding Learning Han Zhang1y, Songlin Wang1y, Kang Zhang1, Zhiling Tang1, Yunjiang Jiang2, Yun

Towards Personalized and Semantic Retrieval: An End-to-EndSolution for E-commerce Search via Embedding Learning

Han Zhang 1†, Songlin Wang 1†, Kang Zhang 1, Zhiling Tang 1, Yunjiang Jiang 2, Yun Xiao 2,Weipeng Yan 1,2, Wen-Yun Yang 2∗

1 JD.com, Beijing, China2 JD.com Silicon Valley Research Center, Mountain View, CA, United States

{zhanghan33, wangsonglin3, zhangkang1, tangzhiling, yunjiang.jiang, xiaoyun1, paul.yan, wenyun.yang}@jd.com

ABSTRACTNowadays e-commerce search has become an integral part of manypeople’s shopping routines. Two critical challenges stay in today’se-commerce search: how to retrieve items that are semanticallyrelevant but not exact matching to query terms, and how to retrieveitems that are more personalized to di�erent users for the samesearch query. In this paper, we present a novel approach calledDPSR, which stands for Deep Personalized and Semantic Retrieval,to tackle this problem. Explicitly, we share our design decisionson how to architect a retrieval system so as to serve industry-scaletra�c e�ciently and how to train a model so as to learn query anditem semantics accurately. Based on o�ine evaluations and onlineA/B test with live tra�cs, we show that DPSR model outperformsexisting models, and DPSR system can retrieve more personalizedand semantically relevant items to signi�cantly improve users’search experience by +1.29% conversion rate, especially for longtail queries by +10.03%. As a result, our DPSR system has beensuccessfully deployed into JD.com’s search production since 2019.

CCS CONCEPTS•Computingmethodologies→Neural networks; •Informationsystems→ Information retrieval;

KEYWORDSSearch; Semantic matching; Neural networks

1 INTRODUCTIONOver the recent decades, online shopping platforms (e.g., Ebay,Walmart, Amazon, Tmall, Taobao and JD) have become increasinglypopular in people’s daily life. E-commerce search, which helps usersto �nd what they need from billions of products, is an essentialpart of those platforms, contributing to the largest percentage oftransactions among all channels [18, 27, 28]. For instance, the tope-commerce platforms in China, e.g., Tmall, Taobao and JD, servehundreds of million active users with gross merchandise volumeof hundreds of billion US dollar. In this paper, we will focus onthe immense impact that deep learning has recently had on thee-commerce search system. At a glance, Figure 1 illustrates the userinterface for searching on JD’s mobile app.

† Both authors contributed equally∗ Corresponding author

Figure 1: Search interface on JD’s e-commerce mobile app.

1.1 �ree Components of Search SystemFigure 2 illustrates a typical e-commerce search system with threecomponents, query processing, candidate retrieval, and ranking.

�ery Processing rewrites a query (e.g., “cellphone for grandpa”)into a term based presentation (e.g., [TERM cellphone] AND [TERMgrandpa]) that can be processed by downstream components. �isstage typically includes tokenization, spelling correction, queryexpansion and rewriting.

Candidate Retrieval uses o�ine built inverted indexes, to e�-ciently retrieve candidate items based on term matching. �is stepgreatly reduces the number of items from billions to hundreds ofthousands, in order to make the �ne ranking feasible.

Ranking orders the retrieved candidates based on factors, such asrelevance, predicted conversion ratio, etc. A production system mayhave cascading ranking steps, which sequentially apply simpler tomore complex ranking functions from upstream to downstream.

In this paper, we focus solely on the candidate retrieval stageto achieve more personalized and semantic search results, sincethis stage contributes the most bad cases in our search production.Based on our analysis, around 20% dissatisfaction cases of searchtra�c of JD.com, one of the largest e-commerce search engine inthe world, can be a�ributed to the failure of this stage. How to deal

arX

iv:2

006.

0228

2v3

[cs

.IR

] 5

Jun

202

0

��

��!��"�� !��

��

� ��#��

��

��

��

� ��#��

��

��

Figure 2: Major stages of an e-commerce search system.

with that in the ranking stage is out of scope for this paper, but willbe our future work.

1.2 Two Challenges in Candidate RetrievalHow to e�ciently retrieve more personalized and semantically rel-evant items remains two major challenges in modern e-commercesearch engines.

Semantic Retrieval Problem refers to that, items that are semanti-cally relevant but do not contain the exact terms of a query cannotbe retrieved by traditional inverted indexes. As reported in [17], themost critical challenge for search systems is term mismatch betweenqueries and items, especially for e-commerce search, where itemtitles are o�en short. Traditional web search o�en uses query rewrit-ing to tackle this problem, which transforms the original query toanother similar query that might be�er represent the search need.However, it is hard to ensure that the same search intention canbe kept through a “middle man”, i.e., rewri�en queries, and thereis also no guarantee that relevant items containing di�erent termscan be retrieved via a limited set of rewri�en queries.

Personalized Retrieval Problem refers to that, traditional invertedindexes cannot retrieve di�erent items according to the currentuser’s characteristics, e.g., gender, purchase power and so on. Forexample, we would like to retrieve more women’s T-shirt if the useris female, and vice versa. Some rule-based solutions have been usedin our system for years include that, 1) indexing tags for items, e.g.,purchase power, gender and so on, the same way as tokens into theinverted index, 2) building separate indexes for di�erent group ofusers. However, these previous approaches are too hand-cra�ed.�us, they are hard to meet more subtle personalization needs.

1.3 Our ContributionsIn this paper, we propose DPSR: Deep Personalized and SemanticRetrieval, to tackle the above two challenges in a leading industrial-scale e-commerce search engine. �e contributions of our workcan be summarized as follows.

In Section 3, we present an overview of our full DPSR embed-ding retrieval system composed of o�ine model training, o�ineindexing and online serving. We share our critical design decisionsfor productionizing this neural network based candidate retrievalinto an industry-level e-commerce search engine.

In Section 4, we develop a novel neural network model witha two tower architecture, a multi-head design of query tower, ana�ention based loss function, a negative sampling approach, ane�cient training algorithm, and human supervision data, all ofwhich are indispensable to train our best performing models.

In Section 5, we present our e�orts on building a large-scaledeep retrieval training system where we signi�cantly customize theo�-the-shelf TensorFlow API for online/o�ine consistency, inputdata storage and scalable distributed training, and on building anindustrial-scale online serving system for embedding retrieval.

In Section 6, we conduct extensive embedding visualization,o�ine evaluation and online AB test to show that our retrievalsystem can help to �nd semantically related items and signi�cantlyimprove users’ online search experience, especially for the long tailqueries, which are di�cult to handle in traditional search systems(i.e., improving conversion rate by around 10%).

2 RELATEDWORK2.1 Traditional Candidate RetrievalFor candidate retrieval, most research focuses on learning queryrewrites [2, 10] as an indirect approach to bridge vocabulary gapbetween queries and documents. Only a few new approaches, in-cluding latent semantic indexing (LSI) [6] with matrix factorization,probabilistic latent semantic indexing (PLSI) [12] with probabilisticmodels, and semantic hashing [25] with an auto-encoding model,have been proposed. All of these models are unsupervised learnedfrom word co-occurrence in documents, without any supervisedlabels. Our approach di�ers from the previous methods in thatwe train a supervised model to directly optimize relevance metricsbased on a large-scale data set with relevant signals, i.e., clicks.

2.2 Deep Learning Based Relevance ModelWith the success of deep learning, a large number of neural net-work based models have been proposed to advance traditionalinformation retrieval (IR) methods (e.g., BM2.5) and learning torank methods [19] in the manner of learning semantic relevancebetween queries and documents. See [17] and [20] for a comprehen-sive survey in semantic match and deep neutral network based IR.Particularly, DSSM [13] and its following work CDSSM [26] havepioneered the work of using deep neural networks for relevancescoring. Recently, new models including DRMM [9], Duet [21] havebeen further developed to include traditional IR lexical matchingsignals (e.g., query terms importance, exact matching) in neuralnetworks. However, as reported by [20], most of the proposedworks in this direction focus on ranking stage, where the optimiza-tion objectives and requirements are very di�erent from candidateretrieval that our work in this paper focuses on.

Two tower architecture for deep neural network has been widelyadopted in existing recommendation works [33, 34] to further in-corporate item features. �is model architecture is also knownas dual encoder in natural language processing [4, 11]. Here wepropose a more advanced two tower model which is composed of amulti-head tower for query and an a�ention loss based on so� dotproduct instead of simple inner product.

Embedding Indexes

QueryTop K Items

Online Serving

QueryEmbedding

Item Embedding Model

Item Corpus

Offline Indexing

Query Embedding Model Serving

Soft DotProduct

Offline Model Training

Item Title Tokens

Shipping Type

Brand Category

Average

…

Title Vector Brand Vector …

Normalize

ReLU

ReLU

ReLU

…

Item Embeddings

Indexing

Head 1 Head k

Query Tokens

Average

…

History Vector Query Vector

Average

User History Events

…

Concat

User Profile

…

Profile Vector

Projection Matrix 1

Normalize

ReLU

ReLU

ReLU

Projection Matrix k

Normalize

ReLU

ReLU

ReLU

Figure 3: Overview of our DPSR retrieval system.

2.3 Embedding Retrieval in Search EngineRecently, embedding retrieval technologies have been widely adoptedin modern recommendation and advertising systems [5, 16, 36],while have not been widely used in search engine yet. We �nd afew works about retrieval problems in search engine [23, 30], whilethey have not been applied to industrial production system. To thebest of our knowledge, we are one of the �rst practical explorationsin this direction of applying embedding retrieval in industrial searchengine system.

3 OVERVIEW OF EMBEDDING RETRIEVALSYSTEM

Before we present the details, let us �rst show a full picture of ourembedding retrieval system. Figure 3 illustrates our productionsystem with three major modules as follows.

O�ine Model Training module trains a two tower model consist-ing of a query embedding model (i.e., query tower) and an itemembedding model (i.e., item tower) for the uses in online servingand o�ine indexing respectively. �is two tower model structureis a careful and essential design to enable fast online embeddingretrieval, which we will discuss more in Section 4. Moreover, Wewill also talk about our e�ort of optimizing o�ine training systemin Section 5.1.

O�ine Indexing module loads the item embedding model (i.e., theitem tower) to compute all the item embeddings from the itemcorpus, and then builds an embedding index o�ine to support e�-cient online embedding retrieval. As it is infeasible to exhaustivelysearch over the item corpus of billions of items, to �nd similar itemembeddings for a query embedding, we employ one of state-of-the-art algorithms [15] for e�cient nearest neighbor search of densevectors.

Online Serving module loads the query embedding model (i.e., thequery tower) to transform any user input query text to query em-bedding, which is then fed to the item embedding index to retrieveK similar items. Note that this online serving system has to be builtwith low latency of tens of milliseconds. Also, it must be scalableto hundreds of thousands queries per second (QPS), and �exible foragile iterations of experiments. We will talk about our e�orts ofbuilding such an online serving system in Section 5.2.

4 EMBEDDING LEARNING MODELIn this section, we introduce the embedding learning model in astepwise fashion, in the order of two tower architecture, multi-headdesign of query tower, a�entive loss function, hybrid negative sam-pling, and human supervision data, all of which are indispensableto train our best performing model.

4.1 Two Tower Model ArchitectureAs shown in o�ine model training module in Figure 3, the modelis composed of a query tower Q and an item tower S . For a givenquery q and an item s , the scoring output of the model is

f (q, s) = G(Q(q), S(s))

where Q(q) ∈ Rd×m denotes query tower outputs ofm query em-beddings in d-dimensional space. Similarly, S(s) ∈ Rd×n denotesitem tower outputs. �e scoring function G(., .) computes the �nalscore between the query and item. Researchers and practitionerso�en let query tower Q and item tower S both output one singleembedding, i.e.,m = 1 and n = 1, and chooseG as inner product, i.e.,G(Q(q), S(s)) = Q(q)>S(s) where the superscript > denotes matrixtranspose. �is simplest setup has been proved to be successful inmany applications [5].

�e key design principle for such two tower architecture is tomake the query embedding and the item embedding independenton each other a�er the model is trained. So we can compute themseparately. All item embeddings can be computed o�ine in orderto build an item embedding index for fast nearest neighbor searchonline, and the query embedding can be computed online to handleall possible user queries. Even though the embeddings are com-puted separately, due to the simple dot product interaction betweenquery and item towers, the query and item embeddings are stilltheoretically in the same geometric space. �us, �nding K nearestitems for a given query embedding is equivalent to minimizing theloss for K query item pairs where the query is given.

In below sections, we will introduce a novel design of querytower Q and an interaction function G to achieve outperformingand explainable retrieval results. Since item representations arenormally straightforward, we still keep the item tower S typicallysimple. It concatenates all item features as input layer, then goesthrough multi-layer perceptron (MLP) of fully connected Recti�edLinear Units (ReLU) to output a single item embedding, whichis �nally normalized to the same length as query embedding, asshown in the right side of o�ine model training panel in Figure 3.Similar MLP structure can be found in previous work [5].

4.2 �ery Tower with Multi-headsAs shown in the le� side of o�ine model training panel in Figure 3,query tower di�ers from item tower in two places, 1) a projec-tion layer that projects the one input dense representation to Kdense representations. Another choice here is to use K indepen-dent embedding set, but it requires larger model size. In practice,we choose the projection layer to achieve similar results but withmuch smaller model size. 2) K separate encoding MLPs, each ofwhich independently outputs one query embedding that potentiallywould capture di�erent intention for the query. We refer to theseK output embeddings as multi-head representations.

�ese multiple query embeddings provide rich representationsfor the query’s intentions. Typically, we �nd in practice that itcould capture di�erent semantic meanings for a polysemous query(e.g., “apple”), di�erent popular brands for a product query (e.g.,“cellphone”), and di�erent products for a brand query (e.g., “Sam-sung”).

It is worth mentioning that the encoding layer can use any othermore powerful neural network, such as Recurrent Neural Network(RNN) and other state-of-the-art transformer based models [7, 24,29]. In a separate o�ine study, we have achieved similar or slightlybe�er results with these advanced models. However, we would liketo emphasize that a simple MLP is more applicable to our industrialproduction modeling system, since it is much more e�cient forboth o�ine training and online serving, which means that we areable to feed more data to the model training, and deploy fewermachines to serve the model. �ese are strong deal breakers inindustrial world.

4.3 Optimization with Attention LossApart from the single embedding and inner product setup, herewe develop a more general form for multiple query embeddings.As a shorthand, we denote each output of query tower Q(q) as

{e1, e2, . . . , em } where ei ∈ Rd , and the single output of item towerS(s) as д ∈ Rd . �en the so� dot product interaction between queryand item can be de�ned as follows,

G(Q(q), S(s)) =m∑i=1

wie>i д. (1)

�is scoring function is basically a weighted sum of all inner prod-ucts betweenm query embeddings and one item embedding. �eweights wi are calculated from so�max of the same set of innerproducts,

wi =exp(e>i д/β)∑mj=1 exp(e>j д/β)

,

where β is the temperature parameter of so�max. Note that thehigher the β is, the more uniform the a�ention weights appear. Ifβ → 0, then the so� dot product in Equation (1) would be equivalentto selecting the largest inner product, i.e., maxi ei>д.

A typical industrial click log data set usually contains only clickpairs of query and item. �e pairs are usually relevant, thus can betreated as positive training examples. Besides that, we also needto collect negative examples by various sampling techniques thatwe will talk about later in Section 4.4. Let us de�ne the set D of alltraining examples as follows,

D ={(qi , s

+i ,Ni

) �� i, r (qi , s+i ) = 1, r (qi , s−j ) = 0 ∀ s−j ∈ Ni},

(2)where each training example is a triplet composed of, a query qi , apositive item s+i that is relevant to the query denoted as r (qi , s+i ) = 1,and an negative item set Ni where every element s−j is irrelevantto the query, denoted as r (qi , s−j ) = 0. �en we can employ hingeloss with margin δ over the training data set D as follows,

L(D) =∑

(qi ,s+i ,Ni )∈D

∑s−j ∈Ni

max(0,δ − f (qi , s+i ) + f (qi , s−j )

).

�e above a�ention loss is only applied in the o�ine training.During the online retrieval, each query head retrieves the samenumber of items. �en all the items will be sorted and cut o� basedon their inner products with one of the heads.

4.4 Click Logs with Negative SamplingTraining a deep model requires a huge amount of data. We exploreclick logs, which represents users’ implicit relevance feedback andconsists of a list of queries and their clicked items, to train ourembedding retrieval model. Intuitively, we can assume that anitem is relevant, at least partially, to the query if it is clicked forthat query. Formally, we can consider click logs as a special caseof data set with only positive examples. �en how to e�cientlycollect negative examples is a crucial question here. In our practice,we employ a hybrid approach that mixes two sources of negativesamples, including random negatives and batch negatives.

4.4.1 Random Negatives. Random negative set Nrandi are uni-

formly sampled from all candidate items. Formally, given a set ofall N available items, we draw a random integer variable from auniform distribution i ∼ Uni f orm(1,N ), and take the i-th elementfrom the item set into random negative set Nrand

i . However, if weapply this uniform sampling in a straightforward way, it would be

very computational expensive, since each negative sample has togo through the item tower, not to mention the cost for samplingthose negative examples and fetching their features. To minimizethe computational cost while retaining its e�ect, we use the samerandom negative set for all training examples in a batch. In practice,we found the results are similar to that using pure random negativesbut the training speed is much faster.

4.4.2 Batch Negatives. Batch negative set Nbatchi are collected

by permuting the positive query item pairs in a training batch. Indetail, for a training batch

B = {(qi , s+i ,Ni ) | i },

we can collect more negative examples for the i-th example as

Nbatchi =

{s+k | k , i, 1 ≤ k ≤ |B|

}.

We can see that batch negatives are basically sampled accordingto item frequency in the dataset. �ese randomly generated queryand item pairs are very unlikely to be relevant by chance. Specif-ically, the chance is equal to that two randomly drawn click logshaving relevant items for each other. Given a dataset of hundredsof millions of click logs, this chance is basically ignorable in termsof training accuracy. Also, the main advantage of the above batchnegatives is the reuse of the item embedding computations. Eachitem embedding in the batch serves once as positive example, and|B| − 1 times as negative examples for other queries in the batch,but with only one feature fetching and forward pass of the itemtower.

4.4.3 Mixing Ratio. Eventually, the complete negative item setNi in Equation (2) is a union set of above two sets,

Ni = Nrandi ∪ Nbatch

i .

In our practice of e-commerce search retrieval, we �nd it is typi-cally useful to have a mixing ratio parameter 0 ≤ α ≤ 1 for thecomposition of negative sample set. Formally, we use proportion αof random negatives, and proportion (1−α) of batch negatives. We�nd the value of α highly correlates with the popularity of itemsretrieved from the model (see Experiments), thus highly in�uentialto online metrics. Intuitively, we can see that the mixing ratio αdetermines the item distribution in negative examples, from uni-form distribution (α = 1) to actual item frequency (α = 0). In thismanner, the model tends to retrieve more popular items for largerα , as popular items appear relatively less frequently in negativeexamples.

4.4.4 Summary. We summarize the full training algorithm withbatch negatives and random negatives in Algorithm 1. �e compu-tational complexity for each training step is O(b2), i.e., quadraticwith the batch size b, since the batch negatives require an innerproduct between every query and item embedding pair in the batch.In practice, since the batch size is usually small, e.g., 64 or 128, thequadratic e�ect is actually much smaller than other computationalcost, i.e., feature fetching, gradient computation, and so on. In fact,with batch negatives, the total convergence is actually faster, dueto the e�cient use of every item tower outputs.

Algorithm 1 DPSR training algorithm1: input: DatasetD, batch size b, max number of stepsT , mixing

ratio α .2: for t = 1 . . .T do3: Sample a batch of b examples B ⊆ D+.4: Sample a set of random negativesNrand for this batch. Note

that all examples in the batch shares this set.5: Compute query head embeddings Q(q) from query tower.6: Compute item embeddings S(s) for all item si in the batch,

and that in the random negative set Nrand .7: Compute loss function value L(B) for this batch B. �e

batch negatives Nbatch are implicitly computed and in-cluded in the loss.

8: Update towers Q and S by back propagation.9: end for

10: return query tower Q and item tower S .

4.5 Human SupervisionBeyond using click logs data, our model is also able to utilize ad-ditional human supervision to further correct corner cases, incor-porate prior knowledge and improve its performance. �e humansupervision comes from three sources:

• Most skipped items can be automatically collected fromonline logs [14]. �ese items and the associated queriescan be used as negative examples.

• Human generated data can be collected based on domainknowledge as arti�cial negative query item pairs (e.g., cell-phone cases are generated as negative items for query “cell-phone”, because they share similar product words literallybut di�er signi�cantly in semantic meaning) and positivequery item pairs (e.g., iPhone 11 items are generated aspositive items for query “newest large screen iphone”).

• Human labels and bad case reports are normally used totrain relevance models [35]. We also include them as bothpositive and negative examples in the training data set.

�ese human supervision data can be fed into the model as eitherpositive query item pairs or an item in the random negative set.

5 EMBEDDING RETRIEVAL SYSTEMWe employ TensorFlow [1] as our training and online serving frame-work, since it has been widely used in both academia and industry.Particularly, it has the advantage of high-performance of trainingspeed with static graph pre-built before training, and seamless inte-gration between training and online serving. We built our systembased on the high level TensorFlow API, called Estimator [32]. Toensure best performance and system consistency, we have also madesigni�cant e�orts to abridge an o�-the-shelf TensorFlow packageand an industry level deep learning system.

5.1 Training System Optimizations5.1.1 Consistency Between Online and O�line. One of the com-

mon challenges for building a machine learning system is to guar-antee the o�ine and online consistency. A typical inconsistencyusually happens at the feature computation stage, especially if

two separate programming scripts are used in o�ine data pre-processing and online serving system. In our system, the mostvulnerable part is the text tokenization, carried on three times indata preprocessing, model training and online serving. In aware ofthis, we implement one unique tokenizer in C++, and wrap it witha very thin Python SWIG interface [3] for o�ine data vocabularycomputation, and with TensorFlow C++ custom operator [1] foro�ine training as well as online serving. Consequentially, it isguaranteed that the same tokenizer code runs through raw datapreprocessing, model training and online prediction.

5.1.2 Compressed Input Data Format. A typical data format forindustrial search or recommendation training system is usually com-posed of three types of features, user features (e.g., query, gender,locale), item features (e.g., popularity), and user-item interactionfeatures (e.g., was it seen by the user). �e plain input data willrepeat user and item features many times since the training datastore all user item interaction pairs, which results in hundreds ofterabytes of disk space occupation, more data transferring timeand slow training speed. To solve this problem, we customizedTensorFlow Dataset [31] to assemble training examples from threeseparate �les, a user feature �le, an item feature �le and an interac-tion �le with query, user id and item id. �e user and item feature�les are �rst loaded into memory as feature lookup dictionaries,then the interaction �le is iterated over the training steps withthe user and item features appended. With this optimization, wesuccessfully reduced the training data size to be 10% of the originalsize.

5.1.3 Scalable Distributed Training. In the scenario of distributedtraining with parameter servers, one of the common bo�lenecksis network bandwidth. Most of mainframe network bandwidthin industry is 10G bits that are far from enough for large deeplearning models. We observed that the o�-the-shelf TensorFlowEstimator implementation is not optimized enough when handlingembedding aggregation (e.g., sum of embeddings), thus the networkbandwidth becomes a bo�leneck quickly while adding a handfulof workers. To further scale up the training speed, we improvedthe embedding aggregation operator in TensorFlow o�cial imple-mentation by moving the embedding aggregation operation insideparameter server, instead of in the workers. �us, only one em-bedding is transferred between parameter server and worker foreach embedding aggregation, instead of tens of them. �erefore,network bandwidth is signi�cantly reduced, and the distributedtraining can be scaled up to �ve times more machines.

5.2 Online Serving System�e overview of DPSR online serving system is shown in Figure 4.�e system consists of two novel parts that we would like to elabo-rate on, one TensorFlow Servable [22] model, and a proxy for modelsharding.

5.2.1 One Servable Model. �e straightforward implementationof DPSR can be composed of two separate parts, query embeddingcomputation, and nearest neighbor lookup. Without careful de-sign, one can simply build two separate online services for them.However, this is not the optimal system design in the sense of twopoints, a) it introduces complexity to manage the mapping between

Embedding Indexes

Servable III

Request forServable V

Query Embedding Model

Proxy Module

Servable I

Servable II

Model Server 1

Servable IV

Embedding Indexes

Servable V

Query Embedding Model

Servable I

Servable X

Model Server N

Servable IX

….

Request forServable III

All Requests

Figure 4: Online serving system for DPSR.

query embedding model and item embedding indexes, which couldcompletely cause system failure if mapping mistake happens. b) itneeds two network round trips to compute the nearest neighborsfor a given query text. To overcome these issues, we take a moreoptimized approach by utilizing TensorFlow Servable [22] frame-work, where we can unify the two parts into one model. As shownin Figure 4, the two parts can be encapsulated into one Servable.�e query embedding is sent directly from query embedding modelto item embedding index, via computer memory, instead of viacomputer network.

5.2.2 Model Sharding. �e further scale up of the system needsto support hundreds of DPSR models online at the same time, fordi�erent retrieval tasks, and for various model A/B experiments.However, one servable model consisting of one query embeddingmodel and one item embedding index usually takes tens of Giga-bytes of memory. �us, It becomes infeasible to store all the modelsin one machine’s memory, and we have to build a system to sup-port serving hundreds of DPSR models. We solve this problem by aproxy module, which plays the role of directing model predictionrequests to one of the model servers that hold the correspondingmodel, as shown in Figure 4. �is infrastructure is not only de-signed for DPSR, but as a general system for supporting all deeplearning models at our search production.

6 EXPERIMENTSIn this section, we �rst visualize the embedding results leverag-ing t-SNE in Section 6.1, so we can get the intuition of how themodel works. �en we report o�ine evaluations by comparingwith di�erent methods in Section 6.2. Next, we report online A/Btest results in our search production, one of largest e-commercesearch engines in the world, in Section 6.3. Furthermore, we alsoreport the o�ine indexing and online serving time of our DPSRsystem in Section 6.4, to demonstrate its e�ciency, which is crucialin the industrial world.

Our production DPSR model is trained on a data set of 60 daysuser click logs, which contains 5.6 billion sessions. We conducteddistributed training in a cluster of �ve 48-cores machines, with atotal of 40 workers and 5 parameters servers launched. We usedmargin parameter δ = 0.1, AdaGrad [8] optimizer with learningrate 0.01, batch size b = 64, embedding dimension d = 64. �etraining converges in about 400 million steps for about 55 hours.

6.1 Embedding Visualization and Analysis6.1.1 Embedding Topology. To have an intuition of how our em-

bedding retrieval model works, we illustrate the 2-D t-SNE coordi-nates for frequent items chosen from the most popular 33 categoriesin our platform. As shown in Figure 5, we can see that the itemembeddings are structured in a very explicit and intuitive way. Basi-cally, we can see that the electronics related categories, e.g., phones,laptops, tablets, earphones, monitors are well placed on the le� sideof the �gure. �e appliance related categories, e.g., refrigerator, �atTV, air conditioner, washer and so on are placed on the lower le�side. �e food related categories, e.g., snacks, instant food, cookies,milk powder, are placed on the lower right part. �e cleaning andbeauty related categories, e.g., face cream and shampoo, are placedon the right part. �e clothes related categories, e.g., shoes, runningshoes, sweaters and down jackets, are placed on the upper rightpart. Overall, this reasonable and intuitive embedding topologyre�ects that the proposed model well learns the item semantics,which in turn enables query embeddings to retrieve relevant items.

6.1.2 Multi-Head Disambiguation. In Figure 6b, we also com-pute the 2-D t-SNE coordinates for frequent items chosen from 10commodity categories to illustrate the e�ect of having multi-headsin query tower. We use two polysemous queries as an example here,“apple” and “cellphone”, which are also within the top-10 queries inour platform. We can see that the two heads for the query “apple”separately retrieve iPhone/Macbook and apple fruit. In Figure 6c,we can see that the two heads for the query “cellphone” retrievethe two most popular brands, Huawei and Xiaomi, separately. �eillustration shows that di�erent heads are able to focus on di�er-ent possible user intentions. In contrast, the single head model inFigure 6a does not cluster well for cellphone category, where theiPhones are forming another cluster far away from other cellphones,potentially due to the ambiguity of the very top query “apple”.

6.1.3 Semantic Matching. For be�er understanding of how ourproposed model performs, we show a few good cases from ourretrieval production in Table 1. We can observe that DPSR is sur-prisingly capable of bridging queries and relevant items by learningthe semantic meaning of some words, such as big kid to 3-6 yearsold, free-style swimming equipment to hand paddle, and grandpato senior. Also, DPSR is able to correct typos in the query, such asv bag to LV bag, and ovivo cellphone to vivo cellphone, partiallybecause we leverage English le�er trigrams in the token vocabulary.We also observed similar typo corrections for Chinese characters,which are mainly learned from user clicks and n-gram embeddings.

6.2 O�line Evaluations6.2.1 Metrics. We use the following o�ine metrics to evaluate

the retrieval methods.Top-k is de�ned as the probability that a relevant item is ranked

within the top k retrieved results among N (we used 1, 024) randomitems for a given query. �is top-k value is empirically estimatedby averaging 200, 000 random queries. A higher top-k indicates abe�er retrieval quality, i.e., hit rate.

AUC is computed in a separate data set with human labeledrelevance for query item pairs. �e labels can be categorized into

relevant and non-relevant ones, and then the embedding inner prod-ucts or any relevancy scores (BM2.5) can be treated as predictionscores. A higher AUC here indicates a be�er retrieval relevancy.

Time is the total retrieval time on a 48-core CPU machine froma query text to 1, 000 most relevant items out of a set of 15 millionitems. �is metric value decides whether a method is possible toapply to industry-level retrieval system or not. Typically, the cuto�is 50 milliseconds, but preferably 20 milliseconds.

6.2.2 Baseline Methods. We compared DPSR with BM2.5 andDSSM as baselines. BM2.5 is a classical information retrieval methodbased on keywords matching using inverted index, and it usesheuristics to score documents based on term frequency and inverteddocument frequency. We compare with two versions of BM2.5, withonly unigrams, and with both unigrams and bigrams (denoted asBM2.5-u&b). DSSM is a classical deep learning model [13] designedfor ranking but not retrieval. We still would like to include thecomparison to clarify the di�erence.

6.2.3 Results. In Table 2, we show the comparison results withthe above baseline methods. We can make the following observa-tions from the results.

• BM2.5 as a classical method shows good retrieval quality,but it takes more than a minute to retrieve from 15 millionitems, which means that it is too unrealistic to use it inonline retrieval.

• DSSM that samples unclicked items as negative examplesperforms worst in top-k , MRR and AUC. �is is mainlydue to that DSSM is optimized for ranking tasks, whichis a highly di�erent task from retrieval. �erefore, wecan conclude that only using unclicked items as negativeexamples does not work to train a retrieval model.

• DPSR refers to a vanilla version of our model without anyuser features. It has the highest AUC score among thebaseline methods and other personalized DPSR versions,which indicates that pure semantic DPSR could achievethe highest retrieval relevance.

• DPSR-p refers to a basic personalized version of our model,with additional user pro�le features, like purchase power,gender and so on. �e result shows that those pro�lefeatures help improve the retrieval quality metrics (Top-k)over the vanilla version, with a slight tradeo� of relevancy.

• DPSR-h refers to a full personalized version of our model,with both user pro�le and user history events. It has thebest retrieval quality metrics (Top-k) over all models, whichdemonstrates that plenty of signals can be squeezed fromthe user history events. Note that the personalized modelimproves the retrieval quality metrics with a tradeo� ofrelevance metrics (AUC), which is also reasonable, sincethe retrieval quality consisting of more factors besidesrelevancy, such as item popularity, personalization and soon.

Moreover, Figure 7 illustrates that the mixing ratio α of randomnegatives and batch negatives (see Section 4.4.3) a�ects the retrieveditem popularity. Basically, we can observe that the more randomnegatives we have in the negative sampling, the more popular itemsare retrieved. But too many random negatives, e.g., α = 1.0, will

IF

Q

L

EH

W

K

G

AC

FT

RC

EP

RC

C

RC

FC

SN

FT

L

MN

A

IF

RF

IF

EH

G

FC

SW

W

L

BS

Q FC

K

LS

A

DP FC

A

SP

DJ

S

ES

TC

S

K

EH

SP

SP

CDP

RF

AC

ES

K

ES

A

AC

PC

EH

Q

LS

RF

GLSW

DP

SW

A

EH

IF

ACIF

A

G

FT

DP

SN

W

I

SP

TB

FT

DP

L GL

FT

EP

SPTC

DP

MN

LI

IF

SN

SW

C

DJRS

PC

BS

A

AC

LS

SPMN

EP

RC

DJ

A

C

RC

G

MN

W

RC

DP

RF

SP

SW

Q

FT

DP

LS

C

MP

TB

DJ

IF

RS

DJ

BS

S

SW

TB A

G

PC

TB

GL

W

PC

TB

RF

MP

EP

TC

RF

S

MN

EH

FC

G

TB

PC

SW

PC

LS

M

S

LS

PC

TC

Q

S

W

EPFC

ITB

GPC

GL

SN

A

DP

TC

Q

S

FT

G

TB

FCG

S

RF

RS

MP

G

GL

SP

MN

W

FC

PC

IF

MP

DJ

SW

A

MBS

I

Q

RC

SP

EP

Q

Q

RFRF

DJ

RS

EH

I

SN

RS

K

GL

RFRF

DJRS

IF

EH

M

C

IF

Q

RF

FC

RS

M

SP

FC

SN

EP

DP

RS

RC

SN

L

SP

I

MP

EH

C

LS

EH

DP

MP

MN

EH

MP

G

FC

RC

ES

I

M

W

G

RF

AC

RF

EP

MN

MP

DP

SP

L

I

SP

GL

IF

RF MP

TC

MP

I

SP

LS

DJ

BSBS

L

FC

RS

FT

FC

MN

TC

Q

MP

M

EH

RS

SP

C

M

W

LS

IF

W

G

TC

RC

Q

RF

GL

ACRC

EP

TC

AC

EP

LS

SN

RC

EP

TB FC

ESK

C

TB

RF

FC

RS

FT

Q

M

ES

C

MP

QTB FC

G

BS

SP

GL

ES

GL

FC

TB

L

LS

I

MN

EH

IF

K

RS

MP

EPM

SN

S

RFMP

DP

SP

SN

ESES

SN

SP

AC

DP

I

TB

RS

ES

M

C

M

A

GL

L

IF

PC

G

PC

LL

C

RS

IF

M

A

FT

RC

AC

L

C

SN

RC

SN

FC

RF

G

K

EH

DJ

PC

DJ

IF

I

SW

SN

TC

FCDP

LS

IF

TC

G

I

LSSW

G

ESES

AC

QI

AC

W

PC

C

TC

S

SN

IF

SW

A

SW

K

LS

L

I

RF

IF

RF

TC

K

FT

GL

L

A

MP K

RF

S

S

CG

TC

SW

M

SN

EH

M

FTFT

L

DP

S

BS M

GL

MN

C

SN

Q

BS

IF

RF

SP

C

EP

C

AC

C

K

FT

A

S

PC

K

FC

M

W

L

I

TC

L

C

RS

SN

C

A

C

W

TB

I

S

C

A

TC

TB

BS

C EP

PC

TB

AC

GL

EP

SS

AI A

PC

FT

LS

EP

RF

TC

MN

Q

SNK

AC

GL

TC

SN

M

RS

A

W

SN

M

TB

TC

RF MPRF

W

DP

G

QFC

TC

A

TB

EP

L

RS

SN

Q

AC

Q

L

M

I

DJ

Q

MPMP

DJ

Q

L

S

K

M

EH

EH

LS

G

TB

SN

PC

FT

MN

S

ES

TB

IF

G

SP

M

AI

EP

LS

I

RF

Q

TB

I

LS

I

AC

C

W

DP

LS

RC

TB

K

I

I

S

SW

BS

DJ

A

EP

FT

A

SDJ

TC

S

DJ

KC

Q

TC

S

FC

DJDJ

SW

DP

DJ

A

S

DP

SW

RF

FTFT

ES

C

I

RS

S

G

DJ

MP

K

IF

W

AC

L

RF

FT

W

C

MPW

L

M

FT

GL

SP

TC

SW

ES

RC

IF

A

DP

PC

W

PC

IF

TC SP

FT

AC

GM

ES

G

RC

G

EHEH

A

DPEP

RC

ES

K

TC

W

G

DJ

RS

SP

DJ

LS

A

DP

DP

L

M

EP

TC

G

RS

FT

L

PC

RF

A

W

BSBS

MP

C

PC

MPW

Q

RFRF W

IF

PC

MP

S

DP

L

RF

SN

LS

PCEP

BS

ES

LS

G

W

LS

W

M

MP

FC

Q

BS

C

TB

RF

MP

RC

IF

SN

EH

C

L

TB

MN

FCSN

LS

GL

FT

KC

RF

LS

TC

RS

W

S

TC

RF

BS

GL

A

L

FC

SP

FT

SP

I

SP

W

AC

A

LS

BS

SW

G

RS

C

SP

MP

LS

SN

AC

MP

C

FC

RS

SP

L

IF

ES

W

C

DJ

L

I

EH

MN

FC

A

MN

M

G

SN

FT

DP

FC

DP

FT

SP SP

DJ

I

IF

FT

GL

IF

SN

SP

C

K

EP

BSES

I

LS

I

M

RC

FT

GL

PC DP

IF

FC

SW

W

I

ES

C

EP

GL

EP

C

ES

TB

SN

RS

M

TB

SN

S

PC

A

FT

AC

RF

LS

TB

FC

S

PC

LS

TB

DJ

A

RF

RC

SN

TB

IF

S

S

DP

M

RC

RS

AC

TB

II

DJ

DP

EH

TB

SN

TB

II

I

RCRC

SP

EPFC

RC

PC

TB

MP

PC

LS

G

MNSP

L

SP

FC

L

EH

G

RS

L

SW

A

EH

K

Q

IF

Q

BS

EPC

L

ES

ES

AC

SP

RS

W

EP

MN

FT

I

DJ

TC

RS

SPBS

L

I Q

K

GL

SP

C

TC

PC

RF

SN

L L

GL

L

M

K

FC

RS

AC

L

EP

L

A

Q

DJ

ES

SN

L

TC

DJ

SW

L

C C

TC

SP

DJ

M

ESPC

G

TC

RC

M

K

MP

DJ

SW

IF

IF

SW

RC

SW

W

IF

Q

W

IF

C

SW

A

MNMN

L

FC

EH

PCG

DP

GL

BS

SN

ES

LS

K

G

GL

SW

MN

EH

SN

RC

C

A

SW

RC

C

MP

SNIF

SW

EP

C

EPDP

TC

Q

DJ

BS

G

AL

EH

GL

W

RF

G

EH

SW

SP

MPC

DJ

SW

DJ

M

W

GL

EP

RC

BS TC

M

AC

SP

W

FC

DJ

M

TC

G

L

CES

SN

DJ

BS

GL

EH

RS

S

A

FT

SW

S

RC

PC

Q

SPSP

I

GL

SN

RF

RC

RS

LS

IF

MN

AC

M

C

RS

RF

M

IF

GL

BS

L

RC

AC

TB

SW

RF

DP

AC RCAC

CK

EP

AC

FC

L

FCG

BS

ESCSP

I

TC

ES

C

MP

LS

IF

A

AC

SPPC

K

C

DJ

RF

RC

FT

TC

I

MP

SP

C

TB

DJ

C

FT

GL

SN

EH

S

FT

K

GL

EP

S

LS

DP

DJ

EH

FT

GL

IFAC

BS

TB

BS

DP

GLGL

BS

EH

RS

G

A

TB

EH

IF

RS

FC

RS

RF

SP

IF

PC

FT

M

C

AC

SP

K

SN

BS

K

GL

EH

RS

MP

EH

W

C

M

Q

PC

S

SN

AC

RF

Q

I

ES

TC

C

MP

AC

WRF

A

Q

WW

DP

AA

W

I

ES

LSL

DJ

TC

SN

DP

G

FTFT

GL

DJ

K

AC

I

FT

EH

PC

A

MN

K

SP

FT

RS

IF

PC

MN

I

EP

ES

FC

C

SN

DJ

IF

A

TCMN

DJDJ

IF

EP

S

IF

PC

IF

G

TBA

EP

K

MBS

I

EP

SN

K

MN

IF

S

TB

DP

DP

M

I

PC

RS

GL

K

PC

W

S

TB

DP

A

G

IF

RS

MN

FT

AC

Q

EH

DP

EP FCQ

MP

EH

GL

PC

FT

GL

TB

ES

SN

A

SW

C

GL

EH

A

SW

MP MPMP

G

DJ

SW

FC

ES TC

IF

SW

RF

L

C

EH

C

IF

ES

MN

L

IFRF

MN

RS

AC

RF

FC

SN

BSBSMN SP

DP

BS

SW

MN

RS

FT

RF K

LS

WK

PC

BS

PC

MP

Q

GL

SW

MP

FTK

M

IF

S

LS

K

L

BS

MNMN

W

G

S

FC

PCMN

RF

IF

SN

MN

LS

SP

Q

TB

A

SP

SN

EP

SN

EP

PC

G

TC

QTB

Q

SW

M

PC

BS

K

IF

G

RC

MN

PC

EH

I

DP

LS

SP

G

GLGL

L

TB

DPEP

I

EH

RF

RC

SP

WW

LL

M

RC

Q

PCM

MP

FT

FC

C

MP

FC

K

TB I

SN

PC

K

W

TCC

IF

K

W

RC

K

EP

TC

EH

W

RSRS

FT

RF

WSN

DJ

SP

DJ

K

C

L

L

C

EH

IF

FC

IF

RF

C

MN

MP

DP

RF

FT

LS

RF

I

AC

SW

M

SW

DJ

EH

MN

IF

K

G

W

EP

C

SW

Q

EH

RF

AC

ESSP

SWLS

S

FT

LS

I

ES

LS

SN

SW

M

TB

A

TC

RF

TC

S

IF

L

RF

PC

A

A

EP

ES

L

FT

L

SP

IF

DJ

MN

W

DJRS

ES

EP

I

GL

BSTC

MN

K

DJ

GLGL S

PC

DJ

TC

LSI

GL

SP

DP

RF

DJ

A

EP

W

TB

RF

C

RS

A

TC

IF

W

ES

I

DP

TC

DP

C

SN

MN

GL

FT

IF

DP

TC

FC

EH

Q

RC

GL

DJ

TC

I

RC

TC

TCBS

TB

DJ

SW

AC

L

FT

MN

EH

IF

AC

W

Q

M

IF

MN

TB

KW

G

FCQ

MN

I

RS

Q

W

K

DP

RC

TC

DJ

GL

TB

M

AC

EP

A

MN

DJ

RS

TC

Q

MN

EH

RF

DJ

Q

TC

FT

MN

RF

RC

SP

Q

MNM

W

SW

I

MN

SP

AA

I

ES

A

FT

IF

SW

ES

L

TB

DP

A

LS

AC

BSMN BS

L

MTC

G

A

FT

BS

IF

FC

TC

C

PC

EH

MN TCTC

DJ

EH

K

M

EP

I

W

GL

L

DJ

M

EH

BS

CK

A

Q

PC

A

FC

TB

EP

AI

M

MP

TC

RF

M

L

TB

EP

LS

SN

BS

A

ES

DP

DP

EP

L

AC

FC

MP

BS

A

MP

BS

MP

DP

AC

SN

ES

DP

LS

FT

ES

EH

GL

C

C

A

ES

EP

L

FC

TC

G

WW

DP

EP

TB

C

IF

I

GL

G

MN

RF

MM

I

M

FC

EP

S

I

GL

M

I

G

AC

MP

S

AC

W

FC

ES

I

A

DP

L

PC

MP

MN

ES

L

SP

RC

A

PC

PC

FT

QII

DPFC

DP

RC

FC

A

AC

EP

SPBS

K

AC

EH

FC

DJ

MNMN

A

RS

DJ

CQQ

AC

LS

G

DJ

RC

DJ

FT

RS

C

BS

S

DP

GL

MN

RS

W

RS

EH

AC

MN

K

BS

I

ES

TC

MP

DJ

IF

MN

AC

FC

MP

C

AC

EH

GC

BS

TB

LSLS

MN

MP

FC

MP

L

GL

M

G

W

AA

C

FT

I

SW

L

QQ

ES

C

AC

G

AC

Q

BS

MN

MN

EH

AC

LS

A

LSLS

TC

ES

GPC

IF

MN

LS

W

DP

MN

TB

W

EH

MN

FC

RF

RS

S

IF

TB

S

IF

LSSW

EH

IF

C

ES

IF

S

PC

RS

ES

IF

S

A

G

AC

RC

LS

I

AC

Q

MP

TC

A

TC

RF

S

BS

K

IF

BS

DP

L

SP

IF

K

DJ

IF SN

EH

W

I

M

K

L

W

DJ

G

M

SP

EP

GL

SN

MP

BS

RC

I

RC

SN

MN

W

L

EH

G

RC

I

EH

C

DJ

EH

GL

S

SP

RS

ES

L

EP

RS

RS

SN

EH

PC

FC

MN BS

PC

RS

W

DJ

A

PC

EH

PC

DJ

DP

AC

SW

G

DJ

EH

GL

MN

C

DP

DJ

FC

RF

S

SW

A

TB

LS

RFSN

SW

SP

LS

MP SNRF

ES

LSTB

ES

G

MM

ES

FC

FT

RF

AC

ES

Q

MNPC

TCES

Q

MN

DJ

G

W

A

BS SP

M

LS

RF

L

MP

K

TC

SNW

C

G

FT

DJ

C

FT

MN

SLS

RF

TC

DP

Q

EH

A

LS

TB

C

L

Q

PC

IF

C

BS

EH

M

EH

MN

BS

C

RF

DJ

MN

Q

TC

DJ

L

FC

FT

IF

RF

A

SW

DJ

RC

ILS

A

FT

L

BS

SW

RS

MP

EP

M

TC

L

TC

K

AC

K

K

SN

EH

A

AC

TB

ES

SW

M

MP

I

RC

C

Q

G

IF

RC

SP

W

G

MN

DP

C

GL

QQ

AC

MN

IF

C

I

GL

AC

SP

EH

SW

TC

DP

FT

DP

FT

ES

MNTC

W

QDP

L

GL

C

SN

K

K

MN

RF

BS

Q

DP

RC

TCMN

G

SW LS

EH

MN

SP

LS

EP

C

GLLS

LS

GL

EH

EP

EH

FT

QDP

G

M

TB

C

TC

A

M

FC

Q

I

SW

TC

IF

EP

DJ

TB

I

MN

TB

AC

BS

SN

TC

W

Q

MP

FT

GL

KK

A

EP

L

TC

A

Q

FC

FT

RS

SN

EP

AC

MMN

SN

GL

TC

SW

C

K

PC

DP

K

RC

SP

K

BS

Q

AC

FT

SPBS

FT

PC

C

DP

RC

FC

DJ

MP

FC

BS

L

ES

TB

ES

M

W

FT

TB

AC

GLSW

SW

TB

SP

W SN

SPCC

MP

PC

RC

M

L

DP

TB

BS

FC

SN

I

PC

TB

SP

GL

RS

I

K

SN

RCRC

GL

W

M

LSLS

IF

ES

FT

RSRS

M

G

M

WRF

I

BS

RSRS

FT

L

RSRS

SGL

EH

RS

K

RF

MP

RS

SW

C

MN

RS

RF

SW

DPEP

SP

M

GL

LS

EH

IF

EH

SP

PC

ES

ES

GL

G

SW

DJ

ES

FCC

FT

S

A

S

EH

LS

G

GL

K

FT

PC

SW

EH

I

EH

ES

DP

A

MP

SW

FT

W

EP

DP

I

A

PC

RS

Q

SW

M

GL

C

MPRF

EH

RF

MN

EP

S

SN

TC

G

SP

K

SN

S

K

SW

A

G

FT

LSLS

M

A

Q

EP

TB

AC

L

EH

RS

S

RS

GL

EP

W

SW

I

Q

BS

MN

FC

MP

DP

S

TB

SN

GL

IF

M

IF

S

DJ

DP

TC

W

LS

C

TC

SN SN

MES

LS

SW

Q

PC

C

AC

C

Q

FC

ES

PC

TB

FT

BSBS

SN

Q

DJ

I

GL

EH

FC

MP

DP

SP

GL

AC

RF

TC

LS

G

I

FT

GL

TB

AC

BS

SN

Q

TC

DP

BS

SNW

M

LL

FT

MTC

LS

ES TC

MN

GL

M

A

GGG

FT

FC

M

A

LS

C

MN

IF

LS

M

G

ES

RS

Q

RCRC

LS LS

G

EH

TC

GEP

TB

G

TB

BSMN

TC

MP

RS

Q

AC

S

GL

BSMNPC

FC

AC

SW

G

PC

MP

TB

K

G

LS

RF

M

MPMP

SN

FC

FT

W

C

PC

MP

FC

MN

RS

GL

I

RS

A

FC

TC

LS

I

EH

TC

I

TB

AC

EP

RC

TC

W

DJ

DJ

EH

FT

SN

MP

LS

IF

TC

SN

C

IF

MN

RC

TB

AC

SN

S

S

EP

BS

MP

S

IA

FC

M

EH

BSMN

TBA

AC

RS

S

I

W

FC

FC

TB

EH

RC

TBTB

RS

RC

EH

Q

TB

BSTC

W

PC

MN

LS

C

PCPC

SN

DP

MN MN

SN

BS

PC

TB

DP

TC

I

TB

RF

SN

IF

EP

TB

KW

LS

W

M

EP

MP

DJ

RC

BS

MN

L

TB

AC

LS

C

DJ

RF

EH

PC

BS

FC

TB

Q

EP

SN

FT

EP

SW

A

EP

SWL

M

Q

RS

S

SWSW

EH

EP

TB

IF

MN

Q

MP

TBI

PC

ES

RC

FCG

AA

RF

SNMP

QDP

LS

TB

C

FT

RS

RF

SNK

S

DJ

LS

C

RC

FC

TC

S

EP

S

SP

C

RC

L

SW

RC

LS

SN

DPDP

SW

RF

MN

SN

AC

GL

A

QDP

L

DPFC

L

M

EH

TB

DJ

SW

MP

BS

EH

AC

GL

I

MP

SN

Q

RC

C

PC

C

EP

PC

DJ

Q

AC

FCDP

SPTC

G

A

LS

Q

BS

RS

FC

G

LS

RF

PC

FC

C

GLGL

I FC

FC

TB

EP

L

EH

EP

SP

M

SN

RF

FT

Q

M

FC

TB

ES

ES

EH

TC

C

RS

TB

RFK

ES SP

TB

TCMN

Q

G

SW

C

RS

TC

LS

PCBS

M

C

SN

EH

C

A

RSGL

L

MP

FC

SW

C

RF

ES

AC

EH

C

SW

G

AC

EPEP

EH

A

FT

A

SW

EP

DJ

DP

RC

TC

SW

S

AC

RF

FT

SN

SW

EP

FT

SP

SN

G

C DP

AC

RF

ES

Q

RF

TC

C

TC

RC

L

RF

DJRS

GL

SP

SW LS

RF

FT

K

M

EH

SN

A

C

GL

IF

EP

DP

SW

FT

ES

M

L

C

RF MP K

MP

MN

RC

L

ESK

M

C

BS

Q

K

TC

K

K

SW

AC

C

PC

K

MPMP

RS

TC

LS

ES

A

FT

MN

TB

PC

C

RC

RC

LS

C

W

SW

A

PC

W

BS

RC

RC

G

ES

AC

EP

LS

SP

FC

EH

BS BS TC

SN

SW

SP

K

BS

Q

I

DJ

A

AC

PC

RS

FC

MP

ES

RC

A

Q

S

L

IF

L

BS

S

M

ES

L

RC

FC

EP

RC

K

PC

RF

SN

LS

L

EH

DJ

Q

TC

FC

FC

SW

TC

DJ

FC

L

RC

MP

A

DJ

SWGL

IF

G

G

C

L

MP

A

Q

RF

RS

EH

EP

RF RF

RC

C

GL

FT

I

SP

Q

SN

FT

GL

SN

TC

DP

FT

MP

I

MP

ES

AC

SN

AC

MN

AC

EP

LS

FT

L

MP

FC

M

DP

SN

L

EP

MP

FC

M

RC

LS

FT

EP

PC

ES

L

W

SW

EP

MP

Q

AC

TB

TC

RFW

DJ

ES TC

C

W

TB

W

LS

SN

A

ES

RS

ES

S

DJ

FT

TB

SPSP

G

GL

S S

DJ

RC

PC

L

DP

PC

RC

W

ES

MN

G

DJ

PCG

TB

I

GL

CK

DJDJ

FCG

C

L

SN

ES

PC

DJDJ

DP

MN

DJ

I

PC

LS

MN

DJ

K

I

LS

EH

RC

TB

W

AC

L

M

S

K

S

G

DP

MP

S

TB

Q

SN

RS

MN

SW

TB

IF

AC

SP

SW

S

RS

FC

SW

MN

TB

IF

RS

BSES

SW

LS

G FCG

RC

M

DJ

S

TC

W

MN

EP

M

AC

TC

A

RF

C

BS

ES

RS

MP

SP

AC

EH

A

SLS

L

MN

LS

BS

TC

C

RS

GL

MN

C

GL

DP

A

RC

TC

FC

M

FT

G

C

SN

FC

G

IF

DP FC

A

RS

RF

I

BS

M

EP

MPW

SWSW

FC

MP

SWGL

A

FT

LS

EH

G

RC

PC

BS

FT

MN

SP

GL

M

RC

SN

FT

PC

SP

RC

DJ

S

S

SN

RF

IFRC

DP

ES

I

M

RF

FT

TC

FT

EH

S

DP

S

G

MP

DPC

C

M

RF

S

LS

RC

SN

ES

DJ

ES

RF

TC

MN

S

SP

EP

W

LS

EP

S

C

PC

Q

IF

SW

C

L

TB

MP

Q

BS

EP

SWSW

PC

TC

TB

FT

TB

GL

SP

I

TC

IF

ACAC

FC

RC

SNRF

TC

FC

M

TB

MP

EH

TB

AC

RF

Q

K

PC EP

ES

RF SN

GL

C

G

K

K

TB

FT

G

EH

M

L

I

M

DJ

FC

RC

TB

BSMN

A

LS

DP

BS

MP

IF

LS

I

KSN

MN

RC

FT

SP

EH

RS

AC

A

EP

FC

IF

LS

SP

EP

I

IF

TBI

C

RC

K

SP

RC

LGL

W

MN

L

MP

DPG

DP

ES

GL

BS

W

AC

A

C

MP

BS

ES

IDPI

IF

K

TB

MP

SP

RC

SP

PC

AC

I

EP

MP

SP

W

AC

SPSPBS

W

RS

A

RC

GL

DJ

DP

G

MP

SN

K

EP

MNBS

EPC

RF

S

FC

AC

DPC

RS

RF

K

EP G

RC

TC

C

SP

SN

BS

MP

S

DJ

A

BSBS

IF

DP

LS

LS

S

RC

DJ

SW

MN

SN

LS

SW

RS

W

SW

AC

FC

GL

EP

RS

EH

RS

SN

AC

FT

Q

I

AC

I

RC

W

SP

EP

W K

AC

ES

RS

GL

FC

S

K

LS

LS

M

W

I

SP

M

RC

FTSN

A

G

S

BS

I

L

I

GQ

C

L

FC

RF

SW

SN

Q

BS

K

FC

GL

C

I

C

BS

A

FT

C

M

S

TC

DJ

TC

RF

ES

EP

SN

A A

K

IF

EH

SP

TB

RF

I

L

PC

S

SN

EH

A

KK

FT

I

S

RC

EPEP

Q

S

DP

MP

MP

ES

M

IF

A

W

L

LS

G

RC

BS

EP

K

G

S

EP

RF MP

C

S

RC

ES

S

Q

IF

MP

TC

IF

C

EP

AC

RF

L

AC

DJ

MP

EH

DJ

L

MP

K

I

MP

I

S

ES

W

SP

RS

A

K

SP

TB

EPEP

QC

W

A

RC

TBTB

SP

A

C

L

AC

DP

Q

LSSW

FC

TC

GL

K

RC

SW

S

IF

C

DP

DJ

DP

IF

DP

RF

SWTB

RF

PC

DJ

TB

CC

DP

L

TC

G QI

TC

I

EP

S

L

DJ

IF

DPEP

SN

G

TB

Q

RF K

DP

RC

SPSP

RC

W

EH

L

MP

Q

RS

I

SW

BS

W

PC

RC

MP

ESM

RC

MP

K

SW

DJ

A

DJ

L

RC

EH

S

A

SN

RC

MPRF IF

L

IF

A

SP

DP

TC

SN

MN

RF

S

SN

DP

LS

G

MP

LS

S

K

DJ

IF

C

TC

GL S

DJ

EP

C

ES

G

L

Q

EH

IA

RS

MP

L

IF

SN

SP

BS

TC

I

RC

SP

W

RS

DP

IF

L

FC

EH

SN

DJ

M

FT

G

EP

W

EP G

GLDJ

SP

EP

M

IF

EH

PC

A

ES

S

RF

BS

PC

TB

C

G

M

S

PC

C

IF

M

A

S

S

ES

RF

TB

EPPC

A

I

RF

IF

ES

SW

TC

EH

C

EH

C

SN

W

PC

ES

GL

PC

EP

TB

BS

I

DP

L

I

C

S

SP

FT

AC

S

RS

ES

RS

GL

SN

SP SP

IF

GL

FT

RCRC

MN

FT

A

FT

SN

L

SN

EP

FC

BS

MN

RF

EH

GL

GL

ES

RS

SPSP

PC

FC

IF

G

A

I

G

LS

EP

FT

EP

FT

DP

SP

FT

A

LS

C

I

TC

I

S

AC

MP

SW

RS

S

SN

PC

ES

C

K

W

K

S

I

C

S

IF

FT

RS

RF

C

RS

ES

RF

LS

Q

IFIF

M

DP

PC

L

Q

SP

K

LS

EPEP

MP

RS

LS LS

SP

FC

M

Q

SW

LS

AC

G

RS

RC

RS

LS

L

AC

S

GL

TB

DJ

M

S

I

A

S

BS

PC

TB

BS

A

I

FT

W

TB

FT

RS

L

KES

EP

CDP

RC

TC

MP

DJ

DJRS

MN

C

RF

FC

SW

LS

DP

MN MN

IF

SP

EH

SW

GL

G

SW

AC

TC

AC

EP

L

MPWW

LS

SW

A

IF

M

GL

PC

SP

GL

W

GL GL

RFSN

M

MN

SW

IFIF

GLRS

PC

L

GL

MN

L

RF

SW

Q

RF

EH

RS

GL

SP

G

SP

C

FC

C

BSES

RC

ES

Q

EP

IF

GL

C

IF

FC

SP

RS

IF

C

AC

BS

SN

SP

IF

Q

MN

BS

GL

MNSP

BS

L

I

A

MN

Q

MN

AC

L

K

RS

DP

A

SW

ES

GL

K

EH

RS

S

PC

K

LS

GPC

RC

W

TC

RS

RF

RS

TB

FT

AC

Q

TC

EP

FC

AC

EH

BS

WW

S

BS

SN

FT

Q

S

RC

MN

EP

C

MP

I

SP

IF

BS

DJ

ES

C

A

FC

ES

A

S

BS

DJ

LS

W

BS

K

DJ

MN

DJ

C

RC

BS

A

M

K

A

M

SW

I

BS

AC

C

A

EH

L

C

EH

EH

MP

EH

A

DJ

AC

FC

RF

SW

FT

I

RC

MP

SW

W

FC

KRF

A

I

LA

SW

DP

IFAC

FC

EP

L

C

FT

SW

K

TC

MN

C

Q

G

SW

AC

EP

RC

M

IF

IA

MBS

K

GL

W

A

DJ

RF

PC

S

G

SW

TB

FT

IF

S

M

BS

I

ES

MP

S

K

PCDP

Q

DJ

FC

RF

IF

S

FT

C

EH

GL

S

RC

SN

S

DJ

FCQ

G

MN

MP

RC

M

W

GL

G

A

W

I

MP

SP

MP

I

BS

TB

RF

AC

ESSP

PC

Q

K

AC

MN

W

A

TC

A

RF

TB

RC

RS

PC

Q

AC AC

S

FT

K

EH

IF

DJ

SW

ACFT

EP

TCBS

FC

AC

DP

PC

Q

PC

TC

G

S

Q

SW

Q

FT

S

MP

MN

EH

SP

SP

Q

GL

L

IF

SP

DP

TC

DP

S

RS

A

FC

BS

FT

ES

L

LS

EH

TB

RS

BS

S

EP

SN

M

RCAC

Q

TB

S

GL

AC

LS

BSESES SP

I

S

EP

MN

FC

IF

C

RF

LS

DP

TB

AC

BS

S

PC

M

C

EH

SP

ACIF

FC

TC

GL

SN

DP

A

SW

MN

DJ

DP

S

S

SN

G

M

FC

GL

A

AC

M

C

W

K

EP

EH

ES

G

FC

RS

MP

LSL

IF

G

FT

TC

M

GL

C

RF

A

IF

IF

MP

TB

RF

L

L

S

FC

W

DJ

SW

DJ

TC

M

AC

TB

L A

DJ

W

M

TB

M

Q

I

BS

AC RC

K

ES

M

GL

MN

FC

DP

FT

AC

L

PC

SW

ES

AC

PCM

K

MN

EP EP

MP

IF

BS

Q

SW

C

GL

AC

M

L

RC

W

RS

SP

SP

DJ

MN

SW

TC

RF

S

EP

K

L

Q

I

EH

MP

EP G

RF

K

A

FT

C

MP

RC

SN

SN

W

LS

FC

SP

EP

IF

LS

AC

FT

BS

C

RS

W

MN

DPQ

W

FC

RS

DPDP

EH

EP

IFAC

K

C

DP

LSW

S

K

IF

L

S

L

RS

MN

S

L

SPBS

PC

FT

L

LS

BS

LS

DP

EP

K

W

MBS

IF

K

RC

IF

L

TB

M

C

SN

BS

SN

MPRF

IF

C

DJ

A

RF

TB

M

DP

RF

RC

SN

SP

EH

MN

ES

I

QFC

RS

RC

BS

EP

A

DJ

MN

BS

KC

RC

EH

FT

EH

MN

K

G

PCM

IF

W

DP

GL

W

SP

MP

FC

RF

IF

TC

TC SP

K

FC

TB

BS

EH

GL

FT

C

MN

TB

EH

RF

IF

MP

DPC

M

RC

MN

DJ

C

RF

FT

LS

MN

SN

C

IF

FT

RS

W

S

DP

GL

Q

ES

RS

EP

LS

G

GL

Q

SN

MN

K

TC

FC

LS

SP

K

A

S

C

I

RC

ES

IF

SW

ES

DJ

SP

LS

L

LS

RC

FC

TC

K

I

MP

MN

EH

K

L

IF

RS

M

SP

PC

M

DJRS

BS

EH

G

DJ

FC

IF

ATB

GC

AC

FT

MN

FT

SP

PC

AC

G

M

RC

SP

K

SP

Q

G

C

PC

GL

TB

RS

FT

K

IF

BS

Q

FT

TB

TB

L

S

RS

G

C

C

EP

Q

A

K

ES

S

PC

C

TB

I

AC

S

MN

AA

C

SN

PC

AC

LS

I

RF

MN

SW

DJ

ES

A

RF

MM

TC

SW

TB

RS

FT

Q

A

DJ

PC

FT

MN

C

DPEP

PC

Q

PC

EH

I

PC

FT

TC

K

W

PC

EH

MN

MP

SW

I

C

FT

C

SN

RF

FC

PC

SW

RFRF

GL GL

FT

A

SW

K

EP

AC

SN

RS

TCBS

DJ

TC

C

AL

C

TB

SW

DJ

S

TB

SN

AC

TB

MP

ES

A

MP

TB

LS

SP

SN

LS

DJ

S

G

W

S

Q

A

MN

TB

CK

RC

RF

FCQ

AC

DJ

S

DP

RC

M

RC

Q

LS

Q

TB

Q

W

EH

ES

IF

EH

TB

FC

MBS

EP

MP

TB

MN

W

TC M

W

A

FC

I

W

LS

LS

RF

SP

I

MP

MN

S

IF

C

GL

SW

SP

RC

A

SW

A

PC

I

MP

DJ

I

EH

GLGL

TB

EH

GL

ESTC

EH

TB

G

SW

S

RC

TB

RC

FT

M

S

TC

GL

FT

SP

EP

SP

TB

S

SWTB

WSN

M

RS

PC

DJ

MP

SW

K

A

S

BS

DP

S

Q

G

QQ

ESES

RC

SN

RC

FT

DP

RF

G

IFFT

GL

A

MP

TB

ES

RFMP

TB

W

A

MMN

GL

PC

DJ

SP

EH

S

A

ES

L

AC

DJ

SW

PC

BSBS

Q

M

K

C

ESESBS

G

TC

BS

PC

GL

GL

L

G

SW

DJ

BS

PC

GLGL

SN

A

RC

SP

DJ

GL

RS

RC

DP

ES

DP

BS

DP

M

RC

BS

GL

C

GL

EH

W

PC

TC

G

S

FC

PC

FC

RF

A

MP

RC

ES

FT

GL

C

AC

RC

LS

MN BS

MP

AC

L

PC

SPMN

Q

RF

RS

ES

G

ES

DP

RC

GL

PC

G

L

SP

AC

SN

Q

DP

AC IF

MPMP

EP

RC

FCFC

PC

Q

SW

S

MN

W

Q

I

S

MN

RF

LS

SN

C

S

SP

IF

MN

L

FC

SW

C

GL

IF

DP

FT

RC

S

L

L

EH

DPG

W

MN

C

DPG

BS

FT

S

W

M

FC

DJ

TB L

AC

W

TB

C

MP

L

TB

FC

MP

TB

DJ

RF

IF

LL

EHEH

FC

C

ES

Q

MP

MP

TB

IF

S

PC

M

RC

SW

FC

FT

AC

TB

M

RF

FT

G

A

FC

RC

L

SP

L

KK

SW

GL

TB

RF

DP

SN

SP

TB

S

TB

RF

RF MP

RS

L

MP

FC

MBS

S

TC

K

S

RS

RC

DJ

Q

EP

DJ

L

K

RC

TC

SW

BS

MN

EP

SN

A

RC

ES

RC

EH

SP

SN

RC

K

RS

SN

I

L

SP

DJ

PC

EH

K

GL

PC

ES

G

SW

GL

EH

ESSP

M

GG

IF

A

S

SW

BS

PC

M

MP

MN

PC

RC

L

SW

SP

K

EH

A

G

GL

MP

SP

DP

A

S

TB

ES

L

M

ES

LS

TB

TC

MTC

EH

DP

MP

DP

IF

RS

CC

EH

SN

W

TC

S

BS

LS

RC

PC

W

TB

L

RSRS

C

FC

SN

SW

PC

FC

I

MP

AC

K

I

RS

TC

DP

SP

RS

AC

LSL

MNBS

K

SW

IF

I

RF

PCMN

EH

GL

A

DJ

ES

DJ

G

MPK

TB

L

PC

GL

BS

DJ

GL

G

A

A

DP

GL

I

DP FC

DJ

MP

SN

BS

EPC

AC

L

GL

LS

FT

ES

SN

EP

MP

BSC

EH

LS

MPRF

M

RC

EPEP

MN

EP

I

W

G

EP GG

FT

RS

Q

FT

I

SW

A

EP

MN

FC

AC

Q

K

LS

RC

EP

L

MP

BS

SW

M

MN

S

SW

TB

RS

W

RS

TC

SPES

AC

BS

RC

A

RS

MP

RC

W

LS

FT

K

SN

FC

RS

ES

PC

SW

W K

Q

TC

I

SW

SW

EP

RS

Q

SW

SN

I

WK

TCSP

RF

EP

K

TB

EH

AC

I

S

EH

EP

AA

RCRC

DJ

AC

GL

FT

AC

RF

L

M

AC

DP

LS

MN

IFIFFT

GL

SP

M

ES

GL

A

ES

MP

LS

EH

BSMN

MP

ES

W

Q

MP

SP

TC

S

M

AC

I

K

PC

W

S

K

EH

DP

DJ

IF

LS

G

TC

ES

PC

IF

SP

EP G

I

ES

EP

C

PC

Q

RS

SW

EP

TC

C

RS

EP

AC

MN

ES

AC

SN

TB

TC

RS

EH

TC

SW

S

FC

RS

SWSW

SW

EH

MN

I

GL

ES

I

RS

SW

A

AC

S

I

DJ

AC

FT

EH

DJ

I

DJS

GL

RF

TC

AC

RF

DP FC

G

SW

TC

S

EH

PC

DP

PC

Q

G

DJ

WW

DP

CK

RC

M

W

G

DJ

RFSN

W

TB

FT

BS

SN

I

EP

TB

RS

AC

LS

G

SN

G

IF

RS

BS

EP

RFRF

W

GL

PC MN

IF

DP

LA

G

K

I

BS

FC

S

ESSP

C

A

S

TC

DP

W

RF

IF

M

K

DP

S

SW

FC

W

SWSW

DP

TB

BS

MP

M

MP

SN

I

G

W

RC IF

FT

PC

S

TB

LSW

K

FC

ES

IF

L

EP

SW

MP

ESBS

DP

W

A

ES

IF

SN

BS

SN

W

LS

TB

SP

EH

C

IF

C

MN

FC

TB

SP

I

SN

BS

RS

RF

FC

RF

C

M

DJ

L

G

ES

MP

RS

L

BS

FC

C

RF

MN

EH

GL

MP

DP

AC

PC

AC

PCM

RF W

RS

FT

W

L

FC

K

FT

BS

SN

TB

TC

LS

L

IF

TB

IF

SP

IF

K

BS

RC

GL

Q

IFIF

TC

A

DP

FT

I

RC

A

W

Q

L

DP

AC

BS

SN

Q

DJ

EH

K

A

DJ

EP

EH

SN

EH

RS

RF

EH

FC

MP

A

MN

K

TC

G

A

ATB

PC

AC

ES

RS

MP

RC

Q

K

SN

LS

SP

FT

SP

MP

TC

W

SN

I

RS

SP

K

LS

MBS

I

LS

Q

RS

FT

I

PC

EP

ASW

K

G

GL

C

FT

SW

C

SP

C

DP

LS

FC

GL

SWSW

PC

FT

MNES

S

PC

W

ES

PC

ACFT

AA

EH

RF

TC

EP

TB

C

IFFT

DP

LS

DJ

GL

DJ

FCQ

MP

Q

ES

WW

AC

C

RC

DP

A

SP

DP

DP

DP

A

S

M

LS

FT

SW

DP

A

Q

LS

MP

RC

RS

DJ

AC

L

MN

GL LS

ES

EP

Q

ES

RS

C

I I

LSL

L

MP

A

MP

FC

LS

GPC

DJ

RF

A

MN

S

TB

RC

MP

LS

RC

RF K

GL

DJ

TC

BS

FC

A

AC

GL

TB

EP

BS

G

SW

M

DJ

M

SN

Q

DJDJ

A

SP

GL

MN

FT

LS

EP

BS

S

RS

AC

FT

DJ

RC

RF

A

SP

K

TB

MNM

EH

I

RC

MP

LS

RS

FT

MN

SN

L

K

AC

K

DP

I

FC

SN

ES

AC

K

TB

L

C

TC

I

LS

TCES

TC

DJ

C

RS

EP

MN

DP

PC

Q

PC

BS

ES

IF

FC

L

MP

FC

RC

EP

TC

BS

I

FC

FT

FT

FC

GL

IF

SN

Q

FC

RC

C

MN

EP

RF

C

RF

ES

AC

TB

MP

Q

C

SW

RS

FC

L

RC

TC

K

A

ES

EH

MP

FC

W

AA

SW

DJ

RC

EP

MN

W

SW

DJ

EH

SN

GL

ES

EP

RS

FT

GL

ES

FCDP

FT

RS

SPC

A

W

PCPC

Q

FT

SN

C

SW

IF

M

BS

GPC

PC

Q

LSSW

G DP

EP

GL

BS

G

PC

SW

I

ACAC

L

PC

MP

Q

G

SW LS

A

SP

G

SN

GL GL

C

W

BS

Q

M

A

LS

W

FCG

W

PC

EP EP

FT

GL

MN MN

A

G

RS

PC

TC

AC

RS

A

SP

FC

BS

FC

RS

K

L

ES

RS

K

ES

RS

G

IF

RF

RS

W

RF

GL

RS

ESBS

EH

K

I

L

RC

RS

SP

C

EH

M

RF

S

M

EH

A

BS

EP

W

WRF

ES

L

ES

SW

Q

S

W

TC

SN

BS

PC

ES

EH

A

MN

DP

A A

AC

Q

L

TC

PC

RC

K

EH

RS

C

DP

SPSP

IF

EH

GC DP

KK

A

RF

EH

RS

TB

RS

MP

DP

K

W

A

W

EH

MN

PC

MNM

DPG

RS

MN

EP

ES

S

AC

AC

EH

GL

MP

DPFC

ES

G

W

PC

Q

TC

GL

M

C

L

AC

EP

RF

TB

Q

K

IF

C

TB

RS

SN

C

A

AC

Q

EP

MN

RC

LS

ES

I

K

M

RS

G

FT

LS

RSRS

MN

RC

Q

TC

EPEP

FT

I

DJ

TB

S

SP

M

SP

Q

TB

EH

SN

IF

RC

ES

FC

IF

AC

MP

A

Q

TB

SN

C

DJ

FCG

LS

RS

W

TB

PCTCMN

PC

FC

SN

TC

Q

C

FT

MP

A

Q

Q

MN

W

K

BS

LS

GL

K

PCMN

DP

G

S

K

TCC

IF

G

S

DJ

S

G

FT

I

FC

RC

DP

C

W

ES

DP

K

AC

MP

LS

M

LS

EH

LSW

SN

C

SP

LS

L

SN

SW

SW

KK

C

L

I

RC

MP

MN

SN

PC

ES

EH

FT

M

TC

GG

BS

EP

GL

KES

G

GL

EP

TB

AC

M

A

RC

EP

PC

MN

SP

RS

AC

FT

RC

FT

MM

SN

EH

DJ

M

SW

DP

PC

RF

AC

A

ES

GL

A

SP

DJ

EH

LS

RC

MP

FT

ES

EP

Q

A

DP

SW

W

LS

ES

I

FC

TC

MP

PC

BS

SP

FT

I

IF

SW

RS

G

MP

RC

LS

EP

LS

M

A

G

S

C

W

A

PC

DP

RF

DJ

W

SW

DJ

C

BS

LS

FTFT

GL

BS

DJ

SP

MP

M

LS

FT

BS

K

Q

DJ

C

ES

TB

TB

EP

C

I

RF

SP

DJ

SN

FC

LS

EH

RS

EH

TC

SNW

BS

TB

TB

EH

RC

SN

DP

LS

A

BS

SN

TC

EH

S

EP

A

AC

BSC

DJ

DP

EH

EP

ES

FC

FT

G

GL

I

IF

K

LLS

MMN

MP

PC

Q

RC

RF

RSS

SN

SP

SW

TB

TC

W

A Apple

AC Air conditioner

BS Bathroom supplies

C Cellphone

DJ Down jacket

DP Dipers

EH Electric heater

EP Earphone

ES Electronic speaker

FC Face cream

FT Flat TV

G Gift

GL Gaming laptop

I Watch

IF Instant Food

K Cookies

L Laptop

LS Lipstick

M Milk

MN Monitor

MP Milk powder

PC Cellphone case

Q Liquor

RC Rice cooker

RF Refrigerator

RS Running shoes

S Shoes

SN Snacks

SP Shampoo

SW Sweater

TB Tablet

TC Thermos cup

W Washer

A Apple

AC Air conditioner

BS Bathroom supplies

C Cellphone

DJ Down jacket

DP Dipers

EH Electric heater

EP Earphone

ES Electronic speaker

FC Face cream

FT Flat TV

G Gift

GL Gaming laptop

I Watch

IF Instant Food

K Cookies

L Laptop

LS Lipstick

M Milk

MN Monitor

MP Milk powder

PC Cellphone case

Q Liquor

RC Rice cooker

RF Refrigerator

RS Running shoes

S Shoes

SN Snacks

SP Shampoo

SW Sweater

TB Tablet

TC Thermos cup

W Washer

Figure 5: t-SNE visualization of item embeddings from 33 most popular categories.

A

CG I

K L

M

Q

S

W

iPhone

(a) One head for query “apple”.

AC

G

I

K

L

M

QS

W

MacbookiPhone

apple fruitA AppleC CellphoneG Gift I WatchK CookiesL LaptopM MilkQ LiquorS ShoesW Washer

Head-1Head-2

(b) Two heads for query “apple”.

A

H

M

O

PR

S

V

Y

Z

A AppleH HuaweiM XiaomiO OppoP PhilipsR RealmeS SumsungV VivoY YijiaZ Meizu

Head-1Head-2

(c) Two heads for query “cellphone”.

Figure 6: t-SNE visualizations of retrieval results for polysemous queries.

Table 1: Good cases from DPSR system in production.

query retrieved item奶粉大童 (milk powder big kid) 美赞臣安儿健A+ 4段 (Enfamil A+ level-4 for 3 to 6 years old)碧倩套装 (“Clinique typo” set) 倩碧(CLINIQUE)经典三部曲套装 (Clinique classic trilogy set)

官网v女包 (authentic v women bag) 路易威登LV女包 (Louis Vui�on LV women bag)ovivo手机 (ovivo cellphone) vivo Z1 (vivo Z1 phone)

学习自由泳器材 (learn free-style swimming equipment) 英发/yingfa划臂 (yinfa hand paddle)

Table 2: Comparison between di�erent methods.

Top-1 Top-10 AUC TimeBM2.5 0.718 0.947 0.661 61 s

BM2.5-u&b 0.721 0.948 0.661 157 sDSSM 0.002 0.016 0.524 20 msDPSR 0.839 0.979 0.696 20 ms

DPSR-p 0.868 0.984 0.692 20 msDPSR-h 0.889 0.998 0.685 20 ms

0.00 0.25 0.50 0.75 1.00random negative proportion

0.4

0.5

0.6

0.7

met

ric v

alue

s

Top-1AUCPopularity

Figure 7: E�ect with di�erent mixing ratio of negatives.

hurt the retrieved item’s relevancy. �us, we can treat the parameterof α as a tradeo� between retrieval popularity and relevancy. Inpractice, we also found a proper choice of α = 0.5 or α = 0.75would help online metrics signi�cantly.

6.3 Online A/B TestsDPSR is designed as a key component in our search system toimprove the overall user experience. �us, we would like to focuson the overall improvement of a search system using DPSR as anadditional retrieval method.

In the control setup (baseline), it includes all the candidatesavailable in our current production system, which are retrieved byinverted-index based methods with query rewri�en enabled. Inthe variation experiment setup (DPSR), it retrieves at most 1, 000candidates from our DPSR system in addition to those in the base-line. For both se�ings, all the candidates will go through the sameranking component and business logic. �e ranking componentapplies a state-of-the-art learning-to-rank method similar to meth-ods mentioned in [18]. Here, we emphasize that our productionsystem is a strong baseline to be compared with, as it has beentuned by hundreds of engineers and scientists for years, and hasapplied state-of-the-art query rewriting and document processingmethods to optimize candidate generation.

We �rst conducted human evaluation for the relevance of re-trieved items. Speci�cally, we ask human evaluators to label therelevance of results from the baseline system and DPSR for the sameset of 500 long tail queries. �e labeled results are categorized into3 buckets, bad, fair and perfect. Table 3 shows that the proposedmethod improve search relevancy by reducing around 6% bad cases.It proves that the deep retrieval system is especially e�ective inhandling “di�cult” or “unsatis�ed” queries, which o�en requiresemantic matching.

We then conducted live experiments over 10% of the entire sitetra�c during a period of two weeks using a standard A/B testing

Table 3: Relevancy metrics by human labeling of 500 longtail queries. DPSR reduces bad cases signi�cantly.

bad fair perfectBaseline 17.86% 26.04% 56.10%

DPSR 13.70% 33.28% 53.01%

Table 4: DPSR Online A/B test improvements.

UCVR GMV QRR1-head +1.13% +1.78% −4.44%2-head +1.34% +2.13% −4.13%

1-head-p13n +1.29% +2.19% −4.29%2-head on long tail query +10.03% +7.50% −9.99%

Table 5: Latency for index building and search.

index building (sec.) search (ms) QPSCPU 3453 9.92 100GPU 499 0.74 1422

Table 6: Overall serving performance.

QPS latency (ms) CPU usage GPU usageCPU 4, 000 20 > 50% 0.0%GPU 5, 800 15 > 50% 25%

con�guration. To protect con�dential business information, onlyrelative improvements are reported. Table 4 shows that the pro-posed DPSR retrieval improves the production system for all corebusiness metrics in e-commerce search, including user conversa-tion rate (UCVR), and gross merchandise value (GMV), as well asquery rewrite rate (QRR), which is believed to be a good indicatorof search satisfaction. We can also observe that the 2-heads versionof query tower, and personalized version (denoted as 1-head-p13n)both improve the vanilla version of 1-head query tower withoutany user features. Especially, we observe that the improvementsmainly come from long tail queries, which are normally hard fortraditional search engines.

6.4 E�ciencyIn Table 5, we show the e�ciency of our o�ine index buildingand online nearest neighbor search excluding the query embed-ding computation. We report the time consumed for indexing andsearching 15 million items with NVIDIA Tesla P40 GPU and Intel64-core CPU. It shows that DPSR can retrieve candidates within10ms on CPU, and can bene�t from GPUs with 85% reduction onindexing time consumption, 92% reduction on search latency and14 times more QPS (query per second) throughput.

In Table 6, we report the overall model serving performance withthe same CPU and GPU machines as above. �e overall latencyfrom query text to 1, 000 nearest neighbors can be done within15 to 20 milliseconds for GPU or CPU machines, which is evencomparable to the retrieval from standard inverted index.

7 CONCLUSIONIn this paper, we have discussed how we build a deep personalizedand semantic retrieval system in an industry scale e-commercesearch engine. Speci�cally, we 1) shared our design of a deepretrieval system, which takes all the production requirements intoconsideration, 2) presented a novel deep learning model that istailored for the personalized and semantic retrieval problems, and3) demonstrated that the proposed approach can e�ectively �ndsemantically relevant items, especially for long tail queries, whichis an ideal complementary candidate generation approach to thetraditional inverted index based approach. We have successfullydeployed DPSR into JD.com’s search production since early 2019,and we believe our proposed system can be easily extended frome-commerce search to other search scenarios.

8 ACKNOWLEDGEMENTWe deeply appreciate Chao Sun, Jintao Tang, Wei He, Tengfei Guan,Wenbin Zhu, Dejun Qiu, Qi Zhu, Hongwei Shen, Wei Wei andYouke Li for their engineering support to build key components ofthe infrastructure, and Chen Zheng, Rui Li and Eric Zhao for theirhelp at the early stage of this project. We thank the anonymousreviewers for their valuable suggestions.

REFERENCES[1] Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Je�rey

Dean, Ma�hieu Devin, Sanjay Ghemawat, Geo�rey Irving, Michael Isard, Man-junath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray,Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, YuanYu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machinelearning. In OSDI. 265–283.

[2] Xiao Bai, Erik Ordentlich, Yuanyuan Zhang, Andy Feng, Adwait Ratnaparkhi,Reena Somvanshi, and Aldi Tjahjadi. 2018. Scalable �ery N-Gram Embeddingfor Improving Matching and Relevance in Sponsored Search. In SIGKDD. 52–61.

[3] David M. Beazley. 1996. SWIG: An Easy to Use Tool for Integrating ScriptingLanguages with C and C++. In Proceedings of the 4th Conference on USENIX Tcl/TkWorkshop (Monterey, California). 15–15.

[4] Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni StJohn, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, et al. 2018.Universal sentence encoder for English. In Proceedings of the 2018 Conferenceon Empirical Methods in Natural Language Processing: System Demonstrations.169–174.

[5] Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks forYouTube Recommendations. In RecSys. 191–198.

[6] Sco� Deerwester, Susan T. Dumais, George W. Furnas, �omas K. Landauer, andRichard Harshman. 1990. Indexing by latent semantic analysis. Journal of theAmerican Society for Information Science 41, 6 (1990), 391–407.

[7] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding.CoRR abs/1810.04805 (2018).

[8] John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methodsfor online learning and stochastic optimization. Journal of machine learningresearch 12, Jul (2011), 2121–2159.

[9] Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Cro�. 2016. A Deep RelevanceMatching Model for Ad-hoc Retrieval. In CIKM (Indianapolis, Indiana, USA).55–64.

[10] Jiafeng Guo, Gu Xu, Hang Li, and Xueqi Cheng. 2008. A Uni�ed and Discrimina-tive Model for �ery Re�nement. In SIGIR (Singapore, Singapore). 379–386.

[11] Ma�hew Henderson, Rami Al-Rfou, Brian Strope, Yun-Hsuan Sung, LaszloLukacs, Ruiqi Guo, Sanjiv Kumar, Balint Miklos, and Ray Kurzweil. 2017. Ef-�cient natural language response suggestion for smart reply. arXiv preprintarXiv:1705.00652 (2017).

[12] �omas Hofmann. 1999. Probabilistic Latent Semantic Indexing. In SIGIR (Berke-ley, California, USA). 50–57.

[13] Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and LarryHeck. 2013. Learning deep structured semantic models for web search usingclickthrough data. In CIKM (San Francisco, California, USA). 2333–2338.

[14] �orsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski,and Geri Gay. 2007. Evaluating the Accuracy of Implicit Feedback from Clicksand �ery Reformulations in Web Search. ACM Transactions on InformationSystems 25, 2, Article 7 (April 2007).

[15] Je� Johnson, Ma�hijs Douze, and Herve Jegou. 2017. Billion-scale similaritysearch with GPUs. CoRR abs/1702.08734 (2017).

[16] Chao Li, Zhiyuan Liu, Mengmeng Wu, Yuchi Xu, Huan Zhao, Pipei Huang, Guo-liang Kang, Qiwei Chen, Wei Li, and Dik Lun Lee. 2019. Multi-Interest Networkwith Dynamic Routing for Recommendation at Tmall. In CIKM. 2615–2623.

[17] Hang Li and Jun Xu. 2014. Semantic Matching in Search. Foundations and Trendsin Information Retrieval 7, 5 (June 2014), 343–469.

[18] Shichen Liu, Fei Xiao, Wenwu Ou, and Luo Si. 2017. Cascade Ranking forOperational E-commerce Search. In SIGKDD (Halifax, NS, Canada). 1557–1565.

[19] Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Foundations andTrends in Information Retrieval 3, 3 (March 2009), 225–331.

[20] Bhaskar Mitra and Nick Craswell. 2018. An Introduction to Neural InformationRetrieval. Foundations and Trends in Information Retrieval 13, 1 (December 2018),1–126.

[21] Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to MatchUsing Local and Distributed Representations of Text for Web Search. In WWW(Perth, Australia). 1291–1299.

[22] Christopher Olston, Fangwei Li, Jeremiah Harmsen, Jordan Soyke, Kiril Gorovoy,Li Lao, Noah Fiedel, Sukriti Ramesh, and Vinu Rajashekhar. 2017. TensorFlow-Serving: Flexible, High-Performance ML Serving. In Workshop on ML Systems atNIPS.

[23] Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen,Xinying Song, and Rabab Ward. 2016. Deep sentence embedding using longshort-term memory networks: Analysis and application to information retrieval.IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 4 (2016),694–707.

[24] Alec Radford, Je�rey Wu, Rewon Child, David Luan, Dario Amodei, and IlyaSutskever. 2019. Language Models are Unsupervised Multitask Learners. TechnicalReport.

[25] Ruslan Salakhutdinov and Geo�rey Hinton. 2009. Semantic Hashing. Interna-tional Journal of Approximate Reasoning 50, 7 (July 2009), 969–978.

[26] Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Gregoire Mesnil. 2014.Learning Semantic Representations Using Convolutional Neural Networks forWeb Search. In WWW (Seoul, Korea). 373–374.

[27] Parikshit Sondhi, Mohit Sharma, Pranam Kolari, and ChengXiang Zhai. 2018. ATaxonomy of �eries for E-commerce Search. In SIGIR. 1245–1248.

[28] Daria Sorokina and Erick Cantu-Paz. 2016. Amazon Search: �e Joy of RankingProducts. In SIGIR. 459–460.

[29] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. A�ention is Allyou Need. In NIPS. 5998–6008.

[30] �anh Vu, Dat �oc Nguyen, Mark Johnson, Dawei Song, and Alistair Willis.2017. Search personalization with embeddings. In European Conference on Infor-mation Retrieval. Springer, 598–604.

[31] Tensor�ow O�cial Website. 2019. Reading custom �le and record formats. h�ps://www.tensor�ow.org/guide/extend/formats

[32] Cassandra Xia, Clemens Mewald, D. Sculley, David Soergel, George Roumpos,Heng-Tze Cheng, Illia Polosukhin, Jamie Alexander Smith, Jianwei Xie, LichanHong, Martin Wicke, Mustafa Ispir, Philip Daniel Tucker, Yuan Tang, and ZakariaHaque. 2017. TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks. In SIGKDD. 1763—-1771.

[33] Ji Yang, Xinyang Yi, Derek Zhiyuan Cheng, Lichan Hong, Yang Li, Simon Xi-aoming Wang, Taibai Xu, and Ed H Chi. 2020. Mixed Negative Sampling forLearning Two-tower Neural Networks in Recommendations. In WWW Compan-ion. 441–447.

[34] Xinyang Yi, Ji Yang, Lichan Hong, Derek Zhiyuan Cheng, Lukasz Heldt, AditeeKumthekar, Zhe Zhao, Li Wei, and Ed Chi. 2019. Sampling-bias-corrected neuralmodeling for large corpus item recommendations. In RecSys. 269–277.

[35] Dawei Yin, Yuening Hu, Jiliang Tang, Tim Daly, Mianwei Zhou, Hua Ouyang,Jianhui Chen, Changsung Kang, Hongbo Deng, Chikashi Nobata, Jean-MarcLanglois, and Yi Chang. 2016. Ranking Relevance in Yahoo Search. In SIGKDD(San Francisco, California, USA). 323–332.

[36] Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai.2018. Learning Tree-Based Deep Model for Recommender Systems. In SIGKDD.1079–1088.

https://www.tensorflow.org/guide/extend/formats

https://www.tensorflow.org/guide/extend/formats

Towards Personalized and Semantic Retrieval: An …Solution for E-commerce Search via Embedding Learning Han Zhang1y, Songlin Wang1y, Kang Zhang1, Zhiling Tang1, Yunjiang Jiang2, Yun

Documents