• ### Optimized Cost per Click in Taobao Display Advertising(1703.02091)

Jan. 29, 2019 cs.GT, stat.ML
• ### Deep Interest Network for Click-Through Rate Prediction(1706.06978)

Sept. 13, 2018 cs.LG, stat.ML
Click-through rate prediction is an essential task in industrial applications, such as online advertising. Recently deep learning based models have been proposed, which follow a similar Embedding\&MLP paradigm. In these methods large scale sparse input features are first mapped into low dimensional embedding vectors, and then transformed into fixed-length vectors in a group-wise manner, finally concatenated together to fed into a multilayer perceptron (MLP) to learn the nonlinear relations among features. In this way, user features are compressed into a fixed-length representation vector, in regardless of what candidate ads are. The use of fixed-length vector will be a bottleneck, which brings difficulty for Embedding\&MLP methods to capture user's diverse interests effectively from rich historical behaviors. In this paper, we propose a novel model: Deep Interest Network (DIN) which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad. This representation vector varies over different ads, improving the expressive ability of model greatly. Besides, we develop two techniques: mini-batch aware regularization and data adaptive activation function which can help training industrial deep networks with hundreds of millions of parameters. Experiments on two public datasets as well as an Alibaba real production dataset with over 2 billion samples demonstrate the effectiveness of proposed approaches, which achieve superior performance compared with state-of-the-art methods. DIN now has been successfully deployed in the online display advertising system in Alibaba, serving the main traffic.
• ### Learning Tree-based Deep Model for Recommender Systems(1801.02294)

Feb. 12, 2018 cs.LG, cs.IR, stat.ML
Model-based methods for recommender systems have been studied to provide more precise results. In systems with large corpus, the amount of calculation for learnt model to predict all user-item pairs' preferences is tremendous, which makes the model difficult to be directly employed in recommendation candidate generation stage. To overcome the calculation barrier, models like matrix factorization can resort to inner product form (i.e., use the inner product of user and item's latent factors as the preference) and index like hashing to perform efficient approximate k-nearest neighbor search. However, other more expressive interaction forms between user and item features, e.g., interactions through advanced deep neural networks, are still prevented from large corpus recommendation because of the amount of calculation. In this paper, we focus on the problem how arbitrary advanced models can be introduced to generate recommendations from large corpus. We propose a novel tree-based method which can provide logarithmic complexity prediction w.r.t. corpus size with more expressive deep neural networks. The main idea of tree-based model is to predict user interests coarse-to-fine, by traversing tree nodes top-down and making decisions whether to pick up each node to user. Furthermore, we show that the tree structure can also be jointly learnt towards better compatible with user interests' distribution, to facilitate both training and prediction. Experiments in two large-scale real-world datasets indicate that the proposed model significantly outperforms traditional methods. And online A/B test results in Taobao display advertising platform prove the effectiveness of the tree-based deep model in production.
• ### Deep Transfer Learning with Joint Adaptation Networks(1605.06636)

Aug. 17, 2017 cs.LG, stat.ML
Deep networks have been successfully applied to learn transferable features for adapting models from a source domain to a different target domain. In this paper, we present joint adaptation networks (JAN), which learn a transfer network by aligning the joint distributions of multiple domain-specific layers across domains based on a joint maximum mean discrepancy (JMMD) criterion. Adversarial training strategy is adopted to maximize JMMD such that the distributions of the source and target domains are made more distinguishable. Learning can be performed by stochastic gradient descent with the gradients computed by back-propagation in linear-time. Experiments testify that our model yields state of the art results on standard datasets.
• ### Unsupervised Domain Adaptation with Residual Transfer Networks(1602.04433)

Feb. 16, 2017 cs.LG
The recent success of deep neural networks relies on massive amounts of labeled data. For a target task where labeled data is unavailable, domain adaptation can transfer a learner from a different source domain. In this paper, we propose a new approach to domain adaptation in deep networks that can jointly learn adaptive classifiers and transferable features from labeled data in the source domain and unlabeled data in the target domain. We relax a shared-classifier assumption made by previous methods and assume that the source classifier and target classifier differ by a residual function. We enable classifier adaptation by plugging several layers into deep network to explicitly learn the residual function with reference to the target classifier. We fuse features of multiple layers with tensor product and embed them into reproducing kernel Hilbert spaces to match distributions for feature adaptation. The adaptation can be achieved in most feed-forward models by extending them with new residual layers and loss functions, which can be trained efficiently via back-propagation. Empirical evidence shows that the new approach outperforms state of the art methods on standard domain adaptation benchmarks.
• ### The Name-Passing Calculus(1508.00093)

Aug. 1, 2015 cs.LO
Name-passing calculi are foundational models for mobile computing. Research into these models has produced a wealth of results ranging from relative expressiveness to programming pragmatics. The diversity of these results call for clarification and reorganization. This paper applies a model independent approach to the study of the name-passing calculi, leading to a uniform treatment and simplification. The technical tools and the results presented in the paper form the foundation for a theory of name-passing calculus.
• ### Unequal Layer Densities in Bilayer Wigner Crystal at High Magnetic Field(1101.2436)

Jan. 12, 2011 cond-mat.mes-hall
We report studies of pinning mode resonances of magnetic field induced bilayer Wigner crystals of bilayer hole samples with negligible interlayer tunneling and different interlayer separations d, in states with varying layer densities, including unequal layer densities. With unequal layer densities, samples with large d relative to the in-plane carrier-carrier spacing a, two pinning resonances are present, one for each layer. For small d/a samples, a single resonance is observed even with significant density imbalance. These samples, at balance, were shown to exhibit an enhanced pinning mode frequency [Zhihai Wang et al., Phys. Rev. Lett. 136804 (2007)], which was ascribed to a one-component, pseudospin ferromagnetic Wigner solid. The evolution of the resonance frequency and line width indicates the quantum interlayer coherence survives at moderate density imbalance, but disappears when imbalance is sufficiently large.

• ### Introducing Small-World Network Effect to Critical Dynamics(cond-mat/0212542)

We analytically investigate the kinetic Gaussian model and the one-dimensional kinetic Ising model on two typical small-world networks (SWN), the adding-type and the rewiring-type. The general approaches and some basic equations are systematically formulated. The rigorous investigation of the Glauber-type kinetic Gaussian model shows the mean-field-like global influence on the dynamic evolution of the individual spins. Accordingly a simplified method is presented and tested, and believed to be a good choice for the mean-field transition widely (in fact, without exception so far) observed on SWN. It yields the evolving equation of the Kawasaki-type Gaussian model. In the one-dimensional Ising model, the p-dependence of the critical point is analytically obtained and the inexistence of such a threshold p_c, for a finite temperature transition, is confirmed. The static critical exponents, gamma and beta are in accordance with the results of the recent Monte Carlo simulations, and also with the mean-field critical behavior of the system. We also prove that the SWN effect does not change the dynamic critical exponent, z=2, for this model. The observed influence of the long-range randomness on the critical point indicates two obviously different hidden mechanisms.
• ### Kawasaki-type Dynamics: Diffusion in the kinetic Gaussian model(cond-mat/0204156)

May 22, 2002 cond-mat.dis-nn
In this article, we retain the basic idea and at the same time generalize Kawasaki's dynamics, spin-pair exchange mechanism, to spin-pair redistribution mechanism, and present a normalized redistribution probability. This serves to unite various order-parameter-conserved processes in microscopic, place them under the control of a universal mechanism and provide the basis for further treatment. As an example of the applications, we treated the kinetic Gaussian model and obtained exact diffusion equation. We observed critical slowing down near the critical point and found that, the critical dynamic exponent z=1/nu=2 is independent of space dimensionality and the assumed mechanism, whether Glauber-type or Kawasaki-type.
• ### Generalized Competing Glauber-type Dynamics and Kawasaki-type Dynamics(cond-mat/0204453)

April 21, 2002 cond-mat.dis-nn
In this article, we have given a systematic formulation of the new generalized competing mechanism: the Glauber-type single-spin transition mechanism, with probability p, simulates the contact of the system with the heat bath, and the Kawasaki-type spin-pair redistribution mechanism, with probability 1-p, simulates an external energy flux. These two mechanisms are natural generalizations of Glauber's single-spin flipping mechanism and Kawasaki's spin-pair exchange mechanism respectively. On the one hand, the new mechanism is in principle applicable to arbitrary systems, while on the other hand, our formulation is able to contain a mechanism that just directly combines single-spin flipping and spin-pair exchange in their original form. Compared with the conventional mechanism, the new mechanism does not assume the simplified version and leads to greater influence of temperature. The fact, order for lower temperature and disorder for higher temperature, will be universally true. In order to exemplify this difference, we applied the mechanism to 1D Ising model and obtained analytical results. We also applied this mechanism to kinetic Gaussian model and found that, above the critical point there will be only paramagnetic phase, while below the critical point, the self-organization as a result of the energy flux will lead the system to an interesting heterophase, instead of the initially guessed antiferromagnetic phase. We studied this process in details.