
Inferring the relations between two images is an important class of tasks in
computer vision. Examples of such tasks include computing optical flow and
stereo disparity. We treat the relation inference tasks as a machine learning
problem and tackle it with neural networks. A key to the problem is learning a
representation of relations. We propose a new neural network module, contrast
association unit (CAU), which explicitly models the relations between two sets
of input variables. Due to the nonnegativity of the weights in CAU, we adopt a
multiplicative update algorithm for learning these weights. Experiments show
that neural networks with CAUs are more effective in learning five fundamental
image transformations than conventional neural networks.

Stochastic Neighbor Embedding (SNE) methods minimize the divergence between
the similarity matrix of a highdimensional data set and its counterpart from a
lowdimensional embedding, leading to widely applied tools for data
visualization. Despite their popularity, the current SNE methods experience a
crowding problem when the data include highly imbalanced similarities. This
implies that the data points with higher total similarity tend to get crowded
around the display center. To solve this problem, we introduce a fast
normalization method and normalize the similarity matrix to be doubly
stochastic such that all the data points have equal total similarities.
Furthermore, we show empirically and theoretically that the doubly
stochasticity constraint often leads to embeddings which are approximately
spherical. This suggests replacing a flat space with spheres as the embedding
space. The spherical embedding eliminates the discrepancy between the center
and the periphery in visualization, which efficiently resolves the crowding
problem. We compared the proposed method (DOSNES) with the stateoftheart SNE
method on three realworld datasets and the results clearly indicate that our
method is more favorable in terms of visualization quality.

We present a novel endtoend framework for facial performance capture given
a monocular video of an actor's face. Our framework are comprised of 2 parts.
First, to extract the information in the frames, we optimize a triplet loss to
learn the embedding space which ensures the semantically closer facial
expressions are closer in the embedding space and the model can be transferred
to distinguish the expressions that are not presented in the training dataset.
Second, the embeddings are fed into an LSTM network to learn the deformation
between frames. In the experiments, we demonstrated that compared to other
methods, our method can distinguish the delicate motion around lips and
significantly reduce jitters between the tracked meshes.

Advanced optimization algorithms such as Newton method and AdaGrad benefit
from second order derivative or second order statistics to achieve better
descent directions and faster convergence rates. At their heart, such
algorithms need to compute the inverse or inverse square root of a matrix whose
size is quadratic of the dimensionality of the search space. For high
dimensional search spaces, the matrix inversion or inversion of square root
becomes overwhelming which in turn demands for approximate methods. In this
work, we propose a new matrix approximation method which divides a matrix into
blocks and represents each block by one or two numbers. The method allows
efficient computation of matrix inverse and inverse square root. We apply our
method to AdaGrad in training deep neural networks. Experiments show
encouraging results compared to the diagonal approximation.

In this paper, we propose an interpretable LSTM recurrent neural network,
i.e., multivariable LSTM for time series with exogenous variables. Currently,
widely used attention mechanism in recurrent neural networks mostly focuses on
the temporal aspect of data and falls short of characterizing variable
importance. To this end, our multivariable LSTM equipped with tensorized
hidden states is developed to learn variable specific representations, which
give rise to both temporal and variable level attention. Preliminary
experiments demonstrate comparable prediction performance of multivariable
LSTM w.r.t. encoderdecoder based baselines. More interestingly, variable
importance in real datasets characterized by the variable attention is highly
in line with that determined by statistical Granger causality test, which
exhibits the prospect of multivariable LSTM as a simple and uniform endtoend
framework for both forecasting and knowledge discovery.

Locating actions in long untrimmed videos has been a challenging problem in
video content analysis. The performances of existing action localization
approaches remain unsatisfactory in precisely determining the beginning and the
end of an action. Imitating the human perception procedure with observations
and refinements, we propose a novel threephase action localization framework.
Our framework is embedded with an Actionness Network to generate initial
proposals through framewise similarity grouping, and then a Refinement Network
to conduct boundary adjustment on these proposals. Finally, the refined
proposals are sent to a Localization Network for further finegrained location
regression. The whole process can be deemed as multistage refinement using a
novel nonlocal pyramid feature under various temporal granularities. We
evaluate our framework on THUMOS14 benchmark and obtain a significant
improvement over the stateofthearts approaches. Specifically, the
performance gain is remarkable under precise localization with high IoU
thresholds. Our proposed framework achieves mAP@IoU=0.5 of 34.2%.

We realize a $\Lambda$ system in a superconducting circuit, with metastable
states exhibiting lifetimes up to 8\,ms. We exponentially suppress the
tunneling matrix elements involved in spontaneous energy relaxation by creating
a "heavy" fluxonium, realized by adding a capacitive shunt to the original
circuit design. The device allows for both cavityassisted and direct
fluorescent readout, as well as state preparation schemes akin to optical
pumping. Since direct transitions between the metastable states are strongly
suppressed, we utilize Raman transitions for coherent manipulation of the
states.

We theoretically analyze a scheme for fast stabilization of arbitrary qubit
states with high fidelities, extending a protocol recently demonstrated
experimentally. Our scheme utilized red and blue sideband transitions in a
system composed of a fluxonium qubit, a lowQ LCoscillator, and a coupler
enabling us to tune the interaction between them. Under parametric modulations
of the coupling strength, the qubit can be steered into any desired pure or
mixed singlequbit state. For realistic circuit parameters, we predict that
stabilization can be achieved within 100 ns. By varying the ratio between the
oscillator's damping rate and the effective qubitoscillator coupling strength,
we can switch between underdamped, criticallydamped, and overdamped
stabilization and find optimal working points. We further analyze the effect of
thermal fluctuations and show that the stabilization scheme remains robust for
realistic temperatures.

We propose a lightweight neural network model, Deformable Volume Network
(Devon) for learning optical flow. Devon benefits from a multistage framework
to iteratively refine its prediction. Each stage is by itself a neural network
with an identical architecture. The optical flow between two stages is
propagated with a newly proposed module, the deformable cost volume. The
deformable cost volume does not distort the original images or their feature
maps and therefore avoids the artifacts associated with warping, a common
drawback in previous models. Devon only has one million parameters. Experiments
show that Devon achieves comparable results to previous neural network models,
despite of its small size.

Fewlayer black phosphorus (FLBP), a recently discovered twodimensional
semiconductor, has attracted substantial attention in the scientific and
technical communities due to its great potential in electronic and
optoelectronic applications. However, reactivity of FLBP flakes with ambient
species limits its direct applications. Among various methods to passivate FLBP
in ambient environment, nanocomposites mixing FLBP flakes with stable matrix
may be one of the most promising approaches for industry applications. Here, we
report a simple onestep procedure to mass produce airstable
FLBP/phospholipids nanocomposite in liquid phase. The resultant nanocomposite
is found to have ultralow tunneling barrier for charge carriers which can be
described by an EfrosShklovskii variable range hopping mechanism. Devices made
from such massproduced FLBP/phospholipids nanocomposite show highly stable
electrical conductivity and optoelectrical response in ambient conditions,
indicating its promising applications in both electronic and optoelectronic
applications. This method could also be generalized to the mass production of
nanocomposites consisting of other airsensitive twodimensional materials,
such as FeSe, NbSe2, WTe2, etc.

Although nonequilibrium work and fluctuation relations have been studied in
detail within classical statistical physics, extending these results to open
quantum systems has proven to be conceptually difficult. For systems that
undergo decoherence but not dissipation, we argue that it is natural to define
quantum work exactly as for isolated quantum systems, using the twopoint
measurement protocol. Complementing previous theoretical analysis using quantum
channels, we show that the nonequilibrium work relation remains valid in this
situation, and we test this assertion experimentally using a system engineered
from an optically trapped ion. Our experimental results reveal the work
relation's validity over a variety of driving speeds, decoherence rates, and
effective temperatures and represent the first confirmation of the work
relation for nonunitary dynamics.

Cities are living systems where urban infrastructures and their functions are
defined and evolved due to population behaviors. Profiling the cities and
functional regions has been an important topic in urban design and planning.
This paper studies a unique big data set which includes daily movement data of
tens of millions of city residents, and develop a visual analytics system,
namely UrbanFACET, to discover and visualize the dynamical profiles of multiple
cities and their residents. This big user movement data set, acquired from
mobile users' agnostic checkins at thousands of phone APPs, is well utilized
in an integrative study and visualization together with urban structure (e.g.,
road network) and POI (Point of Interest) distributions. In particular, we
novelly develop a set of informationtheory based metrics to characterize the
mobility patterns of city areas and groups of residents. These multifaceted
metrics including Fluidity, vibrAncy, Commutation, divErsity, and densiTy
(FACET) which categorize and manifest hidden urban functions and behaviors.
UrbanFACET system further allows users to visually analyze and compare the
metrics over different areas and cities in metropolitan scales. The system is
evaluated through both case studies on several big and heavily populated
cities, and user studies involving realworld users.

We autonomously stabilize arbitrary states of a qubit through parametric
modulation of the coupling between a fixed frequency qubit and resonator. The
coupling modulation is achieved with a tunable coupler design, in which the
qubit and the resonator are connected in parallel to a superconducting quantum
interference device. This allows for quasistatic tuning of the qubitcavity
coupling strength from 12 MHz to more than 300 MHz. Additionally, the coupling
can be dynamically modulated, allowing for single photon exchange in 6 ns.
Qubit coherence times exceeding 20 $\mu$s are maintained over the majority of
the range of tuning, limited primarily by the Purcell effect. The parametric
stabilization technique realized using the tunable coupler involves engineering
the qubit bath through a combination of photon nonconserving sideband
interactions realized by flux modulation, and direct qubit Rabi driving. We
demonstrate that the qubit can be stabilized to arbitrary states on the Bloch
sphere with a worstcase fidelity exceeding 80 %.

We perform fast vehicle detection from traffic surveillance cameras. A novel
deep learning framework, namely Evolving Boxes, is developed that proposes and
refines the object boxes under different feature representations. Specifically,
our framework is embedded with a lightweight proposal network to generate
initial anchor boxes as well as to early discard unlikely regions; a
fineturning network produces detailed features for these candidate boxes. We
show intriguingly that by applying different feature fusion techniques, the
initial boxes can be refined for both localization and recognition. We evaluate
our network on the recent DETRAC benchmark and obtain a significant improvement
over the stateoftheart Faster RCNN by 9.5% mAP. Further, our network
achieves 913 FPS detection speed on a moderate commercial GPU.

Molecules are the most demanding quantum systems to be simulated by quantum
computers because of their complexity and the emergent role of quantum nature.
The recent theoretical proposal of Huh et al. (Nature Photon., 9, 615 (2015))
showed that a multiphoton network with a Gaussian input state can simulate a
molecular spectroscopic process. Here, we report the first experimental
demonstration of molecular vibrational spectroscopy of SO$_{2}$ with a
trappedion system. In our realization, the molecular scattering operation is
decomposed to a series of elementary quantum optical operations, which are
implemented through Raman laser beams, resulting in a multimode Gaussian
(Bogoliubov) transformation. The molecular spectroscopic signal is
reconstructed from the collective projection measurements on phonon modes of
the trappedion system. Our experimental demonstration would pave the way to
largescale molecular quantum simulations, which are classically intractable.

A standard method to obtain information on a quantum state is to measure
marginal distributions along many different axes in phase space, which forms a
basis of quantum state tomography. We theoretically propose and experimentally
demonstrate a general framework to manifest nonclassicality by observing a
single marginal distribution only, which provides a novel insight into
nonclassicality and a practical applicability to various quantum systems. Our
approach maps the 1dim marginal distribution into a factorized 2dim
distribution by multiplying the measured distribution or the vacuumstate
distribution along an orthogonal axis. The resulting fictitious Wigner function
becomes unphysical only for a nonclassical state, thus the negativity of the
corresponding density operator provides an evidence of nonclassicality.
Furthermore, the negativity measured this way yields a lower bound for
entanglement potentiala measure of entanglement generated using a
nonclassical state with a beam splitter setting that is a prototypical model to
produce continuousvariable (CV) entangled states. Our approach detects both
Gaussian and nonGaussian nonclassical states in a reliable and efficient
manner. Remarkably, it works regardless of measurement axis for all
nonGaussian states in finitedimensional Fock space of any size, also
extending to infinitedimensional states of experimental relevance for CV
quantum informatics. We experimentally illustrate the power of our criterion
for motional states of a trapped ion confirming their nonclassicality in a
measurementaxis independent manner. We also address an extension of our
approach combined with phaseshift operations, which leads to a stronger test
of nonclassicality, i.e. detection of genuine nonGaussianity under a CV
measurement.

Spammer detection on social network is a challenging problem. The rigid
antispam rules have resulted in emergence of "smart" spammers. They resemble
legitimate users who are difficult to identify. In this paper, we present a
novel spammer classification approach based on Latent Dirichlet
Allocation(LDA), a topic model. Our approach extracts both the local and the
global information of topic distribution patterns, which capture the essence of
spamming. Tested on one benchmark dataset and one selfcollected dataset, our
proposed method outperforms other stateoftheart methods in terms of averaged
F1score.

The outputs of a trained neural network contain much richer information than
just an onehot classifier. For example, a neural network might give an image
of a dog the probability of one in a million of being a cat but it is still
much larger than the probability of being a car. To reveal the hidden structure
in them, we apply two unsupervised learning algorithms, PCA and ICA, to the
outputs of a deep Convolutional Neural Network trained on the ImageNet of 1000
classes. The PCA/ICA embedding of the object classes reveals their visual
similarity and the PCA/ICA components can be interpreted as common visual
features shared by similar object classes. For an application, we proposed a
new zeroshot learning method, in which the visual features learned by PCA/ICA
are employed. Our zeroshot learning method achieves the stateoftheart
results on the ImageNet of over 20000 classes.

In this work, we show that a giant spin current can be injected into a nodal
topological superconductor, using a normal paramagnetic lead, through a large
number of zero energy Majorana fermions at the superconductor edge. The giant
spin current is caused by the selective equal spin Andreev reflections (SESAR)
induced by Majorana fermions. In each SESAR event, a pair of electrons with
certain spin polarization are injected into the nodal topological
superconductor, even though the pairing in the bulk of the nodal superconductor
is spinsinglet swave. We further explain the origin of the spin current by
showing that the pairing correlation at the edge of a nodal topological
superconductor is predominantly equal spintriplet at zero energy. The
experimental consequences of SESAR in nodal topological superconductors are
discussed.

Multiple Instance Learning (MIL) recently provides an appealing way to
alleviate the drifting problem in visual tracking. Following the
trackingbydetection framework, an online MILBoost approach is developed that
sequentially chooses weak classifiers by maximizing the bag likelihood. In this
paper, we extend this idea towards incorporating the instance significance
estimation into the online MILBoost framework. First, instead of treating all
instances equally, with each instance we associate a significancecoefficient
that represents its contribution to the bag likelihood. The coefficients are
estimated by a simple Bayesian formula that jointly considers the predictions
from several standard MILBoost classifiers. Next, we follow the online boosting
framework, and propose a new criterion for the selection of weak classifiers.
Experiments with challenging public datasets show that the proposed method
outperforms both existing MIL based and boosting based trackers.

Quantum theory is based on a mathematical structure totally different from
conventional arithmetic. Due to the symmetric nature of bosonic particles,
annihilation or creation of single particles translates a quantum state
depending on how many bosons are already in the given quantum system. This
proportionality results in a variety of nonclassical features of quantum
mechanics including the bosonic commutation relation. The annihilation and
creation operations have recently been implemented in photonic systems.
However, this feature of quantum mechanics does not preclude the possibility of
realizing conventional arithmetic in quantum systems. We implement conventional
addition and subtraction of single phonons for a trapped \Yb ion in a harmonic
potential. In order to realize such operations, we apply the transitionless
adiabatic passage scheme on the antiJaynesCummings coupling between the
internal energy states and external motion states of the ion. By performing the
operations on superpositions of Fock states, we realize the hybrid computation
of classical arithmetic in quantum parallelism, and show that our operations
are useful to engineer quantum states. Our singlephonon operations are nearly
deterministic and robust against parameter changes, enabling handy repetition
of the operations independently from the initial state of the atomic motion. We
demonstrate the transform of a classical state to a nonclassical one of highly
subPoissonian phonon statistics and a Gaussian state to a nonGaussian state,
by applying a sequence of the operations. The operations implemented here are
the SusskindGlogower phase operators, whose noncommutativity is also
demonstrated.

The search for topological superconductors which support Majorana fermion
excitations has been an important topic in condensed matter physics. In this
work, we propose a new experimental scheme for engineering topological
superconductors. In this scheme, by manipulating the superlattice structure of
organic molecules placed on top of a superconductor with Rashba spinorbit
coupling, topological superconducting phases can be achieved without
finetuning the chemical potential. Moreover, superconductors with different
Chern numbers can be obtained by changing the superlattice structure of the
organic molecules.

Stochastic sampling based trackers have shown good performance for abrupt
motion tracking so that they have gained popularity in recent years. However,
conventional methods tend to use a twostage sampling paradigm, in which the
search space needs to be uniformly explored with an inefficient preliminary
sampling phase. In this paper, we propose a novel samplingbased method in the
Bayesian filtering framework to address the problem. Within the framework,
nearest neighbor field estimation is utilized to compute the importance
proposal probabilities, which guide the Markov chain search towards promising
regions and thus enhance the sampling efficiency; given the motion priors, a
smoothing stochastic sampling Monte Carlo algorithm is proposed to approximate
the posterior distribution through a smoothing weightupdating scheme.
Moreover, to track the abrupt and the smooth motions simultaneously, we develop
an abruptmotion detection scheme which can discover the presence of abrupt
motions during online tracking. Extensive experiments on challenging image
sequences demonstrate the effectiveness and the robustness of our algorithm in
handling the abrupt motions.

The past two decades witnessed important developments in the field of
nonequilibrium statistical mechanics. Among these developments, the Jarzynski
equality, being a milestone following the landmark work of Clausius and Kelvin,
stands out. The Jarzynski equality relates the free energy difference between
two equilibrium states and the work done on the system through far from
equilibrium processes. While experimental tests of the equality have been
performed in classical regime, the verification of the quantum Jarzynski
equality has not yet been fully demonstrated due to experimental challenges.
Here, we report an experimental test of the quantum Jarzynski equality with a
single \Yb ion trapped in a harmonic potential. We perform projective
measurements to obtain phonon distributions of the initial thermal state.
Following that we apply the laser induced force on the projected energy
eigenstate, and find transition probabilities to final energy eigenstates after
the work is done. By varying the speed of applying the force from equilibrium
to farfrom equilibrium regime, we verified the quantum Jarzynski equality in
an isolated system.

We investigate classification and detection of entanglement of multipartite
quantum states in a very general setting, and obtain efficient $k$separability
criteria for mixed multipartite states in arbitrary dimensional quantum
systems. These criteria can be used to distinguish $n1$ different classes of
multipartite inseparable states and can detect many important multipartite
entangled states such as GHZ states, W states, anti W states, and mixtures
thereof. They detect $k$nonseparable $n$partite quantum states which have
previously not been identified. Here $k=2,3,\cdots,n$. No optimization or
eigenvalue evaluation is needed, and our criteria can be evaluated by simple
computations involving components of the density matrix. Most importantly, they
can be implemented in today's experiments by using at most $\mathcal{O}(n^2)$
local measurements.