
Layerstructured materials are often considered to be good candidates for
thermoelectric materials, because they tend to exhibit intrinsically low
thermal conductivity as a result of atomic interlayer interactions. The
electrical properties of layerstructured materials can be easily tuned using
various methods, such as band modification and intercalation. We report TiNBr,
as a member of the layerstructured metal nitride halide system MNX (M = Ti,
Zr, Hf; X = Cl, Br, I), and it exhibits an ultrahigh Seebeck coefficient of
2215 $\mu V/K$ at 300K. The value of the dimensionless figure of merit, ZT,
along A axis can be as high as 0.661 at 800K, corresponding to a lattice
thermal conductivity as low as 1.34 W/(m K). The low ${\kappa_l}$ of TiNBr is
associated with a collectively low phonon group velocity ($2.05\times 10^3 $
m/s on average) and large phonon anharmonicity that can be quantified using the
Gr\"uneisen parameter and threephonon processes. Animation of the atomic
motion in highly anharmonic modes mainly involves the motion of N atoms, and
the charge density difference reveals that the N atoms become polarized with
the merging of anharmonicity. Moreover, the fitting procedure of the
energydisplacement curve verifies that in addition to the threephonon
processes, the fourthorder anharmonic effect is also important in the integral
anharmonicity of TiNBr. Our work is the first study of the thermoelectric
properties of TiNBr and may help establish a connection between the low lattice
thermal conductivity and the behavior of phonon vibrational modes.

We study active object tracking, where a tracker takes as input the visual
observation (i.e., frame sequence) and produces the camera control signal
(e.g., move forward, turn left, etc.). Conventional methods tackle the tracking
and the camera control separately, which is challenging to tune jointly. It
also incurs many human efforts for labeling and many expensive trialanderrors
in realworld. To address these issues, we propose, in this paper, an endtoend
solution via deep reinforcement learning, where a ConvNetLSTM function
approximator is adopted for the direct frametoaction prediction. We further
propose an environment augmentation technique and a customized reward function,
which are crucial for a successful training. The tracker trained in simulators
(ViZDoom, Unreal Engine) shows good generalization in the case of unseen object
moving path, unseen object appearance, unseen background, and distracting
object. It can restore tracking when occasionally losing the target. With the
experiments over the VOT dataset, we also find that the tracking ability,
obtained solely from simulators, can potentially transfer to realworld
scenarios.

In this paper we develop a method to solve evolution equations on Gelfand
triples with timefractional derivative based on monotonicity techniques.
Applications include deterministic and stochastic quasilinear partial
differential equations with timefractional derivatives, including
timefractional (stochastic) porous media equations (including the case where
the Laplace operator is also fractional) and $p$Laplace equations as special
cases.

This paper studies the Tensor Robust Principal Component (TRPCA) problem
which extends the known Robust PCA (Candes et al. 2011) to the tensor case. Our
model is based on a new tensor Singular Value Decomposition (tSVD) (Kilmer and
Martin 2011) and its induced tensor tubal rank and tensor nuclear norm.
Consider that we have a 3way tensor ${\mathcal{X}}\in\mathbb{R}^{n_1\times
n_2\times n_3}$ such that ${\mathcal{X}}={\mathcal{L}}_0+{\mathcal{E}}_0$,
where ${\mathcal{L}}_0$ has low tubal rank and ${\mathcal{E}}_0$ is sparse. Is
that possible to recover both components? In this work, we prove that under
certain suitable assumptions, we can recover both the lowrank and the sparse
components exactly by simply solving a convex program whose objective is a
weighted combination of the tensor nuclear norm and the $\ell_1$norm, i.e.,
$\min_{{\mathcal{L}},\ {\mathcal{E}}} \
\{{\mathcal{L}}}\_*+\lambda\{{\mathcal{E}}}\_1, \ \text{s.t.} \
{\mathcal{X}}={\mathcal{L}}+{\mathcal{E}}$, where $\lambda=
{1}/{\sqrt{\max(n_1,n_2)n_3}}$. Interestingly, TRPCA involves RPCA as a special
case when $n_3=1$ and thus it is a simple and elegant tensor extension of RPCA.
Also numerical experiments verify our theory and the application for the image
denoising demonstrates the effectiveness of our method.

In this paper, we propose a deep learning approach to tackle the automatic
summarization tasks by incorporating topic information into the convolutional
sequencetosequence (ConvS2S) model and using selfcritical sequence training
(SCST) for optimization. Through jointly attending to topics and wordlevel
alignment, our approach can improve coherence, diversity, and informativeness
of generated summaries via a biased probability generation mechanism. On the
other hand, reinforcement training, like SCST, directly optimizes the proposed
model with respect to the nondifferentiable metric ROUGE, which also avoids
the exposure bias during inference. We carry out the experimental evaluation
with stateoftheart methods over the Gigaword, DUC2004, and LCSTS datasets.
The empirical results demonstrate the superiority of our proposed method in the
abstractive summarization.

Human motion prediction aims at generating future frames of human motion
based on an observed sequence of skeletons. Recent methods employ the latest
hidden states of a recurrent neural network (RNN) to encode the historical
skeletons, which can only address shortterm prediction. In this work, we
propose a motion context modeling by summarizing the historical human motion
with respect to the current prediction. A modified highway unit (MHU) is
proposed for efficiently eliminating motionless joints and estimating next pose
given the motion context. Furthermore, we enhance the motion dynamic by
minimizing the gram matrix loss for longterm motion prediction. Experimental
results show that the proposed model can promisingly forecast the human future
movements, which yields superior performances over related stateoftheart
approaches. Moreover, specifying the motion context with the activity labels
enables our model to perform human motion transfer.

In a previous work, we found that an object's gravity can be regarded as its
buoyancy in space when it displaces gravitons. However, this seems to
contradict the pointlike elementary particle in the standard model. In this
work, by combining Klein's curled fifth dimension and Newman's complex space,
we find a particle have a micro complex horizon, which can be described by the
complex KerrNewman metrics in a 6D complex space, the 3D imaginary subspace
of which is curled in the points of its 3D real subspace (vice versa). A
particle can appear as a pointlike particle or a particle with nozero volume
in different slices of the 6D complex space. The ring singularity of a
particle is hidden in one slice of its complex horizon. As two phases of a
complex black hole, a article and a black hole can be transformed into each
other through a phase transformation.

In this letter, by combining the holographic principle with the graviton
BoseEinstein condensates hypothesis of gravitational backgrounds, we provide a
theory of gravity, which provides some kinetic details of how the gravitational
coupling between matter and spacetime works. The effective radial potential
energy of an object in a gravitational field is found to be the sum of the
interfacial energy caused by its micro horizon and the energy required to make
room for it by displacing gravitons. A version of Archimedes' principle for
gravity can be described as "the effective internal energy of the gravitons
that a body displaces is equal to the work by multiplying the gravity exerted
on it and its distance to the centre of gravity."

Color names are often made up of multiple words. As a task in natural
language understanding we investigate in depth the capacity of neural networks
based on sums of word embeddings (SOWE), recurrence (LSTM and GRU based RNNs)
and convolution (CNN), to estimate colors from sequences of terms. We consider
both point and distribution estimates of color. We argue that the latter has a
particular value as there is no clear agreement between people as to what a
particular color describes  different people have a different idea of what it
means to be ``very dark orange'', for example. Surprisingly, despite it's
simplicity, the sum of word embeddings generally performs the best on almost
all evaluations.

Since their discovery by SDO/AIA in EUV, rapid (phase speeds of 1000 km/s),
quasiperiodic, fastmode propagating wave trains (QFPs) have been observed
accompanying many solar flares. They typically propagate in funnellike
structures associated with the expanding magnetic field topology of the active
regions (ARs). The waves provide information on the associated flare pulsations
and the magnetic structure through coronal seismology. The reported waves
usually originate from a single localized source associated with the flare.
Here, we report the first detection of counterpropagating QFPs associated with
two neighboring flares on 2013 May 22, apparently connected by largescale,
transequatorial coronal loops. We present the first results of 3D MHD model of
counterpropagating QFPs an idealized bipolar AR. We investigate the
excitation, propagation, nonlinearity, and interaction of the
counterpropagating waves for a range of key model parameters, such as the
properties of the sources and the background magnetic structure. In addition to
QFPs, we also find evidence of trapped fast (kink) and slow mode waves
associated with the event. We apply coronal seismology to determine the
magnetic field strength in an oscillating loop during the event. Our model
results are in qualitative agreement with the AIAobserved counter propagating
waves and are used to identify the various MHD wave modes associated with the
observed event providing insights into their linear and nonlinear interactions.
Our observations provide the first direct evidence of counterpropagating fast
magnetosonic waves that can potentially lead to turbulent cascade and carry
significant energy flux for coronal heating in lowcorona magnetic structures.

An experiment for
$p(^{14}\rm{C}$,$^{14}\rm{C}^{*}\rightarrow^{10}\rm{Be}+\alpha)\mathit{p}$
inelastic excitation and decay was performed in inverse kinematics at a beam
energy of 25.3 MeV/u. A series of $^{14}\rm{C}$ excited states, including a new
one at 18.3(1) MeV, were observed which decay to various states of the final
nucleus of $^{10}\rm{Be}$. A specially designed telescopesystem, installed
around the zero degree, played an essential role in detecting the resonant
states near the $\alpha$separation threshold. A state at 14.1(1) MeV is
clearly identified, being consistent with the predicted bandhead of the
molecular rotational band characterized by the $\pi$bond
linearchainconfiguration. Further clarification of the properties of this
exotic state is suggested by using appropriate reaction tools.

In this paper, the Milstein method is used to approximate invariant measures
of stochastic differential equations with commutative noise. The decay rate of
the transition probability kernel generated by the Milstein method to the
unique invariant measure of the method is observed to be exponential with
respect to the time variable. The convergence rate of the numerical invariant
measure to the underlying one is shown to be a one. Numerical simulations are
presented to demonstrate the theoretical results.

Recently, the booming fashion sector and its huge potential benefits have
attracted tremendous attention from many research communities. In particular,
increasing research efforts have been dedicated to the complementary clothing
matching as matching clothes to make a suitable outfit has become a daily
headache for many people, especially those who do not have the sense of
aesthetics. Thanks to the remarkable success of neural networks in various
applications such as image classification and speech recognition, the
researchers are enabled to adopt the datadriven learning methods to analyze
fashion items. Nevertheless, existing studies overlook the rich valuable
knowledge (rules) accumulated in fashion domain, especially the rules regarding
clothing matching. Towards this end, in this work, we shed light on
complementary clothing matching by integrating the advanced deep neural
networks and the rich fashion domain knowledge. Considering that the rules can
be fuzzy and different rules may have different confidence levels to different
samples, we present a neural compatibility modeling scheme with attentive
knowledge distillation based on the teacherstudent network scheme. Extensive
experiments on the realworld dataset show the superiority of our model over
several stateoftheart baselines. Based upon the comparisons, we observe
certain fashion insights that add value to the fashion matching study. As a
byproduct, we released the codes, and involved parameters to benefit other
researchers.

Salient object detection, which aims to identify and locate the most salient
pixels or regions in images, has been attracting more and more interest due to
its various realworld applications. However, this vision task is quite
challenging, especially under complex image scenes. Inspired by the intrinsic
reflection of natural images, in this paper we propose a novel feature learning
framework for largescale salient object detection. Specifically, we design a
symmetrical fully convolutional network (SFCN) to learn complementary saliency
features under the guidance of lossless feature reflection. The location
information, together with contextual and semantic information, of salient
objects are jointly utilized to supervise the proposed network for more
accurate saliency predictions. In addition, to overcome the blurry boundary
problem, we propose a new structural loss function to learn clear object
boundaries and spatially consistent saliency. The coarse prediction results are
effectively refined by these structural information for performance
improvements. Extensive experiments on seven saliency detection datasets
demonstrate that our approach achieves consistently superior performance and
outperforms the very recent stateoftheart methods.

In this work, we study 3D object detection from RGBD data in both indoor and
outdoor scenes. While previous methods focus on images or 3D voxels, often
obscuring natural 3D patterns and invariances of 3D data, we directly operate
on raw point clouds by popping up RGBD scans. However, a key challenge of this
approach is how to efficiently localize objects in point clouds of largescale
scenes (region proposal). Instead of solely relying on 3D proposals, our method
leverages both mature 2D object detectors and advanced 3D deep learning for
object localization, achieving efficiency as well as high recall for even small
objects. Benefited from learning directly in raw point clouds, our method is
also able to precisely estimate 3D bounding boxes even under strong occlusion
or with very sparse points. Evaluated on KITTI and SUN RGBD 3D detection
benchmarks, our method outperforms the state of the art by remarkable margins
while having realtime capability.

We investigate the pairproduction of righthanded neutrinos via the Standard
Model (SM) Higgs boson in a gauged $BL$ model. The righthanded neutrinos with
a mass of few tens of GeV generating viable light neutrino masses via the
seesaw mechanism naturally exhibit displaced vertices and distinctive
signatures at the LHC and proposed lepton colliders. The production rate of the
righthanded neutrinos depends on the mixing between the SM Higgs and the
exotic Higgs associated with the $BL$ breaking, whereas their decay length
depends on the activesterile neutrino mixing. We focus on the displaced
leptonic final states arising from such a process, and analyze the sensitivity
reach of the LHC and proposed lepton colliders in probing the activesterile
neutrino mixing. We show that mixing to muons as small as $V_{\mu N} \approx
10^{7}$ can be probed at the LHC with 100 fb$^{1}$ and at proposed lepton
colliders with 5000 fb$^{1}$. The future high luminosity run at LHC and the
proposed MATHUSLA detector may further improve this reach by an order of
magnitude.

The linearly constrained nonconvex nonsmooth program has drawn much attention
over the last few years due to its ubiquitous power of modeling in the area of
machine learning. A variety of important problems, including deep learning,
matrix factorization and phase retrieval, can be reformulated as the problem of
optimizing a highly nonconvex and nonsmooth objective function with some linear
constraints. However, it is challenging to solve a linearly constrained
nonconvex nonsmooth program, which is much complicated than its unconstrained
counterpart. In fact, the feasible region is a polyhedron, where a simple
projection is intractable in general, and moreover, the periteration cost is
extremely expensive in real scenario, where the dimension of decision variable
is high. Therefore, it has been recognized promising to develop a provable and
practical algorithm for solving linearly constrained nonconvex nonsmooth
programs.
In this paper, we develop an incremental pathfollowing splitting algorithm,
denoted as \textsf{IPFS}, with a theoretical guarantee and a low computational
cost. In specific, we show that this algorithm converges to an
$\epsilon$approximate stationary solution within $O(1/\epsilon)$ iterations
with very low periteration cost. To the best of our knowledge, this is the
first incremental method to solve linearly constrained nonconvex nonsmooth
programs with a theoretical guarantee. Experiments conducted on the constrained
concave penalized linear regression (CCPLR) and nonconvex support vector
machine (NCSVM) demonstrate that the proposed algorithm is more effective and
stable than other competing methods.

In this paper, we consider the Tensor Robust Principal Component Analysis
(TRPCA) problem, which aims to exactly recover the lowrank and sparse
components from their sum. Our model is based on the recently proposed
tensortensor product (or tproduct) [13]. Induced by the tproduct, we first
rigorously deduce the tensor spectral norm, tensor nuclear norm, and tensor
average rank, and show that the tensor nuclear norm is the convex envelope of
the tensor average rank within the unit ball of the tensor spectral norm. These
definitions, their relationships and properties are consistent with matrix
cases. Equipped with the new tensor nuclear norm, we then solve the TRPCA
problem by solving a convex program and provide the theoretical guarantee for
the exact recovery. Our TRPCA model and recovery guarantee include matrix RPCA
as a special case. Numerical experiments verify our results, and the
applications to image recovery and background modeling problems demonstrate the
effectiveness of our method.

Nowadays, billions of videos are online ready to be viewed and shared. Among
an enormous volume of videos, some popular ones are widely viewed by online
users while the majority attract little attention. Furthermore, within each
video, different segments may attract significantly different numbers of views.
This phenomenon leads to a challenging yet important problem, namely
finegrained video attractiveness prediction. However, one major obstacle for
such a challenging problem is that no suitable benchmark dataset currently
exists. To this end, we construct the first finegrained video attractiveness
dataset, which is collected from one of the most popular video websites in the
world. In total, the constructed FVAD consists of 1,019 drama episodes with
780.6 hours covering different categories and a wide variety of video contents.
Apart from the large amount of videos, hundreds of millions of user behaviors
during watching videos are also included, such as "view counts",
"fastforward", "fastrewind", and so on, where "view counts" reflects the
video attractiveness while other engagements capture the interactions between
the viewers and videos. First, we demonstrate that video attractiveness and
different engagements present different relationships. Second, FVAD provides us
an opportunity to study the finegrained video attractiveness prediction
problem. We design different sequential models to perform video attractiveness
prediction by relying solely on video contents. The sequential models exploit
the multimodal relationships between visual and audio components of the video
contents at different levels. Experimental results demonstrate the
effectiveness of our proposed sequential models with different visual and audio
representations, the necessity of incorporating the two modalities, and the
complementary behaviors of the sequential prediction models at different
levels.

Recently, caption generation with an encoderdecoder framework has been
extensively studied and applied in different domains, such as image captioning,
code captioning, and so on. In this paper, we propose a novel architecture,
namely AutoReconstructor Network (ARNet), which, coupling with the
conventional encoderdecoder framework, works in an endtoend fashion to
generate captions. ARNet aims at reconstructing the previous hidden state with
the present one, besides behaving as the inputdependent transition operator.
Therefore, ARNet encourages the current hidden state to embed more information
from the previous one, which can help regularize the transition dynamics of
recurrent neural networks (RNNs). Extensive experimental results show that our
proposed ARNet boosts the performance over the existing encoderdecoder models
on both image captioning and source code captioning tasks. Additionally, ARNet
remarkably reduces the discrepancy between training and inference processes for
caption generation. Furthermore, the performance on permuted sequential MNIST
demonstrates that ARNet can effectively regularize RNN, especially on modeling
longterm dependencies. Our code is available at:
https://github.com/chenxinpeng/ARNet

In this paper, we establish the large deviation principles, with respect to
the weak convergence topology and the stronger Wasserstein metrics, for the
empirical measure under the mean field Gibbs measure, under the strong
exponential integrability condition for the negative part of the interaction
potential.

We propose an endtoend deep learning architecture that produces a 3D shape
in triangular mesh from a single color image. Limited by the nature of deep
neural network, previous methods usually represent a 3D shape in volume or
point cloud, and it is nontrivial to convert them to the more readytouse
mesh model. Unlike the existing methods, our network represents 3D mesh in a
graphbased convolutional neural network and produces correct geometry by
progressively deforming an ellipsoid, leveraging perceptual features extracted
from the input image. We adopt a coarsetofine strategy to make the whole
deformation procedure stable, and define various of mesh related losses to
capture properties of different levels to guarantee visually appealing and
physically accurate 3D geometry. Extensive experiments show that our method not
only qualitatively produces mesh model with better details, but also achieves
higher 3D shape estimation accuracy compared to the stateoftheart.

Thanks to the success of deep learning, crossmodal retrieval has made
significant progress recently. However, there still remains a crucial
bottleneck: how to bridge the modality gap to further enhance the retrieval
accuracy. In this paper, we propose a selfsupervised adversarial hashing
(\textbf{SSAH}) approach, which lies among the early attempts to incorporate
adversarial learning into crossmodal hashing in a selfsupervised fashion. The
primary contribution of this work is that two adversarial networks are
leveraged to maximize the semantic correlation and consistency of the
representations between different modalities. In addition, we harness a
selfsupervised semantic network to discover highlevel semantic information in
the form of multilabel annotations. Such information guides the feature
learning process and preserves the modality relationships in both the common
semantic space and the Hamming space. Extensive experiments carried out on
three benchmark datasets validate that the proposed SSAH surpasses the
stateoftheart methods.

Leveraging the disparity information from both left and right views is
crucial for stereo disparity estimation. Leftright consistency check is an
effective way to enhance the disparity estimation by referring to the
information from the opposite view. However, the conventional leftright
consistency check is an isolated postprocessing step and heavily handcrafted.
This paper proposes a novel leftright comparative recurrent model to perform
leftright consistency checking jointly with disparity estimation. At each
recurrent step, the model produces disparity results for both views, and then
performs online leftright comparison to identify the mismatched regions which
may probably contain erroneously labeled pixels. A soft attention mechanism is
introduced, which employs the learned error maps for better guiding the model
to selectively focus on refining the unreliable regions at the next recurrent
step. In this way, the generated disparity maps are progressively improved by
the proposed recurrent model. Extensive evaluations on KITTI 2015, Scene Flow
and Middlebury benchmarks validate the effectiveness of our model,
demonstrating that stateoftheart stereo disparity estimation results can be
achieved by this new model.

Recently, much advance has been made in image captioning, and an
encoderdecoder framework has achieved outstanding performance for this task.
In this paper, we propose an extension of the encoderdecoder framework by
adding a component called guiding network. The guiding network models the
attribute properties of input images, and its output is leveraged to compose
the input of the decoder at each time step. The guiding network can be plugged
into the current encoderdecoder framework and trained in an endtoend manner.
Hence, the guiding vector can be adaptively learned according to the signal
from the decoder, making itself to embed information from both image and
language. Additionally, discriminative supervision can be employed to further
improve the quality of guidance. The advantages of our proposed approach are
verified by experiments carried out on the MS COCO dataset.