
This article pertains to the classification of multiple SchrammLoewner
evolutions (SLE). We construct the pure partition functions of multiple
SLE$(\kappa)$ with $\kappa \in (0,4]$ and relate them to certain extremal
multiple SLE measures, thus verifying a conjecture from [BBK05, KP16]. We prove
that the two approaches to construct multiple SLEs  the global,
configurational construction of [KL07, Law09a] and the local, growth process
construction of [BBK05, Dub07, Gra07, KP16]  agree. The pure partition
functions are closely related to crossing probabilities in critical statistical
mechanics models. With explicit formulas in the special case of $\kappa = 4$,
we show that these functions give the connection probabilities for the level
lines of the Gaussian free field (GFF) with alternating boundary data. We also
show that certain functions, known as conformal blocks, give rise to multiple
SLE(4) that can be naturally coupled with the GFF with appropriate boundary
data.

Often when multiple labels are obtained for a training example it is assumed
that there is an element of noise that must be accounted for. It has been shown
that this disagreement can be considered signal instead of noise. In this work
we investigate using soft labels for training data to improve generalization in
machine learning models. However, using soft labels for training Deep Neural
Networks (DNNs) is not practical due to the costs involved in obtaining
multiple labels for large data sets. We propose soft label
memorizationgeneralization (SLMG), a finetuning approach to using soft labels
for training DNNs. We assume that differences in labels provided by human
annotators represent ambiguity about the true label instead of noise.
Experiments with SLMG demonstrate improved generalization performance on the
Natural Language Inference (NLI) task. Our experiments show that by injecting a
small percentage of soft label training data (0.03% of training set size) we
can improve generalization performance over several baselines.

Interpreting the performance of deep learning models beyond test set accuracy
is challenging. Characteristics of individual data points are often not
considered during evaluation, and each data point is treated equally. We
examine the impact of a test set question's difficulty to determine if there is
a relationship between difficulty and performance. We model difficulty using
wellstudied psychometric methods on human response patterns. Experiments on
Natural Language Inference (NLI) and Sentiment Analysis (SA) show that the
likelihood of answering a question correctly is impacted by the question's
difficulty. As DNNs are trained with more data, easy examples are learned more
quickly than hard examples.

Recent research has shown that motile cells can adapt their mode of
propulsion to the mechanical properties of the environment in which they find
themselvescrawling in some environments while swimming in others. The latter
can involve movement by blebbing or other cyclic shape changes, and both
highlysimplified and more realistic models of these modes have been studied
previously. Herein we study swimming that is driven by membrane tension
gradients that arise from flows in the actin cortex underlying the membrane,
and does not involve imposed cyclic shape changes. Such gradients can lead to a
number of different characteristic cell shapes, and our first objective is to
understand how different distributions of membrane tension influence the shape
of cells in an inviscid quiescent fluid. We then analyze the effects of spatial
variation in other membrane properties, and how they interact with tension
gradients to determine the shape. We also study the effect of fluidcell
interactions and show how tension leads to cell movement, how the balance
between tension gradients and a variable bending modulus determine the shape
and direction of movement, and how the efficiency of movement depends on the
properties of the fluid and the distribution of tension and bending modulus in
the membrane.

We apply the Rasmussen spectral sequence to prove that the
$\mathbb{Z}^3$graded vector space structure of the HOMFLYPT homology over
$\mathbb{Z}_2$ detects unlinks. Our proof relies on a theorem of Batson and
Seed stating that the $\mathbb{Z}^2$graded vector space structure of the
Khovanov homology over $\mathbb{Z}_2$ detects unlinks.

Broken symmetries in topological condensed matter systems have implications
for the spectrum of Fermionic excitations confined on surfaces or topological
defects. The Fermionic spectrum of confined (quasi2D) $^3$HeA consists of
branches of chiral edge states. The negative energy states are related to the
groundstate angular momentum, $L_z = (N/2) \hbar$, for $N/2$ Cooper pairs. The
power law suppression of the angular momentum, $L_z(T) \simeq (N/2)\,\hbar\,[1
 \frac{2}{3}(\pi T/\Delta)^2 ]$ for $0 \le T \ll T_c$, in the fully gapped 2D
chiral Aphase reflects the thermal excitation of the chiral edge Fermions. We
discuss the effects of wave function overlap, and hybridization between edge
states confined near opposing surfaces on the edge currents, groundstate
angular momentum and groundstate order parameter. Under strong lateral
confinement, the chiral A phase undergoes a sequence of phase transitions,
first to a pair density wave (PDW) phase with broken translational symmetry at
$D_{c2} \approx 16 \xi_0$. The PDW phase is described by a periodic array of
chiral domains with alternating chirality, separated by domain walls. The
period of PDW phase diverges as the confinement length $D\rightarrow D_{c_2}$.
The PDW phase breaks timereversal symmetry, translation invariance, but is
invariant under the combination of timereversal and translation by a onehalf
period of the PDW. The mass current distribution of the PDW phase reflects this
combined symmetry, and orignates from the spectra of edge Fermions and the
chiral branches bound to the domain walls. Under sufficiently strong
confinement a secondorder transition occurs to the nonchiral "polar phase" at
$D_{c1} \approx 9\xi_0$, in which a single pwave orbital state of Cooper pairs
is aligned along the channel.

The verification of linearizability  a key correctness criterion for
concurrent objects  is based on trace refinement whose checking is
PSPACEcomplete. This paper suggests to use \emph{branching} bisimulation
instead. Our approach is based on comparing an abstract specification in which
object methods are executed atomically to a real object program. Exploiting
divergence sensitivity, this also applies to progress properties such as
lockfreedom. These results enable the use of \emph{polynomialtime}
divergencesensitive branching bisimulation checking techniques for verifying
linearizability and progress. We conducted the experiment on concurrent
lockfree stacks to validate the efficiency and effectiveness of our methods.

Existing temporal relation (TempRel) annotation schemes often have low
interannotator agreements (IAA) even between experts, suggesting that the
current annotation task needs a better definition. This paper proposes a new
multiaxis modeling to better capture the temporal structure of events. In
addition, we identify that event endpoints are a major source of confusion in
annotation, so we also propose to annotate TempRels based on startpoints only.
A pilot expert annotation using the proposed scheme shows significant
improvement in IAA from the conventional 60's to 80's (Cohen's Kappa). This
betterdefined annotation scheme further enables the use of crowdsourcing to
alleviate the labor intensity for each annotator. We hope that this work can
foster more interesting studies towards event understanding.

Seismic horizons are geologically significant surfaces that can be used for
building geology structure and stratigraphy models. However, horizon tracking
in 3D seismic data is a timeconsuming and challenging problem. Relief human
from the tedious seismic interpretation is one of the hot research topics. We
proposed a novel automatically seismic horizon tracking method by using a deep
convolutional neural network. We employ a stateofart endtoend semantic
segmentation method to track the seismic horizons automatically. Experiment
result shows that our proposed neural network can automatically track multiple
horizons simultaneously. We validate the effectiveness and robustness of our
proposed method by comparing automatically tracked horizons with manually
picked horizons.

Extracting temporal relations (before, after, overlapping, etc.) is a key
aspect of understanding events described in natural language. We argue that
this task would gain from the availability of a resource that provides prior
knowledge in the form of the temporal order that events usually follow. This
paper develops such a resource  a probabilistic knowledge base acquired in
the news domain  by extracting temporal relations between events from the New
York Times (NYT) articles over a 20year span (19872007). We show that
existing temporal extraction systems can be improved via this resource. As a
byproduct, we also show that interesting statistics can be retrieved from this
resource, which can potentially benefit other timeaware tasks. The proposed
system and resource are both publicly available.

We present MorphNet, an approach to automate the design of neural network
structures. MorphNet iteratively shrinks and expands a network, shrinking via a
resourceweighted sparsifying regularizer on activations and expanding via a
uniform multiplicative factor on all layers. In contrast to previous
approaches, our method is scalable to large networks, adaptable to specific
resource constraints (e.g. the number of floatingpoint operations per
inference), and capable of increasing the network's performance. When applied
to standard network architectures on a wide variety of datasets, our approach
discovers novel structures in each domain, obtaining higher performance while
respecting the resource constraint.

Deep latentvariable models learn representations of highdimensional data in
an unsupervised manner. A number of recent efforts have focused on learning
representations that disentangle statistically independent axes of variation,
often by introducing suitable modifications of the objective function. We
synthesize this growing body of literature by formulating a generalization of
the evidence lower bound that explicitly represents the tradeoffs between
sparsity of the latent code, bijectivity of representations, and coverage of
the support of the empirical data distribution. Our objective is also suitable
to learning hierarchical representations that disentangle blocks of variables
whilst allowing for some degree of correlations within blocks. Experiments on a
range of datasets demonstrate that learned representations contain
interpretable features, are able to learn discrete attributes, and generalize
to unseen combinations of factors.

Modern object detectors usually suffer from low accuracy issues, as
foregrounds always drown in tons of backgrounds and become hard examples during
training. Compared with those proposalbased ones, realtime detectors are in
far more serious trouble since they renounce the use of regionproposing stage
which is used to filter a majority of backgrounds for achieving realtime
rates. Though foregrounds as hard examples are in urgent need of being mined
from tons of backgrounds, a considerable number of stateoftheart realtime
detectors, like YOLO series, have yet to profit from existing hard example
mining methods, as using these methods need detectors fit series of
prerequisites. In this paper, we propose a general hard example mining method
named Loss Rank Mining (LRM) to fill the gap. LRM is a general method for
realtime detectors, as it utilizes the final feature map which exists in all
realtime detectors to mine hard examples. By using LRM, some elements
representing easy examples in final feature map are filtered and detectors are
forced to concentrate on hard examples during training. Extensive experiments
validate the effectiveness of our method. With our method, the improvements of
YOLOv2 detector on autodriving related dataset KITTI and more general dataset
PASCAL VOC are over 5% and 2% mAP, respectively. In addition, LRM is the first
hard example mining strategy which could fit YOLOv2 perfectly and make it
better applied in series of real scenarios where both realtime rates and
accurate detection are strongly demanded.

As a universal quantum character of quantum correlation, the freezing
phenomenon is researched by geometry and quantum discord methods, respectively.
In this paper, the properties of R`enyi discord is studied for two independent
Dimer System coupled to two correlated Fermispin environments under the
nonMarkovian condition. We further demonstrate that the freezing behaviors
still exist for R`enyi discord and study the effects of different parameters on
this behaviors.

In arXiv:math/0508510, Rasmussen observed that the KhovanovRozansky homology
of a link is a finitely generated module over the polynomial ring generated by
the components of this link. In the current paper, we study the module
structure of the middle HOMFLYPT homology, especially the Betti numbers of this
module. For each link, these Betti numbers are supported on a finite subset of
$\mathbb{Z}^4$. One can easily recover from these Betti numbers the Poincar\'e
polynomial of the middle HOMFLYPT homology. We explain why the Betti numbers
can be viewed as a generalization of the reduced HOMFLYPT homology of knots. As
an application, we prove that the projective dimension of the middle HOMFLYPT
homology is additive under split union of links and provides a new obstruction
to split links.

Caching popular files in small base stations (SBSs) has been proved to be an
effective way to reduce bandwidth pressure on the backhaul links of dense small
cell networks (DSCNs). Many existing studies on cacheenabled DSCNs attempt to
improve user experience by optimizing endtoend file delivery delay. However,
under practical scenarios where files (e.g., video files) have diverse quality
of service requirements, energy consumption at SBSs should also be concerned
from the network perspective. In this paper,we attempt to optimize these two
critical metrics in cacheenabled DSCNs. Firstly, we formulate the energydelay
optimization problem as a Mixed Integer Programming (MIP) problem, where file
placement, user association and power control are jointly considered. To model
the tradeoff relationship between energy consumption and endtoend file
delivery delay, a utility function linearly combining these two metrics is used
as an objective function of the optimization problem. Then, we solve the
problem in two stages, i.e. caching stage and delivery stage, based on the
observation that caching is performed during offpeak time. At the caching
stage, a local popular file placement policy is proposed by estimating user
preference at each SBS. At the delivery stage, with given caching status at
SBSs, the MIP problem is further decomposed by Benders' decomposition method.
An efficient algorithm is proposed to approach the optimal association and
power solution by iteratively shrinking the gap of the upper and lower bounds.
Finally, extension simulations are performed to validate our analytical and
algorithmic work. The results demonstrate that the proposed algorithms can
achieve the optimal tradeoff between energy consumption and endtoend file
delivery delay.

Multiphoton entanglement plays a critical role in quantum information
processing, and greatly improves our fundamental understanding of the quantum
world. Despite tremendous efforts in either bulk media or fiberbased devices,
nonlinear interactions in integrated circuits show great promise as an
excellent platform for photon pair generation with its high brightness,
stability and scalability \cite{Caspani2017}. Here, we demonstrate the
generation of bi and multiphoton polarization entangled qubits in a single
silicon nanowire waveguide, and these qubits directly compatible with the dense
wavelength division multiplexing in telecommunication system. Multiphoton
interference and quantum state tomography were used to characterize the quality
of the entangled states. Fourphoton entanglement states among two frequency
channels were ascertained with a fidelity of $0.78\pm0.02$. Our work realizes
the integrated multiphoton source in a relatively simple pattern and paves a
way for the revolution of multiphoton quantum science.

A highintensity supersonic beam source has been a key component in studies
of molecular collisions, moleculesurface interaction, chemical reactions, and
precision spectroscopy. However, the molecular density available for
experiments in a downstream science chamber is limited by skimmer clogging,
which constrains the separation between a valve and a skimmer to at least
several hundred nozzle diameters. A recent experiment (Science Advances, 2017,
3, e1602258) has introduced a new strategy to address this challenge: when a
skimmer is cooled to a temperature below the freezing point of the carrier gas,
skimmer clogging can be effectively suppressed. We go beyond this
proofofprinciple work in several key ways. Firstly, we apply the skimmer
cooling approach to dischargeproduced radical and metastable beams entrained
in a carrier gas. We also identify two different processes for skimmer clogging
mitigationshockwave suppression at temperatures around the carrier gas
freezing point and diffusive clogging at even lower temperatures. With the
carrier clogging removed, we now fully optimize the production of entrained
species such as hydroxyl radicals, resulting in a gain of 30 in density over
the best commercial devices. The gain arises from both clogging mitigation and
favorable geometry with a much shorter valveskimmer distance.

This article pertains to the classification of pairs of simple random curves
with conformal Markov property and symmetry. We give the complete
classification of such curves: conformal Markov property and symmetry single
out a twoparameter family of random curvesHypergeometric SLEdenoted by
hSLE$_{\kappa}(\nu)$ for $\kappa\in (0,4]$ and $\nu<\kappa6$. The proof relies
crucially on Dub\'edat's commutation relation [Dub07] and a uniqueness result
proved in [MS16b]. The classification indicates that hypergeometric SLE is the
only possible scaling limit of the interfaces in critical lattice models
(conjectured or proved to be conformal invariant) in topological rectangles
with alternating boundary conditions.
We also prove various properties of hSLE: continuity, reversibility,
targetindependence, and conditional law characterization. As byproducts, we
give two applications of these properties. The first one is about the critical
Ising interfaces. We prove the convergence of the Ising interface in rectangles
with alternating boundary conditions. This result was first proved by Izyurov
in [Izy15], but our proof is new which is based on the properties of hSLE. The
second application is the existence of the socalled pure partition functions
of multiple SLEs. Such existence was proved for $\kappa\in (0,8)\setminus
\mathbb{Q}$ in [KP16], and it was later proved for $\kappa\in (0,4]$ in [PW17].
We give a new proof of the existence for $\kappa\in (0,6]$ using the properties
of hSLE.

Deep neural networks have enabled progress in a wide variety of applications.
Growing the size of the neural network typically results in improved accuracy.
As model sizes grow, the memory and compute requirements for training these
models also increases. We introduce a technique to train deep neural networks
using half precision floating point numbers. In our technique, weights,
activations and gradients are stored in IEEE halfprecision format.
Halfprecision floating numbers have limited numerical range compared to
singleprecision numbers. We propose two techniques to handle this loss of
information. Firstly, we recommend maintaining a singleprecision copy of the
weights that accumulates the gradients after each optimizer step. This
singleprecision copy is rounded to halfprecision format during training.
Secondly, we propose scaling the loss appropriately to handle the loss of
information with halfprecision gradients. We demonstrate that this approach
works for a wide variety of models including convolution neural networks,
recurrent neural networks and generative adversarial networks. This technique
works for large scale models with more than 100 million parameters trained on
large datasets. Using this approach, we can reduce the memory consumption of
deep learning models by nearly 2x. In future processors, we can also expect a
significant computation speedup using halfprecision hardware units.

We give a simplified and complete proof of the convergence of the chordal
exploration process in critical FKIsing percolation to chordal SLE$_\kappa(
\kappa6)$ with $\kappa=16/3$. Our proof follows the classical
excursionconstruction of SLE$_\kappa(\kappa6)$ processes in the continuum and
we are thus lead to introduce suitable cutoff stopping times in order to
analyse the behaviour of the driving function of the discrete system when
Dobrushin boundary conditions collapse to a single point. Our proof is very
different from [KS15, KS16] as it only relies on the convergence to the chordal
SLE$_{\kappa}$ process in Dobrushin boundary conditions and does not require
the introduction of a new observable. Still, it relies crucially on several
ingredients:
a) the powerful topological framework developed in [KS17] as well as its
followup paper [CDCH$^+$14],
b) the strong RSW Theorem from [CDCH16],
c) the proof is inspired from the appendix A in [BC16].
One important emphasis of this paper is to carefully write down some
properties which are often considered {\em folklore} in the literature but
which are only justified so far by handwaving arguments. The main examples of
these are:
1) the convergence of natural discrete stopping times to their continuous
analogues. (The usual handwaving argument destroys the spatial Markov
property).
2) the fact that the discrete spatial Markov property is preserved in the the
scaling limit. (The enemy being that $\mathbb{E}[X_n \,\, Y_n]$ does not
necessarily converge to $\mathbb{E}[X \,\, Y]$ when $(X_n,Y_n)\to (X,Y)$).
We end the paper with a detailed sketch of the convergence to radial
SLE$_\kappa( \kappa6)$ when $\kappa=16/3$ as well as the derivation of
Onsager's onearm exponent $1/8$.

Estimating the travel time of a path is of great importance to smart urban
mobility. Existing approaches are either based on estimating the time cost of
each road segment which are not able to capture many crosssegment complex
factors, or designed heuristically in a nonlearningbased way which fail to
utilize the existing abundant temporal labels of the data, i.e., the time stamp
of each trajectory point. In this paper, we leverage on new development of deep
neural networks and propose a novel auxiliary supervision model, namely
DeepTravel, that can automatically and effectively extract different features,
as well as make full use of the temporal labels of the trajectory data. We have
conducted comprehensive experiments on real datasets to demonstrate the
outperformance of DeepTravel over existing approaches.

Magnetic insulators (MIs) attract tremendous interest for spintronic
applications due to low Gilbert damping and absence of Ohmic loss. Magnetic
order of MIs can be manipulated and even switched by spinorbit torques (SOTs)
generated through spin Hall effect and RashbaEdelstein effect in heavy
metal/MI bilayers. SOTs on MIs are more intriguing than magnetic metals since
SOTs cannot be transferred to MIs through direct injection of electron spins.
Understanding of SOTs on MIs remains elusive, especially how SOTs scale with
the film thickness. Here, we observe the critical role of dimensionality on the
SOT efficiency by systematically studying the MI layer thickness dependent SOT
efficiency in tungsten/thulium iron garnet (W/TmIG) bilayers. We first show
that the TmIG thin film evolves from twodimensional to threedimensional
magnetic phase transitions as the thickness increases, due to the suppression
of longwavelength thermal fluctuation. Then, we report the significant
enhancement of the measured SOT efficiency as the thickness increases. We
attribute this effect to the increase of the magnetic moment density in concert
with the suppression of thermal fluctuations. At last, we demonstrate the
currentinduced SOT switching in the W/TmIG bilayers with a TmIG thickness up
to 15 nm. The switching current density is comparable with those of heavy
metal/ferromagnetic metal cases. Our findings shed light on the understanding
of SOTs in MIs, which is important for the future development of ultrathin
MIbased lowpower spintronics.

This article focuses on the characterization of global multiple
SchrammLoewner evolutions (SLE). The chordal SLE process describes the scaling
limit of a single interface in various critical lattice models with Dobrushin
boundary conditions, and similarly, global multiple SLEs describe scaling
limits of collections of interfaces in critical lattice models with alternating
boundary conditions. In this article, we give a minimal amount of
characterizing properties for the global multiple SLEs: we prove that there
exists a unique probability measure on collections of pairwise disjoint
continuous simple curves with a certain conditional law property. As a
consequence, we obtain the convergence of multiple interfaces in the critical
Ising and FKIsing models.

Schramm Loewner Evolution (SLE) is a oneparameter family of random planar
curves introduced by Oded Schramm in 1999 as the candidates for the scaling
limits of the interfaces in the planar critical lattice models. This is the
only possible process with conformal invariance and a certain "domain Markov
property". In 2010, Chelkak and Smirnov proved the conformal invariance of the
scaling limits of the critial planar FKIsing model which gave the convergence
of the interface to SLE$_{16/3}$. We derive the arm exponents of SLE$_{\kappa}$
for $\kappa\in (4,8)$. Combining with the convergence of the interface, we
derive the arm exponents of the critical FKIsing model. We obtain six
different patterns of boundary arm exponents and three different patterns of
interior arm exponents of the critical planar FKIsing model on the square
lattice.