
Internet companies are facing the need of handling large scale machine
learning applications in a daily basis, and distributed system which can handle
extralarge scale tasks is needed. Deep forest is a recently proposed deep
learning framework which uses tree ensembles as its building blocks and it has
achieved highly competitive results on various domains of tasks. However, it
has not been tested on extremely large scale tasks. In this work, based on our
parameter server system and platform of artificial intelligence, we developed
the distributed version of deep forest with an easytouse GUI. To the best of
our knowledge, this is the first implementation of distributed deep forest. To
meet the need of realworld tasks, many improvements are introduced to the
original deep forest model. We tested the deep forest model on an extralarge
scale task, i.e., automatic detection of cashout fraud, with more than 100
millions of training samples. Experimental results showed that the deep forest
model has the best performance according to the evaluation metrics from
different perspectives even with very little effort for parameter tuning. This
model can block fraud transactions in a large amount of money \footnote{detail
is business confidential} each day. Even compared with the best deployed model,
deep forest model can additionally bring into a significant decrease of
economic loss.

This paper presents a new signal processing method based on Complex Bandpass
Filtering (CBF) applied to the Coriolis Mass Flowmeter (CMF). CBF can be
utilized to suppress the negative frequency component of each sensor signal to
produce the corresponding analytic form with reduced tracking delay. Further
processing of the analytic form yields the amplitude, frequency, phase and
phase difference of the sensor signals. In comparison with previously published
methods, CBF offers short delay, high noise suppression, high accuracy and low
computational cost. A reduced delay is useful in CMF signal processing
especially for maintaining flowtube oscillation in two/multiphase flow
conditions. The central frequency and the frequency range of the CBF method are
selectable so that they can be customized for different flowtube designs.

Schwinn et al. (2017) have recently compared the abundance and distribution
of massive substructures identified in a gravitational lensing analysis of
Abell 2744 by Jauzac et al. (2016) and Nbody simulation and found no cluster
in {\Lambda}CDM simulation that is similar to Abell 2744. Schwinn et al.(2017)
identified the measured projected aperture masses with the actual masses
associated with subhaloes in the MXXL Nbody simulation. We have used the high
resolution Phoenix cluster simulations to show that such an identification is
incorrect: the aperture mass is dominated by mass in the body of the cluster
that happens to be projected along the lineofsight to the subhalo. This
enhancement varies from factors of a few to factors of more than 100,
particularly for subhaloes projected near the centre of the cluster. We
calculate aperture masses for subhaloes in our simulation and compare them to
the measurements for Abell 2744. We find that the data for Abell 2744 are in
excellent agreement with the matched predictions from {\Lambda}CDM. We provide
further predictions for aperture mass functions of subhaloes in idealized
surveys with varying mass detection thresholds.

Laser cooled lanthanide atoms are ideal candidates with which to study strong
and unconventional quantum magnetism with exotic phases. Here, we use
stateoftheart closedcoupling simulations to model quantum magnetism for
pairs of ultracold spin6 erbium lanthanide atoms placed in a deep optical
lattice. In contrast to the widely used singlechannel Hubbard model
description of atoms and molecules in an optical lattice, we focus on the
singlesite multichannel spin evolution due to spindependent contact,
anisotropic van der Waals, and dipolar forces. This has allowed us to identify
the leading mechanism, orbital anisotropy, that governs molecular spin dynamics
among erbium atoms. The large magnetic moment and combined orbital angular
momentum of the 4fshell electrons are responsible for these strong anisotropic
interactions and unconventional quantum magnetism. Multichannel simulations of
magnetic Cr atoms under similar trapping conditions show that their
spinevolution is controlled by spindependent contact interactions that are
distinct in nature from the orbital anisotropy in Er. The role of an external
magnetic field and the aspect ratio of the lattice site on spin dynamics is
also investigated.

In this paper, we explore the encoding/pooling layer and loss function in the
endtoend speaker and language recognition system. First, a unified and
interpretable endtoend system for both speaker and language recognition is
developed. It accepts variablelength input and produces an utterance level
result. In the endtoend system, the encoding layer plays a role in
aggregating the variablelength input sequence into an utterance level
representation. Besides the basic temporal average pooling, we introduce a
selfattentive pooling layer and a learnable dictionary encoding layer to get
the utterance level representation. In terms of loss function for openset
speaker verification, to get more discriminative speaker embedding, center loss
and angular softmax loss is introduced in the endtoend system. Experimental
results on Voxceleb and NIST LRE 07 datasets show that the performance of
endtoend learning system could be significantly improved by the proposed
encoding layer and loss function.

In this paper, localized information privacy (LIP) is proposed, as a new
privacy definition, which allows statistical aggregation while protecting
users' privacy without relying on a trusted third party. The notion of
contextawareness is incorporated in LIP by the introduction of priors, which
enables the design of privacypreserving data aggregation with knowledge of
priors. We show that LIP relaxes the Localized Differential Privacy (LDP)
notion by explicitly modeling the adversary's knowledge. However, it is
stricter than $2\epsilon$LDP and $\epsilon$mutual information privacy. The
incorporation of local priors allows LIP to achieve higher utility compared to
other approaches. We then present an optimization framework for
privacypreserving data aggregation, with the goal of minimizing the expected
squared error while satisfying the LIP privacy constraints. Utilityprivacy
tradeoffs are obtained under several models in closedform. We then validate
our analysis by {numerical analysis} using both synthetic and realworld data.
Results show that our LIP mechanism provides better utilityprivacy tradeoffs
than LDP and when the prior is not uniformly distributed, the advantage of LIP
is even more significant.

A novel interpretable endtoend learning scheme for language identification
is proposed. It is in line with the classical GMM ivector methods both
theoretically and practically. In the endtoend pipeline, a general encoding
layer is employed on top of the frontend CNN, so that it can encode the
variablelength input sequence into an utterance level vector automatically.
After comparing with the stateoftheart GMM ivector methods, we give
insights into CNN, and reveal its role and effect in the whole pipeline. We
further introduce a general encoding layer, illustrating the reason why they
might be appropriate for language identification. We elaborate on several
typical encoding layers, including a temporal average pooling layer, a
recurrent encoding layer and a novel learnable dictionary encoding layer. We
conducted experiment on NIST LRE07 closedset task, and the results show that
our proposed endtoend systems achieve stateoftheart performance.

A novel learnable dictionary encoding layer is proposed in this paper for
endtoend language identification. It is inline with the conventional GMM
ivector approach both theoretically and practically. We imitate the mechanism
of traditional GMM training and Supervector encoding procedure on the top of
CNN. The proposed layer can accumulate highorder statistics from
variablelength input sequence and generate an utterance level
fixeddimensional vector representation. Unlike the conventional methods, our
new approach provides an endtoend learning framework, where the inherent
dictionary are learned directly from the loss function. The dictionaries and
the encoding representation for the classifier are learned jointly. The
representation is orderless and therefore appropriate for language
identification. We conducted a preliminary experiment on NIST LRE07 closedset
task, and the results reveal that our proposed dictionary encoding layer
achieves significant error reduction comparing with the simple average pooling.

In this paper, the $K$user interference channel with secrecy constraints is
considered with delayed channel state information at transmitters (CSIT). We
propose a novel secure retrospective interference alignment scheme in which the
transmitters carefully mix information symbols with artificial noises to ensure
confidentiality. Achieving positive secure degrees of freedom (SDoF) is
challenging due to the delayed nature of CSIT, and the distributed nature of
the transmitters. Our scheme works over two phases: phase one in which each
transmitter sends information symbols mixed with artificial noises, and repeats
such transmission over multiple rounds. In the next phase, each transmitter
uses delayed CSIT of the previous phase and sends a function of the net
interference and artificial noises (generated in previous phase), which is
simultaneously useful for all receivers. These phases are designed to ensure
the decodability of the desired messages while satisfying the secrecy
constraints. We present our achievable scheme for three models, namely: 1)
$K$user interference channel with confidential messages (ICCM), and we show
that $\frac{1}{2} (\sqrt{K} 6) $ SDoF is achievable, 2) $K$user interference
channel with an external eavesdropper (ICEE), and 3) $K$user IC with
confidential messages and an external eavesdropper (ICCMEE). We show that for
the $K$user ICEE, $\frac{1}{2} (\sqrt{K} 3) $ SDoF is achievable, and for
the $K$user ICCMEE, $\frac{1}{2} (\sqrt{K} 6) $ is achievable. To the best
of our knowledge, this is the first result on the $K$user interference channel
with secrecy constrained models and delayed CSIT that achieves a SDoF which
scales with $K$, the number of users.

The entanglement detection via local measurements can be experimentally
implemented. Based on mutually unbiased measurements and general symmetric
informationally complete positiveoperatorvalued measures, we present
separability criteria for bipartite quantum states, which, by theoretical
analysis, are stronger than the related existing criteria via these
measurements. Two detailed examples are supplemented to show the efficiency of
the presented separability criteria.

The ultrafine entanglement witness, introduced in [F. Shahandeh, M.
Ringbauer, J.C. Loredo, and T.C. Ralph, Phys. Rev. Lett. \textbf{118}, 110502
(2017)], can seamlessly and easily improve any standard entanglement witness.
In this paper, by combining the constraint and the test operators, we rotate
the hyperplane determined by the test operator and improve further the original
ultrafine entanglement witness. In particular, we present a series of new
ultrafine entanglement witnesses, which not only can detect entangled states
that the original ultrafine entanglement witnesses cannot detect, but also have
the merits that the original ultrafine entanglement witnesses have.

Sentiment analysis is a key component in various text mining applications.
Numerous sentiment classification techniques, including conventional and deep
learningbased methods, have been proposed in the literature. In most existing
methods, a highquality training set is assumed to be given. Nevertheless,
constructing a highquality training set that consists of highly accurate
labels is challenging in real applications. This difficulty stems from the fact
that text samples usually contain complex sentiment representations, and their
annotation is subjective. We address this challenge in this study by leveraging
a new labeling strategy and utilizing a twolevel long shortterm memory
network to construct a sentiment classifier. Lexical cues are useful for
sentiment analysis, and they have been utilized in conventional studies. For
example, polar and privative words play important roles in sentiment analysis.
A new encoding strategy, that is, $\rho$hot encoding, is proposed to alleviate
the drawbacks of onehot encoding and thus effectively incorporate useful
lexical cues. We compile three Chinese data sets on the basis of our label
strategy and proposed methodology. Experiments on the three data sets
demonstrate that the proposed method outperforms stateoftheart algorithms.

This paper develops a randomized approach for incrementally building deep
neural networks, where a supervisory mechanism is proposed to constrain the
random assignment of the weights and biases, and all the hidden layers have
direct links to the output layer. A fundamental result on the universal
approximation property is established for such a class of randomized leaner
models, namely deep stochastic configuration networks (DeepSCNs). A learning
algorithm is presented to implement DeepSCNs with either specific architecture
or selforganization. The readout weights attached with all direct links from
each hidden layer to the output layer are evaluated by the least squares
method. Given a set of training examples, DeepSCNs can speedily produce a
learning representation, that is, a collection of random basis functions with
the cascaded inputs together with the readout weights. An empirical study on a
function approximation is carried out to demonstrate some properties of the
proposed deep learner model.

Multiphoton entanglement plays a critical role in quantum information
processing, and greatly improves our fundamental understanding of the quantum
world. Despite tremendous efforts in either bulk media or fiberbased devices,
nonlinear interactions in integrated circuits show great promise as an
excellent platform for photon pair generation with its high brightness,
stability and scalability \cite{Caspani2017}. Here, we demonstrate the
generation of bi and multiphoton polarization entangled qubits in a single
silicon nanowire waveguide, and these qubits directly compatible with the dense
wavelength division multiplexing in telecommunication system. Multiphoton
interference and quantum state tomography were used to characterize the quality
of the entangled states. Fourphoton entanglement states among two frequency
channels were ascertained with a fidelity of $0.78\pm0.02$. Our work realizes
the integrated multiphoton source in a relatively simple pattern and paves a
way for the revolution of multiphoton quantum science.

Ergodic quantum systems are often quite alike, whereas nonergodic, fractal
systems are unique and display characteristic properties. We explore one of
these fractal systems, weakly bound dysprosium lanthanide molecules, in an
external magnetic field. As recently shown, colliding ultracold magnetic
dysprosium atoms display a soft chaotic behavior with a small degree of
disorder. We broaden this classification by investigating the generalized
inverse participation ratio and fractal dimensions for large sets of molecular
wave functions. Our exact closecoupling simulations reveal a dynamic phase
transition from partially localized states to totally delocalized states and
universality in its distribution by increasing the magnetic field strength to
only a hundred Gauss (or 10 mT). Finally, we prove the existence of nonergodic
delocalized phase in the system and explain the violation of ergodicity by
strong coupling between nearthreshold molecular states and the nearby
continuum.

Computer poetry generation is our first step towards computer writing.
Writing must have a theme. The current approaches of using sequencetosequence
models with attention often produce nonthematic poems. We present a novel
conditional variational autoencoder with a hybrid decoder adding the
deconvolutional neural networks to the general recurrent neural networks to
fully learn topic information via latent variables. This approach significantly
improves the relevance of the generated poems by representing each line of the
poem not only in a contextsensitive manner but also in a holistic way that is
highly related to the given keyword and the learned topic. A proposed augmented
word2vec model further improves the rhythm and symmetry. Tests show that the
generated poems by our approach are mostly satisfying with regulated rules and
consistent themes, and 73.42% of them receive an Overall score no less than 3
(the highest score is 5).

This paper addresses deep face recognition (FR) problem under openset
protocol, where ideal face features are expected to have smaller maximal
intraclass distance than minimal interclass distance under a suitably chosen
metric space. However, few existing algorithms can effectively achieve this
criterion. To this end, we propose the angular softmax (ASoftmax) loss that
enables convolutional neural networks (CNNs) to learn angularly discriminative
features. Geometrically, ASoftmax loss can be viewed as imposing
discriminative constraints on a hypersphere manifold, which intrinsically
matches the prior that faces also lie on a manifold. Moreover, the size of
angular margin can be quantitatively adjusted by a parameter $m$. We further
derive specific $m$ to approximate the ideal feature criterion. Extensive
analysis and experiments on Labeled Face in the Wild (LFW), Youtube Faces (YTF)
and MegaFace Challenge show the superiority of ASoftmax loss in FR tasks. The
code has also been made publicly available.

Soliton molecules are the manifestation of attractive and repulsive
interaction between optical pulses mediated by a nonlinear medium. However, the
formation and breakup of soliton molecules are difficult to observe due to the
transient nature of the process. Using the time stretch technique, we have been
able to track the realtime evolution of the bound state formation in a
femtosecond fiber laser and unveil different evolution paths towards a stable
bound state. A nonstationary Qswitched modelocking regime consisting of
various transient solitons is observed in this transition period. We also
observe additional dynamics including soliton molecule vibrations, collision
and decay. Our findings uncover a diverse set of soliton dynamics in an optical
nonlinear system and provide valuable data for further theoretical studies.

In this paper, we first establish an equivalence theorem of Minkowski spaces
by using results in centroaffine differential geometry. As an application in
Finsler geometry, we gives some new characterizations of Berwald spaces.

We present a lower bound of concurrence for fourpartite systems in terms of
the concurrence for $M\, (2\leq M\leq 3)$ part quantum systems and give an
analytical lower bound for $2\otimes2\otimes2\otimes2$ mixed quantum sates. It
is shown that these lower bounds are able to improve the existing bounds and
detect entanglement better. Furthermore, our approach can be generalized to
multipartite quantum systems.

Liquid chromatography with tandem mass spectrometry (LCMS/MS) based
proteomics is a wellestablished research field with major applications such as
identification of disease biomarkers, drug discovery, drug design and
development. In proteomics, protein identification and quantification is a
fundamental task, which is done by first enzymatically digesting it into
peptides, and then analyzing peptides by LCMS/MS instruments. The peptide
feature detection and quantification from an LCMS map is the first step in
typical analysis workflows. In this paper we propose a novel deep learning
based model, DeepIso, that uses Convolutional Neural Networks (CNNs) to scan an
LCMS map to detect peptide features and estimate their abundance. Existing
tools are often designed with limited engineered features based on domain
knowledge, and depend on pretrained parameters which are hardly updated despite
huge amount of new coming proteomic data. Our proposed model, on the other
hand, is capable of learning multiple levels of representation of high
dimensional data through its many layers of neurons and continuously evolving
with newly acquired data. To evaluate our proposed model, we use an antibody
dataset including a heavy and a light chain, each digested by AspN,
Chymotrypsin, Trypsin, thus giving six LCMS maps for the experiment. Our model
achieves 93.21% sensitivity with specificity of 99.44% on this dataset. Our
results demonstrate that novel deep learning tools are desirable to advance the
stateoftheart in protein identification and quantification.

In high energy heavyion collisions, the degrees of freedom at the very early
stage can be effectively represented by strong classical gluonic fields within
the Color Glass Condensate framework. As the system expands, the strong gluonic
fields eventually become weak such that an equivalent description using the
gluonic particle degrees of freedom starts to become valid. We revisit the
spectrum of these gluonic particles by solving the classical YangMills
equations semianalytically with the solutions having the form of power series
expansions in the proper time. We propose a different formula for the gluon
spectrum which is consistent with energy density during the whole time
evolution. We find that the chromoelectric fields have larger contributions to
the gluon spectrum than the chromomagnetic fields do. Furthermore, the large
momentum modes take less time to reach the weakfield regime while smaller
momentum modes take more time. The resulting functional form of the gluon
spectrum is exponential in nature and the spectrum is close to a thermal
distrubtion with effective temperatures around $0.6$ to $0.9\, Q_s$ late in the
Glasma evolution. The sensitiveness of the gluon spectrum to the infrared and
the ultraviolet cutoffs are discussed.

Millimeter wave (mmWave) communications have been considered as a key
technology for next generation cellular systems and WiFi networks because of
its advances in providing ordersofmagnitude wider bandwidth than current
wireless networks. Economical and energy efficient analog/digial hybrid
precoding and combining transceivers have been often proposed for mmWave
massive multipleinput multipleoutput (MIMO) systems to overcome the severe
propagation loss of mmWave channels. One major shortcoming of existing
solutions lies in the assumption of infinite or highresolution phase shifters
(PSs) to realize the analog beamformers. However, lowresolution PSs are
typically adopted in practice to reduce the hardware cost and power
consumption. Motivated by this fact, in this paper, we investigate the
practical design of hybrid precoders and combiners with lowresolution PSs in
mmWave MIMO systems. In particular, we propose an iterative algorithm which
successively designs the lowresolution analog precoder and combiner pair for
each data stream, aiming at conditionally maximizing the spectral efficiency.
Then, the digital precoder and combiner are computed based on the obtained
effective baseband channel to further enhance the spectral efficiency. In an
effort to achieve an even more hardwareefficient large antenna array, we also
investigate the design of hybrid beamformers with onebit resolution (binary)
PSs, and present a novel binary analog precoder and combiner optimization
algorithm with quadratic complexity in the number of antennas. The proposed
lowresolution hybrid beamforming design is further extended to multiuser MIMO
communication systems. Simulation results demonstrate the performance
advantages of the proposed algorithms compared to existing lowresolution
hybrid beamforming designs, particularly for the onebit resolution PS
scenario.

In cognitive radio networks (CRNs), dynamic spectrum access has been proposed
to improve the spectrum utilization, but it also generates spectrum misuse
problems. One common solution to these problems is to deploy monitors to detect
misbehaviors on certain channel. However, in multichannel CRNs, it is very
costly to deploy monitors on every channel. With a limited number of monitors,
we have to decide which channels to monitor. In addition, we need to determine
how long to monitor each channel and in which order to monitor, because
switching channels incurs costs. Moreover, the information about the misuse
behavior is not available a priori. To answer those questions, we model the
spectrum monitoring problem as an adversarial multiarmed bandit problem with
switching costs (MABSC), propose an effective framework, and design two online
algorithms, SpecWatchII and SpecWatchIII, based on the same framework. To
evaluate the algorithms, we use weak regret, i.e., the performance difference
between the solution of our algorithm and optimal (fixed) solution in
hindsight, as the metric. We prove that the expected weak regret of
SpecWatchII is O(T^{2/3}), where T is the time horizon. Whereas, the actual
weak regret of SpecWatchIII is O(T^{2/3}) with probability 1  {\delta}, for
any {\delta} in (0, 1). Both algorithms guarantee the upper bounds matching the
lower bound of the general adversarial MAB SC problem. Therefore, they are all
asymptotically optimal.

In cognitive radio networks (CRNs), dynamic spectrum access has been proposed
to improve the spectrum utilization, but it also generates spectrum misuse
problems. One common solution to these problems is to deploy monitors to detect
misbehaviors on certain channel. However, in multichannel CRNs, it is very
costly to deploy monitors on every channel. With a limited number of monitors,
we have to decide which channels to monitor. In addition, we need to determine
how long to monitor each channel and in which order to monitor, because
switching channels incurs costs. Moreover, the information about the misuse
behavior is not available a priori. To answer those questions, we model the
spectrum monitoring problem as an adversarial multiarmed bandit problem with
switching costs (MABSC), propose an effective framework, and design two online
algorithms, SpecWatchII and SpecWatchIII, based on the same framework. To
evaluate the algorithms, we use weak regret, i.e., the performance difference
between the solution of our algorithm and optimal (fixed) solution in
hindsight, as the metric. We prove that the expected weak regret of
SpecWatchII is O(T^{2/3}), where T is the time horizon. Whereas, the actual
weak regret of SpecWatchIII is O(T^{2/3}) with probability 1  {\delta}, for
any {\delta} in (0, 1). Both algorithms guarantee the upper bounds matching the
lower bound of the general adversarial MAB SC problem. Therefore, they are all
asymptotically optimal.