
Quadratic discriminant analysis (QDA) is a standard tool for classification
due to its simplicity and flexibility. Because the number of its parameters
scales quadratically with the number of the variables, QDA is not practical,
however, when the dimensionality is relatively large. To address this, we
propose a novel procedure named DAQDA for QDA in analyzing highdimensional
data. Formulated in a simple and coherent framework, DAQDA aims to directly
estimate the key quantities in the Bayes discriminant function including
quadratic interactions and a linear index of the variables for classification.
Under appropriate sparsity assumptions, we establish consistency results for
estimating the interactions and the linear index, and further demonstrate that
the misclassification rate of our procedure converges to the optimal Bayes
risk, even when the dimensionality is exponentially high with respect to the
sample size. An efficient algorithm based on the alternating direction method
of multipliers (ADMM) is developed for finding interactions, which is much
faster than its competitor in the literature. The promising performance of
DAQDA is illustrated via extensive simulation studies and the analysis of four
real datasets.

Privacy amplification is an indispensable step in postprocessing of
continuousvariable quantum key distribution (CVQKD), which is used to distill
unconditional secure keys from identical corrected keys between two distant
legal parties. The processing speed of privacy amplification has a significant
effect on the secret key rate of CVQKD system. We report the highspeed
parallel implementation of lengthcompatible privacy amplification algorithm
based on graphic processing unit. Lengthcompatible algorithm is used to
satisfy the security requirement of privacy amplification at different
transmission distances when considering finitesize effect. We achieve the
speed of privacy amplification over 1 Gbps at arbitrary input length and the
speed is one to two orders of magnitude faster than previous demonstrations,
which supports highspeed realtime CVQKD system and ensures the security of
privacy amplification.

Information reconciliation protocol has a significant effect on the secret
key rate and maximal transmission distance of continuousvariable quantum key
distribution (CVQKD) systems. We propose an efficient rateadaptive
reconciliation protocol suitable for practical CVQKD systems with timevarying
quantum channel. This protocol changes the code rate of multiedge type low
density parity check codes, by puncturing (increasing the code rate) and
shortening (decreasing the code rate) techniques, to enlarge the correctable
signaltonoise ratios regime, thus improves the overall reconciliation
efficiency comparing to the original fixed rate reconciliation protocol. We
verify our rateadaptive reconciliation protocol with three typical code rate,
i.e., 0.1, 0.05 and 0.02, the reconciliation efficiency keep around 93.5%,
95.4% and 96.4% for different signaltonoise ratios, which shows the potential
of implementing highperformance CVQKD systems using single code rate matrix.

We unify slice sampling and Hamiltonian Monte Carlo (HMC) sampling,
demonstrating their connection via the HamiltonianJacobi equation from
Hamiltonian mechanics. This insight enables extension of HMC and slice sampling
to a broader family of samplers, called Monomial Gamma Samplers (MGS). We
provide a theoretical analysis of the mixing performance of such samplers,
proving that in the limit of a single parameter, the MGS draws decorrelated
samples from the desired target distribution. We further show that as this
parameter tends toward this limit, performance gains are achieved at a cost of
increasing numerical difficulty and some practical convergence issues. Our
theoretical results are validated with synthetic data and realworld
applications.

The continuousvariable version of quantum key distribution (QKD) offers the
advantages (over discretevariable systems) of higher secret key rates in
metropolitan areas as well as the use of standard telecom components that can
operate at room temperature. An important step in the realworld adoption of
continuousvariable QKD is the deployment of field tests over commercial
fibers. Here we report two different field tests of a continuousvariable QKD
system through commercial fiber networks in Xi'an and Guangzhou over distances
of 30.02 km (12.48 dB) and 49.85 km (11.62 dB), respectively. We achieve secure
key rates two ordersofmagnitude higher than previous field test
demonstrations. This is achieved by developing a fully automatic control system
to create stable excess noise and by applying a rateadaptive reconciliation
protocol to achieve a high reconciliation efficiency with high success
probability. Our results pave the way to achieving continuousvariable QKD in a
metropolitan setting.

The amount of data moved over dedicated and nondedicated network links
increases much faster than the increase in the network capacity, but the
current solutions fail to guarantee even the promised achievable transfer
throughputs. In this paper, we propose a novel dynamic throughput optimization
model based on mathematical modeling with offline knowledge discovery/analysis
and adaptive online decision making. In offline analysis, we mine historical
transfer logs to perform knowledge discovery about the transfer
characteristics. Online phase uses the discovered knowledge from the offline
analysis along with realtime investigation of the network condition to
optimize the protocol parameters. As realtime investigation is expensive and
provides partial knowledge about the current network status, our model uses
historical knowledge about the network and data to reduce the realtime
investigation overhead while ensuring near optimal throughput for each
transfer. Our network and data agnostic solution is tested over different
networks and achieved up to 93% accuracy compared with the optimal achievable
throughput possible on those networks.

We study the impact of finitesize effect on continuousvariable
measurementdeviceindependent quantum key distribution (CVMDI QKD) protocol,
mainly considering the finitesize effect on parameter estimation procedure.
The centrallimit theorem and the maximum likelihood estimation theorem are
used to estimate the parameters. We also analyze the relationship between the
number of exchanged signals and the optimal modulation variance in the
protocol. It is proved that when Charlie's position is close to Bob, the CVMDI
QKD protocol has the farthest transmission distance in finitesize scenario.
Finally, we discuss the impact of finitesize effects related to the practical
detection in the CVMDI QKD protocol. The overall results indicate that the
finitesize effect has a great influence on the secret key rate of the CVMDI
QKD protocol and should not be ignored.

Variational inference (VI) provides fast approximations of a Bayesian
posterior in part because it formulates posterior approximation as an
optimization problem: to find the closest distribution to the exact posterior
over some family of distributions. For practical reasons, the family of
distributions in VI is usually constrained so that it does not include the
exact posterior, even as a limit point. Thus, no matter how long VI is run, the
resulting approximation will not approach the exact posterior. We propose to
instead consider a more flexible approximating family consisting of all
possible finite mixtures of a parametric base distribution (e.g., Gaussian).
For efficient inference, we borrow ideas from gradient boosting to develop an
algorithm we call boosting variational inference (BVI). BVI iteratively
improves the current approximation by mixing it with a new component from the
base distribution family and thereby yields progressively more accurate
posterior approximations as more computing time is spent. Unlike a number of
common VI variants including meanfield VI, BVI is able to capture
multimodality, general posterior covariance, and nonstandard posterior shapes.

We study an optimal control problem in which both the objective function and
the dynamic constraint contain an uncertain parameter. Since the distribution
of this uncertain parameter is not exactly known, the objective function is
taken as the worstcase expectation over a set of possible distributions of the
uncertain parameter. This ambiguity set of distributions is, in turn, defined
by the first two moments of the random variables involved. The optimal control
is found by minimizing the worstcase expectation over all possible
distributions in this set. If the distributions are discrete, the stochastic
minmax optimal control problem can be converted into a convensional optimal
control problem via duality, which is then approximated as a finitedimensional
optimization problem via the control parametrization. We derive necessary
conditions of optimality and propose an algorithm to solve the approximation
optimization problem. The results of discrete probability distribution are then
extended to the case with one dimensional continuous stochastic variable by
applying the control parametrization methodology on the continuous stochastic
variable, and the convergence results are derived. A numerical example is
present to illustrate the potential application of the proposed model and the
effectiveness of the algorithm.

Ordinary least squares (OLS) is the default method for fitting linear models,
but is not applicable for problems with dimensionality larger than the sample
size. For these problems, we advocate the use of a generalized version of OLS
motivated by ridge regression, and propose two novel threestep algorithms
involving least squares fitting and hard thresholding. The algorithms are
methodologically simple to understand intuitively, computationally easy to
implement efficiently, and theoretically appealing for choosing models
consistently. Numerical exercises comparing our methods with penalizationbased
approaches in simulations and data analyses illustrate the great potential of
the proposed algorithms.

Recent years have seen the exponential growth of heterogeneous multimedia
data. The need for effective and accurate data retrieval from heterogeneous
data sources has attracted much research interest in crossmedia retrieval.
Here, given a query of any media type, crossmedia retrieval seeks to find
relevant results of different media types from heterogeneous data sources. To
facilitate largescale crossmedia retrieval, we propose a novel unsupervised
crossmedia hashing method. Our method incorporates local affinity and distance
repulsion constraints into a matrix factorization framework. Correspondingly,
the proposed method learns hash functions that generates unified hash codes
from different media types, while ensuring intrinsic geometric structure of the
data distribution is preserved. These hash codes empower the similarity between
data of different media types to be evaluated directly. Experimental results on
two largescale multimedia datasets demonstrate the effectiveness of the
proposed method, where we outperform the stateoftheart methods.

Fitting statistical models is computationally challenging when the sample
size or the dimension of the dataset is huge. An attractive approach for
downscaling the problem size is to first partition the dataset into subsets
and then fit using distributed algorithms. The dataset can be partitioned
either horizontally (in the sample space) or vertically (in the feature space).
While the majority of the literature focuses on sample space partitioning,
feature space partitioning is more effective when $p\gg n$. Existing methods
for partitioning features, however, are either vulnerable to high correlations
or inefficient in reducing the model dimension. In this paper, we solve these
problems through a new embarrassingly parallel framework named DECO for
distributed variable selection and parameter estimation. In DECO, variables are
first partitioned and allocated to $m$ distributed workers. The decorrelated
subset data within each worker are then fitted via any algorithm designed for
highdimensional problems. We show that by incorporating the decorrelation
step, DECO can achieve consistent variable selection and parameter estimation
on each subset with (almost) no assumptions. In addition, the convergence rate
is nearly minimax optimal for both sparse and weakly sparse models and does NOT
depend on the partition number $m$. Extensive numerical experiments are
provided to illustrate the performance of the new framework.

Photon subtraction can enhance the performance of continuousvariable quantum
key distribution (CV QKD). However, the enhancement effect will be reduced by
the imperfections of practical devices, especially the limited efficiency of a
singlephoton detector. In this paper, we propose a nonGaussian postselection
method to emulate the photon substraction used in coherentstate CV QKD
protocols. The virtual photon subtraction not only can avoid the complexity and
imperfections of a practical photonsubtraction operation, which extends the
secure transmission distance as the ideal case does, but also can be adjusted
flexibly according to the channel parameters to optimize the performance.
Furthermore, our preliminary tests on the information reconciliation suggest
that in the low signaltonoise ratio regime, the performance of reconciliating
the postselected nonGaussian data is better than that of the Gaussian data,
which implies the feasibility of implementing this method practically.

The modern scale of data has brought new challenges to Bayesian inference. In
particular, conventional MCMC algorithms are computationally very expensive for
large data sets. A promising approach to solve this problem is embarrassingly
parallel MCMC (EPMCMC), which first partitions the data into multiple subsets
and runs independent sampling algorithms on each subset. The subset posterior
draws are then aggregated via some combining rules to obtain the final
approximation. Existing EPMCMC algorithms are limited by approximation
accuracy and difficulty in resampling. In this article, we propose a new
EPMCMC algorithm PART that solves these problems. The new algorithm applies
random partition trees to combine the subset posterior draws, which is
distributionfree, easy to resample from and can adapt to multiple scales. We
provide theoretical justification and extensive experiments illustrating
empirical performance.

Variable screening is a fast dimension reduction technique for assisting high
dimensional feature selection. As a preselection method, it selects a moderate
size subset of candidate variables for further refining via feature selection
to produce the final model. The performance of variable screening depends on
both computational efficiency and the ability to dramatically reduce the number
of variables without discarding the important ones. When the data dimension $p$
is substantially larger than the sample size $n$, variable screening becomes
crucial as 1) Faster feature selection algorithms are needed; 2) Conditions
guaranteeing selection consistency might fail to hold. This article studies a
class of linear screening methods and establishes consistency theory for this
special class. In particular, we prove the restricted diagonally dominant (RDD)
condition is a necessary and sufficient condition for strong screening
consistency. As concrete examples, we show two screening methods $SIS$ and
$HOLP$ are both strong screening consistent (subject to additional constraints)
with large probability if $n > O((\rho s + \sigma/\tau)^2\log p)$ under random
designs. In addition, we relate the RDD condition to the irrepresentable
condition, and highlight limitations of $SIS$.

Variable selection is a challenging issue in statistical applications when
the number of predictors $p$ far exceeds the number of observations $n$. In
this ultrahigh dimensional setting, the sure independence screening (SIS)
procedure was introduced to significantly reduce the dimensionality by
preserving the true model with overwhelming probability, before a refined
second stage analysis. However, the aforementioned sure screening property
strongly relies on the assumption that the important variables in the model
have large marginal correlations with the response, which rarely holds in
reality. To overcome this, we propose a novel and simple screening technique
called the highdimensional ordinary leastsquares projection (HOLP). We show
that HOLP possesses the sure screening property and gives consistent variable
selection without the strong correlation assumption, and has a low
computational complexity. A ridge type HOLP procedure is also discussed.
Simulation study shows that HOLP performs competitively compared to many other
marginal correlation based methods. An application to a mammalian eye disease
data illustrates the attractiveness of HOLP.

For massive data sets, efficient computation commonly relies on distributed
algorithms that store and process subsets of the data on different machines,
minimizing communication costs. Our focus is on regression and classification
problems involving many features. A variety of distributed algorithms have been
proposed in this context, but challenges arise in defining an algorithm with
low communication, theoretical guarantees and excellent practical performance
in general settings. We propose a MEdian Selection Subset AGgregation Estimator
(message) algorithm, which attempts to solve these problems. The algorithm
applies feature selection in parallel for each subset using Lasso or another
method, calculates the `median' feature inclusion index, estimates coefficients
for the selected features in parallel for each subset, and then averages these
estimates. The algorithm is simple, involves very minimal communication, scales
efficiently in both sample and feature size, and has theoretical guarantees. In
particular, we show model selection consistency and coefficient estimation
efficiency. Extensive experiments show excellent performance in variable
selection, estimation, prediction, and computation time relative to usual
competitors.

With the rapidly growing scales of statistical problems, subset based
communicationfree parallel MCMC methods are a promising future for large scale
Bayesian analysis. In this article, we propose a new Weierstrass sampler for
parallel MCMC based on independent subsets. The new sampler approximates the
full data posterior samples via combining the posterior draws from independent
subset MCMC chains, and thus enjoys a higher computational efficiency. We show
that the approximation error for the Weierstrass sampler is bounded by some
tuning parameters and provide suggestions for choice of the values. Simulation
study shows the Weierstrass sampler is very competitive compared to other
methods for combining MCMC chains generated for subsets, including averaging
and kernel smoothing.