
In a recent paper, Flandrin [2015] has proposed filtering based on the zeros
of a spectrogram, using the shorttime Fourier transform and a Gaussian window.
His results are based on empirical observations on the distribution of the
zeros of the spectrogram of white Gaussian noise. These zeros tend to be
uniformly spread over the timefrequency plane, and not to clutter. Our
contributions are threefold: we rigorously define the zeros of the spectrogram
of continuous white Gaussian noise, we explicitly characterize their
statistical distribution, and we investigate the computational and statistical
underpinnings of the practical implementation of signal detection based on the
statistics of spectrogram zeros. In particular, we stress that the zeros of
spectrograms of white Gaussian noise correspond to zeros of Gaussian analytic
functions, a topic of recent independent mathematical interest [Hough et al.,
2009].

Determinantal point processes (DPPs) are distributions over sets of items
that model diversity using kernels. Their applications in machine learning
include summary extraction and recommendation systems. Yet, the cost of
sampling from a DPP is prohibitive in largescale applications, which has
triggered an effort towards efficient approximate samplers. We build a novel
MCMC sampler that combines ideas from combinatorial geometry, linear
programming, and Monte Carlo methods to sample from DPPs with a fixed sample
cardinality, also called projection DPPs. Our sampler leverages the ability of
the hitandrun MCMC kernel to efficiently move across convex bodies. Previous
theoretical results yield a fast mixing time of our chain when targeting a
distribution that is close to a projection DPP, but not a DPP in general. Our
empirical results demonstrate that this extends to sampling projection DPPs,
i.e., our sampler is more sampleefficient than previous approaches which in
turn translates to faster convergence when dealing with costlytoevaluate
functions, such as summary extraction in our experiments.

We show that repulsive random variables can yield Monte Carlo methods with
faster convergence rates than the typical $N^{1/2}$, where $N$ is the number
of integrand evaluations. More precisely, we propose stochastic numerical
quadratures involving determinantal point processes associated with
multivariate orthogonal polynomials, and we obtain root mean square errors that
decrease as $N^{(1+1/d)/2}$, where $d$ is the dimension of the ambient space.
First, we prove a central limit theorem (CLT) for the linear statistics of a
class of determinantal point processes, when the reference measure is a product
measure supported on a hypercube, which satisfies the Nevaiclass regularity
condition, a result which may be of independent interest. Next, we introduce a
Monte Carlo method based on these determinantal point processes, and prove a
CLT with explicit limiting variance for the quadrature error, when the
reference measure satisfies a stronger regularity condition. As a corollary, by
taking a specific reference measure and using a construction similar to
importance sampling, we obtain a general Monte Carlo method, which applies to
any measure with continuously derivable density. Loosely speaking, our method
can be interpreted as a stochastic counterpart to Gaussian quadrature, which,
at the price of some convergence rate, is easily generalizable to any dimension
and has a more explicit error term.

Concentration inequalities quantify the deviation of a random variable from a
fixed value. In spite of numerous applications, such as opinion surveys or
ecological counting procedures, few concentration results are known for the
setting of sampling without replacement from a finite population. Until now,
the best general concentration inequality has been a Hoeffding inequality due
to Serfling [Ann. Statist. 2 (1974) 3948]. In this paper, we first improve on
the fundamental result of Serfling [Ann. Statist. 2 (1974) 3948], and further
extend it to obtain a Bernstein concentration bound for sampling without
replacement. We then derive an empirical version of our bound that does not
require the variance to be known to the user.

When targeting a distribution that is artificially invariant under some
permutations, Markov chain Monte Carlo (MCMC) algorithms face the
labelswitching problem, rendering marginal inference particularly cumbersome.
Such a situation arises, for example, in the Bayesian analysis of finite
mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM),
which selfcalibrates its proposal distribution using an online estimate of the
covariance matrix of the target, are no exception. To address the
labelswitching issue, relabeling algorithms associate a permutation to each
MCMC sample, trying to obtain reasonable marginals. In the case of adaptive
Metropolis (Bernoulli 7 (2001) 223242), an online relabeling strategy is
required. This paper is devoted to the AMOR algorithm, a provably consistent
variant of AM that can cope with the labelswitching problem. The idea is to
nest relabeling steps within the MCMC algorithm based on the estimation of a
single covariance matrix that is used both for adapting the covariance of the
proposal distribution in the Metropolis algorithm step and for online
relabeling. We compare the behavior of AMOR to similar relabeling methods. In
the case of compactly supported target distributions, we prove a strong law of
large numbers for AMOR and its ergodicity. These are the first results on the
consistency of an online relabeling algorithm to our knowledge. The proof
underlines latent relations between relabeling and vector quantization.

Determinantal point processes (DPPs) are point process models that naturally
encode diversity between the points of a given realization, through a positive
definite kernel $K$. DPPs possess desirable properties, such as exact sampling
or analyticity of the moments, but learning the parameters of kernel $K$
through likelihoodbased inference is not straightforward. First, the kernel
that appears in the likelihood is not $K$, but another kernel $L$ related to
$K$ through an often intractable spectral decomposition. This issue is
typically bypassed in machine learning by directly parametrizing the kernel
$L$, at the price of some interpretability of the model parameters. We follow
this approach here. Second, the likelihood has an intractable normalizing
constant, which takes the form of a large determinant in the case of a DPP over
a finite set of objects, and the form of a Fredholm determinant in the case of
a DPP over a continuous domain. Our main contribution is to derive bounds on
the likelihood of a DPP, both for finite and continuous domains. Unlike
previous work, our bounds are cheap to evaluate since they do not rely on
approximating the spectrum of a large matrix or an operator. Through usual
arguments, these bounds thus yield cheap variational inference and moderately
expensive exact Markov chain Monte Carlo inference methods for DPPs.

Markov chain Monte Carlo methods are often deemed too computationally
intensive to be of any practical use for big data applications, and in
particular for inference on datasets containing a large number $n$ of
individual data points, also known as tall datasets. In scenarios where data
are assumed independent, various approaches to scale up the MetropolisHastings
algorithm in a Bayesian inference context have been recently proposed in
machine learning and computational statistics. These approaches can be grouped
into two categories: divideandconquer approaches and, subsamplingbased
algorithms. The aims of this article are as follows. First, we present a
comprehensive review of the existing literature, commenting on the underlying
assumptions and theoretical guarantees of each method. Second, by leveraging
our understanding of these limitations, we propose an original
subsamplingbased approach which samples from a distribution provably close to
the posterior distribution of interest, yet can require less than $O(n)$ data
point likelihood evaluations at each iteration for certain statistical models
in favourable scenarios. Finally, we have only been able so far to propose
subsamplingbased methods which display good performance in scenarios where the
Bernsteinvon Mises approximation of the target posterior distribution is
excellent. It remains an open challenge to develop such methods in scenarios
where the Bernsteinvon Mises approximation is poor.

Contributions of the Pierre Auger Collaboration to the 33rd International
Cosmic Ray Conference, Rio de Janeiro, Brazil, July 2013