-
The inductive size bias coupling technique and Stein's method yield a
Berry-Esseen theorem for the number of urns having occupancy $d \ge 2$ when $n$
balls are uniformly distributed over $m$ urns. In particular, there exists a
constant $C$ depending only on $d$ such that $$ \sup_{z \in
\mathbb{R}}|P(W_{n,m} \le z) -P(Z \le z)| \le C \left(
\frac{1+(\frac{n}{m})^3}{\sigma_{n,m}} \right) \quad \mbox{for all $n \ge d$
and $m \ge 2$,} $$ where $W_{n,m}$ and $\sigma_{n,m}^2$ are the standardized
count and variance, respectively, of the number of urns with $d$ balls, and $Z$
is a standard normal random variable. Asymptotically, the bound is optimal up
to constants if $n$ and $m$ tend to infinity together in a way such that $n/m$
stays bounded.
-
Dense kernel matrices $\Theta \in \mathbb{R}^{N \times N}$ obtained from
point evaluations of a covariance function $G$ at locations $\{ x_{i} \}_{1
\leq i \leq N} \subset \mathbb{R}^{d}$ arise in statistics, machine learning,
and numerical analysis. For covariance functions that are Green's functions of
elliptic boundary value problems and homogeneously-distributed sampling points,
we show how to identify a subset $S \subset \{ 1 , \dots , N \}^2$, with $\# S
= O ( N \log (N) \log^{d} ( N /\epsilon ) )$, such that the zero fill-in
incomplete Cholesky factorisation of the sparse matrix $\Theta_{ij} 1_{( i, j )
\in S}$ is an $\epsilon$-approximation of $\Theta$. This factorisation can
provably be obtained in complexity $O ( N \log( N ) \log^{d}( N /\epsilon) )$
in space and $O ( N \log^{2}( N ) \log^{2d}( N /\epsilon) )$ in time, improving
upon the state of the art for general elliptic operators; we further present
numerical evidence that $d$ can be taken to be the intrinsic dimension of the
data set rather than that of the ambient space. The algorithm only needs to
know the spatial configuration of the $x_{i}$ and does not require an analytic
representation of $G$. Furthermore, this factorization straightforwardly
provides an approximate sparse PCA with optimal rate of convergence in the
operator norm. Hence, by using only subsampling and the incomplete Cholesky
factorization, we obtain, at nearly linear complexity, the compression,
inversion and approximate PCA of a large class of covariance matrices. By
inverting the order of the Cholesky factorization we also obtain a solver for
elliptic PDE with complexity $O ( N \log^{d}( N /\epsilon) )$ in space and $O (
N \log^{2d}( N /\epsilon) )$ in time, improving upon the state of the art for
general elliptic operators.
-
We develop in this work a general version of paracontrolled calculus that
allows to treat analytically within this paradigm some singular partial
differential equations with the same efficiency as regularity structures. This
work deals with the analytic side of the story and offers a toolkit for the
study of such equations, under the form of a number of continuity results for
some operators. We illustrate the efficiency of this elementary approach on the
examples of the 3-dimensional generalised parabolic Anderson model equation and
the generalised KPZ equation driven by a (1+1)-dimensional space/time white
noise.
-
Consider a coin tossing experiment which consists of tossing one of two coins
at a time, according to a renewal process. The first coin is fair and the
second has probability $1/2 + \theta$, $\theta \in [-1/2,1/2]$, $\theta$
unknown but fixed, of head. The biased coin is tossed at the renewal times of
the process, and the fair one at all the other times. The main question about
this experiment is whether or not it is possible to determine $\theta$ almost
surely as the number of tosses increases, given only the probabilities of the
renewal process and the observed sequence of heads and tails. We will construct
a confidence interval for $\theta$ and determine conditions on the process for
its almost sure convergence. It will be shown that recurrence is in fact a
necessary condition for the almost sure convergence of the interval, although
the convergence still holds if the process is null recurrent but the expected
number of renewals up to and including time $N$ is $O(N^{1/2+\alpha}), 0 \leq
\alpha < 1/2$. It solves an open problem presented by Harris and Keane (1997).
We also generalize this experiment for random variables on $L^{2}$ which are
sampled according to a renewal process from either one of two distributions.
-
We deal with a general preferential attachment graph model with multiple type
edges. The types are chosen randomly, in a way that depends on the evolution of
the graph. In the $N$-type case, we define the (generalized) degree of a given
vertex as $\boldsymbol{d}=(d_{1},d_{2},\dots,d_{N})$, where
$d_{k}\in\mathbb{Z}_{0}^{+}$ is the number of type $k$ edges connected to it.
We prove the existence of an a.s.\ asymptotic degree distribution for a general
family of preferential attachment random graph models with multi-type edges.
More precisely, we show that the proportion of vertices with (generalized)
degree $\boldsymbol{d}$ tends to some random variable as the number of steps
goes to infinity. We also provide recurrence equations for the asymptotic
degree distribution. Finally, we generalize the scale-free property of random
graphs to the multi-type case.
-
We study a "div-grad type" sub-Laplacian with respect to a smooth measure and
its associated heat semigroup on a compact equiregular sub-Riemannian manifold.
We prove a short time asymptotic expansion of the heat trace up to any order.
Our main result holds true for any smooth measure on the manifold, but it has a
spectral geometric meaning when Popp's measure is considered. Our proof is
probabilistic. In particular, we use S. Watanabe's distributional Malliavin
calculus.
-
The sine process is a rigid point process on the real line, which means that
for almost all configurations $X$, the number of points in an interval $I =
[-R,R]$ is determined by the points of $X$ outside of $I$. In addition, the
points in $I$ are an orthogonal polynomial ensemble on $I$ with a weight
function that is determined by the points in $X \setminus I$. We prove a
universality result that in particular implies that the correlation kernel of
the orthogonal polynomial ensemble tends to the sine kernel as the length
$|I|=2R$ tends to infinity, thereby answering a question posed by A.I. Bufetov.
-
In this paper we consider the field of local times of a discrete-time Markov
chain on a general state space, and obtain uniform (in time) upper bounds on
the total variation distance between this field and the one of a sequence of
$n$ i.i.d. random variables with law given by the invariant measure of that
Markov chain. The proof of this result uses a refinement of the soft local time
method of [11].
-
The study of the restricted isometry property (RIP) of corrupted random
matrices is particularly important in the field of compressed sensing (CS) with
corruptions. If a matrix still satisfies the RIP after that a certain portion
of rows are erased, then we say that this matrix has the strong restricted
isometry property (SRIP). In the field of compressed sensing, random matrices
which satisfy certain moment conditions are of particular interest. Among these
matrices, those with entries generated from i.i.d. Gaussian or symmetric
Bernoulli random variables are often typically considered. Recent studies have
shown that matrices with entries generated from i.i.d. Gaussian random
variables satisfy the SRIP under arbitrary erasure of rows with high
probability. In this paper, we study the erasure robustness property of
Bernoulli random matrices. Our main result shows that with overwhelming
probability, the SRIP holds for Bernoulli random matrices. Moreover, our
analysis leads to a robust version of the famous Johnson-Lindenstrauss lemma
for Bernoulli random matrices.
-
In this paper, we give an extension of the classic Salem-Zygmund inequality
for locally sub-Gaussian random variables. As an application, the concentration
of the roots of a Kac polynomial is studied. Under the finiteness of the
log-moment on the random coefficients, it is well-known that the roots of a Kac
polynomial are concentrated around the unit circle with high probability. We
assume the existence of the moment generating function for the iid random
coefficients for the Kac polynomial and prove that there exists an annulus of
width $\textrm{O}( n^{-2}(\log n)^{-1/2-\gamma})$, $\gamma>1/2$ around the unit
circle that does not contain roots with high probability. As another
application we show that the smallest singular value of a random circulant
matrix is at least $n^{-\rho}$, $\rho\in(0,1/4)$ with probability
$1-\textrm{O}(n^{-2\rho})$.
-
We investigate a new sampling scheme aimed at improving the performance of
particle filters whenever (a) there is a significant mismatch between the
assumed model dynamics and the actual system, or (b) the posterior probability
tends to concentrate in relatively small regions of the state space. The
proposed scheme pushes some particles towards specific regions where the
likelihood is expected to be high, an operation known as nudging in the
geophysics literature. We re-interpret nudging in a form applicable to any
particle filtering scheme, as it does not involve any changes in the rest of
the algorithm. Since the particles are modified, but the importance weights do
not account for this modification, the use of nudging leads to additional bias
in the resulting estimators. However, we prove analytically that nudged
particle filters can still attain asymptotic convergence with the same error
rates as conventional particle methods. Simple analysis also yields an
alternative interpretation of the nudging operation that explains its
robustness to model errors. Finally, we show numerical results that illustrate
the improvements that can be attained using the proposed scheme. In particular,
we present nonlinear tracking examples with synthetic data and a model
inference example using real-world financial data.
-
A sorting network is a geodesic path from $12 \cdots n$ to $n \cdots 21$ in
the Cayley graph of $S_n$ generated by adjacent transpositions. For a uniformly
random sorting network, we establish the existence of a local limit of the
process of space-time locations of transpositions in a neighbourhood of $an$
for $a\in[0,1]$ as $n\to\infty$. Here time is scaled by a factor of $1/n$ and
space is not scaled. The limit is a swap process $U$ on $\mathbb{Z}$. We show
that $U$ is stationary and mixing with respect to the spatial shift and has
time-stationary increments. Moreover, the only dependence on $a$ is through
time scaling by a factor of $\sqrt{a(1-a)}$. To establish the existence of $U$,
we find a local limit for staircase-shaped Young tableaux. These Young tableaux
are related to sorting networks through a bijection of Edelman and Greene.
-
Let $\Omega$ be a Polish space with Borel $\sigma$-field $\mathcal{F}$ and
countably generated sub $\sigma$-field $\mathcal{G}\subset\mathcal{F}$. Denote
by $\mathcal{L}(\mathcal{F})$ the set of all bounded $\mathcal{F}$-upper
semianalytic functions from $\Omega$ to the reals and by
$\mathcal{L}(\mathcal{G})$ the subset of $\mathcal{G}$-upper semianalytic
functions. Let
$\mathcal{E}(\cdot|\mathcal{G})\colon\mathcal{L}(\mathcal{F})\to\mathcal{L}(\mathcal{G})$
be a sublinear increasing functional which leaves $\mathcal{L}(\mathcal{G})$
invariant. It is shown that there exists a $\mathcal{G}$-analytic set-valued
mapping $\mathcal{P}_{\mathcal{G}}$ from $\Omega$ to the set of probabilities
which are concentrated on atoms of $\mathcal{G}$ with compact convex values
such that $\mathcal{E}(X|\mathcal{G})(\omega)=$
$\sup_{P\in\mathcal{P}_{\mathcal{G}}(\omega)} E_P[X]$ if and only if
$\mathcal{E}(\cdot |\mathcal{G})$ is pointwise continuous from below and
continuous from above on the continuous functions. Further, given another
sublinear increasing functional
$\mathcal{E}(\cdot)\colon\mathcal{L}(\mathcal{F})\to\mathbb{R}$ which leaves
the constants invariant, the tower property
$\mathcal{E}(\cdot)=\mathcal{E}(\mathcal{E}(\cdot|\mathcal{G}))$ is
characterized via a pasting property of the representing sets of probabilities,
and the importance of analytic functions is explained. Finally, it is
characterized when a nonlinear version of Fubini's theorem holds true and when
the product of a set of probabilities and a set of kernels is compact.
-
In this paper, we define an extension of the supersymmetric hyperbolic
nonlinear sigma model introduced by Zirnbauer. We show that it arises as a weak
joint limit of a time-changed version introduced by Sabot and Tarr\`es of the
vertex-reinforced jump process. It describes the asymptotics of rescaled
crossing numbers, rescaled fluctuations of local times, asymptotic local times
on a logarithmic scale, endpoints of paths, and last exit trees.
-
We prove a general normal approximation theorem for local graph statistics in
the configuration model, together with an explicit bound on the error in the
approximation with respect to the Wasserstein metric. Such statistics take the
form $T := \sum_{v \in V} H_v$, where $V$ is the vertex set, and $H_v$ depends
on a neighbourhood in the graph around $v$ of size at most $\ell$. The error
bound is expressed in terms of $\ell$, $|V|$, an almost sure bound on $H_v$,
the maximum vertex degree $d_{\max}$ and the variance of $T$. Under suitable
assumptions on the convergence of the empirical degree distributions to a
limiting distribution, we deduce that the size of the giant component in the
configuration model has asymptotically Gaussian fluctuations.
-
This paper considers how to obtain MCMC quantitative convergence bounds which
can be translated into tight complexity bounds in high-dimensional settings. We
propose a modified drift-and-minorization approach, which establishes a
generalized drift condition defined in a subset of the state space. The subset
is called the ``large set'' and is chosen to rule out some ``bad'' states which
have poor drift property when the dimension gets large. Using the ``large set''
together with a ``centered'' drift function, a quantitative bound can be
obtained which can be translated into a tight complexity bound. As a
demonstration, we analyze a certain realistic Gibbs sampler algorithm and
obtain a complexity upper bound for the mixing time, which shows that the
number of iterations required for the Gibbs sampler to converge is constant
under certain conditions on the observed data and the initial state. It is our
hope that this modified drift-and-minorization approach can be employed in many
other specific examples to obtain complexity bounds for high-dimensional Markov
chains.
-
Wavelet (Besov) priors are a promising way of reconstructing indirectly
measured fields in a regularized manner. We demonstrate how wavelets can be
used as a localized basis for reconstructing permeability fields with sharp
interfaces from noisy pointwise pressure field measurements in the context of
the elliptic inverse problem. For this we derive the adjoint method of
minimizing the Besov-norm-regularized misfit functional (this corresponds to
determining the maximum a posteriori point in the Bayesian point of view) in
the Haar wavelet setting. As it turns out, choosing a wavelet--based prior
allows for accelerated optimization compared to established
trigonometrically--based priors.
-
We study a multiclass M/M/1 queueing control problem with finite buffers
under heavy-traffic where the decision maker is uncertain about the rates of
arrivals and service of the system and by scheduling and admission/rejection
decisions acts to minimize a discounted cost that accounts for the uncertainty.
The main result is the asymptotic optimality of a $c\mu$-type of policy derived
via underlying stochastic differential games studied in [16]. Under this
policy, with high probability, rejections are not performed when the workload
lies below some cut-off that depends on the ambiguity level. When the workload
exceeds this cut-off, rejections are carried out and only from the buffer with
the cheapest rejection cost weighted with the mean service rate in some
reference model. The allocation part of the policy is the same for all the
ambiguity levels. This is the first work to address a heavy-traffic queueing
control problem with model uncertainty.
-
Let $\{\eta(v): v\in V_N\}$ be a discrete Gaussian free field in a
two-dimensional box $V_N$ of side length $N$ with Dirichlet boundary
conditions. We study the Liouville first passage percolation, i.e., the
shortest path metric where each vertex is given a weight of $e^{\gamma
\eta(v)}$ for some $\gamma>0$. We show that for sufficiently small but fixed
$\gamma>0$, with probability tending to $1$ as $N\to \infty$, all geodesics
between vertices of macroscopic Euclidean distances simultaneously have (the
conjecturally unique) length exponent strictly larger than 1.
-
We consider an ordinary differential equation with a unique hyperbolic
attractor at the origin, to which we add a small random perturbation. It is
known that under general conditions, the solution of this stochastic
differential equation converges exponentially fast to an equilibrium
distribution. We show that the convergence occurs abruptly: in a time window of
small size compared to the natural time scale of the process, the distance to
equilibrium drops from its maximal possible value to near zero, and only after
this time window the convergence is exponentially fast. This is what is known
as the cut-off phenomenon in the context of Markov chains of increasing
complexity. In addition, we are able to give general conditions to decide
whether the distance to equilibrium converges in this time window to a
universal function, a fact known as profile cut-off.
-
The aim of the present work is the introduction of a viscosity type solution,
called strong-viscosity solution to distinguish it from the classical one, with
the following peculiarities: it is a purely analytic object; it can be easily
adapted to more general equations than classical partial differential
equations. First, we introduce the notion of strong-viscosity solution for
semilinear parabolic partial differential equations, defining it, in a few
words, as the pointwise limit of classical solutions to perturbed semilinear
parabolic partial differential equations; we compare it with the standard
definition of viscosity solution. Afterwards, we extend the concept of
strong-viscosity solution to the case of semilinear parabolic path-dependent
partial differential equations, providing an existence and uniqueness result.
-
Consider a population of infinitesimally small frogs on the real line.
Initially the frogs on the positive half-line are dormant while those on the
negative half-line are awake and move according to the heat flow. At the
interface, the incoming wake frogs try to wake up the dormant frogs and succeed
with a probability proportional to their amount among the total amount of
involved frogs at the specific site. Otherwise, the incoming frogs also fall
asleep. This frog model is a special case of the infinite rate symbiotic
branching process on the real line with different motion speeds for the two
types.
We construct this frog model as the limit of approximating processes and
compute the structure of jumps. We show that our frog model can be described by
a stochastic partial differential equation on the real line with a jump type
noise.
-
The theory of quasifree quantum stochastic calculus for infinite-dimensional
noise is developed within the framework of Hudson-Parthasarathy quantum
stochastic calculus. The question of uniqueness for the covariance amplitude
with respect to which a given unitary quantum stochastic cocycle is quasifree
is addressed, and related to the minimality of the corresponding stochastic
dilation. The theory is applied to the identification of a wide class of
quantum random walks whose limit processes are driven by quasifree noises.
-
Loewner driving functions encode simple curves in 2-dimensional simply
connected domains by real-valued functions. We prove that the Loewner driving
function of a $C^{1,\beta}$ curve (differentiable parametrization with
$\beta$-H\"older continuous derivative) is in the class $C^{1,\beta-1/2}$ if
$1/2<\beta\leq 1$, and in the class $C^{0,\beta + 1/2}$ if $0 \leq \beta \leq
1/2$. This is the converse of a result of Carto Wong and is optimal. We also
introduce the Loewner energy of a rooted planar loop and use our regularity
result to show the independence of this energy from the basepoint.
-
The central limit theorem of martingales is the fundamental tool for studying
the convergence of stochastic processes, especially stochastic integrals and
differential equations. In this paper, general central limit theorems and
functional central limit theorems are obtained for martingale like random
variables under the sub-linear expectation. As applications, the Lindeberg
central limit theorem and functional central limit theorem are obtained for
independent but not necessarily identically distributed random variables, and a
new proof of the L\'evy characterization of a G-Brownian motion without using
stochastic calculus is given. For proving the results, we have also established
Rosenthal's inequality and the exceptional inequality for the martingale like
random variables.