
The inductive size bias coupling technique and Stein's method yield a
BerryEsseen theorem for the number of urns having occupancy $d \ge 2$ when $n$
balls are uniformly distributed over $m$ urns. In particular, there exists a
constant $C$ depending only on $d$ such that $$ \sup_{z \in
\mathbb{R}}P(W_{n,m} \le z) P(Z \le z) \le C \left(
\frac{1+(\frac{n}{m})^3}{\sigma_{n,m}} \right) \quad \mbox{for all $n \ge d$
and $m \ge 2$,} $$ where $W_{n,m}$ and $\sigma_{n,m}^2$ are the standardized
count and variance, respectively, of the number of urns with $d$ balls, and $Z$
is a standard normal random variable. Asymptotically, the bound is optimal up
to constants if $n$ and $m$ tend to infinity together in a way such that $n/m$
stays bounded.

The generalized Dickman distribution ${\cal D}_\theta$ with parameter
$\theta>0$ is the unique solution to the distributional equality $W=_d W^*$,
where \begin{eqnarray} W^*=_d U^{1/\theta}(W+1) \qquad (1) \end{eqnarray} with
$W$ nonnegative with probability one, $U \sim {\cal U}[0,1]$ independent of
$W$, and $=_d$ denoting equality in distribution. Members of this family appear
in number theory, stochastic geometry, perpetuities and the study of
algorithms. We obtain bounds in Wasserstein type distances between ${\cal
D}_\theta$ and \begin{eqnarray} W_n= \frac{1}{n} \sum_{i=1}^n Y_k B_k \qquad
(2) \end{eqnarray} where $B_1,\ldots,B_n, Y_1, \ldots, Y_n$ are independent
with $B_k \sim {\rm Ber}(1/k), E[Y_k]=k, {\rm Var}(Y_k)=\sigma_k^2$ and provide
an application to the minimal directed spanning tree in $\mathbb{R}^2$, and
also obtain such bounds when the Bernoulli variables in $(2)$ are replaced by
Poissons. We also give simple proofs and provide bounds with optimal rates for
the Dickman convergence of the weighted sums, arising in probabilistic number
theory, of the form \begin{eqnarray} S_n=\frac{1}{\log(p_n)} \sum_{k=1}^n X_k
\log(p_k) \end{eqnarray} where $(p_k)_{k \ge 1}$ is an enumeration of the prime
numbers in increasing order and $X_k$ is Geometric with parameter $(11/p_k)$,
Bernoulli with success probability $1/(1+p_k)$ or Poisson with mean
$\lambda_k$.
In addition, we broaden the class of generalized Dickman distributions by
studying the fixed points of the transformation \begin{eqnarray*} s(W^*)=_d
U^{1/\theta}s(W+1) \end{eqnarray*} generalizing $(1)$, that allows the use of
nonidentity utility functions $s(\cdot)$ in Vervaat perpetuities. We obtain
distributional bounds for recursive methods that can be used to simulate from
this family.

Given a nonnegative random variable $W$ and $\theta>0$, let the generalized
Dickman transformation map the distribution of $W$ to that of $$ W^*=_d
U^{1/\theta}(W+1), $$ where $U \sim {\cal U}[0,1]$, a uniformly distributed
variable on the unit interval, independent of $W$, and where $=_d$ denotes
equality in distribution. It is well known that $W^*$ and $W$ are equal in
distribution if and only if $W$ has the generalized Dickman distribution ${\cal
D}_\theta$. We demonstrate that the Wasserstein distance $d_1$ between $W$, a
nonnegative random variable with finite mean, and $D_\theta$ having
distribution ${\cal D}_\theta$ obeys the inequality $$ d_1(W,D_\theta) \le
(1+\theta)d_1(W,W^*). $$ The specialization of this bound to the case
$\theta=1$ and coupling constructions yield $$ d_1(W_{n,1},D) \le \frac{8\log
(n/2)+10}{n} \quad \mbox{for all $n \ge 1$, where} \quad
W_{n,1}=\frac{1}{n}C_{n,1}1, $$ and $C_{n,m}$ is the number of comparisons
made by the Quickselect algorithm to find the $m^{th}$ smallest element of a
list of $n$ distinct numbers. A similar bound holds for $m \ge 2$, and together
recover the results of [12] that show distributional convergence of $W_n$ to
the standard Dickman distribution in the asymptotic regime $m=o(n)$. By
developing an exact expression for the expected running time $E[C_{n,m}]$,
lower bounds are provided that show the rate is not improvable for all $m \not
= 2$.

Size bias occurs famously in waitingtime paradoxes, undesirably in sampling
schemes, and unexpectedly in connection with Stein's method, tightness,
analysis of the lognormal distribution, Skorohod embedding, infinite
divisibility, and number theory. In this paper we review the basics and survey
some of these unexpected connections.

Given events $A$ and $B$ on a product space $S=\prod_{i=1}^n S_i$, the set $A
\Box B$ consists of all vectors ${\bf x}=(x_1,\ldots,x_n) \in S$ for which
there exist disjoint coordinate subsets $K$ and $L$ of $\{1,\ldots,n\}$ such
that given the coordinates $x_i, i \in K$ one has that ${\bf x} \in A$
regardless of the values of ${\bf x}$ on the remaining coordinates, and
likewise that ${\bf x} \in B$ given the coordinates {$x_j, j \in L$}.
For a finite product of discrete spaces endowed with a product measure, the
BKR inequality
$$
P(A \Box B) \le P(A)P(B) \quad (1)
$$
was conjectured by van den Berg and Kesten [3] and proved by Reimer [13].
In [7] inequality (1) was extended to general product probability spaces,
replacing $A \Box B$ by the set $A \Box_{11} B$ consisting of those outcomes
${\bf x}$ which only assure with probability one that ${\bf x} \in A$ and ${\bf
x} \in B$ based only on the revealed coordinates in $K$ and $L$ as above. A
strengthening of the original BKR inequality (1) results, due to the fact that
$A \Box B \subseteq A \Box_{11} B$. In particular, it may be the case that $A
\Box B$ is empty, while $A \Box_{11} B$ is not.
We propose the further extension $A \Box_{st} B$ depending on probability
thresholds $s$ and $t$, where $A \Box_{11} B$ is the special case where both
$s$ and $t$ take the value one. The outcomes ${\bf x}$ in $A \Box_{st} B$ are
those for which disjoint sets of coordinates $K$ and $L$ exist such that given
the values of $\bf x$ on the revealed set of coordinates $K$, the probability
that $A$ occurs is at least $s$, and given the coordinates of $\bf x$ in $L$,
the probability of $B$ is at least $t$. We provide simple examples that
illustrate the utility of these extensions.

Performance guarantees for compression in nonlinear models under nonGaussian
observations can be achieved through the use of distributional characteristics
that are sensitive to the distance to normality, and which in particular return
the value of zero under Gaussian or linear sensing. The use of these
characteristics, or discrepancies, improves some previous results in this area
by relaxing conditions and tightening performance bounds. In addition, these
characteristics are tractable to compute when Gaussian sensing is corrupted by
either additive errors or mixing.

We provide nonasymptotic $L^1$ bounds to the normal for four wellknown
models in statistical physics and particle systems in $\mathbb{Z}^d$; the
ferromagnetic nearestneighbor Ising model, the supercritical bond percolation
model, the voter model and the contact process. In the Ising model, we obtain
an $L^1$ distance bound between the total magnetization and the normal
distribution at any temperature when the magnetic moment parameter is nonzero,
and when the inverse temperature is below critical and the magnetic moment
parameter is zero. In the percolation model we obtain such a bound for the
total number of points in a finite region belonging to an infinite cluster in
dimensions $d \ge 2$, in the voter model for the occupation time of the origin
in dimensions $d \ge 7$, and for finite time integrals of nonconstant
increasing cylindrical functions evaluated on the one dimensional supercritical
contact process started in its unique invariant distribution.
The tool developed for these purposes is a version of Stein's method adapted
to positively associated random variables. In one dimension, letting
$\boldsymbol{\xi}=(\xi_1,\ldots,\xi_m)$ be a positively associated mean zero
random vector with components that obey the bound $\xi_i \le B,
i=1,\ldots,m$, and whose sum $W = \sum_{i=1}^m \xi_i$ has variance 1, it holds
that $$ d_1 \left(\mathcal{L}(W),\mathcal{L}(Z) \right) \leq 5B +
\sqrt{\frac{8}{\pi}}\sum_{i \neq j} \mathbb{E}[\xi_i \xi_j] $$ where $Z$ has
the standard normal distribution and $d_1(\cdot,\cdot)$ is the $L^1$ metric.
Our methods apply in the multidimensional case with the $L^1$ metric replaced
by a smooth function metric.

Thresholdtype counts based on multivariate occupancy models with log concave
marginals admit bounded size biased couplings under weak conditions, leading to
new concentration of measure results for random graphs, germgrain models in
stochastic geometry, multinomial allocation models and multivariate
hypergeometric sampling. The work generalizes and improves upon previous
results in a number of directions.

In a proximity region graph ${\cal G}$ in $\mathbb{R}^d$, two distinct points
$x,y$ of a point process $\mu$ are connected when the 'forbidden region'
$S(x,y)$ these points determine has empty intersection with $\mu$. The Gabriel
graph, where $S(x,y)$ is the open disc with diameter the line segment
connecting $x$ and $y$, is one canonical example. When $\mu$ is a Poisson or
binomial process, under broad conditions on the regions $S(x,y)$, bounds on the
Kolmogorov and Wasserstein distances to the normal are produced for functionals
of ${\cal G}$, including the total number of edges and the total length.
Variance lower bounds, not requiring strong stabilization, are also proven to
hold for a class of such functionals.

Let $\lambda$ be the second largest eigenvalue in absolute value of a uniform
random $d$regular graph on $n$ vertices. It was famously conjectured by Alon
and proved by Friedman that if $d$ is fixed independent of $n$, then
$\lambda=2\sqrt{d1} +o(1)$ with high probability. In the present work we show
that $\lambda=O(\sqrt{d})$ continues to hold with high probability as long as
$d=O(n^{2/3})$, making progress towards a conjecture of Vu that the bound holds
for all $1\le d\le n/2$. Prior to this work the best result was obtained by
Broder, Frieze, Suen and Upfal (1999) using the configuration model, which hits
a barrier at $d=o(n^{1/2})$. We are able to go beyond this barrier by proving
concentration of measure results directly for the uniform distribution on
$d$regular graphs. These come as consequences of advances we make in the
theory of concentration by size biased couplings. Specifically, we obtain
Bennetttype tail estimates for random variables admitting certain unbounded
size biased couplings.

Strong embeddings, that is, couplings between a partial sum process of a
sequence of random variables and a Brownian motion, have found numerous
applications in probability and statistics. We extend Chatterjee's novel use of
Stein's method for $\{1,+1\}$ valued variables to a general class of discrete
distributions, and provide $\log n$ rates for the coupling of partial sums of
independent variables to a Brownian motion, and results for coupling sums of
suitably standardized exchangeable variables to a Brownian bridge.

We study highdimensional signal recovery from nonlinear measurements with
design vectors having elliptically symmetric distribution. Special attention is
devoted to the situation when the unknown signal belongs to a set of low
statistical complexity, while both the measurements and the design vectors are
heavytailed. We propose and analyze a new estimator that adapts to the
structure of the problem, while being robust both to the possible model
misspecification characterized by arbitrary nonlinearity of the measurements
as well as to data corruption modeled by the heavytailed distributions.
Moreover, this estimator has low computational complexity. Our results are
expressed in the form of exponential concentration inequalities for the error
of the proposed estimator. On the technical side, our proofs rely on the
generic chaining methods, and illustrate the power of this approach for
statistical applications. Theory is supported by numerical experiments
demonstrating that our estimator outperforms existing alternatives when data is
heavytailed.

The BKR inequality conjectured by van den Berg and Kesten in [11], and proved
by Reimer in [8], states that for $A$ and $B$ events on $S$, a finite product
of finite sets $S_i,i=1,\ldots,n$, and $P$ any product measure on $S$, $$ P(A
\Box B) \le P(A)P(B),$$ where the set $A \Box B$ consists of the elementary
events which lie in both $A$ and $B$ for `disjoint reasons.' Precisely, with
${\bf n}:=\{1,\ldots,n\}$ and $K \subset {\bf n}$, for ${\bf x} \in S$ letting
$[{\bf x}]_K=\{{\bf y} \in S: y_i = x_i, i \in K\}$, the set $A \Box B$
consists of all ${\bf x} \in S$ for which there exist disjoint subsets $K$ and
$L$ of ${\bf n}$ for which $[{\bf x}]_K \subset A$ and $[{\bf x}]_L \subset B$.
The BKR inequality is extended to the following functional version on a
general finite product measure space $(S,\mathbb{S})$ with product probability
measure $P$, $$E\left\{ \max_{\stackrel{K \cap L = \emptyset}{K \subset {\bf
n}, L \subset {\bf n}}}
\underline{f}_K({\bf X})\underline{g}_L({\bf
X})\right\} \leq E\left\{f({\bf X})\right\}\,E\left\{g({\bf
X})\right\},$$ where $f$ and $g$ are nonnegative measurable functions,
$\underline{f}_K({\bf x}) = {\rm ess} \inf_{{\bf y} \in [{\bf x}]_K}f({\bf y})$
and $\underline{g}_L({\bf x}) = {\rm ess} \inf_{{\bf y} \in [{\bf x}]_L}g({\bf
y}).$ The original BKR inequality is recovered by taking $f({\bf x})={\bf
1}_A({\bf x})$ and $g({\bf x})={\bf 1}_B({\bf x})$, and applying the fact that
in general ${\bf 1}_{A \Box B} \le \max_{K \cap L = \emptyset}
\underline{f}_K({\bf x}) \underline{g}_L({\bf x})$.
Related formulations, and functional versions of the dual inequality on
events by Kahn, Saks, and Smyth [6], are also considered. Applications include
order statistics, assignment problems, and paths in random graphs.

With ${\mathcal{Q}}_{q,n}$ the distribution of $n$ minus the rank of a matrix
chosen uniformly from the collection of all $n\times(n+m)$ matrices over the
finite field $\mathbb{F}_q$ of size $q\ge2$, and ${\mathcal{Q}}_q$ the
distributional limit of ${\mathcal{Q}}_{q,n}$ as $n\rightarrow\infty$, we apply
Stein's method to prove the total variation bound
$\frac{1}{8q^{n+m+1}}\leq\{\mathcal{Q}}_{q,n}{\mathcal{Q}}_q\_{\mathrm{TV}}\leq\frac{3}{q^{n+m+1}}$.
In addition, we obtain similar sharp results for the rank distributions of
symmetric, symmetric with zero diagonal, skew symmetric, skew centrosymmetric
and Hermitian matrices.

Intrinsic volumes of convex sets are natural geometric quantities that also
play important roles in applications, such as linear inverse problems with
convex constraints, and constrained statistical inference. It is a wellknown
fact that, given a closed convex cone $C\subset \mathbb{R}^d$, its conic
intrinsic volumes determine a probability measure on the finite set
$\{0,1,...d\}$, customarily denoted by $\mathcal{L}(V_C)$. The aim of the
present paper is to provide a BerryEsseen bound for the normal approximation
of ${\cal L}(V_C)$, implying a general quantitative central limit theorem (CLT)
for sequences of (correctly normalised) discrete probability measures of the
type $\mathcal{L}(V_{C_n})$, $n\geq 1$. This bound shows that, in the
highdimensional limit, most conic intrinsic volumes encountered in
applications can be approximated by a suitable Gaussian distribution. Our
approach is based on a variety of techniques, namely: (1) Steiner formulae for
closed convex cones, (2) Stein's method and second order Poincar\'e inequality,
(3) concentration estimates, and (4) Fourier analysis. Our results explicitly
connect the sharp phase transitions, observed in many regularised linear
inverse problems with convex constraints, with the asymptotic Gaussian
fluctuations of the intrinsic volumes of the associated descent cones. In
particular, our findings complete and further illuminate the recent
breakthrough discoveries by Amelunxen, Lotz, McCoy and Tropp (2014) and McCoy
and Tropp (2014) about the concentration of conic intrinsic volumes and its
connection with threshold phenomena. As an additional outgrowth of our work we
develop total variation bounds for normal approximations of the lengths of
projections of Gaussian vectors on closed convex sets.

Consider a uniformly chosen random reduced decomposition of the longest
element in the symmetric group. It is known that the location of the first
transposition in this decomposition converges to the semicircle distribution.
In this note we provide a sharp error term for this result, using the
"comparison of generators" approach to Stein's method.

The tails of the distribution of a mean zero, variance $\sigma^2$ random
variable $Y$ satisfy concentration of measure inequalities of the form
$\mathbb{P}(Y \ge t) \le \exp(B(t))$ for $$ B(t)=\frac{t^2}{2( \sigma^2 + ct)}
\quad \mbox{for $t \ge 0$, and} \quad B(t)=\frac{t}{c}\left( \log t  \log \log
t  \frac{\sigma^2}{c}\right) \quad \mbox{for $t>e$} $$ whenever there exists a
zero biased coupling of $Y$ bounded by $c$, under suitable conditions on the
existence of the moment generating function of $Y$. These inequalities apply in
cases where $Y$ is not a function of independent variables, such as for the
Hoeffding statistic $Y=\sum_{i=1}^n a_{i\pi(i)}$ where $A=(a_{ij})_{1 \le i,j
\le n} \in \mathbb{R}^{n \times n}$ and the permutation $\pi$ has the uniform
distribution over the symmetric group, and when its distribution is constant on
cycle type.

Applying Stein's method, an inductive technique and size bias coupling yields
a BerryEsseen theorem for normal approximation without the usual restriction
that the coupling be bounded. The theorem is applied to counting the number of
vertices in the ErdosRenyi random graph of a given degree.

Using a characterizing equation for the Beta distribution, Stein's method is
applied to obtain bounds of the optimal order for the Wasserstein distance
between the distribution of the scaled number of white balls drawn from a
P\'olyaEggenberger urn and its limiting Beta distribution. The bound is
computed by making a direct comparison between characterizing operators of the
target and the Beta distribution, the former derived by extending Stein's
density approach to discrete distributions.
In addition, refinements are given to D\"obler's result [12] for the Arcsine
approximation for the fraction of time a simple random walk of even length
spends positive, and so also to the distributions of its last return time to
zero and its first visit to its terminal point, by supplying explicit constants
to the present Wasserstein bound and also demonstrating that its rate is of the
optimal order.

In the so called lightbulb process, on days r=1,..,n, out of n lightbulbs,
all initially off, exactly r bulbs selected uniformly and independent of the
past have their status changed from off to on, or vice versa. With W_n the
number of bulbs on at the terminal time n and C_n a suitable clubbed binomial
distribution, d_{TV}(W_n,C_n) \le 2.7314 \sqrt{n} e^{(n+1)/3} for all n \ge 1.
The result is shown using Stein's method.

Let $Y$ be a nonnegative random variable with mean $\mu$ and finite positive
variance $\sigma^2$, and let $Y^s$, defined on the same space as $Y$, have the
$Y$ size biased distribution, that is, the distribution characterized by
E[Yf(Y)]=\mu E f(Y^s) for all functions $f$ for which these expectations exist.
Under a variety of conditions on the coupling of Y and $Y^s$, including
combinations of boundedness and monotonicity, concentration of measure
inequalities hold. Examples include the number of relatively ordered
subsequences of a random permutation, sliding window statistics including the
number of mruns in a sequence of coin tosses, the number of local maximum of a
random function on a lattice, the number of urns containing exactly one ball in
an urn allocation model, the volume covered by the union of $n$ balls placed
uniformly over a volume n subset of d dimensional Euclidean space, the number
of bulbs switched on at the terminal time in the so called lightbulb process,
and the infinitely divisible and compound Poisson distributions that satisfy a
bounded moment generating function condition.

A concentration of measure result is proved for the number of isolated
vertices $Y$ in the Erd\H{o}sR\'{e}nyi random graph model on $n$ edges with
edge probability $p$. When $\mu$ and $\sigma^2$ denote the mean and variance of
$Y$ respectively, $P((Y\mu)/\sigma\ge t)$ admits a bound of the form
$e^{kt^2}$ for some constant positive $k$ under the assumption $p \in (0,1)$
and $np\rightarrow c \in (0,\infty)$ as $n \rightarrow \infty$. The left tail
inequality $$ P(\frac{Y\mu}{\sigma}\le t)&\le&
\exp(\frac{t^2\sigma^2}{4\mu}) $$ holds for all $n \in {2,3,...},p \in (0,1)$
and $t \ge 0$. The results are shown by coupling $Y$ to a random variable $Y^s$
having the $Y$size biased distribution, that is, the distribution
characterized by $E[Yf(Y)]=\mu E[f(Y^s)] $ for all functions $f$ for which
these expectations exist.

The tools of zero biasing are adapted to yield a general result suitable for
analyzing the behavior of certain growth processes. The main theorem is applied
to prove central limit theorems, with explicit error terms in the L^1 metric,
for certain statistics of the Jack measure on partitions and for the number of
balls drawn in a PolyaEggenberger urn process.

We compare estimators of the (essential) supremum and the integral of a
function $f$ defined on a measurable space when $f$ may be observed at a sample
of points in its domain, possibly with error. The estimators compared vary in
their levels of stratification of the domain, with the result that more refined
stratification is better with respect to different criteria. The emphasis is on
criteria related to stochastic orders. For example, rather than compare
estimators of the integral of $f$ by their variances (for unbiased estimators),
or mean square error, we attempt the stronger comparison of convex order when
possible. For the supremum, the criterion is based on the stochastic order of
estimators.

We study a class of optimal allocation problems, including the wellknown
Bomber Problem, with the following common probabilistic structure. An aircraft
equipped with an amount~$x$ of ammunition is intercepted by enemy airplanes
arriving according to a homogenous Poisson process over a fixed time
duration~$t$. Upon encountering an enemy, the aircraft has the choice of
spending any amount~$0\le y\le x$ of its ammunition, resulting in the
aircraft's survival with probability equal to some known increasing function of
$y$. Two different goals have been considered in the literature concerning the
optimal amount~$K(x,t)$ of ammunition spent: (i)~Maximizing the probability of
surviving for time~$t$, which is the socalled Bomber Problem, and (ii)
maximizing the number of enemy airplanes shot down during time~$t$, which we
call the Fighter Problem. Several authors have attempted to settle the
following conjectures about the monotonicity of $K(x,t)$: [A] $K(x,t)$ is
decreasing in $t$, [B] $K(x,t)$ is increasing in $x$, and [C] the
amount~$xK(x,t)$ held back is increasing in $x$. [A] and [C] have been shown
for the Bomber Problem with discrete ammunition, while [B] is still an open
question. In this paper we consider both time and ammunition continuous, and
for the Bomber Problem prove [A] and [C], while for the Fighter we prove [A]
and [C] for one special case and [B] and [C] for another. These proofs involve
showing that the optimal survival probability and optimal number shot down are
totally positive of order 2 ($\mbox{TP}_2$) in the Bomber and Fighter Problems,
respectively. The $\mbox{TP}_2$ property is shown by constructing convergent
sequences of approximating functions through an iterative operation which
preserves $\mbox{TP}_2$ and other properties.