
Size bias occurs famously in waitingtime paradoxes, undesirably in sampling
schemes, and unexpectedly in connection with Stein's method, tightness,
analysis of the lognormal distribution, Skorohod embedding, infinite
divisibility, and number theory. In this paper we review the basics and survey
some of these unexpected connections.

Asymptotics for Dickman's number theoretic function $\rho(u)$, as $u
\rightarrow \infty$, were given de Bruijn and Alladi, and later in sharper form
by Hildebrand and Tenenbaum. The perspective in these works is that of analytic
number theory. However, the function $\rho(\cdot)$ also arises as a constant
multiple of a certain probability density connected with a scale invariant
Poisson process, and we observe that Dickman asymptotics can be interpreted as
a Gaussian local limit theorem for the sum of arrivals in a tilted Poisson
process, combined with untilting.
In this paper we exploit and extend this reasoning to obtain analogous
asymptotic formulas for a class of functions including, in addition to
Dickman's function, the densities of random variables having L\'evy measure
with support contained in $[0,1]$, subject to mild regularity assumptions.

Billingsley's theorem (1972) asserts that the PoissonDirichlet process is
the limit, as $n \to \infty$, of the process giving the relative log sizes of
the largest prime factor, the second largest, and so on, of a random integer
chosen uniformly from 1 to $n$. In this paper we give a new proof that directly
exploits Dickman's asymptotic formula for the number of such integers with no
prime factor larger than $n^{1/u}$, namely $\Psi(n,n^{1/u}) \sim n \rho(u)$, to
derive the limiting joint density functions of the finitedimensional
projections of the log prime factor processes. Our main technical tool is a new
criterion for the convergence in distribution of nonlattice discrete random
variables to continuous random variables.

Let $p_1 \ge p_2 \ge \dots$ be the prime factors of a random integer chosen
uniformly from $1$ to $n$, and let $$ \frac{\log p_1}{\log n}, \frac{\log
p_2}{\log n}, \dots $$ be the sequence of scaled log factors. Billingsley's
Theorem (1972), in its modern formulation, asserts that the limiting process,
as $n \to \infty$, is the PoissonDirichlet process with parameter $\theta =1$.
In this paper we give a new proof, inspired by the 1993 proof by Donnelly and
Grimmett, and extend the result to factorizations of elements of normed
arithmetic semigroups satisfying certain growth conditions, for which the
limiting PoissonDirichlet process need not have $\theta =1$. We also establish
PoissonDirichlet limits, with $\theta \ne 1$, for ordinary integers
conditional on the number of prime factors deviating from the usual value $\log
\log n$.
At the core of our argument is a purely probabilistic lemma giving a new
criterion for convergence in distribution to a PoissonDirichlet process, from
which the numbertheoretic applications follow as straightforward corollaries.
The lemma uses ingredients similar to those employed by Donnelly and Grimmett,
but reorganized so as to allow subsequent number theory input to be processed
as rapidly as possible.
A byproduct of this work is a new characterization of PoissonDirichlet
processes in terms of multiintensities.

We present new, exceptionally efficient proofs of PoissonDirichlet limit
theorems for the scaled sizes of irreducible components of random elements in
the classic combinatorial contexts of arbitrary assemblies, multisets, and
selections, when the components generating functions satisfy certain standard
hypotheses. The proofs exploit a new criterion for PoissonDirichlet limits,
originally designed for rapid proofs of Billingsley's theorem on the scaled
sizes of log prime factors of random integers (and some new generalizations).
Unexpectedly, the technique applies in the present combinatorial setting as
well, giving, perhaps, a long soughtafter unifying point of view. The proofs
depend also on formulas of Arratia and Tavar{\'e} for the mixed moments of
counts of components of various sizes, as well as formulas of Flajolet and
Soria for the asymptotics of generating function coefficients.

According to a 1975 result of T. Kaijser, if some nonvanishing product of
hidden Markov model (HMM) stepping matrices is subrectangular, and the
underlying chain is aperiodic, the corresponding $\alpha$chain has a unique
invariant limiting measure $\lambda$. Here the $\alpha$chain
$\{\alpha_n\}=\{(\alpha_{ni})\}$ is given by \[\alpha_{ni}=P(X_n=i
Y_n,Y_{n1},...),\] where $\{(X_n,Y_n)\}$ is a finite state HMM with unobserved
Markov chain component $\{X_n\}$ and observed output component $\{Y_n\}$. This
defines $\{\alpha_n\}$ as a stochastic process taking values in the probability
simplex. It is not hard to see that $\{\alpha_n\}$ is itself a Markov chain.
The stepping matrices $M(y)=(M(y)_{ij})$ give the probability that
$(X_n,Y_n)=(j,y)$, conditional on $X_{n1}=i$. A matrix is said to be
subrectangular if the locations of its nonzero entries forms a cartesian
product of a set of row indices and a set of column indices. Kaijser's result
is based on an application of the FurstenbergKesten theory to the random
matrix products $M(Y_1)M(Y_2)... M(Y_n)$. In this paper we prove a slightly
stronger form of Kaijser's theorem with a simpler argument, exploiting the
theory of e chains.