-
We study full Bayesian procedures for sparse linear regression when errors
have a symmetric but otherwise unknown distribution. The unknown error
distribution is endowed with a symmetrized Dirichlet process mixture of
Gaussians. For the prior on regression coefficients, a mixture of point masses
at zero and continuous distributions is considered. We study behavior of the
posterior with diverging number of predictors. Conditions are provided for
consistency in the mean Hellinger distance. The compatibility and restricted
eigenvalue conditions yield the minimax convergence rate of the regression
coefficients in $\ell_1$- and $\ell_2$-norms, respectively. The convergence
rate is adaptive to both the unknown sparsity level and the unknown symmetric
error density under compatibility conditions. In addition, strong model
selection consistency and a semi-parametric Bernstein-von Mises theorem are
proven under slightly stronger conditions.
-
It is becoming increasingly common to see large collections of network data
objects -- that is, data sets in which a network is viewed as a fundamental
unit of observation. As a result, there is a pressing need to develop
network-based analogues of even many of the most basic tools already standard
for scalar and vector data. In this paper, our focus is on averages of
unlabeled, undirected networks with edge weights. Specifically, we (i)
characterize a certain notion of the space of all such networks, (ii) describe
key topological and geometric properties of this space relevant to doing
probability and statistics thereupon, and (iii) use these properties to
establish the asymptotic behavior of a generalized notion of an empirical mean
under sampling from a distribution supported on this space. Our results rely on
a combination of tools from geometry, probability theory, and statistical shape
analysis. In particular, the lack of vertex labeling necessitates working with
a quotient space modding out permutations of labels. This results in a
nontrivial geometry for the space of unlabeled networks, which in turn is found
to have important implications on the types of probabilistic and statistical
results that may be obtained and the techniques needed to obtain them.
-
Assuming a banded structure is one of the common practice in the estimation
of high-dimensional precision matrix. In this case, estimating the bandwidth of
the precision matrix is a crucial initial step for subsequent analysis.
Although there exist some consistent frequentist tests for the bandwidth
parameter, bandwidth selection consistency for precision matrices has not been
established in a Bayesian framework. In this paper, we propose a prior
distribution tailored to the bandwidth estimation of high-dimensional precision
matrices. The banded structure is imposed via the Cholesky factor from the
modified Cholesky decomposition. We establish the strong model selection
consistency for the bandwidth as well as the consistency of the Bayes factor.
The convergence rates for Bayes factors under both the null and alternative
hypotheses are derived which yield similar order of rates. As a by-product, we
also proposed an estimation procedure for the Cholesky factors yielding an
almost optimal order of convergence rates. Two-sample bandwidth test is also
considered, and it turns out that our method is able to consistently detect the
equality of bandwidths between two precision matrices. The simulation study
confirms that our method in general outperforms or is comparable to the
existing frequentist and Bayesian methods.
-
Asymptotic theory of tail index estimation has been studied extensively in
the frequentist literature on extreme values, but rarely in the Bayesian
context. We investigate whether popular Bayesian kernel mixture models are able
to support heavy tailed distributions and consistently estimate the tail index.
We show that posterior inconsistency in tail index is surprisingly common for
both parametric and nonparametric mixture models. We then present a set of
sufficient conditions under which posterior consistency in tail index can be
achieved, and verify these conditions for Pareto mixture models under general
mixing priors.
-
Effective and accurate model selection is an important problem in modern data
analysis. One of the major challenges is the computational burden required to
handle large data sets that cannot be stored or processed on one machine.
Another challenge one may encounter is the presence of outliers and
contaminations that damage the inference quality. The parallel "divide and
conquer" model selection strategy divides the observations of the full data set
into roughly equal subsets and perform inference and model selection
independently on each subset. After local subset inference, this method
aggregates the posterior model probabilities or other model/variable selection
criteria to obtain a final model by using the notion of geometric median. This
approach leads to improved concentration in finding the "correct" model and
model parameters and also is provably robust to outliers and data
contamination.
-
This article provides an exposition of recent methodologies for nonparametric
analysis of digital observations on images and other non-Euclidean objects.
Fr\'echet means of distributions on metric spaces, such as manifolds and
stratified spaces, have played an important role in this endeavor. Apart from
theoretical issues of uniqueness of the Fr\'echet minimizer and the asymptotic
distribution of the sample Fr\'echet mean under uniqueness, applications to
image analysis are highlighted. In addition, nonparametric Bayes theory is
brought to bear on the problems of density estimation and classification on
manifolds.
-
We propose a class of intrinsic Gaussian processes (in-GPs) for
interpolation, regression and classification on manifolds with a primary focus
on complex constrained domains or irregular shaped spaces arising as subsets or
submanifolds of R, R2, R3 and beyond. For example, in-GPs can accommodate
spatial domains arising as complex subsets of Euclidean space. in-GPs respect
the potentially complex boundary or interior conditions as well as the
intrinsic geometry of the spaces. The key novelty of the proposed approach is
to utilise the relationship between heat kernels and the transition density of
Brownian motion on manifolds for constructing and approximating valid and
computationally feasible covariance kernels. This enables in-GPs to be
practically applied in great generality, while existing approaches for
smoothing on constrained domains are limited to simple special cases. The broad
utilities of the in-GP approach is illustrated through simulation studies and
data examples.
-
Gaussian processes (GPs) are very widely used for modeling of unknown
functions or surfaces in applications ranging from regression to classification
to spatial processes. Although there is an increasingly vast literature on
applications, methods, theory and algorithms related to GPs, the overwhelming
majority of this literature focuses on the case in which the input domain
corresponds to a Euclidean space. However, particularly in recent years with
the increasing collection of complex data, it is commonly the case that the
input domain does not have such a simple form. For example, it is common for
the inputs to be restricted to a non-Euclidean manifold, a case which forms the
motivation for this article. In particular, we propose a general extrinsic
framework for GP modeling on manifolds, which relies on embedding of the
manifold into a Euclidean space and then constructing extrinsic kernels for GPs
on their images. These extrinsic Gaussian processes (eGPs) are used as prior
distributions for unknown functions in Bayesian inferences. Our approach is
simple and general, and we show that the eGPs inherit fine theoretical
properties from GP models in Euclidean spaces. We consider applications of our
models to regression and classification problems with predictors lying in a
large class of manifolds, including spheres, planar shape spaces, a space of
positive definite matrices, and Grassmannians. Our models can be readily used
by practitioners in biological sciences for various regression and
classification problems, such as disease diagnosis or detection. Our work is
also likely to have impact in spatial statistics when spatial locations are on
the sphere or other geometric spaces.
-
There is growing interest in using the close connection between differential
geometry and statistics to model smooth manifold-valued data. In particular,
much work has been done recently to generalize principal component analysis
(PCA), the method of dimension reduction in linear spaces, to Riemannian
manifolds. One such generalization is known as principal geodesic analysis
(PGA). This paper, in a novel fashion, obtains Taylor expansions in scaling
parameters introduced in the domain of objective functions in PGA. It is shown
this technique not only leads to better closed-form approximations of PGA but
also reveals the effects that scale, curvature and the distribution of data
have on solutions to PGA and on their differences to first-order tangent space
approximations. This approach should be able to be applied not only to PGA but
also to other generalizations of PCA and more generally to other intrinsic
statistics on Riemannian manifolds.
-
Community detection, which focuses on clustering nodes or detecting
communities in (mostly) a single network, is a problem of considerable
practical interest and has received a great deal of attention in the research
community. While being able to cluster within a network is important, there are
emerging needs to be able to cluster multiple networks. This is largely
motivated by the routine collection of network data that are generated from
potentially different populations, such as brain networks of subjects from
different disease groups, genders, or biological networks generated under
different experimental conditions, etc. We propose a simple and general
framework for clustering multiple networks based on a mixture model on
graphons. Our clustering method employs graphon estimation as a first step and
performs spectral clustering on the matrix of distances between estimated
graphons. This is illustrated through both simulated and real data sets, and
theoretical justification of the algorithm is given in terms of consistency.
-
We propose a novel approach to Bayesian analysis that is provably robust to
outliers in the data and often has computational advantages over standard
methods. Our technique is based on splitting the data into non-overlapping
subgroups, evaluating the posterior distribution given each independent
subgroup, and then combining the resulting measures. The main novelty of our
approach is the proposed aggregation step, which is based on the evaluation of
a median in the space of probability measures equipped with a suitable
collection of distances that can be quickly and efficiently evaluated in
practice. We present both theoretical and numerical evidence illustrating the
improvements achieved by our method.
-
Two central limit theorems for sample Fr\'echet means are derived, both
significant for nonparametric inference on non-Euclidean spaces. The first one,
Theorem 2.2, encompasses and improves upon most earlier CLTs on Fr\'echet means
and broadens the scope of the methodology beyond manifolds to diverse new
non-Euclidean data including those on certain stratified spaces which are
important in the study of phylogenetic trees. It does not require that the
underlying distribution $Q$ have a density, and applies to both intrinsic and
extrinsic analysis. The second theorem, Theorem 3.3, focuses on intrinsic means
on Riemannian manifolds of dimensions $d>2$ and breaks new ground by providing
a broad CLT without any of the earlier restrictive support assumptions. It
makes the statistically reasonable assumption of a somewhat smooth density of
$Q$. The excluded case of dimension $d=2$ proves to be an enigma, although the
first theorem does provide a CLT in this case as well under a support
restriction. Theorem 3.3 immediately applies to spheres $S^d$, $d>2$, which are
also of considerable importance in applications to axial spaces and to
landmarks based image analysis, as these spaces are quotients of spheres under
a Lie group $\mathcal G $ of isometries of $S^d$.
-
We introduce a Bayesian model for inferring mixtures of subspaces of
different dimensions. The key challenge in such a mixture model is
specification of prior distributions over subspaces of different dimensions. We
address this challenge by embedding subspaces or Grassmann manifolds into a
sphere of relatively low dimension and specifying priors on the sphere. We
provide an efficient sampling algorithm for the posterior distribution of the
model parameters. We illustrate that a simple extension of our mixture of
subspaces model can be applied to topic modeling. We also prove posterior
consistency for the mixture of subspaces model. The utility of our approach is
demonstrated with applications to real and simulated data.
-
We propose an extrinsic regression framework for modeling data with manifold
valued responses and Euclidean predictors. Regression with manifold responses
has wide applications in shape analysis, neuroscience, medical imaging and many
other areas. Our approach embeds the manifold where the responses lie onto a
higher dimensional Euclidean space, obtains a local regression estimate in that
space, and then projects this estimate back onto the image of the manifold.
Outside the regression setting both intrinsic and extrinsic approaches have
been proposed for modeling i.i.d manifold-valued data. However, to our
knowledge our work is the first to take an extrinsic approach to the regression
problem. The proposed extrinsic regression framework is general,
computationally efficient and theoretically appealing. Asymptotic distributions
and convergence rates of the extrinsic regression estimates are derived and a
large class of examples are considered indicating the wide applicability of our
approach.
-
We present a data augmentation scheme to perform Markov chain Monte Carlo
inference for models where data generation involves a rejection sampling
algorithm. Our idea, which seems to be missing in the literature, is a simple
scheme to instantiate the rejected proposals preceding each data point. The
resulting joint probability over observed and rejected variables can be much
simpler than the marginal distribution over the observed variables, which often
involves intractable integrals. We consider three problems, the first being the
modeling of flow-cytometry measurements subject to truncation. The second is a
Bayesian analysis of the matrix Langevin distribution on the Stiefel manifold,
and the third, Bayesian inference for a nonparametric Gaussian process density
model. The latter two are instances of problems where Markov chain Monte Carlo
inference is doubly-intractable. Our experiments demonstrate superior
performance over state-of-the-art sampling algorithms for such problems.
-
The Stiefel manifold $V_{p,d}$ is the space of all $d \times p$ orthonormal
matrices, with the $d-1$ hypersphere and the space of all orthogonal matrices
constituting special cases. In modeling data lying on the Stiefel manifold,
parametric distributions such as the matrix Langevin distribution are often
used; however, model misspecification is a concern and it is desirable to have
nonparametric alternatives. Current nonparametric methods are Fr\'echet mean
based. We take a fully generative nonparametric approach, which relies on
mixing parametric kernels such as the matrix Langevin. The proposed kernel
mixtures can approximate a large class of distributions on the Stiefel
manifold, and we develop theory showing posterior consistency. While there
exists work developing general posterior consistency results, extending these
results to this particular manifold requires substantial new theory. Posterior
inference is illustrated on a real-world dataset of near-Earth objects.
-
Shape constrained regression analysis has applications in dose-response
modeling, environmental risk assessment, disease screening and many other
areas. Incorporating the shape constraints can improve estimation efficiency
and avoid implausible results. We propose two novel methods focusing on
Bayesian monotone curve and surface estimation using Gaussian process
projections. The first projects samples from an unconstrained prior, while the
second projects samples from the Gaussian process posterior. Theory is
developed on continuity of the projection, posterior consistency and rates of
contraction. The second approach is shown to have an empirical Bayes
justification and to lead to simple computation with good performance in finite
samples. Our projection approach can be applied in other constrained function
estimation problems including in multivariate settings.