
We show that prediction performance for globallocal shrinkage regression can
overcome two major difficulties of global shrinkage regression: (i) the amount
of relative shrinkage is monotone in the singular values of the design matrix
and (ii) the shrinkage is determined by a single tuning parameter.
Specifically, we show that the horseshoe regression, with heavytailed
componentspecific local shrinkage parameters, in conjunction with a global
parameter providing shrinkage towards zero, alleviates both these difficulties
and consequently, results in an improved risk for prediction. Numerical
demonstrations of improved prediction over competing approaches in simulations
and in a pharmacogenomics data set confirm our theoretical findings.

The goal of this paper is to contrast and survey the major advances in two of
the most commonly used highdimensional techniques, namely, the Lasso and
horseshoe regularization. Lasso is a gold standard for predictor selection
while horseshoe is a stateoftheart Bayesian estimator for sparse signals.
Lasso is fast and scalable and uses convex optimization whilst the horseshoe is
nonconvex. Our novel perspective focuses on three aspects: (i) theoretical
optimality in high dimensional inference for the Gaussian sparse model and
beyond, (ii) efficiency and scalability of computation and (iii) methodological
development and performance.

We develop a new estimator of the inverse covariance matrix for
highdimensional multivariate normal data using the horseshoe prior. The
proposed graphical horseshoe estimator has attractive properties compared to
other popular estimators, such as the graphical lasso and graphical Smoothly
Clipped Absolute Deviation (SCAD). The most prominent benefit is that when the
true inverse covariance matrix is sparse, the graphical horseshoe provides
estimates with small information divergence from the true sampling
distribution. The posterior mean under the graphical horseshoe prior can also
be almost unbiased under certain conditions. In addition to these theoretical
results, we also provide a full Gibbs sampler for implementing our estimator.
MATLAB code is available for download from github at
http://github.com/liyf1988/GHS. The graphical horseshoe estimator compares
favorably to existing techniques in simulations and in a human gene network
data analysis.

In Divide & Recombine (D&R), big data are divided into subsets, each analytic
method is applied to subsets, and the outputs are recombined. This enables deep
analysis and practical computational performance. An innovate D\&R procedure is
proposed to compute likelihood functions of datamodel (DM) parameters for big
data. The likelihoodmodel (LM) is a parametric probability density function of
the DM parameters. The density parameters are estimated by fitting the density
to MCMC draws from each subset DM likelihood function, and then the fitted
densities are recombined. The procedure is illustrated using normal and
skewnormal LMs for the logistic regression DM.

Feature subset selection arises in many highdimensional applications of
statistics, such as compressed sensing and genomics. The $\ell_0$ penalty is
ideal for this task, the caveat being it requires the NPhard combinatorial
evaluation of all models. A recent area of considerable interest is to develop
efficient algorithms to fit models with a nonconvex $\ell_\gamma$ penalty for
$\gamma\in (0,1)$, which results in sparser models than the convex $\ell_1$ or
lasso penalty, but is harder to fit. We propose an alternative, termed the
horseshoe regularization penalty for feature subset selection, and demonstrate
its theoretical and computational advantages. The distinguishing feature from
existing nonconvex optimization approaches is a full probabilistic
representation of the penalty as the negative of the logarithm of a suitable
prior, which in turn enables efficient expectationmaximization and local
linear approximation algorithms for optimization and MCMC for uncertainty
quantification. In synthetic and real data, the resulting algorithms provide
better statistical performance, and the computation requires a fraction of time
of stateoftheart nonconvex solvers.

Globallocal mixtures are derived from the CauchySchlomilch and Liouville
integral transformation identities. We characterize wellknown normalscale
mixture distributions including the Laplace or lasso, logit and quantile as
well as new globallocal mixtures. We also apply our methodology to
convolutions that commonly arise in Bayesian inference. Finally, we conclude
with a conjecture concerning bridge and uniform correlation mixtures.

We provide a framework for assessing the default nature of a prior
distribution using the property of regular variation, which we study for
globallocal shrinkage priors. In particular, we demonstrate the horseshoe
priors, originally designed to handle sparsity, also possess regular variation
and thus are appropriate for default Bayesian analysis. To illustrate our
methodology, we solve a problem of noninformative priors due to Efron (1973),
who showed standard flat noninformative priors in highdimensional normal
means model can be highly informative for nonlinear parameters of interest. We
consider four such problems and show globallocal shrinkage priors such as the
horseshoe and horseshoe+ perform as Efron (1973) requires in each case. We find
the reason for this lies in the ability of the globallocal shrinkage priors to
separate a lowdimensional signal embedded in highdimensional noise, even for
nonlinear functions.

Inferring dependence structure through undirected graphs is crucial for
uncovering the major modes of multivariate interaction among highdimensional
genomic markers that are potentially associated with cancer. Traditionally,
conditional independence has been studied using sparse Gaussian graphical
models for continuous data and sparse Ising models for discrete data. However,
there are two clear situations when these approaches are inadequate. The first
occurs when the data are continuous but display nonnormal marginal behavior
such as heavy tails or skewness, rendering an assumption of normality
inappropriate. The second occurs when a part of the data is ordinal or discrete
(e.g., presence or absence of a mutation) and the other part is continuous
(e.g., expression levels of genes or proteins). In this case, the existing
Bayesian approaches typically employ a latent variable framework for the
discrete part that precludes inferring conditional independence among the data
that are actually observed. The current article overcomes these two challenges
in a unified framework using Gaussian scale mixtures. Our framework is able to
handle continuous data that are not normal and data that are of mixed
continuous and discrete nature, while still being able to infer a sparse
conditional sign independence structure among the observed data. Extensive
performance comparison in simulations with alternative techniques and an
analysis of a real cancer genomics data set demonstrate the effectiveness of
the proposed approach.

We propose a new prior for ultrasparse signal detection that we term the
"horseshoe+ prior." The horseshoe+ prior is a natural extension of the
horseshoe prior that has achieved success in the estimation and detection of
sparse signals and has been shown to possess a number of desirable theoretical
properties while enjoying computational feasibility in high dimensions. The
horseshoe+ prior builds upon these advantages. Our work proves that the
horseshoe+ posterior concentrates at a rate faster than that of the horseshoe
in the KullbackLeibler (KL) sense. We also establish theoretically that the
proposed estimator has lower posterior mean squared error in estimating signals
compared to the horseshoe and achieves the optimal Bayes risk in testing up to
a constant. For globallocal scale mixture priors, we develop a new technique
for analyzing the marginal sparse prior densities using the class of MeijerG
functions. In simulations, the horseshoe+ estimator demonstrates superior
performance in a standard design setting against competing methods, including
the horseshoe and DirichletLaplace estimators. We conclude with an
illustration on a prostate cancer data set and by pointing out some directions
for future research.

Inference for partially observed Markov process models has been a
longstanding methodological challenge with many scientific and engineering
applications. Iterated filtering algorithms maximize the likelihood function
for partially observed Markov process models by solving a recursive sequence of
filtering problems. We present new theoretical results pertaining to the
convergence of iterated filtering algorithms implemented via sequential Monte
Carlo filters. This theory complements the growing body of empirical evidence
that iterated filtering algorithms provide an effective inference strategy for
scientific models of nonlinear dynamic systems. The first step in our theory
involves studying a new recursive approach for maximizing the likelihood
function of a latent variable model, when this likelihood is evaluated via
importance sampling. This leads to the consideration of an iterated importance
sampling algorithm which serves as a simple special case of iterated filtering,
and may have applicability in its own right.

In this article a flexible Bayesian nonparametric model is proposed for
nonhomogeneous hidden Markov models. The model is developed through the
amalgamation of the ideas of hidden Markov models and predictor dependent
stickbreaking processes. Computation is carried out using auxiliary variable
representation of the model which enable us to perform exact MCMC sampling from
the posterior. Furthermore, the model is extended to the situation when the
predictors can simultaneously in influence the transition dynamics of the
hidden states as well as the emission distribution. Estimates of few steps
ahead conditional predictive distributions of the response have been used as
performance diagnostics for these models. The proposed methodology is
illustrated through simulation experiments as well as analysis of a real data
set concerned with the prediction of rainfall induced malaria epidemics.