
Modeling correlation (and covariance) matrices is a challenging problem due
to the large dimensionality and positivedefiniteness constraint. In this
paper, we propose a novel Bayesian framework based on modeling the correlations
as products of vectors on unit spheres. The covariance matrix is then modeled
by using its decomposition into correlation and variance matrices. This
approach allows us to induce flexible prior distributions for covariance
matrices (that go beyond the commonly used inverseWishart prior) by specifying
a wide range of distributions on spheres (e.g. the squaredDirichlet
distribution). For modeling reallife spatiotemporal processes with complex
dependence structures, we extend our method to dynamic cases and introduce
unitvector Gaussian process priors in order to capture the evolution of
correlation among multiple time series. To handle the intractability of the
resulting posterior, we introduce the adaptive $\Delta$Spherical Hamiltonian
Monte Carlo. Using an example of NormalInverseWishart problem, a simulated
periodic process, and an analysis of local field potential data (collected from
the hippocampus of rats performing a complex sequence memory task), we
demonstrate the validity and effectiveness of our proposed framework for
(dynamic) modeling covariance and correlation matrices.

Climate projections continue to be marred by large uncertainties, which
originate in processes that need to be parameterized, such as clouds,
convection, and ecosystems. But rapid progress is now within reach. New
computational tools and methods from data assimilation and machine learning
make it possible to integrate global observations and local highresolution
simulations in an Earth system model (ESM) that systematically learns from
both. Here we propose a blueprint for such an ESM. We outline how
parameterization schemes can learn from global observations and targeted
highresolution simulations, for example, of clouds and convection, through
matching loworder statistics between ESMs, observations, and highresolution
simulations. We illustrate learning algorithms for ESMs with a simple dynamical
system that shares characteristics of the climate system; and we discuss the
opportunities the proposed framework presents and the challenges that remain to
realize it.

It is well known that the Fisher information induces a Riemannian geometry on
parametric families of probability density functions. We generalize this
formalism to a geometry over all density functions regardless of
parameterization. The resulting nonparametric Fisher information geometry is
then shown to be equivalent to a familiar, albeit infinite dimensional,
geometric objectthe sphere. By shifting focus away from density functions
and toward \emph{squareroot} density functions, one may calculate theoretical
quantities of interest with ease. More importantly, the sphere of squareroot
densities is much more computationally tractable. This insight leads to a novel
Bayesian nonparametric density estimation model. We construct the
$\chi^2$process density prior by modeling the squareroot density with a
Gaussian process prior. Inference over squareroot densities is fast, and the
model retains the flexibility characteristic of Bayesian nonparametric models.

Bayesian inverse problems often involve sampling posterior distributions on
infinitedimensional function spaces. Traditional Markov chain Monte Carlo
(MCMC) algorithms are characterized by deteriorating mixing times upon
meshrefinement, when the finitedimensional approximations become more
accurate. Such methods are typically forced to reduce stepsizes as the
discretization gets finer, and thus are expensive as a function of dimension.
Recently, a new class of MCMC methods with meshindependent convergence times
has emerged. However, few of them take into account the geometry of the
posterior informed by the data. At the same time, recently developed geometric
MCMC algorithms have been found to be powerful in exploring complicated
distributions that deviate significantly from elliptic Gaussian laws, but are
in general computationally intractable for models defined in infinite
dimensions. In this work, we combine geometric methods on a finitedimensional
subspace with meshindependent infinitedimensional approaches. Our objective
is to speed up MCMC mixing times, without significantly increasing the
computational cost per step (for instance, in comparison with the vanilla
preconditioned CrankNicolson (pCN) method). This is achieved by using ideas
from geometric MCMC to probe the complex structure of an intrinsic
finitedimensional subspace where most data information concentrates, while
retaining robust mixing times as the dimension grows by using pCNlike methods
in the complementary subspace. The resulting algorithms are demonstrated in the
context of three challenging inverse problems arising in subsurface flow, heat
conduction and incompressible flow control. The algorithms exhibit up to two
orders of magnitude improvement in sampling efficiency when compared with the
pCN method.

We extend the application of Hamiltonian Monte Carlo to allow for sampling
from probability distributions defined over symmetric or Hermitian positive
definite matrices. To do so, we exploit the Riemannian structure induced by
Cartan's centuryold canonical metric. The geodesics that correspond to this
metric are available in closedform andwithin the context of Lagrangian
Monte Carloprovide a principled way to travel around the space of positive
definite matrices. Our method improves Bayesian inference on such matrices by
allowing for a broad range of priors, so we are not limited to conjugate priors
only. In the context of spectral density estimation, we use the (nonconjugate)
complex reference prior as an example modeling option made available by the
algorithm. Results based on simulated and realworld multivariate time series
are presented in this context, and future directions are outlined.

We introduce phylodyn, an R package for phylodynamic analysis based on gene
genealogies. The package main functionality is Bayesian nonparametric
estimation of effective population size fluctuations over time. Our
implementation includes several Markov chain Monte Carlobased methods and an
integrated nested Laplace approximationbased approach for phylodynamic
inference that have been developed in recent years. Genealogical data describe
the timed ancestral relationships of individuals sampled from a population of
interest. Here, individuals are assumed to be sampled at the same point in time
(isochronous sampling) or at different points in time (heterochronous
sampling); in addition, sampling events can be modeled with preferential
sampling, which means that the intensity of sampling events is allowed to
depend on the effective population size trajectory. We assume the coalescent
and the sequentially Markov coalescent processes as generative models of
genealogies. We include several coalescent simulation functions that are useful
for testing our phylodynamics methods via simulation studies. We compare the
performance and outputs of various methods implemented in phylodyn and outline
their strengths and weaknesses. R package phylodyn is available at
https://github.com/mdkarcher/phylodyn.

The Bayesian approach to Inverse Problems relies predominantly on Markov
Chain Monte Carlo methods for posterior inference. The typical nonlinear
concentration of posterior measure observed in many such Inverse Problems
presents severe challenges to existing simulation based inference methods.
Motivated by these challenges the exploitation of local geometric information
in the form of covariant gradients, metric tensors, LeviCivita connections,
and local geodesic flows, have been introduced to more effectively locally
explore the configuration space of the posterior measure. However, obtaining
such geometric quantities usually requires extensive computational effort and
despite their effectiveness affect the applicability of these
geometricallybased Monte Carlo methods. In this paper we explore one way to
address this issue by the construction of an emulator of the model from which
all geometric objects can be obtained in a much more computationally feasible
manner. The main concept is to approximate the geometric quantities using a
Gaussian Process emulator which is conditioned on a carefully chosen design set
of configuration points, which also determines the quality of the emulator. To
this end we propose the use of statistical experiment design methods to refine
a potentially arbitrarily initialized design online without destroying the
convergence of the resulting Markov chain to the desired invariant measure. The
practical examples considered in this paper provide a demonstration of the
significant improvement possible in terms of computational loading suggesting
this is a promising avenue of further development.

Statistical models with constrained probability distributions are abundant in
machine learning. Some examples include regression models with norm constraints
(e.g., Lasso), probit, many copula models, and latent Dirichlet allocation
(LDA). Bayesian inference involving probability distributions confined to
constrained domains could be quite challenging for commonly used sampling
algorithms. In this paper, we propose a novel augmentation technique that
handles a wide range of constraints by mapping the constrained domain to a
sphere in the augmented space. By moving freely on the surface of this sphere,
sampling algorithms handle constraints implicitly and generate proposals that
remain within boundaries when mapped back to the original space. Our proposed
method, called {Spherical Augmentation}, provides a mathematically natural and
computationally efficient framework for sampling from constrained probability
distributions. We show the advantages of our method over stateoftheart
sampling algorithms, such as exact Hamiltonian Monte Carlo, using several
examples including truncated Gaussian distributions, Bayesian Lasso, Bayesian
bridge regression, reconstruction of quantized stationary Gaussian process, and
LDA for topic modeling.

Phylodynamics focuses on the problem of reconstructing past population size
dynamics from current genetic samples taken from the population of interest.
This technique has been extensively used in many areas of biology, but is
particularly useful for studying the spread of quickly evolving infectious
diseases agents, e.g.,\ influenza virus. Phylodynamics inference uses a
coalescent model that defines a probability density for the genealogy of
randomly sampled individuals from the population. When we assume that such a
genealogy is known, the coalescent model, equipped with a Gaussian process
prior on population size trajectory, allows for nonparametric Bayesian
estimation of population size dynamics. While this approach is quite powerful,
large data sets collected during infectious disease surveillance challenge the
stateoftheart of Bayesian phylodynamics and demand computationally more
efficient inference framework. To satisfy this demand, we provide a
computationally efficient Bayesian inference framework based on Hamiltonian
Monte Carlo for coalescent process models. Moreover, we show that by splitting
the Hamiltonian function we can further improve the efficiency of this
approach. Using several simulated and real datasets, we show that our method
provides accurate estimates of population size dynamics and is substantially
faster than alternative methods based on elliptical slice sampler and
Metropolisadjusted Langevin algorithm.

In machine learning and statistics, probabilistic inference involving
multimodal distributions is quite difficult. This is especially true in high
dimensional problems, where most existing algorithms cannot easily move from
one mode to another. To address this issue, we propose a novel Bayesian
inference approach based on Markov Chain Monte Carlo. Our method can
effectively sample from multimodal distributions, especially when the dimension
is high and the modes are isolated. To this end, it exploits and modifies the
Riemannian geometric properties of the target distribution to create
\emph{wormholes} connecting modes in order to facilitate moving between them.
Further, our proposed method uses the regeneration technique in order to adapt
the algorithm by identifying new modes and updating the network of wormholes
without affecting the stationary distribution. To find new modes, as opposed to
rediscovering those previously identified, we employ a novel mode searching
algorithm that explores a \emph{residual energy} function obtained by
subtracting an approximate Gaussian mixture density (based on previously
discovered modes) from the target density function.

We propose a scalable semiparametric Bayesian model to capture dependencies
among multiple neurons by detecting their cofiring (possibly with some lag
time) patterns over time. After discretizing time so there is at most one spike
at each interval, the resulting sequence of 1's (spike) and 0's (silence) for
each neuron is modeled using the logistic function of a continuous latent
variable with a Gaussian process prior. For multiple neurons, the corresponding
marginal distributions are coupled to their joint probability distribution
using a parametric copula model. The advantages of our approach are as follows:
the nonparametric component (i.e., the Gaussian process model) provides a
flexible framework for modeling the underlying firing rates; the parametric
component (i.e., the copula model) allows us to make inference regarding both
contemporaneous and lagged relationships among neurons; using the copula model,
we construct multivariate probabilistic models by separating the modeling of
univariate marginal distributions from the modeling of dependence structure
among variables; our method is easy to implement using a computationally
efficient sampling algorithm that can be easily extended to high dimensional
problems. Using simulated data, we show that our approach could correctly
capture temporal dependencies in firing rates and identify synchronous neurons.
We also apply our model to spike train data obtained from prefrontal cortical
areas in rat's brain.

Contributed discussion and rejoinder to "Geodesic Monte Carlo on Embedded
Manifolds" (arXiv:1301.6064)

We propose a new Markov Chain Monte Carlo (MCMC) method for constrained
target distributions. Our method first maps the $D$dimensional constrained
domain of parameters to the unit ball ${\bf B}_0^D(1)$. Then, it augments the
resulting parameter space to the $D$dimensional sphere, ${\bf S}^D$. The
boundary of ${\bf B}_0^D(1)$ corresponds to the equator of ${\bf S}^D$. This
change of domains enables us to implicitly handle the original constraints
because while the sampler moves freely on the sphere, it proposes states that
are within the constraints imposed on the original parameter space. To improve
the computational efficiency of our algorithm, we split the Lagrangian dynamics
into several parts such that a part of the dynamics can be handled analytically
by finding the geodesic flow on the sphere. We apply our method to several
examples including truncated Gaussian, Bayesian Lasso, Bayesian bridge
regression, and a copula model for identifying synchrony among multiple
neurons. Our results show that the proposed method can provide a natural and
efficient framework for handling several types of constraints on target
distributions.

Hamiltonian Monte Carlo (HMC) improves the computational efficiency of the
Metropolis algorithm by reducing its random walk behavior. Riemannian Manifold
HMC (RMHMC) further improves HMC's performance by exploiting the geometric
properties of the parameter space. However, the geometric integrator used for
RMHMC involves implicit equations that require costly numerical analysis (e.g.,
fixedpoint iteration). In some cases, the computational overhead for solving
implicit equations undermines RMHMC's benefits. To avoid this problem, we
propose an explicit geometric integrator that replaces the momentum variable in
RMHMC by velocity. We show that the resulting transformation is equivalent to
transforming Riemannian Hamilton dynamics to Lagrangian dynamics. Experimental
results show that our method improves RMHMC's overall computational efficiency.
All computer programs and data sets are available online
(http://www.ics.uci.edu/~babaks/Site/Codes.html) in order to allow replications
of the results reported in this paper.

In this paper, we discuss an extension of the Split Hamiltonian Monte Carlo
(Split HMC) method for Gaussian process model (GPM). This method is based on
splitting the Hamiltonian in a way that allows much of the movement around the
state space to be done at low computational cost. To this end, we approximate
the negative log density (i.e., the energy function) of the distribution of
interest by a quadratic function U0 for which Hamiltonian dynamics can be
solved analytically. The overall energy function U is then written as U0 + U1,
where U1 is the approximation error. The Hamiltonian is then split into two
parts; one part is based on U0 is handled analytically, the other part is based
on U1 for which we approximate Hamiltonian's equations by discretizing time. We
use simulated and real data to compare the performance of our method to the
standard HMC. We find that splitting the Hamiltonian for GP models could lead
to substantial improvement (up to 10 folds) of sampling efficiency, which is
measured in terms of the amount of time required for producing an independent
sample with high acceptance probability from posterior distributions.

We show how the Hamiltonian Monte Carlo algorithm can sometimes be speeded up
by "splitting" the Hamiltonian in a way that allows much of the movement around
the state space to be done at low computational cost. One context where this is
possible is when the log density of the distribution of interest (the potential
energy function) can be written as the log of a Gaussian density, which is a
quadratic function, plus a slowly varying function. Hamiltonian dynamics for
quadratic energy functions can be analytically solved. With the splitting
technique, only the slowlyvarying part of the energy needs to be handled
numerically, and this can be done with a larger stepsize (and hence fewer
steps) than would be necessary with a direct simulation of the dynamics.
Another context where splitting helps is when the most important terms of the
potential energy function and its gradient can be evaluated quickly, with only
a slowlyvarying part requiring costly computations. With splitting, the quick
portion can be handled with a small stepsize, while the costly portion uses a
larger stepsize. We show that both of these splitting approaches can reduce the
computational cost of sampling from the posterior distribution for a logistic
regression model, using either a Gaussian approximation centered on the
posterior mode, or a Hamiltonian split into a term that depends on only a small
number of critical cases, and another term that involves the larger number of
cases whose influence on the posterior distribution is small. Supplemental
materials for this paper are available online.