• ### Goal-oriented optimal approximations of Bayesian linear inverse problems(1607.01881)

March 14, 2017 math.NA, stat.CO, stat.ME
We propose optimal dimensionality reduction techniques for the solution of goal-oriented linear-Gaussian inverse problems, where the quantity of interest (QoI) is a function of the inversion parameters. These approximations are suitable for large-scale applications. In particular, we study the approximation of the posterior covariance of the QoI as a low-rank negative update of its prior covariance, and prove optimality of this update with respect to the natural geodesic distance on the manifold of symmetric positive definite matrices. Assuming exact knowledge of the posterior mean of the QoI, the optimality results extend to optimality in distribution with respect to the Kullback-Leibler divergence and the Hellinger distance between the associated distributions. We also propose approximation of the posterior mean of the QoI as a low-rank linear function of the data, and prove optimality of this approximation with respect to a weighted Bayes risk. Both of these optimal approximations avoid the explicit computation of the full posterior distribution of the parameters and instead focus on directions that are well informed by the data and relevant to the QoI. These directions stem from a balance among all the components of the goal-oriented inverse problem: prior information, forward model, measurement noise, and ultimate goals. We illustrate the theory using a high-dimensional inverse problem in heat transfer.
• ### Bayesian inverse problems with $l_1$ priors: a Randomize-then-Optimize approach(1607.01904)

Dec. 29, 2016 stat.CO
Prior distributions for Bayesian inference that rely on the $l_1$-norm of the parameters are of considerable interest, in part because they promote parameter fields with less regularity than Gaussian priors (e.g., discontinuities and blockiness). These $l_1$-type priors include the total variation (TV) prior and the Besov $B^s_{1,1}$ space prior, and in general yield non-Gaussian posterior distributions. Sampling from these posteriors is challenging, particularly in the inverse problem setting where the parameter space is high-dimensional and the forward problem may be nonlinear. This paper extends the randomize-then-optimize (RTO) method, an optimization-based sampling algorithm developed for Bayesian inverse problems with Gaussian priors, to inverse problems with $l_1$-type priors. We use a variable transformation to convert an $l_1$-type prior to a standard Gaussian prior, such that the posterior distribution of the transformed parameters is amenable to Metropolized sampling via RTO. We demonstrate this approach on several deconvolution problems and an elliptic PDE inverse problem, using TV or Besov $B^s_{1,1}$ space priors. Our results show that the transformed RTO algorithm characterizes the correct posterior distribution and can be more efficient than other sampling algorithms. The variable transformation can also be extended to other non-Gaussian priors.
• ### Scalable posterior approximations for large-scale Bayesian inverse problems via likelihood-informed parameter and state reduction(1510.06053)

April 30, 2016 math.NA, stat.CO, stat.ME
Two major bottlenecks to the solution of large-scale Bayesian inverse problems are the scaling of posterior sampling algorithms to high-dimensional parameter spaces and the computational cost of forward model evaluations. Yet incomplete or noisy data, the state variation and parameter dependence of the forward model, and correlations in the prior collectively provide useful structure that can be exploited for dimension reduction in this setting--both in the parameter space of the inverse problem and in the state space of the forward model. To this end, we show how to jointly construct low-dimensional subspaces of the parameter space and the state space in order to accelerate the Bayesian solution of the inverse problem. As a byproduct of state dimension reduction, we also show how to identify low-dimensional subspaces of the data in problems with high-dimensional observations. These subspaces enable approximation of the posterior as a product of two factors: (i) a projection of the posterior onto a low-dimensional parameter subspace, wherein the original likelihood is replaced by an approximation involving a reduced model; and (ii) the marginal prior distribution on the high-dimensional complement of the parameter subspace. We present and compare several strategies for constructing these subspaces using only a limited number of forward and adjoint model simulations. The resulting posterior approximations can rapidly be characterized using standard sampling techniques, e.g., Markov chain Monte Carlo. Two numerical examples demonstrate the accuracy and efficiency of our approach: inversion of an integral equation in atmospheric remote sensing, where the data dimension is very high; and the inference of a heterogeneous transmissivity field in a groundwater system, which involves a partial differential equation forward model with high dimensional state and parameters.
• ### On dimension reduction in Gaussian filters(1508.06452)

Feb. 23, 2016 math.PR, stat.CO, stat.ME
A priori dimension reduction is a widely adopted technique for reducing the computational complexity of stationary inverse problems. In this setting, the solution of an inverse problem is parameterized by a low-dimensional basis that is often obtained from the truncated Karhunen-Loeve expansion of the prior distribution. For high-dimensional inverse problems equipped with smoothing priors, this technique can lead to drastic reductions in parameter dimension and significant computational savings. In this paper, we extend the concept of a priori dimension reduction to non-stationary inverse problems, in which the goal is to sequentially infer the state of a dynamical system. Our approach proceeds in an offline-online fashion. We first identify a low-dimensional subspace in the state space before solving the inverse problem (the offline phase), using either the method of "snapshots" or regularized covariance estimation. Then this subspace is used to reduce the computational complexity of various filtering algorithms - including the Kalman filter, extended Kalman filter, and ensemble Kalman filter - within a novel subspace-constrained Bayesian prediction-and-update procedure (the online phase). We demonstrate the performance of our new dimension reduction approach on various numerical examples. In some test cases, our approach reduces the dimensionality of the original problem by orders of magnitude and yields up to two orders of magnitude in computational savings.
• ### Dimension-independent likelihood-informed MCMC(1411.3688)

Oct. 17, 2015 math.NA, stat.CO, stat.ME
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines of research intersect in the methods developed here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian information and any associated low-dimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent, likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Two nonlinear inverse problems are used to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.
• ### Optimal low-rank approximations of Bayesian linear inverse problems(1407.3463)

July 6, 2015 math.NA, stat.CO, stat.ME
In the Bayesian approach to inverse problems, data are often informative, relative to the prior, only on a low-dimensional subspace of the parameter space. Significant computational savings can be achieved by using this subspace to characterize and approximate the posterior distribution of the parameters. We first investigate approximation of the posterior covariance matrix as a low-rank update of the prior covariance matrix. We prove optimality of a particular update, based on the leading eigendirections of the matrix pencil defined by the Hessian of the negative log-likelihood and the prior precision, for a broad class of loss functions. This class includes the F\"{o}rstner metric for symmetric positive definite matrices, as well as the Kullback-Leibler divergence and the Hellinger distance between the associated distributions. We also propose two fast approximations of the posterior mean and prove their optimality with respect to a weighted Bayes risk under squared-error loss. These approximations are deployed in an offline-online manner, where a more costly but data-independent offline calculation is followed by fast online evaluations. As a result, these approximations are particularly useful when repeated posterior mean evaluations are required for multiple data sets. We demonstrate our theoretical results with several numerical examples, including high-dimensional X-ray tomography and an inverse heat conduction problem. In both of these examples, the intrinsic low-dimensional structure of the inference problem can be exploited while producing results that are essentially indistinguishable from solutions computed in the full space.
• ### Likelihood-informed dimension reduction for nonlinear inverse problems(1403.4680)

July 15, 2014 math.NA, stat.CO, stat.ME
The intrinsic dimensionality of an inverse problem is affected by prior information, the accuracy and number of observations, and the smoothing properties of the forward operator. From a Bayesian perspective, changes from the prior to the posterior may, in many problems, be confined to a relatively low-dimensional subspace of the parameter space. We present a dimension reduction approach that defines and identifies such a subspace, called the "likelihood-informed subspace" (LIS), by characterizing the relative influences of the prior and the likelihood over the support of the posterior distribution. This identification enables new and more efficient computational methods for Bayesian inference with nonlinear forward models and Gaussian priors. In particular, we approximate the posterior distribution as the product of a lower-dimensional posterior defined on the LIS and the prior distribution marginalized onto the complementary subspace. Markov chain Monte Carlo sampling can then proceed in lower dimensions, with significant gains in computational efficiency. We also introduce a Rao-Blackwellization strategy that de-randomizes Monte Carlo estimates of posterior expectations for additional variance reduction. We demonstrate the efficiency of our methods using two numerical examples: inference of permeability in a groundwater system governed by an elliptic PDE, and an atmospheric remote sensing problem based on Global Ozone Monitoring System (GOMOS) observations.
• ### Data-Driven Model Reduction for the Bayesian Solution of Inverse Problems(1403.4290)

July 15, 2014 math.NA, stat.CO, stat.ME
One of the major challenges in the Bayesian solution of inverse problems governed by partial differential equations (PDEs) is the computational cost of repeatedly evaluating numerical PDE models, as required by Markov chain Monte Carlo (MCMC) methods for posterior sampling. This paper proposes a data-driven projection-based model reduction technique to reduce this computational cost. The proposed technique has two distinctive features. First, the model reduction strategy is tailored to inverse problems: the snapshots used to construct the reduced-order model are computed adaptively from the posterior distribution. Posterior exploration and model reduction are thus pursued simultaneously. Second, to avoid repeated evaluations of the full-scale numerical model as in a standard MCMC method, we couple the full-scale model and the reduced-order model together in the MCMC algorithm. This maintains accurate inference while reducing its overall computational cost. In numerical experiments considering steady-state flow in a porous medium, the data-driven reduced-order model achieves better accuracy than a reduced-order model constructed using the classical approach. It also improves posterior sampling efficiency by several orders of magnitude compared to a standard MCMC method.