
We introduce a new framework for efficient sampling from complex probability
distributions, using a combination of optimal transport maps and the
MetropolisHastings rule. The core idea is to use continuous transportation to
transform typical Metropolis proposal mechanisms (e.g., random walks, Langevin
methods) into nonGaussian proposal distributions that can more effectively
explore the target density. Our approach adaptively constructs a lower
triangular transport mapan approximation of the KnotheRosenblatt
rearrangementusing information from previous MCMC states, via the solution of
an optimization problem. This optimization problem is convex regardless of the
form of the target distribution. It is solved efficiently using a Newton method
that requires no gradient information from the target probability distribution;
the target distribution is instead represented via samples. Sequential updates
enable efficient and parallelizable adaptation of the map even for large
numbers of samples. We show that this approach uses inexact or truncated maps
to produce an adaptive MCMC algorithm that is ergodic for the exact target
distribution. Numerical demonstrations on a range of parameter inference
problems show orderofmagnitude speedups over standard MCMC techniques,
measured by the number of effectively independent samples produced per target
density evaluation and per unit of wallclock time.

In many inverse problems, model parameters cannot be precisely determined
from observational data. Bayesian inference provides a mechanism for capturing
the resulting parameter uncertainty, but typically at a high computational
cost. This work introduces a multiscale decomposition that exploits conditional
independence across scales, when present in certain classes of inverse
problems, to decouple Bayesian inference into two stages: (1) a computationally
tractable coarsescale inference problem; and (2) a mapping of the
lowdimensional coarsescale posterior distribution into the original
highdimensional parameter space. This decomposition relies on a
characterization of the nonGaussian joint distribution of coarse and
finescale quantities via optimal transport maps. We demonstrate our approach
on a sequence of inverse problems arising in subsurface flow, using the
multiscale finite element method to discretize the steady state pressure
equation. We compare the multiscale strategy with fulldimensional Markov chain
Monte Carlo on a problem of moderate dimension (100 parameters) and then use it
to infer a conductivity field described by over 10,000 parameters.

We present the fundamentals of a measure transport approach to sampling. The
idea is to construct a deterministic couplingi.e., a transport mapbetween
a complex "target" probability measure of interest and a simpler reference
measure. Given a transport map, one can generate arbitrarily many independent
and unweighted samples from the target simply by pushing forward reference
samples through the map. We consider two different and complementary scenarios:
first, when only evaluations of the unnormalized target density are available,
and second, when the target distribution is known only through a finite
collection of samples. We show that in both settings the desired transports can
be characterized as the solutions of variational problems. We then address
practical issues associated with the optimizationbased construction of
transports: choosing finitedimensional parameterizations of the map, enforcing
monotonicity, quantifying the error of approximate transports, and refining
approximate transports by enriching the corresponding approximation spaces.
Approximate transports can also be used to "Gaussianize" complex distributions
and thus precondition conventional asymptotically exact sampling schemes. We
place the measure transport approach in broader context, describing connections
with other optimizationbased samplers, with inference and density estimation
schemes using optimal transport, and with alternative transformationbased
approaches to simulation. We also sketch current work aimed at the construction
of transport maps in high dimensions, exploiting essential features of the
target distribution (e.g., conditional independence, lowrank structure). The
approaches and algorithms presented here have direct applications to Bayesian
computation and to broader problems of stochastic simulation.