
We propose optimal dimensionality reduction techniques for the solution of
goaloriented linearGaussian inverse problems, where the quantity of interest
(QoI) is a function of the inversion parameters. These approximations are
suitable for largescale applications. In particular, we study the
approximation of the posterior covariance of the QoI as a lowrank negative
update of its prior covariance, and prove optimality of this update with
respect to the natural geodesic distance on the manifold of symmetric positive
definite matrices. Assuming exact knowledge of the posterior mean of the QoI,
the optimality results extend to optimality in distribution with respect to the
KullbackLeibler divergence and the Hellinger distance between the associated
distributions. We also propose approximation of the posterior mean of the QoI
as a lowrank linear function of the data, and prove optimality of this
approximation with respect to a weighted Bayes risk. Both of these optimal
approximations avoid the explicit computation of the full posterior
distribution of the parameters and instead focus on directions that are well
informed by the data and relevant to the QoI. These directions stem from a
balance among all the components of the goaloriented inverse problem: prior
information, forward model, measurement noise, and ultimate goals. We
illustrate the theory using a highdimensional inverse problem in heat
transfer.

We describe stochastic Newton and stochastic quasiNewton approaches to
efficiently solve large linear leastsquares problems where the very large data
sets present a significant computational burden (e.g., the size may exceed
computer memory or data are collected in realtime). In our proposed framework,
stochasticity is introduced in two different frameworks as a means to overcome
these computational limitations, and probability distributions that can exploit
structure and/or sparsity are considered. Theoretical results on consistency of
the approximations for both the stochastic Newton and the stochastic
quasiNewton methods are provided. The results show, in particular, that
stochastic Newton iterates, in contrast to stochastic quasiNewton iterates,
may not converge to the desired leastsquares solution. Numerical examples,
including an example from extreme learning machines, demonstrate the potential
applications of these methods.

In the Bayesian approach to inverse problems, data are often informative,
relative to the prior, only on a lowdimensional subspace of the parameter
space. Significant computational savings can be achieved by using this subspace
to characterize and approximate the posterior distribution of the parameters.
We first investigate approximation of the posterior covariance matrix as a
lowrank update of the prior covariance matrix. We prove optimality of a
particular update, based on the leading eigendirections of the matrix pencil
defined by the Hessian of the negative loglikelihood and the prior precision,
for a broad class of loss functions. This class includes the F\"{o}rstner
metric for symmetric positive definite matrices, as well as the
KullbackLeibler divergence and the Hellinger distance between the associated
distributions. We also propose two fast approximations of the posterior mean
and prove their optimality with respect to a weighted Bayes risk under
squarederror loss. These approximations are deployed in an offlineonline
manner, where a more costly but dataindependent offline calculation is
followed by fast online evaluations. As a result, these approximations are
particularly useful when repeated posterior mean evaluations are required for
multiple data sets. We demonstrate our theoretical results with several
numerical examples, including highdimensional Xray tomography and an inverse
heat conduction problem. In both of these examples, the intrinsic
lowdimensional structure of the inference problem can be exploited while
producing results that are essentially indistinguishable from solutions
computed in the full space.

We present a local density estimator based on first order statistics. To
estimate the density at a point, $x$, the original sample is divided into
subsets and the average minimum sample distance to $x$ over all such subsets is
used to define the density estimate at $x$. The tuning parameter is thus the
number of subsets instead of the typical bandwidth of kernel or histogrambased
density estimators. The proposed method is similar to nearestneighbor density
estimators but it provides smoother estimates. We derive the asymptotic
distribution of this minimum sample distance statistic to study globally
optimal values for the number and size of the subsets. Simulations are used to
illustrate and compare the convergence properties of the estimator. The results
show that the method provides good estimates of a wide variety of densities
without changes of the tuning parameter, and that it offers competitive
convergence performance.