• We provide a general discussion of Smolyak's algorithm for the acceleration of scientific computations. The algorithm first appeared in Smolyak's work on multidimensional integration and interpolation. Since then, it has been generalized in multiple directions and has been associated with the keywords: sparse grids, hyperbolic cross approximation, combination technique, and multilevel methods. Variants of Smolyak's algorithm have been employed in the computation of high-dimensional integrals in finance, chemistry, and physics, in the numerical solution of partial and stochastic differential equations, and in uncertainty quantification. Motivated by this broad and ever-increasing range of applications, we describe a general framework that summarizes fundamental results and assumptions in a concise application-independent manner.
  • We provide a framework for the sparse approximation of multilinear problems and show that several problems in uncertainty quantification fit within this framework. In these problems, the value of a multilinear map has to be approximated using approximations of different accuracy and computational work of the arguments of this map. We propose and analyze a generalized version of Smolyak's algorithm, which provides sparse approximation formulas with convergence rates that mitigate the curse of dimension that appears in multilinear approximation problems with a large number of arguments. We apply the general framework to response surface approximation and optimization under uncertainty for parametric partial differential equations using kernel-based approximation. The theoretical results are supplemented by numerical experiments.
  • In this work we propose a stochastic model for estimating the occurrence of crack initiations on the surface of metallic specimens in fatigue problems that can be applied to a general class of geometries. The stochastic model is based on spatial Poisson processes with intensity function that combines stress-life (S-N) curves with averaged effective stress, $\sigma_{{\mathrm{eff}}}^{\Delta} (\mathbf{x})$, which is computed after solving numerically the linear elasticity equations on the specimen domains using finite element methods. Here, $\Delta$ is a parameter that characterizes the size of the neighbors covering the domain boundary. The averaged effective stress, parameterized by $\Delta$, maps the stress tensor to a scalar field upon the specimen domain. Data from fatigue experiments on notched and unnotched sheet specimens of 75S-T6 aluminum alloys are used to calibrate the model parameters for the individual data sets and for their combination. Bayesian and classical approaches are applied to estimate the survival-probability function for any specimen tested under a prescribed fatigue experimental setup. Our proposed model can predict the remaining life of specimens from the same material with new geometries.
  • In this work, we present the ensemble-marginalized Kalman filter (EnMKF), a sequential algorithm analogous to our previously proposed approach [1,2], for estimating the state and parameters of linear parabolic partial differential equations in initial-boundary value problems when the boundary data are noisy. We apply EnMKF to infer the thermal properties of building walls and to estimate the corresponding heat flux from real and synthetic data. Compared with a modified Ensemble Kalman Filter (EnKF) that is not marginalized, EnMKF reduces the bias error, avoids the collapse of the ensemble without needing to add inflation, and converges to the mean field posterior using $50\%$ or less of the ensemble size required by EnKF. According to our results, the marginalization technique in EnMKF is key to performance improvement with smaller ensembles at any fixed time.
  • We design and analyse the performance of a multilevel ensemble Kalman filter method (MLEnKF) for filtering settings where the underlying state-space model is an infinite-dimensional spatio-temporal process. We consider underlying models that needs to be simulated by numerical methods, with discretization in both space and time. The multilevel Monte Carlo (MLMC) sampling strategy, achieving variance reduction through pairwise coupling of ensemble particles on neighboring resolutions, is used in the sample-moment step of MLEnKF to produce an efficient hierarchical filtering method for spatio-temporal models. Under sufficient regularity, MLEnKF is proven to be more efficient for weak approximations than EnKF, asymptotically in the large-ensemble and fine-numerical-resolution limit. Numerical examples support our theoretical findings.
  • In calculating expected information gain in optimal Bayesian experimental design, the computation of the inner loop in the classical double-loop Monte Carlo requires a large number of samples and suffers from underflow if the number of samples is small. These drawbacks can be avoided by using an importance sampling approach. We present a computationally efficient method for optimal Bayesian experimental design that introduces importance sampling based on the Laplace method to the inner loop. We derive the optimal values for the method parameters in which the average computational cost is minimized according to the desired error tolerance. We use three numerical examples to demonstrate the computational efficiency of our method compared with the classical double-loop Monte Carlo, and a more recent single-loop Monte Carlo method that uses the Laplace method as an approximation of the return value of the inner loop. The first example is a scalar problem that is linear in the uncertain parameter. The second example is a nonlinear scalar problem. The third example deals with the optimal sensor placement for an electrical impedance tomography experiment to recover the fiber orientation in laminate composites.
  • Weighted least squares polynomial approximation uses random samples to determine projections of functions onto spaces of polynomials. It has been shown that, using an optimal distribution of sample locations, the number of samples required to achieve quasi-optimal approximation in a given polynomial subspace scales, up to a logarithmic factor, linearly in the dimension of this space. However, in many applications, the computation of samples includes a numerical discretization error. Thus, obtaining polynomial approximations with a single level method can become prohibitively expensive, as it requires a sufficiently large number of samples, each computed with a sufficiently small discretization error. As a solution to this problem, we propose a multilevel method that utilizes samples computed with different accuracies and is able to match the accuracy of single-level approximations with reduced computational cost. We derive complexity bounds under certain assumptions about polynomial approximability and sample work. Furthermore, we propose an adaptive algorithm for situations where such assumptions cannot be verified a priori. Finally, we provide an efficient algorithm for the sampling from optimal distributions and an analysis of computationally favorable alternative distributions. Numerical experiments underscore the practical applicability of our method.
  • The assessment of the thermal properties of walls is essential for accurate building energy simulations that are needed to make effective energy-saving policies. These properties are usually investigated through in-situ measurements of temperature and heat flux over extended time periods. The one-dimensional heat equation with unknown Dirichlet boundary conditions is used to model the heat transfer process through the wall. In [F. Ruggeri, Z. Sawlan, M. Scavino, R. Tempone, A hierarchical Bayesian setting for an inverse problem in linear parabolic PDEs with noisy boundary conditions, Bayesian Analysis 12 (2) (2017) 407--433], it was assessed the uncertainty about the thermal diffusivity parameter using different synthetic data sets. In this work, we adapt this methodology to an experimental study conducted in an environmental chamber, with measurements recorded every minute from temperature probes and heat flux sensors placed on both sides of a solid brick wall over a five-day period. The observed time series are locally averaged, according to a smoothing procedure determined by the solution of a criterion function optimization problem, to fit the required set of noise model assumptions. Therefore, after preprocessing, we can reasonably assume that the temperature and the heat flux measurements have stationary Gaussian noise and we can avoid working with full covariance matrices. The results show that our technique reduces the bias error of the estimated parameters when compared to other approaches. Finally, we compute the information gain under two experimental setups to recommend how the user can efficiently determine the duration of the measurement campaign and the range of the external temperature oscillation.
  • This work addresses the problem of pricing American basket options in a multivariate setting, which includes among others, the Bachelier and the Black-Scholes models. In high dimensions, nonlinear partial differential equation methods for solving the problem become prohibitively costly due to the curse of dimensionality. Instead, this work proposes to use a stopping rule that depends on the dynamics of a low-dimensional Markovian projection of the given basket of assets. It is shown that the ability to approximate the original value function by a lower-dimensional approximation is a feature of the dynamics of the system and is unaffected by the path-dependent nature of the American basket option. Assuming that we know the density of the forward process and using the Laplace approximation, we first efficiently evaluate the diffusion coefficient corresponding to the low-dimensional Markovian projection of the basket. Then, we approximate the optimal early-exercise boundary of the option by solving a Hamilton-Jacobi-Bellman partial differential equation in the projected, low-dimensional space. The resulting near-optimal early-exercise boundary is used to produce an exercise strategy for the high-dimensional option, thereby providing a lower bound for the price of the American basket option. A corresponding upper bound is also provided. These bounds allow to assess the accuracy of the proposed pricing method. Indeed, our approximate early-exercise strategy provides a straightforward lower bound for the American basket option price. Following a duality argument due to Rogers, we derive a corresponding upper bound solving only the low-dimensional optimal control problem. Numerically, we show the feasibility of the method using baskets with dimensions up to fifty. In these examples, the resulting option price relative errors are only of the order of few percent.
  • The sum of Log-normal variates is encountered in many challenging applications such as in performance analysis of wireless communication systems and in financial engineering. Several approximation methods have been developed in the literature, the accuracy of which is not ensured in the tail regions. These regions are of primordial interest wherein small probability values have to be evaluated with high precision. Variance reduction techniques are known to yield accurate, yet efficient, estimates of small probability values. Most of the existing approaches, however, have considered the problem of estimating the right-tail of the sum of Log-normal random variables (RVS). In the present work, we consider instead the estimation of the left-tail of the sum of correlated Log-normal variates with Gaussian copula under a mild assumption on the covariance matrix. We propose an estimator combining an existing mean-shifting importance sampling approach with a control variate technique. The main result is that the proposed estimator has an asymptotically vanishing relative error which represents a major finding in the context of the left-tail simulation of the sum of Log-normal RVs. Finally, we assess by various simulation results the performances of the proposed estimator compared to existing estimators.
  • We address the approximation of functionals depending on a system of particles, described by stochastic differential equations (SDEs), in the mean-field limit when the number of particles approaches infinity. This problem is equivalent to estimating the weak solution of the limiting McKean-Vlasov SDE. To that end, our approach uses systems with finite numbers of particles and a time-stepping scheme. In this case, there are two discretization parameters: the number of time steps and the number of particles. Based on these two parameters, we consider different variants of the Monte Carlo and Multilevel Monte Carlo (MLMC) methods and show that, in the best case, the optimal work complexity of MLMC, to estimate the functional in one typical setting with an error tolerance of $\mathrm{TOL}$, is $\mathcal O\left({\mathrm{TOL}^{-3}}\right)$ when using the partitioning estimator and the Milstein time-stepping scheme. We also consider a method that uses the recent Multi-index Monte Carlo method and show an improved work complexity in the same typical setting of $\mathcal O\left(\mathrm{TOL}^{-2}\log(\mathrm{TOL}^{-1})^2\right)$. Our numerical experiments are carried out on the so-called Kuramoto model, a system of coupled oscillators.
  • We consider the problem of pricing basket options in a multivariate Black Scholes or Variance Gamma model. From a numerical point of view, pricing such options corresponds to moderate and high dimensional numerical integration problems with non-smooth integrands. Due to this lack of regularity, higher order numerical integration techniques may not be directly available, requiring the use of methods like Monte Carlo specifically designed to work for non-regular problems. We propose to use the inherent smoothing property of the density of the underlying in the above models to mollify the payoff function by means of an exact conditional expectation. The resulting conditional expectation is unbiased and yields a smooth integrand, which is amenable to the efficient use of adaptive sparse grid cubature. Numerical examples indicate that the high-order method may perform orders of magnitude faster compared to Monte Carlo or Quasi Monte Carlo in dimensions up to 35.
  • In the following article we consider approximate Bayesian computation (ABC) inference. We introduce a method for numerically approximating ABC posteriors using the multilevel Monte Carlo (MLMC). A sequential Monte Carlo version of the approach is developed and it is shown under some assumptions that for a given level of mean square error, this method for ABC has a lower cost than i.i.d. sampling from the most accurate ABC approximation. Several numerical examples are given.
  • This work embeds a multilevel Monte Carlo (MLMC) sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF), thereby yielding a multilevel ensemble Kalman filter (MLEnKF) which has provably superior asymptotic cost to a given accuracy level. The development of MLEnKF for finite-dimensional state-spaces in the work [20] is here extended to models with infinite-dimensional state- spaces in the form of spatial fields. A concrete example is given to illustrate the results.
  • We derive computable error estimates for finite element approximations of linear elliptic partial differential equations (PDE) with rough stochastic coefficients. In this setting, the exact solutions contain high frequency content that standard a posteriori error estimates fail to capture. We propose goal-oriented estimates, based on local error indicators, for the pathwise Galerkin and expected quadrature errors committed in standard, continuous, piecewise linear finite element approximations. Derived using easily validated assumptions, these novel estimates can be computed at a relatively low cost and have applications to subsurface flow problems in geophysics where the conductivities are assumed to have lognormal distributions with low regularity. Our theory is supported by numerical experiments on test problems in one and two dimensions.
  • We analyze the recent Multi-index Stochastic Collocation (MISC) method for computing statistics of the solution of a partial differential equation (PDEs) with random data, where the random coefficient is parametrized by means of a countable sequence of terms in a suitable expansion. MISC is a combination technique based on mixed differences of spatial approximations and quadratures over the space of random data and, naturally, the error analysis uses the joint regularity of the solution with respect to both the variables in the physical domain and parametric variables. In MISC, the number of problem solutions performed at each discretization level is not determined by balancing the spatial and stochastic components of the error, but rather by suitably extending the knapsack-problem approach employed in the construction of the quasi-optimal sparse-grids and Multi-index Monte Carlo methods. We use a greedy optimization procedure to select the most effective mixed differences to include in the MISC estimator. We apply our theoretical estimates to a linear elliptic PDEs in which the log-diffusion coefficient is modeled as a random field, with a covariance similar to a Mat\'ern model, whose realizations have spatial regularity determined by a scalar parameter. We conduct a complexity analysis based on a summability argument showing algebraic rates of convergence with respect to the overall computational work. The rate of convergence depends on the smoothness parameter, the physical dimensionality and the efficiency of the linear solver. Numerical experiments show the effectiveness of MISC in this infinite-dimensional setting compared with the Multi-index Monte Carlo method and compare the convergence rate against the rates predicted in our theoretical analysis.
  • In this work we introduce the Multi-Index Stochastic Collocation method (MISC) for computing statistics of the solution of a PDE with random data. MISC is a combination technique based on mixed differences of spatial approximations and quadratures over the space of random data. We propose an optimization procedure to select the most effective mixed differences to include in the MISC estimator: such optimization is a crucial step and allows us to build a method that, provided with sufficient solution regularity, is potentially more effective than other multi-level collocation methods already available in literature. We then provide a complexity analysis that assumes decay rates of product type for such mixed differences, showing that in the optimal case the convergence rate of MISC is only dictated by the convergence of the deterministic solver applied to a one dimensional problem. We show the effectiveness of MISC with some computational tests, comparing it with other related methods available in the literature, such as the Multi-Index and Multilevel Monte Carlo, Multilevel Stochastic Collocation, Quasi Optimal Stochastic Collocation and Sparse Composite Collocation methods.
  • This work embeds a multilevel Monte Carlo sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF) in the setting of finite dimensional signal evolution and noisy discrete-time observations. The signal dynamics is assumed to be governed by a stochastic differential equation (SDE), and a hierarchy of time grids is introduced for multilevel numerical integration of that SDE. The resulting multilevel EnKF is proved to asymptotically outperform EnKF in terms of computational cost versus approximation accuracy. The theoretical results are illustrated numerically.
  • The proof of convergence of the standard ensemble Kalman filter (EnKF) from Legland etal. (2011) is extended to non-Gaussian state space models. A density-based deterministic approximation of the mean-field limit EnKF (DMFEnKF) is proposed, consisting of a PDE solver and a quadrature rule. Given a certain minimal order of convergence $\kappa$ between the two, this extends to the deterministic filter approximation, which is therefore asymptotically superior to standard EnKF when the dimension $d<2\kappa$. The fidelity of approximation of the true distribution is also established using an extension of total variation metric to random measures. This is limited by a Gaussian bias term arising from non-linearity/non-Gaussianity of the model, which exists for both DMFEnKF and standard EnKF. Numerical results support and extend the theory.
  • In biochemically reactive systems with small copy numbers of one or more reactant molecules, the dynamics is dominated by stochastic effects. To approximate those systems, discrete state-space and stochastic simulation approaches have been shown to be more relevant than continuous state-space and deterministic ones. In systems characterized by having simultaneously fast and slow timescales, existing discrete space-state stochastic path simulation methods, such as the stochastic simulation algorithm (SSA) and the explicit tau-leap (Explicit-TL) method, can be very slow. Implicit approximations have been developed to improve numerical stability and provide efficient simulation algorithms for those systems. Here, we propose an efficient Multilevel Monte Carlo (MLMC) method in the spirit of the work by Anderson and Higham (2012). This method uses split-step implicit tau-leap (SSI-TL) at levels where the SSI-TL method is not applicable due to numerical stability issues. We present numerical examples that illustrate the performance of the proposed method.
  • In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
  • We provide a bound for the error committed when using a Fourier method to price European options when the underlying follows an exponential \levy dynamic. The price of the option is described by a partial integro-differential equation (PIDE). Applying a Fourier transformation to the PIDE yields an ordinary differential equation that can be solved analytically in terms of the characteristic exponent of the \levy process. Then, a numerical inverse Fourier transform allows us to obtain the option price. We present a novel bound for the error and use this bound to set the parameters for the numerical method. We analyse the properties of the bound for a dissipative and pure-jump example. The bound presented is independent of the asymptotic behaviour of option prices at extreme asset prices. The error bound can be decomposed into a product of terms resulting from the dynamics and the option payoff, respectively. The analysis is supplemented by numerical examples that demonstrate results comparable to and superior to the existing literature.
  • We consider the wave equation with highly oscillatory initial data, where there is uncertainty in the wave speed, initial phase and/or initial amplitude. To estimate quantities of interest related to the solution and their statistics, we combine a high-frequency method based on Gaussian beams with sparse stochastic collocation. Although the wave solution, $u^\varepsilon$, is highly oscillatory in both physical and stochastic spaces, we provide theoretical arguments and numerical evidence that quantities of interest based on local averages of $|u^\varepsilon|^2$ are smooth, with derivatives in the stochastic space uniformly bounded in $\varepsilon$, where $\varepsilon$ denotes the short wavelength. This observable related regularity makes the sparse stochastic collocation approach more efficient than Monte Carlo methods. We present numerical tests that demonstrate this advantage.
  • A formal mean square error expansion (MSE) is derived for Euler--Maruyama numerical solutions of stochastic differential equations (SDE). The error expansion is used to construct a pathwise a posteriori adaptive time stepping Euler--Maruyama method for numerical solutions of SDE, and the resulting method is incorporated into a multilevel Monte Carlo (MLMC) method for weak approximations of SDE. This gives an efficient MSE adaptive MLMC method for handling a number of low-regularity approximation problems. In low-regularity numerical example problems, the developed adaptive MLMC method is shown to outperform the uniform time stepping MLMC method by orders of magnitude, producing output whose error with high probability is bounded by TOL>0 at the near-optimal MLMC cost rate O(TOL^{-2}log(TOL)^4).
  • In this work, we present an extension to the context of Stochastic Reaction Networks (SRNs) of the forward-reverse representation introduced in "Simulation of forward-reverse stochastic representations for conditional diffusions", a 2014 paper by Bayer and Schoenmakers. We apply this stochastic representation in the computation of efficient approximations of expected values of functionals of SNR bridges, i.e., SRNs conditioned to its values in the extremes of given time-intervals. We then employ this SNR bridge-generation technique to the statistical inference problem of approximating the reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate Ordinary Differential Equations (ODEs) approximation; then, during phase II, we apply the Monte Carlo version of the Expectation-Maximization (EM) algorithm starting from the phase I output. By selecting a set of over dispersed seeds as initial points for phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are illustrated by numerical examples.