• We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm--and the p-value thresholds intrinsic to it--as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to "ban" p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.
  • This chapter will appear in the forthcoming Handbook of Approximate Bayesian Computation (2018). The conceptual and methodological framework that underpins approximate Bayesian computation (ABC) is targetted primarily towards problems in which the likelihood is either challenging or missing. ABC uses a simulation-based non-parametric estimate of the likelihood of a summary statistic and assumes that the generation of data from the model is computationally cheap. This chapter reviews two alternative approaches for estimating the intractable likelihood, with the goal of reducing the necessary model simulations to produce an approximate posterior. The first of these is a Bayesian version of the synthetic likelihood (SL), initially developed by Wood (2010), which uses a multivariate normal approximation to the summary statistic likelihood. Using the parametric approximation as opposed to the non-parametric approximation of ABC, it is possible to reduce the number of model simulations required. The second likelihood approximation method we consider in this chapter is based on the empirical likelihood (EL), which is a non-parametric technique and involves maximising a likelihood constructed empirically under a set of moment constraints. Mengersen et al (2013) adapt the EL framework so that it can be used to form an approximate posterior for problems where ABC can be applied, that is, for models with intractable likelihoods. However, unlike ABC and the Bayesian SL (BSL), the Bayesian EL (BCel) approach can be used to completely avoid model simulations in some cases. The BSL and BCel methods are illustrated on models of varying complexity.
  • A common approach for Bayesian computation with big data is to partition the data into smaller pieces, perform local inference for each piece separately, and finally combine the results to obtain an approximation to the global posterior. Looking at this from the bottom up, one can perform separate analyses on individual sources of data and then combine these in a larger Bayesian model. In either case, the idea of distributed modeling and inference has both conceptual and computational appeal, but from the Bayesian perspective there is no general way of handling the prior distribution: if the prior is included in each separate inference, it will be multiply-counted when the inferences are combined; but if the prior is itself divided into pieces, it may not provide enough regularization for each separate computation, thus eliminating one of the key advantages of Bayesian methods. To resolve this dilemma, we propose expectation propagation (EP) as a general prototype for distributed Bayesian inference. The central idea is to factor the likelihood according to the data partitions, and to iteratively combine each factor with an approximate model of the prior and all other parts of the data, thus producing an overall approximation to the global posterior at convergence. In this paper, we give an introduction to EP and an overview of some recent developments of the method, with particular emphasis on its use in combining inferences from partitioned data. In addition to distributed modeling of large datasets, our unified treatment also includes hierarchical modeling of data with a naturally partitioned structure. The paper describes a general algorithmic framework, rather than a specific algorithm, and presents an example implementation for it.
  • In this paper, we compare two numerical methods for approximating the probability that the sum of dependent regularly varying random variables exceeds a high threshold under Archimedean copula models. The first method is based on conditional Monte Carlo. We present four estimators and show that most of them have bounded relative errors. The second method is based on analytical expressions of the multivariate survival or cumulative distribution functions of the regularly varying random variables and provides sharp and deterministic bounds of the probability of exceedance. We discuss implementation issues and illustrate the accuracy of both procedures through numerical studies.
  • We consider the Voronoi tessellation based on a homogeneous Poisson point process in $\mathbf{R}^{d}$. For a geometric characteristic of the cells (e.g. the inradius, the circumradius, the volume), we investigate the point process of the nuclei of the cells with large values. Conditions are obtained for the convergence in distribution of this point process of exceedances to a homogeneous compound Poisson point process. We provide a characterization of the asymptotic cluster size distribution which is based on the Palm version of the point process of exceedances. This characterization allows us to compute efficiently the values of the extremal index and the cluster size probabilities by simulation for various geometric characteristics. The extension to the Poisson-Delaunay tessellation is also discussed.
  • While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. We study in this paper the implementation and the properties of Jeffreys priors in several mixture settings, show that the associated posterior distributions most often are improper, and then propose a noninformative alternative for the analysis of mixtures.
  • Natural disasters may have considerable impact on society as well as on (re)insurance industry. Max-stable processes are ideally suited for the modeling of the spatial extent of such extreme events, but it is often assumed that there is no temporal dependence. Only a few papers have introduced spatio-temporal max-stable models, extending the Smith, Schlather and Brown-Resnick spatial processes. These models suffer from two major drawbacks: time plays a similar role as space and the temporal dynamics is not explicit. In order to overcome these defects, we introduce spatio-temporal max-stable models where we partly decouple the influence of time and space in their spectral representations. We introduce both continuous and discrete-time versions. We then consider particular Markovian cases with a max-autoregressive representation and discuss their properties. Finally, we briefly propose an inference methodology which is tested through a simulation study.
  • Assume that claims in a portfolio of insurance contracts are described by independent and identically distributed random variables with regularly varying tails and occur according to a near mixed Poisson process. We provide a collection of results pertaining to the joint asymptotic Laplace transforms of the normalized sums of the smallest and largest claims, when the length of the considered time interval tends to infinity. The results crucially depend on the value of the tail index of the claim distribution, as well as on the number of largest claims under consideration.
  • We attempt to trace the history and development of Markov chain Monte Carlo (MCMC) from its early inception in the late 1940s through its use today. We see how the earlier stages of Monte Carlo (MC, not MCMC) research have led to the algorithms currently in use. More importantly, we see how the development of this methodology has not only changed our solutions to problems, but has changed the way we think about problems.
  • In the Bayesian community, an ongoing imperative is to develop efficient algorithms. An appealing approach is to form a hybrid algorithm by combining ideas from competing existing techniques. This paper addresses issues in designing hybrid methods by considering selected case studies: the delayed rejection algorithm, the pinball sampler, the Metropolis adjusted Langevin algorithm, and the population Monte Carlo algorithm. We observe that even if each component of a hybrid algorithm has individual strengths, they may not contribute equally or even positively when they are combined. Moreover, even if the statistical efficiency is improved, from a practical perspective there are technical issues to be considered such as applicability and computational workload. In order to optimize performance of the algorithm in real time, these issues should be taken into account.
  • The development of statistical methods and numerical algorithms for model choice is vital to many real-world applications. In practice, the ABC approach can be instrumental for sequential model design; however, the theoretical basis of its use has been questioned. We present a measure-theoretic framework for using the ABC error towards model choice and describe how easily existing rejection, Metropolis-Hastings and sequential importance sampling ABC algorithms are extended for the purpose of model checking. Considering a panel of applications from evolutionary biology to dynamic systems, we discuss the choice of summaries which differs from standard ABC approaches. The methods and algorithms presented here may provide the workhorse machinery for an exploratory approach to ABC model choice, particularly as the application of standard Bayesian tools can prove impossible.
  • Approximate Bayesian computation (ABC), also known as likelihood-free methods, have become a favourite tool for the analysis of complex stochastic models, primarily in population genetics but also in financial analyses. We advocated in Grelaud et al. (2009) the use of ABC for Bayesian model choice in the specific case of Gibbs random fields (GRF), relying on a sufficiency property mainly enjoyed by GRFs to show that the approach was legitimate. Despite having previously suggested the use of ABC for model choice in a wider range of models in the DIY ABC software (Cornuet et al., 2008), we present theoretical evidence that the general use of ABC for model choice is fraught with danger in the sense that no amount of computation, however large, can guarantee a proper approximation of the posterior probabilities of the models under comparison.
  • The Savage-Dickey ratio is known as a specialised representation of the Bayes factor (O'Hagan and Forster, 2004) that allows for a functional plugging approximation of this quantity. We demonstrate here that the Savage-Dickey representation is in fact a generic representation of the Bayes factor that relies on specific measure-theoretic versions of the densities involved in the ratio, instead of a special identity imposing the above constraints on the prior distributions. We completely clarify the measure-theoretic foundations of the representation as well as the generalisation of Verdinelli and Wasserman (1995) and propose a comparison of this new approximation with their version, as well as with bridge sampling and Chib's approaches.
  • Gibbs random fields (GRF) are polymorphous statistical models that can be used to analyse different types of dependence, in particular for spatially correlated data. However, when those models are faced with the challenge of selecting a dependence structure from many, the use of standard model choice methods is hampered by the unavailability of the normalising constant in the Gibbs likelihood. In particular, from a Bayesian perspective, the computation of the posterior probabilities of the models under competition requires special likelihood-free simulation techniques like the Approximate Bayesian Computation (ABC) algorithm that is intensively used in population genetics. We show in this paper how to implement an ABC algorithm geared towards model choice in the general setting of Gibbs random fields, demonstrating in particular that there exists a sufficient statistic across models. The accuracy of the approximation to the posterior probabilities can be further improved by importance sampling on the distribution of the models. The practical aspects of the method are detailed through two applications, the test of an iid Bernoulli model versus a first-order Markov chain, and the choice of a folding structure for two proteins.
  • In Scott (2002) and Congdon (2006), a new method is advanced to compute posterior probabilities of models under consideration. It is based solely on MCMC outputs restricted to single models, i.e., it is bypassing reversible jump and other model exploration techniques. While it is indeed possible to approximate posterior probabilities based solely on MCMC outputs from single models, as demonstrated by Gelfand and Dey (1994) and Bartolucci et al. (2006), we show that the proposals of Scott (2002) and Congdon (2006) are biased and advance several arguments towards this thesis, the primary one being the confusion between model-based posteriors and joint pseudo-posteriors. From a practical point of view, the bias in Scott's (2002) approximation appears to be much more severe than the one in Congdon's (2006), the later being often of the same magnitude as the posterior probability it approximates, although we also exhibit an example where the divergence from the true posterior probability is extreme.
  • In Chib (1995), a method for approximating marginal densities in a Bayesian setting is proposed, with one proeminent application being the estimation of the number of components in a normal mixture. As pointed out in Neal (1999) and Fruhwirth-Schnatter (2004), the approximation often fails short of providing a proper approximation to the true marginal densities because of the well-known label switching problem (Celeux et al., 2000). While there exist other alternatives to the derivation of approximate marginal densities, we reconsider the original proposal here and show as in Berkhof et al. (2003) and Lee et al. (2008) that it truly approximates the marginal densities once the label switching issue has been solved.
  • Population Monte Carlo has been introduced as a sequential importance sampling technique to overcome poor fit of the importance function. In this paper, we compare the performances of the original Population Monte Carlo algorithm with a modified version that eliminates the influence of the transition particle via a double Rao-Blackwellisation. This modification is shown to improve the exploration of the modes through an large simulation experiment on posterior distributions of mean mixtures of distributions.
  • The k-nearest-neighbour procedure is a well-known deterministic method used in supervised classification. This paper proposes a reassessment of this approach as a statistical technique derived from a proper probabilistic model; in particular, we modify the assessment made in a previous analysis of this method undertaken by Holmes and Adams (2002,2003), and evaluated by Manocha and Girolami (2007), where the underlying probabilistic model is not completely well-defined. Once a clear probabilistic basis for the k-nearest-neighbour procedure is established, we derive computational tools for conducting Bayesian inference on the parameters of the corresponding model. In particular, we assess the difficulties inherent to pseudo-likelihood and to path sampling approximations of an intractable normalising constant, and propose a perfect sampling strategy to implement a correct MCMC sampler associated with our model. If perfect sampling is not available, we suggest using a Gibbs sampling approximation. Illustrations of the performance of the corresponding Bayesian classifier are provided for several benchmark datasets, demonstrating in particular the limitations of the pseudo-likelihood approximation in this set-up.