
We discuss problems the null hypothesis significance testing (NHST) paradigm
poses for replication and more broadly in the biomedical and social sciences as
well as how these problems remain unresolved by proposals involving modified
pvalue thresholds, confidence intervals, and Bayes factors. We then discuss
our own proposal, which is to abandon statistical significance. We recommend
dropping the NHST paradigmand the pvalue thresholds intrinsic to itas the
default statistical paradigm for research, publication, and discovery in the
biomedical and social sciences. Specifically, we propose that the pvalue be
demoted from its threshold screening role and instead, treated continuously, be
considered along with currently subordinate factors (e.g., related prior
evidence, plausibility of mechanism, study design and data quality, real world
costs and benefits, novelty of finding, and other factors that vary by research
domain) as just one among many pieces of evidence. We have no desire to "ban"
pvalues or other purely statistical measures. Rather, we believe that such
measures should not be thresholded and that, thresholded or not, they should
not take priority over the currently subordinate factors. We also argue that it
seldom makes sense to calibrate evidence as a function of pvalues or other
purely statistical measures. We offer recommendations for how our proposal can
be implemented in the scientific publication process as well as in statistical
decision making more broadly.

This chapter will appear in the forthcoming Handbook of Approximate Bayesian
Computation (2018).
The conceptual and methodological framework that underpins approximate
Bayesian computation (ABC) is targetted primarily towards problems in which the
likelihood is either challenging or missing. ABC uses a simulationbased
nonparametric estimate of the likelihood of a summary statistic and assumes
that the generation of data from the model is computationally cheap. This
chapter reviews two alternative approaches for estimating the intractable
likelihood, with the goal of reducing the necessary model simulations to
produce an approximate posterior. The first of these is a Bayesian version of
the synthetic likelihood (SL), initially developed by Wood (2010), which uses a
multivariate normal approximation to the summary statistic likelihood. Using
the parametric approximation as opposed to the nonparametric approximation of
ABC, it is possible to reduce the number of model simulations required. The
second likelihood approximation method we consider in this chapter is based on
the empirical likelihood (EL), which is a nonparametric technique and involves
maximising a likelihood constructed empirically under a set of moment
constraints. Mengersen et al (2013) adapt the EL framework so that it can be
used to form an approximate posterior for problems where ABC can be applied,
that is, for models with intractable likelihoods. However, unlike ABC and the
Bayesian SL (BSL), the Bayesian EL (BCel) approach can be used to completely
avoid model simulations in some cases. The BSL and BCel methods are illustrated
on models of varying complexity.

A common approach for Bayesian computation with big data is to partition the
data into smaller pieces, perform local inference for each piece separately,
and finally combine the results to obtain an approximation to the global
posterior. Looking at this from the bottom up, one can perform separate
analyses on individual sources of data and then combine these in a larger
Bayesian model. In either case, the idea of distributed modeling and inference
has both conceptual and computational appeal, but from the Bayesian perspective
there is no general way of handling the prior distribution: if the prior is
included in each separate inference, it will be multiplycounted when the
inferences are combined; but if the prior is itself divided into pieces, it may
not provide enough regularization for each separate computation, thus
eliminating one of the key advantages of Bayesian methods. To resolve this
dilemma, we propose expectation propagation (EP) as a general prototype for
distributed Bayesian inference. The central idea is to factor the likelihood
according to the data partitions, and to iteratively combine each factor with
an approximate model of the prior and all other parts of the data, thus
producing an overall approximation to the global posterior at convergence. In
this paper, we give an introduction to EP and an overview of some recent
developments of the method, with particular emphasis on its use in combining
inferences from partitioned data. In addition to distributed modeling of large
datasets, our unified treatment also includes hierarchical modeling of data
with a naturally partitioned structure. The paper describes a general
algorithmic framework, rather than a specific algorithm, and presents an
example implementation for it.

In this paper, we compare two numerical methods for approximating the
probability that the sum of dependent regularly varying random variables
exceeds a high threshold under Archimedean copula models. The first method is
based on conditional Monte Carlo. We present four estimators and show that most
of them have bounded relative errors. The second method is based on analytical
expressions of the multivariate survival or cumulative distribution functions
of the regularly varying random variables and provides sharp and deterministic
bounds of the probability of exceedance. We discuss implementation issues and
illustrate the accuracy of both procedures through numerical studies.

We consider the Voronoi tessellation based on a homogeneous Poisson point
process in $\mathbf{R}^{d}$. For a geometric characteristic of the cells (e.g.
the inradius, the circumradius, the volume), we investigate the point process
of the nuclei of the cells with large values. Conditions are obtained for the
convergence in distribution of this point process of exceedances to a
homogeneous compound Poisson point process. We provide a characterization of
the asymptotic cluster size distribution which is based on the Palm version of
the point process of exceedances. This characterization allows us to compute
efficiently the values of the extremal index and the cluster size probabilities
by simulation for various geometric characteristics. The extension to the
PoissonDelaunay tessellation is also discussed.

While Jeffreys priors usually are welldefined for the parameters of mixtures
of distributions, they are not available in closed form. Furthermore, they
often are improper priors. Hence, they have never been used to draw inference
on the mixture parameters. We study in this paper the implementation and the
properties of Jeffreys priors in several mixture settings, show that the
associated posterior distributions most often are improper, and then propose a
noninformative alternative for the analysis of mixtures.

Natural disasters may have considerable impact on society as well as on
(re)insurance industry. Maxstable processes are ideally suited for the
modeling of the spatial extent of such extreme events, but it is often assumed
that there is no temporal dependence. Only a few papers have introduced
spatiotemporal maxstable models, extending the Smith, Schlather and
BrownResnick spatial processes. These models suffer from two major drawbacks:
time plays a similar role as space and the temporal dynamics is not explicit.
In order to overcome these defects, we introduce spatiotemporal maxstable
models where we partly decouple the influence of time and space in their
spectral representations. We introduce both continuous and discretetime
versions. We then consider particular Markovian cases with a maxautoregressive
representation and discuss their properties. Finally, we briefly propose an
inference methodology which is tested through a simulation study.

Assume that claims in a portfolio of insurance contracts are described by
independent and identically distributed random variables with regularly varying
tails and occur according to a near mixed Poisson process. We provide a
collection of results pertaining to the joint asymptotic Laplace transforms of
the normalized sums of the smallest and largest claims, when the length of the
considered time interval tends to infinity. The results crucially depend on the
value of the tail index of the claim distribution, as well as on the number of
largest claims under consideration.

We attempt to trace the history and development of Markov chain Monte Carlo
(MCMC) from its early inception in the late 1940s through its use today. We see
how the earlier stages of Monte Carlo (MC, not MCMC) research have led to the
algorithms currently in use. More importantly, we see how the development of
this methodology has not only changed our solutions to problems, but has
changed the way we think about problems.

In the Bayesian community, an ongoing imperative is to develop efficient
algorithms. An appealing approach is to form a hybrid algorithm by combining
ideas from competing existing techniques. This paper addresses issues in
designing hybrid methods by considering selected case studies: the delayed
rejection algorithm, the pinball sampler, the Metropolis adjusted Langevin
algorithm, and the population Monte Carlo algorithm. We observe that even if
each component of a hybrid algorithm has individual strengths, they may not
contribute equally or even positively when they are combined. Moreover, even if
the statistical efficiency is improved, from a practical perspective there are
technical issues to be considered such as applicability and computational
workload. In order to optimize performance of the algorithm in real time, these
issues should be taken into account.

The development of statistical methods and numerical algorithms for model
choice is vital to many realworld applications. In practice, the ABC approach
can be instrumental for sequential model design; however, the theoretical basis
of its use has been questioned. We present a measuretheoretic framework for
using the ABC error towards model choice and describe how easily existing
rejection, MetropolisHastings and sequential importance sampling ABC
algorithms are extended for the purpose of model checking. Considering a panel
of applications from evolutionary biology to dynamic systems, we discuss the
choice of summaries which differs from standard ABC approaches. The methods and
algorithms presented here may provide the workhorse machinery for an
exploratory approach to ABC model choice, particularly as the application of
standard Bayesian tools can prove impossible.

Approximate Bayesian computation (ABC), also known as likelihoodfree
methods, have become a favourite tool for the analysis of complex stochastic
models, primarily in population genetics but also in financial analyses. We
advocated in Grelaud et al. (2009) the use of ABC for Bayesian model choice in
the specific case of Gibbs random fields (GRF), relying on a sufficiency
property mainly enjoyed by GRFs to show that the approach was legitimate.
Despite having previously suggested the use of ABC for model choice in a wider
range of models in the DIY ABC software (Cornuet et al., 2008), we present
theoretical evidence that the general use of ABC for model choice is fraught
with danger in the sense that no amount of computation, however large, can
guarantee a proper approximation of the posterior probabilities of the models
under comparison.

The SavageDickey ratio is known as a specialised representation of the Bayes
factor (O'Hagan and Forster, 2004) that allows for a functional plugging
approximation of this quantity. We demonstrate here that the SavageDickey
representation is in fact a generic representation of the Bayes factor that
relies on specific measuretheoretic versions of the densities involved in the
ratio, instead of a special identity imposing the above constraints on the
prior distributions. We completely clarify the measuretheoretic foundations of
the representation as well as the generalisation of Verdinelli and Wasserman
(1995) and propose a comparison of this new approximation with their version,
as well as with bridge sampling and Chib's approaches.

Gibbs random fields (GRF) are polymorphous statistical models that can be
used to analyse different types of dependence, in particular for spatially
correlated data. However, when those models are faced with the challenge of
selecting a dependence structure from many, the use of standard model choice
methods is hampered by the unavailability of the normalising constant in the
Gibbs likelihood. In particular, from a Bayesian perspective, the computation
of the posterior probabilities of the models under competition requires special
likelihoodfree simulation techniques like the Approximate Bayesian Computation
(ABC) algorithm that is intensively used in population genetics. We show in
this paper how to implement an ABC algorithm geared towards model choice in the
general setting of Gibbs random fields, demonstrating in particular that there
exists a sufficient statistic across models. The accuracy of the approximation
to the posterior probabilities can be further improved by importance sampling
on the distribution of the models. The practical aspects of the method are
detailed through two applications, the test of an iid Bernoulli model versus a
firstorder Markov chain, and the choice of a folding structure for two
proteins.

In Scott (2002) and Congdon (2006), a new method is advanced to compute
posterior probabilities of models under consideration. It is based solely on
MCMC outputs restricted to single models, i.e., it is bypassing reversible jump
and other model exploration techniques. While it is indeed possible to
approximate posterior probabilities based solely on MCMC outputs from single
models, as demonstrated by Gelfand and Dey (1994) and Bartolucci et al. (2006),
we show that the proposals of Scott (2002) and Congdon (2006) are biased and
advance several arguments towards this thesis, the primary one being the
confusion between modelbased posteriors and joint pseudoposteriors. From a
practical point of view, the bias in Scott's (2002) approximation appears to be
much more severe than the one in Congdon's (2006), the later being often of the
same magnitude as the posterior probability it approximates, although we also
exhibit an example where the divergence from the true posterior probability is
extreme.

In Chib (1995), a method for approximating marginal densities in a Bayesian
setting is proposed, with one proeminent application being the estimation of
the number of components in a normal mixture. As pointed out in Neal (1999) and
FruhwirthSchnatter (2004), the approximation often fails short of providing a
proper approximation to the true marginal densities because of the wellknown
label switching problem (Celeux et al., 2000). While there exist other
alternatives to the derivation of approximate marginal densities, we reconsider
the original proposal here and show as in Berkhof et al. (2003) and Lee et al.
(2008) that it truly approximates the marginal densities once the label
switching issue has been solved.

Population Monte Carlo has been introduced as a sequential importance
sampling technique to overcome poor fit of the importance function. In this
paper, we compare the performances of the original Population Monte Carlo
algorithm with a modified version that eliminates the influence of the
transition particle via a double RaoBlackwellisation. This modification is
shown to improve the exploration of the modes through an large simulation
experiment on posterior distributions of mean mixtures of distributions.

The knearestneighbour procedure is a wellknown deterministic method used
in supervised classification. This paper proposes a reassessment of this
approach as a statistical technique derived from a proper probabilistic model;
in particular, we modify the assessment made in a previous analysis of this
method undertaken by Holmes and Adams (2002,2003), and evaluated by Manocha and
Girolami (2007), where the underlying probabilistic model is not completely
welldefined. Once a clear probabilistic basis for the knearestneighbour
procedure is established, we derive computational tools for conducting Bayesian
inference on the parameters of the corresponding model. In particular, we
assess the difficulties inherent to pseudolikelihood and to path sampling
approximations of an intractable normalising constant, and propose a perfect
sampling strategy to implement a correct MCMC sampler associated with our
model. If perfect sampling is not available, we suggest using a Gibbs sampling
approximation. Illustrations of the performance of the corresponding Bayesian
classifier are provided for several benchmark datasets, demonstrating in
particular the limitations of the pseudolikelihood approximation in this
setup.