
Stochastic compartmental models are important tools for understanding the
course of infectious diseases epidemics in populations and in prospective
evaluation of intervention policies. However, calculating the likelihood for
discretely observed data from even simple models  such as the ubiquitous
susceptibleinfectiousremoved (SIR) model  has been considered
computationally intractable, since its formulation almost a century ago.
Recently researchers have proposed methods to circumvent this limitation
through data augmentation or approximation, but these approaches often suffer
from high computational cost or loss of accuracy. We develop the mathematical
foundation and an efficient algorithm to compute the likelihood for discretely
observed data from a broad class of stochastic compartmental models. We also
give expressions for the derivatives of the transition probabilities using the
same technique, making possible inference via Hamiltonian Monte Carlo (HMC). We
use the 17th century plague in Eyam, a classic example of the SIR model, to
compare our recursion method to sequential Monte Carlo, analyze using HMC, and
assess the model assumptions. We also apply our direct likelihood evaluation to
perform Bayesian inference for the 20142015 Ebola outbreak in Guinea. The
results suggest that the epidemic infectious rates have decreased since October
2014 in the Southeast region of Guinea, while rates remain the same in other
regions, facilitating understanding of the outbreak and the effectiveness of
Ebola control interventions.

Concerns over reproducibility in science extend to research using existing
healthcare data; many observational studies investigating the same topic
produce conflicting results, even when using the same data. To address this
problem, we propose a paradigm shift. The current paradigm centers on
generating one estimate at a time using a unique study design with unknown
reliability and publishing (or not) one estimate at a time. The new paradigm
advocates for highthroughput observational studies using consistent and
standardized methods, allowing evaluation, calibration, and unbiased
dissemination to generate a more reliable and complete evidence base. We
demonstrate this new paradigm by comparing all depression treatments for a set
of outcomes, producing 17,718 hazard ratios, each using methodology on par with
stateoftheart studies. We furthermore include control hypotheses to evaluate
and calibrate our evidence generation process. Results show good transitivity
and consistency between databases, and agree with four out of the five findings
from clinical trials. The distribution of effect size estimates reported in
literature reveals an absence of small or null effects, with a sharp cutoff at
p = 0.05. No such phenomena were observed in our results, suggesting more
complete and more reliable evidence.

It is common in phylogenetics to have some, perhaps partial, information
about the overall evolutionary tree of a group of organisms and wish to find an
evolutionary tree of a specific gene for those organisms. There may not be
enough information in the gene sequences alone to accurately reconstruct the
correct "gene tree." Although the gene tree may deviate from the "species tree"
due to a variety of genetic processes, in the absence of evidence to the
contrary it is parsimonious to assume that they agree. A common statistical
approach in these situations is to develop a likelihood penalty to incorporate
such additional information. Recent studies using simulation and empirical data
suggest that a likelihood penalty quantifying concordance with a species tree
can significantly improve the accuracy of gene tree reconstruction compared to
using sequence data alone. However, the consistency of such an approach has not
yet been established, nor have convergence rates been bounded. Because
phylogenetics is a nonstandard inference problem, the standard theory does not
apply. In this paper, we propose a penalized maximum likelihood estimator for
gene tree reconstruction, where the penalty is the square of the
BilleraHolmesVogtmann geodesic distance from the gene tree to the species
tree. We prove that this method is consistent, and derive its convergence rate
for estimating the discrete gene tree structure and continuous edge lengths
(representing the amount of evolution that has occurred on that branch)
simultaneously. We find that the regularized estimator is "adaptive fast
converging," meaning that it can reconstruct all edges of length greater than
any given threshold from gene sequences of polynomial length. Our method does
not require the species tree to be known exactly; in fact, our asymptotic
theory holds for any such guide tree.

Birthdeath processes track the size of a univariate population, but many
biological systems involve interaction between populations, necessitating
models for two or more populations simultaneously. A lack of efficient methods
for evaluating finitetime transition probabilities of bivariate processes,
however, has restricted statistical inference in these models. Researchers rely
on computationally expensive methods such as matrix exponentiation or Monte
Carlo approximation, restricting likelihoodbased inference to small systems,
or indirect methods such as approximate Bayesian computation. In this paper, we
introduce the birth(death)/birthdeath process, a tractable bivariate extension
of the birthdeath process. We develop an efficient and robust algorithm to
calculate the transition probabilities of birth(death)/birthdeath processes
using a continued fraction representation of their Laplace transforms. Next, we
identify several exemplary models arising in molecular epidemiology,
macroparasite evolution, and infectious disease modeling that fall within this
class, and demonstrate advantages of our proposed method over existing
approaches to inference in these models. Notably, the ubiquitous stochastic
susceptibleinfectiousremoved (SIR) model falls within this class, and we
emphasize that computable transition probabilities newly enable direct
inference of parameters in the SIR model. We also propose a very fast method
for approximating the transition probabilities under the SIR model via a novel
branching process simplification, and compare it to the continued fraction
representation method with application to the 17th century plague in Eyam.
Although the two methods produce similar maximum a posteriori estimates, the
branching process approximation fails to capture the correlation structure in
the joint posterior distribution.

Phylogenetic comparative methods explore the relationships between
quantitative traits adjusting for shared evolutionary history. This adjustment
often occurs through a Brownian diffusion process along the branches of the
phylogeny that generates model residuals or the traits themselves. For
highdimensional traits, inferring all pairwise correlations within the
multivariate diffusion is limiting. To circumvent this problem, we propose
phylogenetic factor analysis (PFA) that assumes a small unknown number of
independent evolutionary factors arise along the phylogeny and these factors
generate clusters of dependent traits. Set in a Bayesian framework, PFA
provides measures of uncertainty on the factor number and groupings, combines
both continuous and discrete traits, integrates over missing measurements and
incorporates phylogenetic uncertainty with the help of molecular sequences. We
develop Gibbs samplers based on dynamic programming to estimate the PFA
posterior distribution, over threefold faster than for multivariate diffusion
and a further orderofmagnitude more efficiently in the presence of latent
traits. We further propose a novel marginal likelihood estimator for previously
impractical models with discrete data and find that PFA also provides a better
fit than multivariate diffusion in evolutionary questions in columbine flower
development, placental reproduction transitions and triggerfish fin
morphometry.

Effective population size characterizes the genetic variability in a
population and is a parameter of paramount importance in population genetics.
Kingman's coalescent process enables inference of past population dynamics
directly from molecular sequence data, and researchers have developed a number
of flexible coalescentbased models for Bayesian nonparametric estimation of
the effective population size as a function of time. A major goal of
demographic reconstruction is understanding the association between the
effective population size and potential explanatory factors. Building upon
Bayesian nonparametric coalescentbased approaches, we introduce a flexible
framework that incorporates timevarying covariates through Gaussian Markov
random fields. To approximate the posterior distribution, we adapt efficient
Markov chain Monte Carlo algorithms designed for highly structured Gaussian
models. Incorporating covariates into the demographic inference framework
enables the modeling of associations between the effective population size and
covariates while accounting for uncertainty in population histories.
Furthermore, it can lead to more precise estimates of population dynamics. We
apply our model to four examples. We reconstruct the demographic history of
raccoon rabies in North America and find a significant association with the
spatiotemporal spread of the outbreak. Next, we examine the effective
population size trajectory of the DENV4 virus in Puerto Rico along with viral
isolate count data and find similar cyclic patterns. We compare the population
history of the HIV1 CRF02_AG clade in Cameroon with HIV incidence and
prevalence data and find that the effective population size is more reflective
of incidence rate. Finally, we explore the hypothesis that the population
dynamics of musk ox during the Late Quaternary period were related to climate
change.

Understanding the processes that give rise to quantitative measurements
associated with molecular sequence data remains an important issue in
statistical phylogenetics. Examples of such measurements include geographic
coordinates in the context of phylogeography and phenotypic traits in the
context of comparative studies. A popular approach is to model the evolution of
continuously varying traits as a Brownian diffusion process. However, standard
Brownian diffusion is quite restrictive and may not accurately characterize
certain trait evolutionary processes. Here, we relax one of the major
restrictions of standard Brownian diffusion by incorporating a nontrivial
estimable drift into the process. We introduce a relaxed drift diffusion model
for the evolution of multivariate continuously varying traits along a
phylogenetic tree via Brownian diffusion with drift. Notably, the relaxed drift
model accommodates branchspecific variation of drift rates while preserving
model identifiability. We implement the relaxed drift model in a Bayesian
inference framework to simultaneously reconstruct the evolutionary histories of
molecular sequence data and associated multivariate continuous trait data, and
provide tools to visualize evolutionary reconstructions. We illustrate our
approach in three viral examples. In the first two, we examine the
spatiotemporal spread of HIV1 in central Africa and West Nile virus in North
America and show that a relaxed drift approach uncovers a clearer, more
detailed picture of the dynamics of viral dispersal than standard Brownian
diffusion. Finally, we study antigenic evolution in the context of HIV1
resistance to three broadly neutralizing antibodies. Our analysis reveals
evidence of a continuous drift at the HIV1 population level towards enhanced
resistance to neutralization by the VRC01 monoclonal antibody over the course
of the epidemic.

Phylodynamics seeks to estimate effective population size fluctuations from
molecular sequences of individuals sampled from a population of interest. One
way to accomplish this task formulates an observed sequence data likelihood
exploiting a coalescent model for the sampled individuals' genealogy and then
integrating over all possible genealogies via Monte Carlo or, less efficiently,
by conditioning on one genealogy estimated from the sequence data. However,
when analyzing sequences sampled serially through time, current methods
implicitly assume either that sampling times are fixed deterministically by the
data collection protocol or that their distribution does not depend on the size
of the population. Through simulation, we first show that, when sampling times
do probabilistically depend on effective population size, estimation methods
may be systematically biased. To correct for this deficiency, we propose a new
model that explicitly accounts for preferential sampling by modeling the
sampling times as an inhomogeneous Poisson process dependent on effective
population size. We demonstrate that in the presence of preferential sampling
our new model not only reduces bias, but also improves estimation precision.
Finally, we compare the performance of the currently used phylodynamic methods
with our proposed model through clinicallyrelevant, seasonal human influenza
examples.

Understanding which phenotypic traits are consistently correlated throughout
evolution is a highly pertinent problem in modern evolutionary biology. Here,
we propose a multivariate phylogenetic latent liability model for assessing the
correlation between multiple types of data, while simultaneously controlling
for their unknown shared evolutionary history informed through molecular
sequences. The latent formulation enables us to consider in a single model
combinations of continuous traits, discrete binary traits and discrete traits
with multiple ordered and unordered states. Previous approaches have
entertained a single data type generally along a fixed history, precluding
estimation of correlation between traits and ignoring uncertainty in the
history. We implement our model in a Bayesian phylogenetic framework, and
discuss inference techniques for hypothesis testing. Finally, we showcase the
method through applications to columbine flower morphology, antibiotic
resistance in Salmonella and epitope evolution in influenza.

Surveys often ask respondents to report nonnegative counts, but respondents
may misremember or round to a nearby multiple of 5 or 10. This phenomenon is
called heaping, and the error inherent in heaped selfreported numbers can bias
estimation. Heaped data may be collected crosssectionally or longitudinally
and there may be covariates that complicate the inferential task. Heaping is a
wellknown issue in many survey settings, and inference for heaped data is an
important statistical problem. We propose a novel reporting distribution whose
underlying parameters are readily interpretable as rates of misremembering and
rounding. The process accommodates a variety of heaping grids and allows for
quasiheaping to values nearly but not equal to heaping multiples. We present a
Bayesian hierarchical model for longitudinal samples with covariates to infer
both the unobserved true distribution of counts and the parameters that control
the heaping process. Finally, we apply our methods to longitudinal
selfreported counts of sex partners in a study of highrisk behavior in
HIVpositive youth.

Although footandmouth disease virus (FMDV) incidence has decreased in South
America over the last years, the pathogen still circulates in the region and
the risk of reemergence in previously FMDVfree areas is a veterinary public
health concern. In this paper we merge environmental, epidemiological and
genetic data to reconstruct spatiotemporal patterns and determinants of FMDV
serotypes A and O dispersal in South America. Our dating analysis suggests that
serotype A emerged in South America around 1930, while serotype O emerged
around 1990. The rate of evolution for serotype A was significantly higher
compared to serotype O. Phylogeographic inference identified two wellconnected
sub networks of viral flow, one including Venezuela, Colombia and Ecuador;
another including Brazil, Uruguay and Argentina. The spread of serotype A was
best described by geographic distances, while trade of live cattle was the
predictor that best explained serotype O spread. Our findings show that the two
serotypes have different underlying evolutionary and spatial dynamics and may
pose different threats to control programmes. Keywords: Phylogeography,
footandmouth disease virus, South America, animal trade.

We analyze longitudinal selfreported counts of sexual partners from youth
living with HIV. In selfreported survey data, subjects recall counts of events
or behaviors such as the number of sexual partners or the number of drug uses
in the past three months. Subjects with small counts may report the exact
number, whereas subjects with large counts may have difficulty recalling the
exact number. Thus, selfreported counts are noisy, and misreporting induces
errors in the count variable. As a naive method for analyzing selfreported
counts, the Poisson random effects model treats the observed counts as true
counts and reporting errors in the outcome variable are ignored. Inferences are
therefore based on incorrect information and may lead to conclusions
unsupported by the data. We describe a Bayesian model for analyzing
longitudinal selfreported count data that formally accounts for reporting
error. We model reported counts conditional on underlying true counts using a
linear birthdeath process and use a Poisson random effects model to model the
underlying true counts. A regression version of our model can identify
characteristics of subjects with greater or lesser reporting error. We
demonstrate several approaches to prior specification.

Many important stochastic counting models can be written as general
birthdeath processes (BDPs). BDPs are continuoustime Markov chains on the
nonnegative integers and can be used to easily parameterize a rich variety of
probability distributions. Although the theoretical properties of general BDPs
are well understood, traditionally statistical work on BDPs has been limited to
the simple linear (Kendall) process, which arises in ecology and evolutionary
applications. Aside from a few simple cases, it remains impossible to find
analytic expressions for the likelihood of a discretelyobserved BDP, and
computational difficulties have hindered development of tools for statistical
inference. But the gap between BDP theory and practical methods for estimation
has narrowed in recent years. There are now robust methods for evaluating
likelihoods for realizations of BDPs: finitetime transition, first passage,
equilibrium probabilities, and distributions of summary statistics that arise
commonly in applications. Recent work has also exploited the connection between
continuously and discretelyobserved BDPs to derive EM algorithms for maximum
likelihood estimation. Likelihoodbased inference for previously intractable
BDPs is much easier than previously thought and regression approaches analogous
to Poisson regression are straightforward to derive. In this review, we outline
the basic mathematical theory for BDPs and demonstrate new tools for
statistical inference using data from BDPs. We give six examples of BDPs and
derive EM algorithms to fit their parameters by maximum likelihood. We show how
to compute the distribution of integral summary statistics and give an example
application to the total cost of an epidemic. Finally, we suggest future
directions for innovation in this important class of stochastic processes.

Continuoustime linear birthdeathimmigration (BDI) processes are frequently
used in ecology and epidemiology to model stochastic dynamics of the population
of interest. In clinical settings, multiple birthdeath processes can describe
disease trajectories of individual patients, allowing for estimation of the
effects of individual covariates on the birth and death rates of the process.
Such estimation is usually accomplished by analyzing patient data collected at
unevenly spaced time points, referred to as panel data in the biostatistics
literature. Fitting linear BDI processes to panel data is a nontrivial
optimization problem because birth and death rates can be functions of many
parameters related to the covariates of interest. We propose a novel
expectationmaximization (EM) algorithm for fitting linear BDI models with
covariates to panel data. We derive a closedform expression for the joint
generating function of some of the BDI process statistics and use this
generating function to reduce the Estep of the EM algorithm, as well as
calculation of the Fisher information, to onedimensional integration. This
analytical technique yields a computationally efficient and robust optimization
algorithm that we implemented in an opensource R package. We apply our method
to DNA fingerprinting of Mycobacterium tuberculosis, the causative agent of
tuberculosis, to study intrapatient time evolution of IS6110 copy number, a
genetic marker frequently used during estimation of epidemiological clusters of
Mycobacterium tuberculosis infections. Our analysis reveals previously
undocumented differences in IS6110 birthdeath rates among three major lineages
of Mycobacterium tuberculosis, which has important implications for
epidemiologists that use IS6110 for DNA fingerprinting of Mycobacterium
tuberculosis.

Influenza viruses undergo continual antigenic evolution allowing mutant
viruses to evade host immunity acquired to previous virus strains. Antigenic
phenotype is often assessed through pairwise measurement of crossreactivity
between influenza strains using the hemagglutination inhibition (HI) assay.
Here, we extend previous approaches to antigenic cartography, and
simultaneously characterize antigenic and genetic evolution by modeling the
diffusion of antigenic phenotype over a shared virus phylogeny. Using HI data
from influenza lineages A/H3N2, A/H1N1, B/Victoria and B/Yamagata, we determine
patterns of antigenic drift across viral lineages, showing that A/H3N2 evolves
faster and in a more punctuated fashion than other influenza lineages. We also
show that yeartoyear antigenic drift appears to drive incidence patterns
within each influenza lineage. This work makes possible substantial future
advances in investigating the dynamics of influenza and other
antigenicallyvariable pathogens by providing a model that intimately combines
molecular and antigenic evolution.

Background: Simulated nucleotide or amino acid sequences are frequently used
to assess the performance of phylogenetic reconstruction methods. BEAST, a
Bayesian statistical framework that focuses on reconstructing timecalibrated
molecular evolutionary processes, supports a wide array of evolutionary models,
but lacked matching machinery for simulation of character evolution along
phylogenies.
Results: We present a flexible Monte Carlo simulation tool, called piBUSS,
that employs the BEAGLE high performance library for phylogenetic computations
within BEAST to rapidly generate large sequence alignments under complex
evolutionary models. piBUSS sports a userfriendly graphical user interface
(GUI) that allows combining a rich array of models across an arbitrary number
of partitions. A commandline interface mirrors the options available through
the GUI and facilitates scripting in largescale simulation studies. Analogous
to BEAST model and analysis setup, more advanced simulation options are
supported through an extensible markup language (XML) specification, which in
addition to generating sequence output, also allows users to combine simulation
and analysis in a single BEAST run.
Conclusions: piBUSS offers a unique combination of flexibility and
easeofuse for sequence simulation under realistic evolutionary scenarios.
Through different interfaces, piBUSS supports simulation studies ranging from
modest endeavors for illustrative purposes to complex and largescale
assessments of evolutionary inference procedures. The software aims at
implementing new models and data types that are continuously being developed as
part of BEAST/BEAGLE.

Global mobility flow data are at the heart of spatial epidemiological models
used to predict infectious disease behavior but this wealth of data on human
mobility has been largely neglected by reconstructions of pathogen evolutionary
dynamics using viral genetic data. Although stochastic models of viral
evolution may potentially be informed by such data, a major challenge lies in
deciding which mobility processes are critical and to what extent they
contribute to shaping contemporaneous distributions of pathogen diversity.
Here, we develop a framework to integrate predictors of viral diffusion with
phylogeographic inference and estimate human influenza H3N2 migration history
while simultaneously testing and quantifying the factors that underly it. We
provide evidence for air travel governing the global dynamics of human
influenza whereas other processes act at a more local scale.

Following a series of highprofile drug safety disasters in recent years,
many countries are redoubling their efforts to ensure the safety of licensed
medical products. Largescale observational databases such as claims databases
or electronic health record systems are attracting particular attention in this
regard, but present significant methodological and computational concerns. In
this paper we show how highperformance statistical computation, including
graphics processing units, relatively inexpensive highly parallel computing
devices, can enable complex methods in large databases. We focus on
optimization and massive parallelization of cyclic coordinate descent
approaches to fit a conditioned generalized linear model involving tens of
millions of observations and thousands of predictors in a Bayesian context. We
find ordersofmagnitude improvement in overall runtime. Coordinate descent
approaches are ubiquitous in highdimensional statistics and the algorithms we
propose open up exciting new methodological possibilities with the potential to
significantly improve drug safety.

The branching structure of biological evolution confers statistical
dependencies on phenotypic trait values in related organisms. For this reason,
comparative macroevolutionary studies usually begin with an inferred phylogeny
that describes the evolutionary relationships of the organisms of interest. The
probability of the observed trait data can be computed by assuming a model for
trait evolution, such as Brownian motion, over the branches of this fixed tree.
However, the phylogenetic tree itself contributes statistical uncertainty to
estimates of other evolutionary quantities, and many comparative evolutionary
biologists regard the tree as a nuisance parameter. In this paper, we present a
framework for analytically integrating over unknown phylogenetic trees in
comparative evolutionary studies by assuming that the tree arises from a
continuoustime Markov branching model called the Yule process. To do this, we
derive a closedform expression for the distribution of phylogenetic diversity,
which is the sum of branch lengths connecting a set of taxa. We then present a
generalization of phylogenetic diversity which is equivalent to the expected
trait disparity in a set of taxa whose evolutionary relationships are generated
by a Yule process and whose traits evolve by Brownian motion. We derive
expressions for the distribution of expected trait disparity under a Yule tree.
Given one or more observations of trait disparity in a clade, we perform fast
likelihoodbased estimation of the Brownian variance for unresolved clades. Our
method does not require simulation or a fixed phylogenetic tree. We conclude
with a brief example illustrating Brownian rate estimation for thirteen
families in the Mammalian order Carnivora, in which the phylogenetic tree for
each family is unresolved.

A birthdeath process is a continuoustime Markov chain that counts the
number of particles in a system over time. In the general process with $n$
current particles, a new particle is born with instantaneous rate $\lambda_n$
and a particle dies with instantaneous rate $\mu_n$. Currently no robust and
efficient method exists to evaluate the finitetime transition probabilities in
a general birthdeath process with arbitrary birth and death rates. In this
paper, we first revisit the theory of continued fractions to obtain expressions
for the Laplace transforms of these transition probabilities and make explicit
an important derivation connecting transition probabilities and continued
fractions. We then develop an efficient algorithm for computing these
probabilities that analyzes the error associated with approximations in the
method. We demonstrate that this errorcontrolled method agrees with known
solutions and outperforms previous approaches to computing these probabilities.
Finally, we apply our novel method to several important problems in ecology,
evolution, and genetics.

Birthdeath processes (BDPs) are continuoustime Markov chains that track the
number of "particles" in a system over time. While widely used in population
biology, genetics and ecology, statistical inference of the instantaneous
particle birth and death rates remains largely limited to restrictive linear
BDPs in which perparticle birth and death rates are constant. Researchers
often observe the number of particles at discrete times, necessitating data
augmentation procedures such as expectationmaximization (EM) to find maximum
likelihood estimates. The Estep in the EM algorithm is available in
closedform for some linear BDPs, but otherwise previous work has resorted to
approximation or simulation. Remarkably, the Estep conditional expectations
can also be expressed as convolutions of computable transition probabilities
for any general BDP with arbitrary rates. This important observation, along
with a convenient continued fraction representation of the Laplace transforms
of the transition probabilities, allows novel and efficient computation of the
conditional expectations for all BDPs, eliminating the need for approximation
or costly simulation. We use this insight to derive EM algorithms that yield
maximum likelihood estimation for general BDPs characterized by various rate
models, including generalized linear models. We show that our Laplace
convolution technique outperforms competing methods when available and
demonstrate a technique to accelerate EM algorithm convergence. Finally, we
validate our approach using synthetic data and then apply our methods to
estimation of mutation parameters in microsatellite evolution.

Massive datasets in the gigabyte and terabyte range combined with the
availability of increasingly sophisticated statistical tools yield analyses at
the boundary of what is computationally feasible. Compromising in the face of
this computational burden by partitioning the dataset into more tractable sizes
results in stratified analyses, removed from the context that justified the
initial data collection. In a Bayesian framework, these stratified analyses
generate intermediate realizations, often compared using point estimates that
fail to account for the variability within and correlation between the
distributions these realizations approximate. However, although the initial
concession to stratify generally precludes the more sensible analysis using a
single joint hierarchical model, we can circumvent this outcome and capitalize
on the intermediate realizations by extending the dynamic iterative reweighting
MCMC algorithm. In doing so, we reuse the available realizations by reweighting
them with importance weights, recycling them into a now tractable joint
hierarchical model. We apply this technique to intermediate realizations
generated from stratified analyses of 687 influenza A genomes spanning 13 years
allowing us to revisit hypotheses regarding the evolutionary history of
influenza within a hierarchical statistical framework.

This paper discusses the potential of graphics processing units (GPUs) in
highdimensional optimization problems. A single GPU card with hundreds of
arithmetic cores can be inserted in a personal computer and dramatically
accelerates many statistical algorithms. To exploit these devices fully,
optimization algorithms should reduce to multiple parallel tasks, each
accessing a limited amount of data. These criteria favor EM and MM algorithms
that separate parameters and data. To a lesser extent block relaxation and
coordinate descent and ascent also qualify. We demonstrate the utility of GPUs
in nonnegative matrix factorization, PET image reconstruction, and
multidimensional scaling. Speedups of 100 fold can easily be attained. Over the
next decade, GPUs will fundamentally alter the landscape of computational
statistics. It is time for more statisticians to get onboard.