• It is well recognised that animal and plant pathogens form complex ecological communities of interacting organisms within their hosts. Although community ecology approaches have been applied to determine pathogen interactions at the within-host scale, methodologies enabling robust inference of the epidemiological impact of pathogen interactions are lacking. Here we developed a novel statistical framework to identify statistical covariances from the infection time-series of multiple pathogens simultaneously. Our framework extends Bayesian multivariate disease mapping models to analyse multivariate time series data by accounting for within- and between-year dependencies in infection risk and incorporating a between-pathogen covariance matrix which we estimate. Importantly, our approach accounts for possible confounding drivers of temporal patterns in pathogen infection frequencies, enabling robust inference of pathogen-pathogen interactions. We illustrate the validity of our statistical framework using simulated data and applied it to diagnostic data available for five respiratory viruses co-circulating in a major urban population between 2005 and 2013: adenovirus, human coronavirus, human metapneumovirus, influenza B virus and respiratory syncytial virus. We found positive and negative covariances indicative of epidemiological interactions among specific virus pairs. This statistical framework enables a community ecology perspective to be applied to infectious disease epidemiology with important utility for public health planning and preparedness.
  • Understanding how genetic changes allow emerging virus strains to escape the protection afforded by vaccination is vital for the maintenance of effective vaccines. In the current work, we use structural and phylogenetic differences between pairs of virus strains to identify important antigenic sites on the surface of the influenza A(H1N1) virus through the prediction of haemagglutination inhibition (HI) assay, pairwise measures of the antigenic similarity of virus strains. We propose a sparse hierarchical Bayesian model that can deal with the pairwise structure and inherent experimental variability in the H1N1 data through the introduction of latent variables. The latent variables represent the underlying HI assay measurement of any given pair of virus strains and help account for the fact that for any HI assay measurement between the same pair of virus strains, the difference in the viral sequence remains the same. Through accurately representing the structure of the H1N1 data, the model is able to select virus sites which are antigenic, while its latent structure achieves the computational efficiency required to deal with large virus sequence data, as typically available for the influenza virus. In addition to the latent variable model, we also propose a new method, block integrated Widely Applicable Information Criterion (biWAIC), for selecting between competing models. We show how this allows us to effectively select the random effects when used with the proposed model and apply both methods to an A(H1N1) dataset.
  • Diversity measurement underpins the study of biological systems, but measures used vary across disciplines. Despite their common use and broad utility, no unified framework has emerged for measuring, comparing and partitioning diversity. The introduction of information theory into diversity measurement has laid the foundations, but the framework is incomplete without the ability to partition diversity, which is central to fundamental questions across the life sciences: How do we prioritise communities for conservation? How do we identify reservoirs and sources of pathogenic organisms? How do we measure ecological disturbance arising from climate change? The lack of a common framework means that diversity measures from different fields have conflicting fundamental properties, allowing conclusions reached to depend on the measure chosen. This conflict is unnecessary and unhelpful. A mathematically consistent framework would transform disparate fields by delivering scientific insights in a common language. It would also allow the transfer of theoretical and practical developments between fields. We meet this need, providing a versatile unified framework for partitioning biological diversity. It encompasses any kind of similarity between individuals, from functional to genetic, allowing comparisons between qualitatively different kinds of diversity. Where existing partitioning measures aggregate information across the whole population, our approach permits the direct comparison of subcommunities, allowing us to pinpoint distinct, diverse or representative subcommunities and investigate population substructure. The framework is provided as a ready-to-use R package to easily test our approach.
  • Determining phenotype from genetic data is a fundamental challenge. Influenza A viruses undergo rapid antigenic drift and identification of emerging antigenic variants is critical to the vaccine selection process. Using former seasonal influenza A(H1N1) viruses, hemagglutinin sequence and corresponding antigenic data were analyzed in combination with 3-D structural information. We attributed variation in hemagglutination inhibition to individual amino acid substitutions and quantified their antigenic impact, validating a subset experimentally using reverse genetics. Substitutions identified as low-impact were shown to be a critical component of influenza antigenic evolution and by including these, as well as the high-impact substitutions often focused on, the accuracy of predicting antigenic phenotypes of emerging viruses from genotype was doubled. The ability to quantify the phenotypic impact of specific amino acid substitutions should help refine techniques that predict the fitness and evolutionary success of variant viruses, leading to stronger theoretical foundations for selection of candidate vaccine viruses.