
We introduce a tensorbased clustering method to extract sparse,
lowdimensional structure from highdimensional, multiindexed datasets. This
framework is designed to enable detection of clusters of data in the presence
of structural requirements which we encode as algebraic constraints in a linear
program. Our clustering method is general and can be tailored to a variety of
applications in science and industry. We illustrate our method on a collection
of experiments measuring the response of genetically diverse breast cancer cell
lines to an array of ligands. Each experiment consists of a cell lineligand
combination, and contains timecourse measurements of the earlysignalling
kinases MAPK and AKT at two different ligand dose levels. By imposing
appropriate structural constraints and respecting the multiindexed structure
of the data, the analysis of clusters can be optimized for biological
interpretation and therapeutic understanding. We then perform a systematic,
largescale exploration of mechanistic models of MAPKAKT crosstalk for each
cluster. This analysis allows us to quantify the heterogeneity of breast cancer
cell subtypes, and leads to hypotheses about the signalling mechanisms that
mediate the response of the cell lines to ligands.

Network theory provides a useful framework for studying interconnected
systems of interacting agents. Many networked systems evolve continuously in
time, but most existing methods for analyzing timedependent networks rely on
discrete or discretized time. In this paper, we propose a novel approach for
studying networks that evolve in continuous time by distinguishing between
interactions, which we model as discrete contacts, and \emph{ties}, which
represent strengths of relationships as functions of time. To illustrate our
framework of tiedecay networks, we show how to examine  in a mathematically
tractable and computationally efficient way  important (i.e., 'central')
nodes in networks in which tie strengths decay in time after individuals
interact. As a concrete illustration, we introduce a continuoustime
generalization of PageRank centrality and apply it to a network of retweets
during the 2012 National Health Service controversy in the United Kingdom. Our
work also provides guidance for similar generalizations of other tools from
network theory to continuoustime networks with tie decay, including for
applications to streaming data.

Cells adapt their metabolic fluxes in response to changes in the environment.
We present a framework for the systematic construction of fluxbased graphs
derived from organismwide metabolic networks. Our graphs encode the
directionality of metabolic fluxes via edges that represent the flow of
metabolites from source to target reactions. The methodology can be applied in
the absence of a specific biological context by modelling fluxes
probabilistically, or can be tailored to different environmental conditions by
incorporating flux distributions computed through constraintbased approaches
such as Flux Balance Analysis. We illustrate our approach on the central carbon
metabolism of Escherichia coli and on a metabolic model of human hepatocytes.
The fluxdependent graphs under various environmental conditions and genetic
perturbations exhibit systemic changes in their topological and community
structure, which capture the rerouting of metabolic fluxes and the varying
importance of specific reactions and pathways. By integrating constraintbased
models and tools from network science, our framework allows the study of
contextspecific metabolic responses at a system level beyond standard pathway
descriptions.

We examine the relationship between social structure and sentiment through
the analysis of a large collection of tweets about the Irish Marriage
Referendum of 2015. We obtain the sentiment of every tweet with the hashtags
#marref and #marriageref that was posted in the days leading to the referendum,
and construct networks to aggregate sentiment and use it to study the
interactions among users. Our results show that the sentiment of mention tweets
posted by users is correlated with the sentiment of received mentions, and
there are significantly more connections between users with similar sentiment
scores than among users with opposite scores in the mention and follower
networks. We combine the community structure of the two networks with the
activity level of the users and sentiment scores to find groups of users who
support voting `yes' or `no' in the referendum. There were numerous
conversations between users on opposing sides of the debate in the absence of
follower connections, which suggests that there were efforts by some users to
establish dialogue and debate across ideological divisions. Our analysis shows
that social structure can be integrated successfully with sentiment to analyse
and understand the disposition of social media users. These results have
potential applications in the integration of data and metadata to study
opinion dynamics, public opinion modelling, and polling.

Social media are being increasingly used for health promotion, yet the
landscape of users, messages and interactions in such fora is poorly
understood. Studies of social media and diabetes have focused mostly on
patients, or public agencies addressing it, but have not looked broadly at all
the participants or the diversity of content they contribute. We study Twitter
conversations about diabetes through the systematic analysis of 2.5 million
tweets collected over 8 months and the interactions between their authors. We
address three questions: (1) what themes arise in these tweets?, (2) who are
the most influential users?, (3) which type of users contribute to which
themes? We answer these questions using a mixedmethods approach, integrating
techniques from anthropology, network science and information retrieval such as
thematic coding, temporal network analysis, and community and topic detection.
Diabetesrelated tweets fall within broad thematic groups: health information,
news, social interaction, and commercial. At the same time, humorous messages
and references to popular culture appear consistently, more than any other type
of tweet. We classify authors according to their temporal 'hub' and 'authority'
scores. Whereas the hub landscape is diffuse and fluid over time, top
authorities are highly persistent across time and comprise bloggers, advocacy
groups and NGOs related to diabetes, as well as forprofit entities without
specific diabetes expertise. Top authorities fall into seven interest
communities as derived from their Twitter follower network. Our findings have
implications for public health professionals and policy makers who seek to use
social media as an engagement tool and to inform policy design.

Cellular signal transduction usually involves activation cascades, the
sequential activation of a series of proteins following the reception of an
input signal. Here we study the classic model of weakly activated cascades and
obtain analytical solutions for a variety of inputs. We show that in the
special but important case of optimalgain cascades (i.e., when the
deactivation rates are identical) the downstream output of the cascade can be
represented exactly as a lumped nonlinear module containing an incomplete gamma
function with real parameters that depend on the rates and length of the
cascade, as well as parameters of the input signal. The expressions obtained
can be applied to the nonidentical case when the deactivation rates are random
to capture the variability in the cascade outputs. We also show that cascades
can be rearranged so that blocks with similar rates can be lumped and
represented through our nonlinear modules. Our results can be used both to
represent cascades in computational models of differential equations and to fit
data efficiently, by reducing the number of equations and parameters involved.
In particular, the length of the cascade appears as a realvalued parameter and
can thus be fitted in the same manner as Hill coefficients. Finally, we show
how the obtained nonlinear modules can be used instead of delay differential
equations to model delays in signal transduction.

We exploit flow propagation on the directed neuronal network of the nematode
Caenorhabditis elegans to reveal dynamically relevant features of its
connectome. We find flowbased groupings of neurons at different levels of
granularity, which we relate to functional and anatomical constituents of its
nervous system. A systematic in silico evaluation of the full set of single and
double neuron ablations is used to identify deletions that induce the most
severe disruptions of the multiresolution flow structure. Such ablations are
linked to functionally relevant neurons, and suggest potential candidates for
further in vivo investigation. In addition, we use the directional patterns of
incoming and outgoing network flows at all scales to identify flow profiles for
the neurons in the connectome, without preimposing a priori categories. The
four flow roles identified are linked to signal propagation motivated by
biological inputresponse scenarios.

Directionality is a crucial ingredient in many complex networks in which
information, energy or influence are transmitted. In such directed networks,
analysing flows (and not only the strength of connections) is crucial to reveal
important features of the network that might go undetected if the orientation
of connections is ignored. We showcase here a flowbased approach for community
detection in networks through the study of the network of the most influential
Twitter users during the 2011 riots in England. Firstly, we use directed Markov
Stability to extract descriptions of the network at different levels of
coarseness in terms of interest communities, i.e., groups of nodes within which
flows of information are contained and reinforced. Such interest communities
reveal user groupings according to location, profession, employer, and topic.
The study of flows also allows us to generate an interest distance, which
affords a personalised view of the attention in the network as viewed from the
vantage point of any given user. Secondly, we analyse the profiles of incoming
and outgoing longrange flows with a combined approach of rolebased similarity
and the novel relaxed minimum spanning tree algorithm to reveal that the users
in the network can be classified into five roles. These flow roles go beyond
the standard leader/follower dichotomy and differ from classifications based on
regular/structural equivalence. We then show that the interest communities fall
into distinct informational organigrams characterised by a different mix of
user roles reflecting the quality of dialogue within them. Our generic
framework can be used to provide insight into how flows are generated,
distributed, preserved and consumed in directed networks.

We present a framework to cluster nodes in directed networks according to
their roles by combining RoleBased Similarity (RBS) and Markov Stability, two
techniques based on flows. First we compute the RBS matrix, which contains the
pairwise similarities between nodes according to the scaled number of in and
outdirected paths of different lengths. The weighted RBS similarity matrix is
then transformed into an undirected similarity network using the Relaxed
MinimumSpanning Tree (RMST) algorithm, which uses the geometric structure of
the RBS matrix to unblur the network, such that edges between nodes with high,
direct RBS are preserved. Finally, we partition the RMST similarity network
into rolecommunities of nodes at all scales using Markov Stability to find a
robust set of roles in the network. We showcase our framework through a
biological and a manmade network.

Motivation: Estimating parameters from data is a key stage of the modelling
process, particularly in biological systems where many parameters need to be
estimated from sparse and noisy data sets. Over the years, a variety of
heuristics have been proposed to solve this complex optimisation problem, with
good results in some cases yet with limitations in the biological setting.
Results: In this work, we develop an algorithm for model parameter fitting
that combines ideas from evolutionary algorithms, sequential Monte Carlo and
direct search optimisation. Our method performs well even when the order of
magnitude and/or the range of the parameters is unknown. The method refines
iteratively a sequence of parameter distributions through local optimisation
combined with partial resampling from a historical prior defined over the
support of all previous iterations. We exemplify our method with biological
models using both simulated and real experimental data and estimate the
parameters efficiently even in the absence of a priori knowledge about the
parameters.

We present a dynamical model for rewiring and attachment in bipartite
networks in which edges are added between nodes that belong to catalogs that
can either be fixed in size or growing in size. The model is motivated by an
empirical study of data from the video rental service Netflix, which invites
its users to give ratings to the videos available in its catalog. We find that
the distribution of the number of ratings given by users and that of the number
of ratings received by videos both follow a power law with an exponential
cutoff. We also examine the activity patterns of Netflix users and find bursts
of intense videorating activity followed by long periods of inactivity. We
derive ordinary differential equations to model the acquisition of edges by the
nodes over time and obtain the corresponding timedependent degree
distributions. We then compare our results with the Netflix data and find good
agreement. We conclude with a discussion of how catalog models can be used to
study systems in which agents are forced to choose, rate, or prioritize their
interactions from a very large set of options.