• From 1837, when he returned to England aboard the $\textit{HMS Beagle}$, to 1860, just after publication of $\textit{The Origin of Species}$, Charles Darwin kept detailed notes of each book he read or wanted to read. His notes and manuscripts provide information about decades of individual scientific practice. Previously, we trained topic models on the full texts of each reading, and applied information-theoretic measures to detect that changes in his reading patterns coincided with the boundaries of his three major intellectual projects in the period 1837-1860. In this new work we apply the reading model to five additional documents, four of them by Darwin: the first edition of $\textit{The Origin of Species}$, two private essays stating intermediate forms of his theory in 1842 and 1844, a third essay of disputed dating, and Alfred Russel Wallace's essay, which Darwin received in 1858. We address three historical inquiries, previously treated qualitatively: 1) the mythology of "Darwin's Delay," that despite completing an extensive draft in 1844, Darwin waited until 1859 to publish $\textit{The Origin of Species}$ due to external pressures; 2) the relationship between Darwin and Wallace's contemporaneous theories, especially in light of their joint presentation; and 3) dating of the "Outline and Draft" which was rediscovered in 1975 and postulated first as an 1839 draft preceding the Sketch of 1842, then as an interstitial draft between the 1842 and 1844 essays.
  • Due to the interdisciplinary nature of complex systems as a field, students studying complex systems at University level have diverse disciplinary backgrounds. This brings challenges (e.g. wide range of computer programming skills) but also opportunities (e.g. facilitating interdisciplinary interactions and projects) for the classroom. However, there is little published regarding how these challenges and opportunities are handled in teaching and learning Complex Systems as an explicit subject in higher education, and how this differs in comparison to other subject areas. We seek to explore these particular challenges and opportunities via an interview-based study of pioneering teachers and learners (conducted amongst the authors) regarding their experiences. We compare and contrast those experiences, and analyse them with respect to the educational literature. Our discussions explored: approaches to curriculum design, how theories/models/frameworks of teaching and learning informed decisions and experience, how diversity in student backgrounds was addressed, and assessment task design. We found a striking level of commonality in the issues expressed as well as the strategies to handle them, for example a significant focus on problem-based learning, and the use of major student-led creative projects for both achieving and assessing learning outcomes.
  • The French Revolution brought principles of "liberty, equality, and brotherhood" to bear on the day-to-day challenges of governing what was then the largest country in Europe. Its experiments provided a model for future revolutions and democracies across the globe, but this first modern revolution had no model to follow. Using reconstructed transcripts of debates held in the Revolution's first parliament, we present a quantitative analysis of how this system managed innovation. We use information theory to track the creation, transmission, and destruction of patterns of word-use across over 40,000 speeches and more than one thousand speakers. The parliament as a whole was biased toward the adoption of new patterns, but speakers' individual qualities could break these overall trends. Speakers on the left innovated at higher rates while speakers on the right acted, often successfully, to preserve prior patterns. Key players such as Robespierre (on the left) and Abb\'e Maury (on the right) played information-processing roles emblematic of their politics. Newly-created organizational functions---such as the Assembly's President and committee chairs---had significant effects on debate outcomes, and a distinct transition appears mid-way through the parliament when committees, external to the debate process, gain new powers to "propose and dispose" to the body as a whole. Taken together, these quantitative results align with existing qualitative interpretations but also reveal crucial information-processing dynamics that have hitherto been overlooked. Great orators had the public's attention, but deputies (mostly on the political left) who mastered the committee system gained new powers to shape revolutionary legislation.
  • In response to failures of central planning, the Chinese government has experimented not only with free-market trade zones, but with allowing non-profit foundations to operate in a decentralized fashion. A network study shows how these foundations have connected together by sharing board members, in a structural parallel to what is seen in corporations in the United States and Europe. This board interlocking leads to the emergence of an elite group with privileged network positions. While the presence of government officials on non-profit boards is widespread, government officials are much less common in a subgroup of foundations that control just over half of all revenue in the network. This subgroup, associated with business elites, not only enjoys higher levels of within-elite links, but even preferentially excludes government officials from the NGOs with higher degree. The emergence of this structurally autonomous sphere is associated with major political and social events in the state-society relationship. Cluster analysis reveals multiple internal components within this sphere that share similar levels of network influence. Rather than a core-periphery structure centered around government officials, the Chinese non-profit world appears to be a multipolar one of distinct elite groups, many of which achieve high levels of independence from direct government control.
  • Search in an environment with an uncertain distribution of resources involves a trade-off between exploitation of past discoveries and further exploration. This extends to information foraging, where a knowledge-seeker shifts between reading in depth and studying new domains. To study this decision-making process, we examine the reading choices made by one of the most celebrated scientists of the modern era: Charles Darwin. From the full-text of books listed in his chronologically-organized reading journals, we generate topic models to quantify his local (text-to-text) and global (text-to-past) reading decisions using Kullback-Liebler Divergence, a cognitively-validated, information-theoretic measure of relative surprise. Rather than a pattern of surprise-minimization, corresponding to a pure exploitation strategy, Darwin's behavior shifts from early exploitation to later exploration, seeking unusually high levels of cognitive surprise relative to previous eras. These shifts, detected by an unsupervised Bayesian model, correlate with major intellectual epochs of his career as identified both by qualitative scholarship and Darwin's own self-commentary. Our methods allow us to compare his consumption of texts with their publication order. We find Darwin's consumption more exploratory than the culture's production, suggesting that underneath gradual societal changes are the explorations of individual synthesis and discovery. Our quantitative methods advance the study of cognitive search through a framework for testing interactions between individual and collective behavior and between short- and long-term consumption choices. This novel application of topic modeling to characterize individual reading complements widespread studies of collective scientific behavior.
  • The organism is a fundamental concept in biology. However there is no universally accepted, formal, and yet broadly applicable definition of what an organism is. Here we introduce a candidate definition. We adopt the view that the "organism" is a functional concept, used by scientists to address particular questions concerning the future state of a biological system, rather than something wholly defined by that system. In this approach organisms are a coarse-graining of a fine-grained dynamical model of a biological system. Crucially, the coarse-graining of the system into organisms is chosen so that their dynamics can be used by scientists to make accurate predictions of those features of the biological system that interests them, and do so with minimal computational burden. To illustrate our framework we apply it to a dynamic model of lichen symbiosis---a system where either the lichen or its constituent fungi and algae could reasonably be considered "organisms." We find that the best choice for what organisms are in this scenario are complex mixtures of many entities that do not resemble standard notions of organisms. When we restrict our allowed coarse-grainings to more traditional types of organisms, we find that ecological conditions, such as niche competition and predation pressure, play a significant role in determining the best choice for organisms.
  • What is the boundary between a vigorous argument and a breakdown of relations? What drives a group of individuals across it? Taking Wikipedia as a test case, we use a hidden Markov model to approximate the computational structure and social grammar of more than a decade of cooperation and conflict among its editors. Across a wide range of pages, we discover a bursty war/peace structure where the systems can become trapped, sometimes for months, in a computational subspace associated with significantly higher levels of conflict-tracking "revert" actions. Distinct patterns of behavior characterize the lower-conflict subspace, including tit-for-tat reversion. While a fraction of the transitions between these subspaces are associated with top-down actions taken by administrators, the effects are weak. Surprisingly, we find no statistical signal that transitions are associated with the appearance of particularly anti-social users, and only weak association with significant news events outside the system. These findings are consistent with transitions being driven by decentralized processes with no clear locus of control. Models of belief revision in the presence of a common resource for information-sharing predict the existence of two distinct phases: a disordered high-conflict phase, and a frozen phase with spontaneously-broken symmetry. The bistability we observe empirically may be a consequence of editor turn-over, which drives the system to a critical point between them.
  • When we use machine learning for public policy, we find that many useful variables are associated with others on which it would be ethically problematic to base decisions. This problem becomes particularly acute in the Big Data era, when predictions are often made in the absence of strong theories for underlying causal mechanisms. We describe the dangers to democratic decision-making when high-performance algorithms fail to provide an explicit account of causation. We then demonstrate how information theory allows us to degrade predictions so that they decorrelate from protected variables with minimal loss of accuracy. Enforcing total decorrelation is at best a near-term solution, however. The role of causal argument in ethical debate urges the development of new, interpretable machine-learning algorithms that reference causal mechanisms.
  • For many organisms, the number of sensory neurons is largely determined during development, before strong environmental cues are present. This is despite the fact that environments can fluctuate drastically both from generation to generation and within an organism's lifetime. How can organisms get by by hard-coding the number of sensory neurons? We approach this question using rate-distortion theory. A combination of simulation and theory suggests that when environments are large, the rate-distortion function---a proxy for material costs, timing delays, and energy requirements---depends only on coarse-grained environmental statistics that are expected to change on evolutionary, rather than ontogenetic, timescales.
  • We present three major transitions that occur on the way to the elaborate and diverse societies of the modern era. Our account links the worlds of social animals such as pigtail macaques and monk parakeets to examples from human history, including 18th Century London and the contemporary online phenomenon of Wikipedia. From the first awareness and use of group-level social facts to the emergence of norms and their self-assembly into normative bundles, each transition represents a new relationship between the individual and the group. At the center of this relationship is the use of coarse-grained information gained via lossy compression. The role of top-down causation in the origin of society parallels that conjectured to occur in the origin and evolution of life itself.
  • Social norms have traditionally been difficult to quantify. In any particular society, their sheer number and complex interdependencies often limit a system-level analysis. One exception is that of the network of norms that sustain the online Wikipedia community. We study the fifteen-year evolution of this network using the interconnected set of pages that establish, describe, and interpret the community's norms. Despite Wikipedia's reputation for \textit{ad hoc} governance, we find that its normative evolution is highly conservative. The earliest users create norms that both dominate the network and persist over time. These core norms govern both content and interpersonal interactions using abstract principles such as neutrality, verifiability, and assume good faith. As the network grows, norm neighborhoods decouple topologically from each other, while increasing in semantic coherence. Taken together, these results suggest that the evolution of Wikipedia's norm network is akin to bureaucratic systems that predate the information age.
  • Common knowledge of intentions is crucial to basic social tasks ranging from cooperative hunting to oligopoly collusion, riots, revolutions, and the evolution of social norms and human culture. Yet little is known about how common knowledge leaves a trace on the dynamics of a social network. Here we show how an individual's network properties---primarily local clustering and betweenness centrality---provide strong signals of the ability to successfully participate in common knowledge tasks. These signals are distinct from those expected when practices are contagious, or when people use less-sophisticated heuristics that do not yield true coordination. This makes it possible to infer decision rules from observation. We also find that tasks that require common knowledge can yield significant inequalities in success, in contrast to the relative equality that results when practices spread by contagion alone.
  • In complex environments, there are costs to both ignorance and perception. An organism needs to track fitness-relevant information about its world, but the more information it tracks, the more resources it must devote to memory and processing. Rate-distortion theory shows that, when errors are allowed, remarkably efficient internal representations can be found by biologically-plausible hill-climbing mechanisms. We identify two regimes: a high-fidelity regime where perceptual costs scale logarithmically with environmental complexity, and a low-fidelity regime where perceptual costs are, remarkably, independent of the environment. When environmental complexity is rising, Darwinian evolution should drive organisms to the threshold between the high- and low-fidelity regimes. Organisms that code efficiently will find themselves able to make, just barely, the most subtle distinctions in their environment.
  • To analyze high-dimensional systems, many fields in science and engineering rely on high-level descriptions, sometimes called "macrostates," "coarse-grainings," or "effective theories". Examples of such descriptions include the thermodynamic properties of a large collection of point particles undergoing reversible dynamics, the variables in a macroeconomic model describing the individuals that participate in an economy, and the summary state of a cell composed of a large set of biochemical networks. Often these high-level descriptions are constructed without considering the ultimate reason for needing them in the first place. Here, we formalize and quantify one such purpose: the need to predict observables of interest concerning the high-dimensional system with as high accuracy as possible, while minimizing the computational cost of doing so. The resulting State Space Compression (SSC) framework provides a guide for how to solve for the {optimal} high-level description of a given dynamical system, rather than constructing it based on human intuition alone. In this preliminary report, we introduce SSC, and illustrate it with several information-theoretic quantifications of "accuracy", all with different implications for the optimal compression. We also discuss some other possible applications of SSC beyond the goal of accurate prediction. These include SSC as a measure of the complexity of a dynamical system, and as a way to quantify information flow between the scales of a system.
  • Group-level cognitive states are widely observed in human social systems, but their discussion is often ruled out a priori in quantitative approaches. In this paper, we show how reference to the irreducible mental states and psychological dynamics of a group is necessary to make sense of large scale social phenomena. We introduce the problem of mental boundaries by reference to a classic problem in the evolution of cooperation. We then provide an explicit quantitative example drawn from ongoing work on cooperation and conflict among Wikipedia editors, showing how some, but not all, effects of individual experience persist in the aggregate. We show the limitations of methodological individualism, and the substantial benefits that come from being able to refer to collective intentions, and attributions of cognitive states of the form "what the group believes" and "what the group values".
  • We propose a novel method for clustering data which is grounded in information-theoretic principles and requires no parametric assumptions. Previous attempts to use information theory to define clusters in an assumption-free way are based on maximizing mutual information between data and cluster labels. We demonstrate that this intuition suffers from a fundamental conceptual flaw that causes clustering performance to deteriorate as the amount of data increases. Instead, we return to the axiomatic foundations of information theory to define a meaningful clustering measure based on the notion of consistency under coarse-graining for finite data.
  • In compressed sensing, we wish to reconstruct a sparse signal $x$ from observed data $y$. In sparse coding, on the other hand, we wish to find a representation of an observed signal $y$ as a sparse linear combination, with coefficients $x$, of elements from an overcomplete dictionary. While many algorithms are competitive at both problems when $x$ is very sparse, it can be challenging to recover $x$ when it is less sparse. We present the Difference Map, which excels at sparse recovery when sparseness is lower and noise is higher. The Difference Map out-performs the state of the art with reconstruction from random measurements and natural image reconstruction via sparse coding.
  • We consider Bayesian estimation of information-theoretic quantities from data, using a Dirichlet prior. Acknowledging the uncertainty of the event space size $m$ and the Dirichlet prior's concentration parameter $c$, we treat both as random variables set by a hyperprior. We show that the associated hyperprior, $P(c, m)$, obeys a simple "Irrelevance of Unseen Variables" (IUV) desideratum iff $P(c, m) = P(c) P(m)$. Thus, requiring IUV greatly reduces the number of degrees of freedom of the hyperprior. Some information-theoretic quantities can be expressed multiple ways, in terms of different event spaces, e.g., mutual information. With all hyperpriors (implicitly) used in earlier work, different choices of this event space lead to different posterior expected values of these information-theoretic quantities. We show that there is no such dependence on the choice of event space for a hyperprior that obeys IUV. We also derive a result that allows us to exploit IUV to greatly simplify calculations, like the posterior expected mutual information or posterior expected multi-information. We also use computer experiments to favorably compare an IUV-based estimator of entropy to three alternative methods in common use. We end by discussing how seemingly innocuous changes to the formalization of an estimation problem can substantially affect the resultant estimates of posterior expectations.
  • Reciprocity is a vital feature of social networks, but relatively little is known about its temporal structure or the mechanisms underlying its persistence in real world behavior. In pursuit of these two questions, we study the stationary and dynamical signals of reciprocity in a network of manioc beer (Spanish: chicha; Tsimane': shocdye') drinking events in a Tsimane' village in lowland Bolivia. At the stationary level, our analysis reveals that social exchange within the community is heterogeneously patterned according to kinship and spatial proximity. A positive relationship between the frequencies at which two families host each other, controlling for kinship and proximity, provides evidence for stationary reciprocity. Our analysis of the dynamical structure of this network presents a novel method for the study of conditional, or non-stationary, reciprocity effects. We find evidence that short-timescale reciprocity (within three days) is present among non- and distant-kin pairs; conversely, we find that levels of cooperation among close kin can be accounted for on the stationary hypothesis alone.
  • We investigate the computational structure of a paradigmatic example of distributed social interaction: that of the open-source Wikipedia community. We examine the statistical properties of its cooperative behavior, and perform model selection to determine whether this aspect of the system can be described by a finite-state process, or whether reference to an effectively unbounded resource allows for a more parsimonious description. We find strong evidence, in a majority of the most-edited pages, in favor of a collective-state model, where the probability of a "revert" action declines as the square root of the number of non-revert actions seen since the last revert. We provide evidence that the emergence of this social counter is driven by collective interaction effects, rather than properties of individual users.
  • We characterize the statistical bootstrap for the estimation of information-theoretic quantities from data, with particular reference to its use in the study of large-scale social phenomena. Our methods allow one to preserve, approximately, the underlying axiomatic relationships of information theory---in particular, consistency under arbitrary coarse-graining---that motivate use of these quantities in the first place, while providing reliability comparable to the state of the art for Bayesian estimators. We show how information-theoretic quantities allow for rigorous empirical study of the decision-making capacities of rational agents and the time-asymmetric flows of information in distributed systems. We provide illustrative examples by reference to ongoing collaborative work on the semantic structure of the British Criminal Court system and the conflict dynamics of the contemporary Afghanistan insurgency.
  • A common feature of biological networks is the geometric property of self-similarity. Molecular regulatory networks through to circulatory systems, nervous systems, social systems and ecological trophic networks, show self-similar connectivity at multiple scales. We analyze the relationship between topology and signaling in contrasting classes of such topologies. We find that networks differ in their ability to contain or propagate signals between arbitrary nodes in a network depending on whether they possess branching or loop-like features. Networks also differ in how they respond to noise, such that one allows for greater integration at high noise, and this performance is reversed at low noise. Surprisingly, small-world topologies, with diameters logarithmic in system size, have slower dynamical timescales, and may be less integrated (more modular) than networks with longer path lengths. All of these phenomena are essentially mesoscopic, vanishing in the infinite limit but producing strong effects at sizes and timescales relevant to biology.
  • Abstracting an effective theory from a complicated process is central to the study of complexity. Even when the underlying mechanisms are understood, or at least measurable, the presence of dissipation and irreversibility in biological, computational and social systems makes the problem harder. Here we demonstrate the construction of effective theories in the presence of both irreversibility and noise, in a dynamical model with underlying feedback. We use the Krohn-Rhodes theorem to show how the composition of underlying mechanisms can lead to innovations in the emergent effective theory. We show how dissipation and irreversibility fundamentally limit the lifetimes of these emergent structures, even though, on short timescales, the group properties may be enriched compared to their noiseless counterparts.
  • Random instances of feedforward Boolean circuits are studied both analytically and numerically. Evaluating these circuits is known to be a P-complete problem and thus, in the worst case, believed to be impossible to perform, even given a massively parallel computer, in time much less than the depth of the circuit. Nonetheless, it is found that for some ensembles of random circuits, saturation to a fixed truth value occurs rapidly so that evaluation of the circuit can be accomplished in much less parallel time than the depth of the circuit. For other ensembles saturation does not occur and circuit evaluation is apparently hard. In particular, for some random circuits composed of connectives with five or more inputs, the number of true outputs at each level is a chaotic sequence. Finally, while the average case complexity depends on the choice of ensemble, it is shown that for all ensembles it is possible to simultaneously construct a typical circuit together with its solution in polylogarithmic parallel time.
  • We analyze the timescales of conflict decision-making in a primate society. We present evidence for multiple, periodic timescales associated with social decision-making and behavioral patterns. We demonstrate the existence of periodicities that are not directly coupled to environmental cycles or known ultraridian mechanisms. Among specific biological and socially-defined demographic classes, periodicities span timescales between hours and days, and many are not driven by exogenous or internal regularities. Our results indicate that they are instead driven by strategic responses to social interaction patterns. Analyses also reveal that a class of individuals, playing a critical functional role, policing, have a signature timescale on the order of one hour. We propose a classification of behavioral timescales analogous to those of the nervous system, with high-frequency, or $\alpha$-scale, behavior occurring on hour-long scales, through to multi-hour, or $\beta$-scale, behavior, and, finally $\gamma$ periodicities observed on a timescale of days.