• Here we study polysemy as a potential learning bias in vocabulary learning in children. Words of low polysemy could be preferred as they reduce the disambiguation effort for the listener. However, such preference could be a side-effect of another bias: the preference of children for nouns in combination with the lower polysemy of nouns with respect to other part-of-speech categories. Our results show that mean polysemy in children increases over time in two phases, i.e. a fast growth till the 31st month followed by a slower tendency towards adult speech. In contrast, this evolution is not found in adults interacting with children. This suggests that children have a preference for non-polysemous words in their early stages of vocabulary acquisition. Interestingly, the evolutionary pattern described above weakens when controlling for syntactic category (noun, verb, adjective or adverb) but it does not disappear completely, suggesting that it could result from acombination of a standalone bias for low polysemy and a preference for nouns.
  • Lane changing is one of the most common maneuvers on motorways. Although, macroscopic traffic models are well known for their suitability to describe fast moving crowded traffic, most of these models are generally developed in one dimensional framework, henceforth lane changing behavior is somehow neglected. In this paper, we propose a macroscopic model, which accounts for lane-changing behavior on motorway, based on a two-dimensional extension of the Aw and Rascle [Aw and Rascle, SIAM J.Appl.Math., 2000] and Zhang [Zhang, Transport.Res.B-Meth., 2002] macroscopic model for traffic flow. Under conditions, when lane changing maneuvers are no longer possible, the model "relaxes" to the one-dimensional Aw-Rascle-Zhang model. Following the same approach as in [Aw, Klar, Materne and Rascle, SIAM J.Appl.Math., 2002], we derive the two-dimensional macroscopic model through scaling of time discretization of a microscopic follow-the-leader model with driving direction. We provide a detailed analysis of the space-time discretization of the proposed macroscopic as well as an approximation of the solution to the associated Riemann problem. Furthermore, we illustrate some features of the proposed model through some numerical experiments.
  • This paper introduces a theoretical framework for the analysis and control of the stochastic susceptible-infected-removed (SIR) spreading process over a network of heterogeneous agents. In our analysis, we analyze the exact networked Markov process describing the SIR model, without resorting to mean-field approximations, and introduce a convex optimization framework to find an efficient allocation of resources to contain the expected number of accumulated infections over time. Numerical simulations are presented to illustrate the effectiveness of the obtained results.
  • The occurrence of discrimination is an important problem in the social and economical sciences. Much of the discrimination observed in empirical studies can be explained by the theory of in-group favoritism, which states that people tend to act more positively towards peers whose appearances are more similar to their own. Some studies, however, find hierarchical structures in inter-group relations, where members of low-status groups also favor the high-status group members. These observations cannot be understood in the light of in-group favoritism. Here we present an agent based model in which evolutionary dynamics can result in a hierarchical discrimination between two groups characterized by a meaningless, but observable binary label. We find that discriminating strategies end up dominating the system when the selection pressure is high, i.e. when agents have a much higher probability of imitating their neighbor with the highest payoff. These findings suggest that the puzzling persistence of hierarchical discrimination may result from the evolutionary dynamics of the social system itself, namely the social imitation dynamics. It also predicts that discrimination will occur more often in highly competitive societies.
  • Ancient regional routes were vital for interactions between settlements and deeply influenced the development of past societies and their "complexification". At the same time, since any transportation infrastructure needs some level of inter-settlement cooperation to be established, they can also be regarded as an epiphenomenon of social interactions at the regional scale. Here, we propose to analyze ancient pathway networks to understand the organization of cities and villages located in a certain territory, attempting to clarify whether such organization existed and if so, how it functioned. To address such a question, we chose a quantitative approach. Adopting network science as a general framework, by means of formal models, we try to identify how the collective effort that produced the terrestrial infrastructure was directed and organized. We selected a paradigmatic case study: Iron Age southern Etruria, a very well-studied context, with detailed archaeological information about settlement patterns and an established tradition of studies on terrestrial transportation routes, perfectly suitable for testing new techniques. The results of the modelling suggest that a balanced coordinated decision-making process was shaping the route network in Etruria, a scenario which correlates well with the picture elaborated by different scholars using a more traditional technique.
  • Covering problems are classical computational problems concerning whether a certain combinatorial structure 'covers' another. For example, the minimum vertex covering problem aims to find the smallest set of vertices in a graph so that each edge is incident to at least one vertex in that set. Interestingly, the computational complexity of the minimum vertex covering problem in graphs is closely related to the core percolation problem, where the core is a special subgraph obtained by the greedy leaf removal procedure. Here, by generalizing the greedy leaf removal procedure in graphs to hypergraphs, we introduce two generalizations of core percolation in graphs to hypergraphs, related to the minimum hyperedge cover problem and the minimum vertex cover problem on hypergraphs, respectively. We offer analytical solutions of these two core percolations for random hypergraphs with arbitrary vertex degree and hyperedge cardinality distributions. We also compute these two cores in several real-world hypergraphs, finding that they tend to be much smaller than their randomized counterparts. This result suggests that both the minimum hyperedge cover problem and the minimum vertex cover problem in those real-world hypergraphs can actually be solved in polynomial time. Finally, we map the minimum dominating set problem in graphs to the minimum hyperedge cover problem in hypergraphs. We show that our generalized greedy leaf removel procedure significantly outperforms the state-of-the-art method in solving the minimum dominating set problem.
  • Although viral spreading processes taking place in networks are often analyzed using Markovian models in which both the transmission and the recovery times follow exponential distributions, empirical studies show that, in many real scenarios, the distribution of these times are not necessarily exponential. To overcome this limitation, we first introduce a generalized susceptible-infected-susceptible (SIS) spreading model that allows transmission and recovery times to follow phase-type distributions. In this context, we derive a lower bound on the exponential decay rate towards the infection-free equilibrium of the spreading model without relying on mean-field approximations. Based on our results, we illustrate how the particular shape of the transmission/recovery distribution influences the exponential rate of convergence towards the equilibrium.
  • In such different domains as neurosciences, spin glasses, social science, economics and finance, large ensemble of interacting individuals following (mainstream) or opposing (hipsters) to the majority are ubiquitous. In these systems, interactions generally occur after specific delays associated to transport, transmission or integration of information. We investigate here the impact of anti-conformism combined to delays in the emergent dynamics of large populations of mainstreams and hipsters. To this purpose, we introduce a class of simple statistical systems of interacting agents composed of (i) mainstreams and anti-conformists in the presence of (ii) delays, possibly heterogeneous, in the transmission of information. In this simple model, each agent can be in one of two states, and can change state in continuous time with a rate depending on the state of others in the past. We express the thermodynamic limit of these systems as the number of agents diverge, and investigate the solutions of the limit equation, with a particular focus on synchronized oscillations induced by delayed interactions. We show that when hipsters are too slow in detecting the trends, they will consistently make the same choice, and realizing this too late, they will switch, all together to another state where they remain alike. Similar synchronizations arise when the impact of mainstreams on hipsters choices (and reciprocally) dominate the impact of other hipsters choices, and we show that these may emerge only when the randomness in the hipsters decisions is sufficiently large. Beyond the choice of the best suit to wear this winter, this study may have important implications in understanding synchronization of nerve cells, investment strategies in finance, or emergent dynamics in social science, domains in which delays of communication and the geometry of information accessibility are prominent.
  • Benford's Law predicts that the first significant digit on the leftmost side of numbers in real-life data is proportioned between all possible 1 to 9 digits approximately as in LOG(1 + 1/digit), so that low digits occur much more frequently than high digits in the first place. The two essential prerequisites for data configuration with regards to compliance with Benford's Law are high order of magnitude and positive skewness with a tail falling to the right of the histogram, so that quantitative configuration is such that the small is numerous and the big is rare. In this article various quantitative partition models are examined in terms of the quantitative and digital behavior of the resultant set of parts. The universal feature found across all partition models is having many small parts but only very few big parts, while Benford's Law is valid only in some particular partition cases and under certain constraints. Hence another suggested vista of Benford's Law is viewing it as a particular subset of the broader positive skewness phenomenon in quantitative partitioning. Significantly, such a vista is true in all other causes and explanations of Benford's Law where the small consistently outnumbers the big also in partial structures of the model or well before full convergence to Benford is achieved - endowing the principle universality in a sense. In conclusion, either the active act of partitioning or the passive consideration of a large quantity as the composition of smaller parts can be considered as another independent explanation for the widespread empirical observation of Benford's Law in the physical sciences.
  • Collective, especially group-based, managerial decision making is crucial in organizations. Using an evolutionary theoretic approach to collective decision making, agent-based simulations were conducted to investigate how human collective decision making would be affected by the agents' diversity in problem understanding and/or behavior in discussion, as well as by their social network structure. Simulation results indicated that groups with consistent problem understanding tended to produce higher utility values of ideas and displayed better decision convergence, but only if there was no group-level bias in collective problem understanding. Simulation results also indicated the importance of balance between selection-oriented (i.e., exploitative) and variation-oriented (i.e., explorative) behaviors in discussion to achieve quality final decisions. Expanding the group size and introducing non-trivial social network structure generally improved the quality of ideas at the cost of decision convergence. Simulations with different social network topologies revealed collective decision making on small-world networks with high local clustering tended to achieve highest decision quality more often than on random or scale-free networks. Implications of this evolutionary theory and simulation approach for future managerial research on collective, group, and multi-level decision making are discussed.
  • Modern society depends on the flow of information over online social networks, and users of popular platforms generate significant behavioral data about themselves and their social ties. However, it remains unclear what fundamental limits exist when using these data to predict the activities and interests of individuals, and to what accuracy such predictions can be made using an individual's social ties. Here we show that 95% of the potential predictive accuracy for an individual is achievable using their social ties only, without requiring that individual's data. We use information theoretic tools to estimate the predictive information within the writings of Twitter users, providing an upper bound on the available predictive information that holds for any predictive or machine learning methods. As few as 8-9 of an individual's contacts are sufficient to obtain predictability comparable to that of the individual alone. Distinct temporal and social effects are visible by measuring information flow along social ties, allowing us to better study the dynamics of online activity. Our results have distinct privacy implications: information is so strongly embedded in a social network that in principle one can profile an individual from their available social ties even when the individual forgoes the platform completely.
  • We introduce a tensor-based clustering method to extract sparse, low-dimensional structure from high-dimensional, multi-indexed datasets. This framework is designed to enable detection of clusters of data in the presence of structural requirements which we encode as algebraic constraints in a linear program. Our clustering method is general and can be tailored to a variety of applications in science and industry. We illustrate our method on a collection of experiments measuring the response of genetically diverse breast cancer cell lines to an array of ligands. Each experiment consists of a cell line-ligand combination, and contains time-course measurements of the early-signalling kinases MAPK and AKT at two different ligand dose levels. By imposing appropriate structural constraints and respecting the multi-indexed structure of the data, the analysis of clusters can be optimized for biological interpretation and therapeutic understanding. We then perform a systematic, large-scale exploration of mechanistic models of MAPK-AKT crosstalk for each cluster. This analysis allows us to quantify the heterogeneity of breast cancer cell subtypes, and leads to hypotheses about the signalling mechanisms that mediate the response of the cell lines to ligands.
  • We sketch the history of spectral ranking, a general umbrella name for techniques that apply the theory of linear maps (in particular, eigenvalues and eigenvectors) to matrices that do not represent geometric transformations, but rather some kind of relationship between entities. Albeit recently made famous by the ample press coverage of Google's PageRank algorithm, spectral ranking was devised more than a century ago, and has been studied in tournament ranking, psychology, social sciences, bibliometrics, economy and choice theory. We describe the contribution given by previous scholars in precise and modern mathematical terms: along the way, we show how to express in a general way damped rankings, such as Katz's index, as dominant eigenvectors of perturbed matrices, and then use results on the Drazin inverse to go back to the dominant eigenvectors by a limit process. The result suggests a regularized definition of spectral ranking that yields for a general matrix a unique vector depending on a boundary condition.
  • Centrality is widely recognized as one of the most critical measures to provide insight in the structure and function of complex networks. While various centrality measures have been proposed for single-layer networks, a general framework for studying centrality in multilayer networks (i.e., multicentrality) is still lacking. In this study, a tensor-based framework is introduced to study eigenvector multicentrality, which enables the quantification of the impact of interlayer influence on multicentrality, providing a systematic way to describe how multicentrality propagates across different layers. This framework can leverage prior knowledge about the interplay among layers to better characterize multicentrality for varying scenarios. Two interesting cases are presented to illustrate how to model multilayer influence by choosing appropriate functions of interlayer influence and design algorithms to calculate eigenvector multicentrality. This framework is applied to analyze several empirical multilayer networks, and the results corroborate that it can quantify the influence among layers and multicentrality of nodes effectively.
  • The structure of the International Trade Network (ITN), whose nodes and links represent world countries and their trade relations respectively, affects key economic processes worldwide, including globalization, economic integration, industrial production, and the propagation of shocks and instabilities. Characterizing the ITN via a simple yet accurate model is an open problem. The traditional Gravity Model (GM) successfully reproduces the volume of trade between connected countries, using macroeconomic properties such as GDP, geographic distance, and possibly other factors. However, it predicts a network with complete or homogeneous topology, thus failing to reproduce the highly heterogeneous structure of the ITN. On the other hand, recent maximum-entropy network models successfully reproduce the complex topology of the ITN, but provide no information about trade volumes. Here we integrate these two currently incompatible approaches via the introduction of an Enhanced Gravity Model (EGM) of trade. The EGM is the simplest model combining the GM with the network approach within a maximum-entropy framework. Via a unified and principled mechanism that is transparent enough to be generalized to any economic network, the EGM provides a new econometric framework wherein trade probabilities and trade volumes can be separately controlled by any combination of dyadic and country-specific macroeconomic variables. The model successfully reproduces both the global topology and the local link weights of the ITN, parsimoniously reconciling the conflicting approaches. It also indicates that the probability that any two countries trade a certain volume should follow a geometric or exponential distribution with an additional point mass at zero volume.
  • Various models have been recently proposed to reflect and predict different properties of complex networks. However, the community structure, which is one of the most important properties, is not well studied and modeled. In this paper, we suggest a principle called "preferential placement", which allows to model a realistic clustering structure. We provide an extensive empirical analysis of the obtained structure as well as some theoretical results.
  • Recent progress in applying complex network theory to problems in quantum information has resulted in a beneficial crossover. Complex network methods have successfully been applied to transport and entanglement models while information physics is setting the stage for a theory of complex systems with quantum information-inspired methods. Novel quantum induced effects have been predicted in random graphs---where edges represent entangled links---and quantum computer algorithms have been proposed to offer enhancement for several network problems. Here we review the results at the cutting edge, pinpointing the similarities and the differences found at the intersection of these two fields.
  • Circuity, the ratio of network distances to straight-line distances, is an important measure of urban street network structure and transportation efficiency. Circuity results from a circulation network's configuration, planning, and underlying terrain. In turn, it impacts how humans use urban space for settlement and travel. Although past research has examined overall street network circuity, researchers have not studied the relative circuity of walkable versus drivable circulation networks. This study uses OpenStreetMap data to explore relative network circuity. We download walkable and drivable networks for 40 US cities using the OSMnx software, which we then use to simulate four million routes and analyze circuity to characterize network structure. We find that walking networks tend to allow for more direct routes than driving networks do in most cities: average driving circuity exceeds average walking circuity in all but four of the cities that exhibit statistically significant differences between network types. We discuss various reasons for this phenomenon, illustrated with case studies. Network circuity also varies substantially between different types of places. These findings underscore the value of using network-based distances and times rather than straight-line when studying urban travel and access. They also suggest the importance of differentiating between walkable and drivable circulation networks when modeling and characterizing urban street networks: although different modes' networks overlap in any given city, their relative structure and performance vary in most cities.
  • We develop an algorithm that forecasts cascading events, by employing a Green's function scheme on the basis of the self-exciting point process model. This method is applied to open data of 10 types of crimes happened in Chicago. It shows a good prediction accuracy superior to or comparable to the standard methods which are the expectation-maximization method and prospective hotspot maps method. We find a cascade influence of the crimes that has a long-time, logarithmic tail; this result is consistent with an earlier study on burglaries. This long-tail feature cannot be reproduced by the other standard methods. In addition, a merit of the Green's function method is the low computational cost in the case of high density of events and/or large amount of the training data.
  • Positioning data offer a remarkable source of information to analyze crowds urban dynamics. However, discovering urban activity patterns from the emergent behavior of crowds involves complex system modeling. An alternative approach is to adopt computational techniques belonging to the emergent paradigm, which enables self-organization of data and allows adaptive analysis. Specifically, our approach is based on stigmergy. By using stigmergy each sample position is associated with a digital pheromone deposit, which progressively evaporates and aggregates with other deposits according to their spatiotemporal proximity. Based on this principle, we exploit positioning data to identify high density areas (hotspots) and characterize their activity over time. This characterization allows the comparison of dynamics occurring in different days, providing a similarity measure exploitable by clustering techniques. Thus, we cluster days according to their activity behavior, discovering unexpected urban activity patterns. As a case study, we analyze taxi traces in New York City during 2015.
  • Accounting for undecided and uncertain voters is a challenging issue for predicting election results from public opinion polls. Undecided voters typify the uncertainty of swing voters in polls but are often ignored or allocated to each candidate in a simple, deterministic manner. Historically this may have been adequate because the undecided were comparatively small enough to assume that they do not affect the relative proportions of the decided voters. However, in the presence of high numbers of undecided voters, these static rules may in fact bias election predictions from election poll authors and meta-poll analysts. In this paper, we examine the effect of undecided voters in the 2016 US presidential election to the previous three presidential elections. We show there were a relatively high number of undecided voters over the campaign and on election day, and that the allocation of undecided voters in this election was not consistent with two-party proportional (or even) allocations. We find evidence that static allocation regimes are inadequate for election prediction models and that probabilistic allocations may be superior. We also estimate the bias attributable to polling agencies, often referred to as "house effects".
  • Social media are massive marketplaces where ideas and news compete for our attention. Previous studies have shown that quality is not a necessary condition for online virality and that knowledge about peer choices can distort the relationship between quality and popularity. However, these results do not explain the viral spread of low-quality information, such as the digital misinformation that threatens our democracy. We investigate quality discrimination in a stylized model of online social network, where individual agents prefer quality information, but have behavioral limitations in managing a heavy flow of information. We measure the relationship between the quality of an idea and its likelihood to become prevalent at the system level. We find that both information overload and limited attention contribute to a degradation in the market's discriminative power. A good tradeoff between discriminative power and diversity of information is possible according to the model. However, calibration with empirical data characterizing information load and finite attention in real social media reveals a weak correlation between quality and popularity of information. In these realistic conditions, the model predicts that high-quality information has little advantage over low-quality information.
  • Segregation is the separation of social groups in the physical or in the online world. Segregation discovery consists of finding contexts of segregation. In the modern digital society, discovering segregation is challenging, due to the large amount and the variety of social data. We present a tool in support of segregation discovery from relational and graph data. The SCube system builds on attributed graph clustering and frequent itemset mining. It offers to the analyst a multi-dimensional segregation data cube for exploratory data analysis. The demonstration first guides the audience through the relevant social science concepts. Then, it focuses on scenarios around case studies of gender occupational segregation. Two real and large datasets about the boards of directors of Italian and Estonian companies will be explored in search of segregation contexts. The architecture of the SCube system and its computational efficiency challenges and solutions are discussed.
  • Percolation transition is widely observed in networks ranging from biology to engineering. While much attention has been paid to network topologies, studies rarely focus on critical percolation phenomena driven by network dynamics. Using extensive real data, we study the critical percolation properties in city traffic dynamics. Our results suggest that two modes of different critical percolation behaviors are switching in the same network topology under different traffic dynamics. One mode of city traffic (during nonrush hours or days off) has similar critical percolation characteristics as small world networks, while the other mode (during rush hours on working days) tends to behave as a 2D lattice. This switching behavior can be understood by the fact that the high-speed urban roads during nonrush hours or days off (that are congested during rush hours) represent effective long-range connections, like in small world networks. Our results might be useful for understanding and improving traffic resilience.
  • Understanding the mechanisms responsible for the emergence and evolution of oscillations in traffic flow has been subject to intensive research by the traffic flow theory community. In our previous work, we proposed a new mechanism to explain the generation of traffic oscillations: traffic instability caused by the competition between speed adaptation and the cumulative effect of stochastic factors. In this paper, by conducting a closer examination of car following data obtained in a 25-car platoon experiment, we discovered that the speed difference plays a more important role on car-following dynamics than the spacing, and when its amplitude is small, the growth of oscillations is mainly determined by the stochastic factors that follow the mean reversion process; when its amplitude increases, the growth of the oscillations is determined by the competition between the stochastic factors and the speed difference. An explanation is then provided, based on the above findings, to why the speed variance in the oscillatory traffic grows in a concave way along the platoon. Finally, we proposed a mode-switching stochastic car-following model that incorporates the speed adaptation and spacing indifference behaviors of drivers, which captures the observed characteristics of oscillation and discharge rate. Sensitivity analysis shows that reaction delay only has slight effect but indifference region boundary has significant on oscillation growth rate and discharge rate.