-
Here we study polysemy as a potential learning bias in vocabulary learning in
children. Words of low polysemy could be preferred as they reduce the
disambiguation effort for the listener. However, such preference could be a
side-effect of another bias: the preference of children for nouns in
combination with the lower polysemy of nouns with respect to other
part-of-speech categories. Our results show that mean polysemy in children
increases over time in two phases, i.e. a fast growth till the 31st month
followed by a slower tendency towards adult speech. In contrast, this evolution
is not found in adults interacting with children. This suggests that children
have a preference for non-polysemous words in their early stages of vocabulary
acquisition. Interestingly, the evolutionary pattern described above weakens
when controlling for syntactic category (noun, verb, adjective or adverb) but
it does not disappear completely, suggesting that it could result from
acombination of a standalone bias for low polysemy and a preference for nouns.
-
Lane changing is one of the most common maneuvers on motorways. Although,
macroscopic traffic models are well known for their suitability to describe
fast moving crowded traffic, most of these models are generally developed in
one dimensional framework, henceforth lane changing behavior is somehow
neglected. In this paper, we propose a macroscopic model, which accounts for
lane-changing behavior on motorway, based on a two-dimensional extension of the
Aw and Rascle [Aw and Rascle, SIAM J.Appl.Math., 2000] and Zhang [Zhang,
Transport.Res.B-Meth., 2002] macroscopic model for traffic flow. Under
conditions, when lane changing maneuvers are no longer possible, the model
"relaxes" to the one-dimensional Aw-Rascle-Zhang model. Following the same
approach as in [Aw, Klar, Materne and Rascle, SIAM J.Appl.Math., 2002], we
derive the two-dimensional macroscopic model through scaling of time
discretization of a microscopic follow-the-leader model with driving direction.
We provide a detailed analysis of the space-time discretization of the proposed
macroscopic as well as an approximation of the solution to the associated
Riemann problem. Furthermore, we illustrate some features of the proposed model
through some numerical experiments.
-
This paper introduces a theoretical framework for the analysis and control of
the stochastic susceptible-infected-removed (SIR) spreading process over a
network of heterogeneous agents. In our analysis, we analyze the exact
networked Markov process describing the SIR model, without resorting to
mean-field approximations, and introduce a convex optimization framework to
find an efficient allocation of resources to contain the expected number of
accumulated infections over time. Numerical simulations are presented to
illustrate the effectiveness of the obtained results.
-
The occurrence of discrimination is an important problem in the social and
economical sciences. Much of the discrimination observed in empirical studies
can be explained by the theory of in-group favoritism, which states that people
tend to act more positively towards peers whose appearances are more similar to
their own. Some studies, however, find hierarchical structures in inter-group
relations, where members of low-status groups also favor the high-status group
members. These observations cannot be understood in the light of in-group
favoritism. Here we present an agent based model in which evolutionary dynamics
can result in a hierarchical discrimination between two groups characterized by
a meaningless, but observable binary label. We find that discriminating
strategies end up dominating the system when the selection pressure is high,
i.e. when agents have a much higher probability of imitating their neighbor
with the highest payoff. These findings suggest that the puzzling persistence
of hierarchical discrimination may result from the evolutionary dynamics of the
social system itself, namely the social imitation dynamics. It also predicts
that discrimination will occur more often in highly competitive societies.
-
Ancient regional routes were vital for interactions between settlements and
deeply influenced the development of past societies and their
"complexification". At the same time, since any transportation infrastructure
needs some level of inter-settlement cooperation to be established, they can
also be regarded as an epiphenomenon of social interactions at the regional
scale. Here, we propose to analyze ancient pathway networks to understand the
organization of cities and villages located in a certain territory, attempting
to clarify whether such organization existed and if so, how it functioned. To
address such a question, we chose a quantitative approach. Adopting network
science as a general framework, by means of formal models, we try to identify
how the collective effort that produced the terrestrial infrastructure was
directed and organized. We selected a paradigmatic case study: Iron Age
southern Etruria, a very well-studied context, with detailed archaeological
information about settlement patterns and an established tradition of studies
on terrestrial transportation routes, perfectly suitable for testing new
techniques. The results of the modelling suggest that a balanced coordinated
decision-making process was shaping the route network in Etruria, a scenario
which correlates well with the picture elaborated by different scholars using a
more traditional technique.
-
Covering problems are classical computational problems concerning whether a
certain combinatorial structure 'covers' another. For example, the minimum
vertex covering problem aims to find the smallest set of vertices in a graph so
that each edge is incident to at least one vertex in that set. Interestingly,
the computational complexity of the minimum vertex covering problem in graphs
is closely related to the core percolation problem, where the core is a special
subgraph obtained by the greedy leaf removal procedure. Here, by generalizing
the greedy leaf removal procedure in graphs to hypergraphs, we introduce two
generalizations of core percolation in graphs to hypergraphs, related to the
minimum hyperedge cover problem and the minimum vertex cover problem on
hypergraphs, respectively. We offer analytical solutions of these two core
percolations for random hypergraphs with arbitrary vertex degree and hyperedge
cardinality distributions. We also compute these two cores in several
real-world hypergraphs, finding that they tend to be much smaller than their
randomized counterparts. This result suggests that both the minimum hyperedge
cover problem and the minimum vertex cover problem in those real-world
hypergraphs can actually be solved in polynomial time. Finally, we map the
minimum dominating set problem in graphs to the minimum hyperedge cover problem
in hypergraphs. We show that our generalized greedy leaf removel procedure
significantly outperforms the state-of-the-art method in solving the minimum
dominating set problem.
-
Although viral spreading processes taking place in networks are often
analyzed using Markovian models in which both the transmission and the recovery
times follow exponential distributions, empirical studies show that, in many
real scenarios, the distribution of these times are not necessarily
exponential. To overcome this limitation, we first introduce a generalized
susceptible-infected-susceptible (SIS) spreading model that allows transmission
and recovery times to follow phase-type distributions. In this context, we
derive a lower bound on the exponential decay rate towards the infection-free
equilibrium of the spreading model without relying on mean-field
approximations. Based on our results, we illustrate how the particular shape of
the transmission/recovery distribution influences the exponential rate of
convergence towards the equilibrium.
-
In such different domains as neurosciences, spin glasses, social science,
economics and finance, large ensemble of interacting individuals following
(mainstream) or opposing (hipsters) to the majority are ubiquitous. In these
systems, interactions generally occur after specific delays associated to
transport, transmission or integration of information. We investigate here the
impact of anti-conformism combined to delays in the emergent dynamics of large
populations of mainstreams and hipsters. To this purpose, we introduce a class
of simple statistical systems of interacting agents composed of (i) mainstreams
and anti-conformists in the presence of (ii) delays, possibly heterogeneous, in
the transmission of information. In this simple model, each agent can be in one
of two states, and can change state in continuous time with a rate depending on
the state of others in the past. We express the thermodynamic limit of these
systems as the number of agents diverge, and investigate the solutions of the
limit equation, with a particular focus on synchronized oscillations induced by
delayed interactions. We show that when hipsters are too slow in detecting the
trends, they will consistently make the same choice, and realizing this too
late, they will switch, all together to another state where they remain alike.
Similar synchronizations arise when the impact of mainstreams on hipsters
choices (and reciprocally) dominate the impact of other hipsters choices, and
we show that these may emerge only when the randomness in the hipsters
decisions is sufficiently large. Beyond the choice of the best suit to wear
this winter, this study may have important implications in understanding
synchronization of nerve cells, investment strategies in finance, or emergent
dynamics in social science, domains in which delays of communication and the
geometry of information accessibility are prominent.
-
Benford's Law predicts that the first significant digit on the leftmost side
of numbers in real-life data is proportioned between all possible 1 to 9 digits
approximately as in LOG(1 + 1/digit), so that low digits occur much more
frequently than high digits in the first place. The two essential prerequisites
for data configuration with regards to compliance with Benford's Law are high
order of magnitude and positive skewness with a tail falling to the right of
the histogram, so that quantitative configuration is such that the small is
numerous and the big is rare. In this article various quantitative partition
models are examined in terms of the quantitative and digital behavior of the
resultant set of parts. The universal feature found across all partition models
is having many small parts but only very few big parts, while Benford's Law is
valid only in some particular partition cases and under certain constraints.
Hence another suggested vista of Benford's Law is viewing it as a particular
subset of the broader positive skewness phenomenon in quantitative
partitioning. Significantly, such a vista is true in all other causes and
explanations of Benford's Law where the small consistently outnumbers the big
also in partial structures of the model or well before full convergence to
Benford is achieved - endowing the principle universality in a sense. In
conclusion, either the active act of partitioning or the passive consideration
of a large quantity as the composition of smaller parts can be considered as
another independent explanation for the widespread empirical observation of
Benford's Law in the physical sciences.
-
Collective, especially group-based, managerial decision making is crucial in
organizations. Using an evolutionary theoretic approach to collective decision
making, agent-based simulations were conducted to investigate how human
collective decision making would be affected by the agents' diversity in
problem understanding and/or behavior in discussion, as well as by their social
network structure. Simulation results indicated that groups with consistent
problem understanding tended to produce higher utility values of ideas and
displayed better decision convergence, but only if there was no group-level
bias in collective problem understanding. Simulation results also indicated the
importance of balance between selection-oriented (i.e., exploitative) and
variation-oriented (i.e., explorative) behaviors in discussion to achieve
quality final decisions. Expanding the group size and introducing non-trivial
social network structure generally improved the quality of ideas at the cost of
decision convergence. Simulations with different social network topologies
revealed collective decision making on small-world networks with high local
clustering tended to achieve highest decision quality more often than on random
or scale-free networks. Implications of this evolutionary theory and simulation
approach for future managerial research on collective, group, and multi-level
decision making are discussed.
-
Modern society depends on the flow of information over online social
networks, and users of popular platforms generate significant behavioral data
about themselves and their social ties. However, it remains unclear what
fundamental limits exist when using these data to predict the activities and
interests of individuals, and to what accuracy such predictions can be made
using an individual's social ties. Here we show that 95% of the potential
predictive accuracy for an individual is achievable using their social ties
only, without requiring that individual's data. We use information theoretic
tools to estimate the predictive information within the writings of Twitter
users, providing an upper bound on the available predictive information that
holds for any predictive or machine learning methods. As few as 8-9 of an
individual's contacts are sufficient to obtain predictability comparable to
that of the individual alone. Distinct temporal and social effects are visible
by measuring information flow along social ties, allowing us to better study
the dynamics of online activity. Our results have distinct privacy
implications: information is so strongly embedded in a social network that in
principle one can profile an individual from their available social ties even
when the individual forgoes the platform completely.
-
We introduce a tensor-based clustering method to extract sparse,
low-dimensional structure from high-dimensional, multi-indexed datasets. This
framework is designed to enable detection of clusters of data in the presence
of structural requirements which we encode as algebraic constraints in a linear
program. Our clustering method is general and can be tailored to a variety of
applications in science and industry. We illustrate our method on a collection
of experiments measuring the response of genetically diverse breast cancer cell
lines to an array of ligands. Each experiment consists of a cell line-ligand
combination, and contains time-course measurements of the early-signalling
kinases MAPK and AKT at two different ligand dose levels. By imposing
appropriate structural constraints and respecting the multi-indexed structure
of the data, the analysis of clusters can be optimized for biological
interpretation and therapeutic understanding. We then perform a systematic,
large-scale exploration of mechanistic models of MAPK-AKT crosstalk for each
cluster. This analysis allows us to quantify the heterogeneity of breast cancer
cell subtypes, and leads to hypotheses about the signalling mechanisms that
mediate the response of the cell lines to ligands.
-
We sketch the history of spectral ranking, a general umbrella name for
techniques that apply the theory of linear maps (in particular, eigenvalues and
eigenvectors) to matrices that do not represent geometric transformations, but
rather some kind of relationship between entities. Albeit recently made famous
by the ample press coverage of Google's PageRank algorithm, spectral ranking
was devised more than a century ago, and has been studied in tournament
ranking, psychology, social sciences, bibliometrics, economy and choice theory.
We describe the contribution given by previous scholars in precise and modern
mathematical terms: along the way, we show how to express in a general way
damped rankings, such as Katz's index, as dominant eigenvectors of perturbed
matrices, and then use results on the Drazin inverse to go back to the dominant
eigenvectors by a limit process. The result suggests a regularized definition
of spectral ranking that yields for a general matrix a unique vector depending
on a boundary condition.
-
Centrality is widely recognized as one of the most critical measures to
provide insight in the structure and function of complex networks. While
various centrality measures have been proposed for single-layer networks, a
general framework for studying centrality in multilayer networks (i.e.,
multicentrality) is still lacking. In this study, a tensor-based framework is
introduced to study eigenvector multicentrality, which enables the
quantification of the impact of interlayer influence on multicentrality,
providing a systematic way to describe how multicentrality propagates across
different layers. This framework can leverage prior knowledge about the
interplay among layers to better characterize multicentrality for varying
scenarios. Two interesting cases are presented to illustrate how to model
multilayer influence by choosing appropriate functions of interlayer influence
and design algorithms to calculate eigenvector multicentrality. This framework
is applied to analyze several empirical multilayer networks, and the results
corroborate that it can quantify the influence among layers and multicentrality
of nodes effectively.
-
The structure of the International Trade Network (ITN), whose nodes and links
represent world countries and their trade relations respectively, affects key
economic processes worldwide, including globalization, economic integration,
industrial production, and the propagation of shocks and instabilities.
Characterizing the ITN via a simple yet accurate model is an open problem. The
traditional Gravity Model (GM) successfully reproduces the volume of trade
between connected countries, using macroeconomic properties such as GDP,
geographic distance, and possibly other factors. However, it predicts a network
with complete or homogeneous topology, thus failing to reproduce the highly
heterogeneous structure of the ITN. On the other hand, recent maximum-entropy
network models successfully reproduce the complex topology of the ITN, but
provide no information about trade volumes. Here we integrate these two
currently incompatible approaches via the introduction of an Enhanced Gravity
Model (EGM) of trade. The EGM is the simplest model combining the GM with the
network approach within a maximum-entropy framework. Via a unified and
principled mechanism that is transparent enough to be generalized to any
economic network, the EGM provides a new econometric framework wherein trade
probabilities and trade volumes can be separately controlled by any combination
of dyadic and country-specific macroeconomic variables. The model successfully
reproduces both the global topology and the local link weights of the ITN,
parsimoniously reconciling the conflicting approaches. It also indicates that
the probability that any two countries trade a certain volume should follow a
geometric or exponential distribution with an additional point mass at zero
volume.
-
Various models have been recently proposed to reflect and predict different
properties of complex networks. However, the community structure, which is one
of the most important properties, is not well studied and modeled. In this
paper, we suggest a principle called "preferential placement", which allows to
model a realistic clustering structure. We provide an extensive empirical
analysis of the obtained structure as well as some theoretical results.
-
Recent progress in applying complex network theory to problems in quantum
information has resulted in a beneficial crossover. Complex network methods
have successfully been applied to transport and entanglement models while
information physics is setting the stage for a theory of complex systems with
quantum information-inspired methods. Novel quantum induced effects have been
predicted in random graphs---where edges represent entangled links---and
quantum computer algorithms have been proposed to offer enhancement for several
network problems. Here we review the results at the cutting edge, pinpointing
the similarities and the differences found at the intersection of these two
fields.
-
Circuity, the ratio of network distances to straight-line distances, is an
important measure of urban street network structure and transportation
efficiency. Circuity results from a circulation network's configuration,
planning, and underlying terrain. In turn, it impacts how humans use urban
space for settlement and travel. Although past research has examined overall
street network circuity, researchers have not studied the relative circuity of
walkable versus drivable circulation networks. This study uses OpenStreetMap
data to explore relative network circuity. We download walkable and drivable
networks for 40 US cities using the OSMnx software, which we then use to
simulate four million routes and analyze circuity to characterize network
structure. We find that walking networks tend to allow for more direct routes
than driving networks do in most cities: average driving circuity exceeds
average walking circuity in all but four of the cities that exhibit
statistically significant differences between network types. We discuss various
reasons for this phenomenon, illustrated with case studies. Network circuity
also varies substantially between different types of places. These findings
underscore the value of using network-based distances and times rather than
straight-line when studying urban travel and access. They also suggest the
importance of differentiating between walkable and drivable circulation
networks when modeling and characterizing urban street networks: although
different modes' networks overlap in any given city, their relative structure
and performance vary in most cities.
-
We develop an algorithm that forecasts cascading events, by employing a
Green's function scheme on the basis of the self-exciting point process model.
This method is applied to open data of 10 types of crimes happened in Chicago.
It shows a good prediction accuracy superior to or comparable to the standard
methods which are the expectation-maximization method and prospective hotspot
maps method. We find a cascade influence of the crimes that has a long-time,
logarithmic tail; this result is consistent with an earlier study on
burglaries. This long-tail feature cannot be reproduced by the other standard
methods. In addition, a merit of the Green's function method is the low
computational cost in the case of high density of events and/or large amount of
the training data.
-
Positioning data offer a remarkable source of information to analyze crowds
urban dynamics. However, discovering urban activity patterns from the emergent
behavior of crowds involves complex system modeling. An alternative approach is
to adopt computational techniques belonging to the emergent paradigm, which
enables self-organization of data and allows adaptive analysis. Specifically,
our approach is based on stigmergy. By using stigmergy each sample position is
associated with a digital pheromone deposit, which progressively evaporates and
aggregates with other deposits according to their spatiotemporal proximity.
Based on this principle, we exploit positioning data to identify high density
areas (hotspots) and characterize their activity over time. This
characterization allows the comparison of dynamics occurring in different days,
providing a similarity measure exploitable by clustering techniques. Thus, we
cluster days according to their activity behavior, discovering unexpected urban
activity patterns. As a case study, we analyze taxi traces in New York City
during 2015.
-
Accounting for undecided and uncertain voters is a challenging issue for
predicting election results from public opinion polls. Undecided voters typify
the uncertainty of swing voters in polls but are often ignored or allocated to
each candidate in a simple, deterministic manner. Historically this may have
been adequate because the undecided were comparatively small enough to assume
that they do not affect the relative proportions of the decided voters.
However, in the presence of high numbers of undecided voters, these static
rules may in fact bias election predictions from election poll authors and
meta-poll analysts. In this paper, we examine the effect of undecided voters in
the 2016 US presidential election to the previous three presidential elections.
We show there were a relatively high number of undecided voters over the
campaign and on election day, and that the allocation of undecided voters in
this election was not consistent with two-party proportional (or even)
allocations. We find evidence that static allocation regimes are inadequate for
election prediction models and that probabilistic allocations may be superior.
We also estimate the bias attributable to polling agencies, often referred to
as "house effects".
-
Social media are massive marketplaces where ideas and news compete for our
attention. Previous studies have shown that quality is not a necessary
condition for online virality and that knowledge about peer choices can distort
the relationship between quality and popularity. However, these results do not
explain the viral spread of low-quality information, such as the digital
misinformation that threatens our democracy. We investigate quality
discrimination in a stylized model of online social network, where individual
agents prefer quality information, but have behavioral limitations in managing
a heavy flow of information. We measure the relationship between the quality of
an idea and its likelihood to become prevalent at the system level. We find
that both information overload and limited attention contribute to a
degradation in the market's discriminative power. A good tradeoff between
discriminative power and diversity of information is possible according to the
model. However, calibration with empirical data characterizing information load
and finite attention in real social media reveals a weak correlation between
quality and popularity of information. In these realistic conditions, the model
predicts that high-quality information has little advantage over low-quality
information.
-
Segregation is the separation of social groups in the physical or in the
online world. Segregation discovery consists of finding contexts of
segregation. In the modern digital society, discovering segregation is
challenging, due to the large amount and the variety of social data. We present
a tool in support of segregation discovery from relational and graph data. The
SCube system builds on attributed graph clustering and frequent itemset mining.
It offers to the analyst a multi-dimensional segregation data cube for
exploratory data analysis. The demonstration first guides the audience through
the relevant social science concepts. Then, it focuses on scenarios around case
studies of gender occupational segregation. Two real and large datasets about
the boards of directors of Italian and Estonian companies will be explored in
search of segregation contexts. The architecture of the SCube system and its
computational efficiency challenges and solutions are discussed.
-
Percolation transition is widely observed in networks ranging from biology to
engineering. While much attention has been paid to network topologies, studies
rarely focus on critical percolation phenomena driven by network dynamics.
Using extensive real data, we study the critical percolation properties in city
traffic dynamics. Our results suggest that two modes of different critical
percolation behaviors are switching in the same network topology under
different traffic dynamics. One mode of city traffic (during nonrush hours or
days off) has similar critical percolation characteristics as small world
networks, while the other mode (during rush hours on working days) tends to
behave as a 2D lattice. This switching behavior can be understood by the fact
that the high-speed urban roads during nonrush hours or days off (that are
congested during rush hours) represent effective long-range connections, like
in small world networks. Our results might be useful for understanding and
improving traffic resilience.
-
Understanding the mechanisms responsible for the emergence and evolution of
oscillations in traffic flow has been subject to intensive research by the
traffic flow theory community. In our previous work, we proposed a new
mechanism to explain the generation of traffic oscillations: traffic
instability caused by the competition between speed adaptation and the
cumulative effect of stochastic factors. In this paper, by conducting a closer
examination of car following data obtained in a 25-car platoon experiment, we
discovered that the speed difference plays a more important role on
car-following dynamics than the spacing, and when its amplitude is small, the
growth of oscillations is mainly determined by the stochastic factors that
follow the mean reversion process; when its amplitude increases, the growth of
the oscillations is determined by the competition between the stochastic
factors and the speed difference. An explanation is then provided, based on the
above findings, to why the speed variance in the oscillatory traffic grows in a
concave way along the platoon. Finally, we proposed a mode-switching stochastic
car-following model that incorporates the speed adaptation and spacing
indifference behaviors of drivers, which captures the observed characteristics
of oscillation and discharge rate. Sensitivity analysis shows that reaction
delay only has slight effect but indifference region boundary has significant
on oscillation growth rate and discharge rate.