
For many years, Herlihy's elegant computability based Consensus Hierarchy has
been our best explanation of the relative power of various types of
multiprocessor synchronization objects when used in deterministic algorithms.
However, key to this hierarchy is treating synchronization instructions as
distinct objects, an approach that is far from the realworld, where
multiprocessor programs apply synchronization instructions to collections of
arbitrary memory locations. We were surprised to realize that, when considering
instructions applied to memory locations, the computability based hierarchy
collapses. This leaves open the question of how to better capture the power of
various synchronization instructions.
In this paper, we provide an approach to answering this question. We present
a hierarchy of synchronization instructions, classified by their space
complexity in solving obstructionfree consensus. Our hierarchy provides a
classification of combinations of known instructions that seems to fit with our
intuition of how useful some are in practice, while questioning the
effectiveness of others. We prove an essentially tight characterization of the
power of buffered read and write instructions.Interestingly, we show a similar
result for multilocation atomic assignments.

Determining the space complexity of $x$obstructionfree $k$set agreement
for $x\leq k$ is an open problem. In $x$obstructionfree protocols, processes
are required to return in executions where at most $x$ processes take steps.
The best known upper bound on the number of registers needed to solve this
problem among $n>k$ processes is $nk+x$ registers. No general lower bound
better than $2$ was known.
We prove that any $x$obstructionfree protocol solving $k$set agreement
among $n>k$ processes uses at least $\lfloor(nx)/(k+1x)\rfloor+1$ registers.
Our main tool is a simulation that serves as a reduction from the impossibility
of deterministic waitfree $k$set agreement: if a protocol uses fewer
registers, then it is possible for $k+1$ processes to simulate the protocol and
deterministically solve $k$set agreement in a waitfree manner, which is
impossible. A critical component of the simulation is the ability of simulating
processes to revise the past of simulated processes. We introduce a new
augmented snapshot object, which facilitates this.
We also prove that any space lower bound on the number of registers used by
obstructionfree protocols applies to protocols that satisfy nondeterministic
solo termination. Hence, our lower bound of $\lfloor(n1)/k\rfloor+1$ for the
obstructionfree ($x=1$) case also holds for randomized waitfree free
protocols. In particular, this gives a tight lower bound of exactly $n$
registers for solving obstructionfree and randomized waitfree consensus.
Finally, our new techniques can be applied to get a space lower of $\lfloor
n/2\rfloor+1$ for $\epsilon$approximate agreement, for sufficiently small
$\epsilon$. It requires participating processes to return values within
$\epsilon$ of each other. The best known upper bounds are
$\lceil\log(1/\epsilon)\rceil$ and $n$, while no general lower bounds were
known.

Population protocols are a model of distributed computing, in which $n$
agents with limited local state interact randomly, and cooperate to
collectively compute global predicates. An extensive series of papers, across
different communities, has examined the computability and complexity
characteristics of this model. Majority, or consensus, is a central task, in
which agents need to collectively reach a decision as to which one of two
states $A$ or $B$ had a higher initial count. Two complexity metrics are
important: the time that a protocol requires to stabilize to an output
decision, and the state space size that each agent requires.
It is known that majority requires $\Omega(\log \log n)$ states per agent to
allow for polylogarithmic time stabilization, and that $O(\log^2 n)$ states
are sufficient. Thus, there is an exponential gap between the upper and lower
bounds.
We address this question. We provide a new lower bound of $\Omega(\log n)$
states for any protocol which stabilizes in $O( n^{1c} )$ time, for any $c >
0$ constant. This result is conditional on basic monotonicity and output
assumptions, satisfied by all known protocols. Technically, it represents a
significant departure from previous lower bounds. Instead of relying on dense
configurations, we introduce a new surgery technique to construct executions
which contradict the correctness of algorithms that stabilize too fast.
Subsequently, our lower bound applies to general initial configurations.
We give an algorithm for majority which uses $O(\log n)$ states, and
stabilizes in $O(\log^2 n)$ time. Central to the algorithm is a new leaderless
phase clock, which allows nodes to synchronize in phases of $\Theta(n \log{n})$
consecutive interactions using $O(\log n)$ states per node. We also employ our
phase clock to build a leader election algorithm with $O(\log n )$ states,
which stabilizes in $O(\log^2 n)$ time.

Contrary to common belief, a recent work by Ellen, Gelashvili, Shavit, and
Zhu has shown that computability does not require multicore architectures to
support "strong" synchronization instructions like compareandswap, as opposed
to combinations of "weaker" instructions like decrement and multiply. However,
this is the status quo, and in turn, most efficient concurrent datastructures
heavily rely on compareandswap (e.g. for swinging pointers and in general,
conflict resolution).
We show that this need not be the case, by designing and implementing a
concurrent linearizable Log datastructure (also known as a History object),
supporting two operations: append(item), which appends the item to the log, and
getlog(), which returns the appended items so far, in order. Readers are
waitfree and writers are lockfree, and this datastructure can be used in a
lockfree universal construction to implement any concurrent object with a
given sequential specification. Our implementation uses atomic read, xor,
decrement, and fetchandincrement instructions supported on X86 architectures,
and provides similar performance to a compareandswapbased solution on
today's hardware. This raises a fundamental question about minimal set of
synchronization instructions that the architectures have to support.

Population protocols are a popular model of distributed computing, in which
randomlyinteracting agents with little computational power cooperate to
jointly perform computational tasks. Inspired by developments in molecular
computation, and in particular DNA computing, recent algorithmic work has
focused on the complexity of solving simple yet fundamental tasks in the
population model, such as leader election (which requires stabilization to a
single agent in a special "leader" state), and majority (in which agents must
stabilize to a decision as to which of two possible initial states had higher
initial count). Known results point towards an inherent tradeoff between the
time complexity of such algorithms, and the space complexity, i.e. size of the
memory available to each agent.
In this paper, we explore this tradeoff and provide new upper and lower
bounds for majority and leader election. First, we prove a unified lower bound,
which relates the space available per node with the time complexity achievable
by a protocol: for instance, our result implies that any protocol solving
either of these tasks for $n$ agents using $O( \log \log n )$ states must take
$\Omega( n / \rm{polylog} n )$ expected time. This is the first result to
characterize time complexity for protocols which employ superconstant number
of states per node, and proves that fast, polylogarithmic running times
require protocols to have relatively large space costs.
On the positive side, we give algorithms showing that fast, polylogarithmic
stabilization time can be achieved using $O( \log^2 n )$ space per node, in the
case of both tasks. Overall, our results highlight a time complexity separation
between $O(\log \log n)$ and $\Theta( \log^2 n )$ state space size for both
majority and leader election in population protocols, and introduce new
techniques, which should be applicable more broadly.

Population protocols are networks of finitestate agents, interacting
randomly, and updating their states using simple rules. Despite their extreme
simplicity, these systems have been shown to cooperatively perform complex
computational tasks, such as simulating register machines to compute standard
arithmetic functions. The election of a unique leader agent is a key
requirement in such computational constructions. Yet, the fastest currently
known population protocol for electing a leader only has linear stabilization
time, and, it has recently been shown that no population protocol using a
constant number of states per node may overcome this linear bound.
In this paper, we give the first population protocol for leader election with
polylogarithmic stabilization time, using polylogarithmic memory states per
node. The protocol structure is quite simple: each node has an associated
value, and is either a leader (still in contention) or a minion (following some
leader). A leader keeps incrementing its value and "defeats" other leaders in
onetoone interactions, and will drop from contention and become a minion if
it meets a leader with higher value. Importantly, a leader also drops out if it
meets a minion with higher absolute value. While these rules are quite simple,
the proof that this algorithm achieves polylogarithmic stabilization time is
nontrivial. In particular, the argument combines careful use of concentration
inequalities with anticoncentration bounds, showing that the leaders' values
become spread apart as the execution progresses, which in turn implies that
straggling leaders get quickly eliminated. We complement our analysis with
empirical results, showing that our protocol stabilizes extremely fast, even
for large network sizes.

The optimal space complexity of consensus in shared memory is a decadesold
open problem. For a system of $n$ processes, no algorithm is known that uses a
sublinear number of registers. However, the best known lower bound due to Fich,
Herlihy, and Shavit requires $\Omega(\sqrt{n})$ registers.
The special symmetric case of the problem where processes are anonymous (run
the same algorithm) has also attracted attention. Even in this case, the best
lower and upper bounds are still $\Omega(\sqrt{n})$ and $O(n)$. Moreover, Fich,
Herlihy, and Shavit first proved their lower bound for anonymous processes, and
then extended it to the general case. As such, resolving the anonymous case
might be a significant step towards understanding and solving the general
problem.
In this work, we show that in a system of anonymous processes, any consensus
algorithm satisfying nondeterministic solo termination has to use $\Omega(n)$
readwrite registers in some execution. This implies an $\Omega(n)$ lower bound
on the space complexity of deterministic obstructionfree and randomized
waitfree consensus, matching the upper bound and closing the symmetric case of
the open problem.

The Restricted Isometry Property (RIP) is a fundamental property of a matrix
which enables sparse recovery. Informally, an $m \times n$ matrix satisfies RIP
of order $k$ for the $\ell_p$ norm, if $\Ax\_p \approx \x\_p$ for every
vector $x$ with at most $k$ nonzero coordinates.
For every $1 \leq p < \infty$ we obtain almost tight bounds on the minimum
number of rows $m$ necessary for the RIP property to hold. Prior to this work,
only the cases $p = 1$, $1 + 1 / \log k$, and $2$ were studied. Interestingly,
our results show that the case $p = 2$ is a "singularity" point: the optimal
number of rows $m$ is $\widetilde{\Theta}(k^{p})$ for all $p\in
[1,\infty)\setminus \{2\}$, as opposed to $\widetilde{\Theta}(k)$ for $k=2$.
We also obtain almost tight bounds for the column sparsity of RIP matrices
and discuss implications of our results for the Stable Sparse Recovery problem.

The problem of electing a leader from among $n$ contenders is one of the
fundamental questions in distributed computing. In its simplest formulation,
the task is as follows: given $n$ processors, all participants must eventually
return a win or lose indication, such that a single contender may win. Despite
a considerable amount of work on leader election, the following question is
still open: can we elect a leader in an asynchronous faultprone system faster
than just running a $\Theta(\log n)$time tournament, against a strong adaptive
adversary?
In this paper, we answer this question in the affirmative, improving on a
decadesold upper bound. We introduce two new algorithmic ideas to reduce the
time complexity of electing a leader to $O(\log^* n)$, using $O(n^2)$
pointtopoint messages. A nontrivial application of our algorithm is a new
upper bound for the tight renaming problem, assigning $n$ items to the $n$
participants in expected $O(\log^2 n)$ time and $O(n^2)$ messages. We
complement our results with lower bound of $\Omega(n^2)$ messages for solving
these two problems, closing the question of their message complexity.

JohnsonLindenstrauss (JL) matrices implemented by sparse random synaptic
connections are thought to be a prime candidate for how convergent pathways in
the brain compress information. However, to date, there is no complete
mathematical support for such implementations given the constraints of real
neural tissue. The fact that neurons are either excitatory or inhibitory
implies that every so implementable JL matrix must be signconsistent (i.e.,
all entries in a single column must be either all nonnegative or all
nonpositive), and the fact that any given neuron connects to a relatively
small subset of other neurons implies that the JL matrix had better be sparse.
We construct sparse JL matrices that are signconsistent, and prove that our
construction is essentially optimal. Our work answers a mathematical question
that was triggered by earlier work and is necessary to justify the existence of
JL compression in the brain, and emphasizes that inhibition is crucial if
neurons are to perform efficient, correlationpreserving compression.

All consensus hierarchies in the literature assume that we have, in addition
to copies of a given object, an unbounded number of registers. But why do we
really need these registers?
This paper considers what would happen if one attempts to solve consensus
using various objects but without any registers. We show that under a
reasonable assumption, objects like queues and stacks cannot emulate the
missing registers. We also show that, perhaps surprisingly, initialization,
shown to have no computational consequences when registers are readily
available, is crucial in determining the synchronization power of objects when
no registers are allowed. Finally, we show that without registers, the number
of available objects affects the level of consensus that can be solved.
Our work thus raises the question of whether consensus hierarchies which
assume an unbounded number of registers truly capture synchronization power,
and begins a line of research aimed at better understanding the interaction
between readwrite memory and the powerful synchronization operations available
on modern architectures.