
We show that a broad class of $(+,\diamond)$ vector products (for binary
integer functions $\diamond$) are equivalent under onetopolylog reductions to
the computation of the Hamming distance. Examples include: the dominance
product, the threshold product and $\ell_{2p+1}$ distances for constant $p$.
Our results imply equivalence (up to polylog factors) between complexity of
computation of All Pairs: Hamming Distances, $\ell_{2p+1}$ Distances, Dominance
Products and Threshold Products. As a consequence, Yuster's~(SODA'09) algorithm
improves not only Matou\v{s}ek's (IPL'91), but also the results of Indyk,
Lewenstein, Lipsky and Porat (ICALP'04) and Min, Kao and Zhu (COCOON'09).
Furthermore, our reductions apply to the pattern matching setting, showing
equivalence (up to polylog factors) between pattern matching under Hamming
Distance, $\ell_{2p+1}$ Distance, Dominance Product and Threshold Product, with
current best upperbounds due to results of Abrahamson (SICOMP'87), Amir and
Farach (Ann.~Math.~Artif.~Intell.'91), Atallah and Duket (IPL'11), Clifford,
Clifford and Iliopoulous (CPM'05) and Amir, Lipsky, Porat and Umanski (CPM'05).
The resulting algorithms for $\ell_{2p+1}$ Pattern Matching and All Pairs
$\ell_{2p+1}$, for $2p+1 = 3,5,7,\dots$ are new.
Additionally, we show that the complexity of AllPairsHammingDistances (and
thus of other aforementioned AllPairs problems) is within polylog from the
time it takes to multiply matrices $n \times (n\cdot d)$ and $(n\cdot d) \times
n$, each with $(n\cdot d)$ nonzero entries. This means that the current
upperbounds by Yuster (SODA'09) cannot be improved without improving the sparse
matrix multiplication algorithm by Yuster and Zwick~(ACM TALG'05) and vice
versa.

The problem of finding \emph{distance} between \emph{pattern} of length $m$
and \emph{text} of length $n$ is a typical way of generalizing pattern matching
to incorporate dissimilarity score. For both Hamming and $L_1$ distances only a
super linear upper bound $\widetilde{O}(n\sqrt{m})$ are known, which prompts
the question of relaxing the problem: either by asking for $(1 \pm
\varepsilon)$ approximate distance (every distance is reported up to a
multiplicative factor), or $k$approximated distance (distances exceeding $k$
are reported as $\infty$). We focus on $L_1$ distance, for which we show new
algorithms achieving complexities respectively $\widetilde{O}(\varepsilon^{1}
n)$ and $\widetilde{O}((m+k\sqrt{m}) \cdot n/m)$. This is a significant
improvement upon previous algorithms with runtime
$\widetilde{O}(\varepsilon^{2} n)$ of Lipsky and Porat [Algorithmica 2011] and
$\widetilde{O}(n\sqrt{k})$ of Amir, Lipsky, Porat and Umanski [CPM 2005].

A population protocol describes a set of state change rules for a population
of $n$ indistinguishable finitestate agents (automata), undergoing random
pairwise interactions. Within this very basic framework, it is possible to
resolve a number of fundamental tasks in distributed computing, including:
leader election, aggregate and threshold functions on the population, such as
majority computation, and plurality consensus. For the first time, we show that
solutions to all of these problems can be obtained \emph{quickly} using
finitestate protocols. For any input, the designed finitestate protocols
converge under a fair random scheduler to an output which is correct with high
probability in expected $O(\mathrm{poly} \log n)$ parallel time. In the same
setting, we also show protocols which always reach a valid solution, in
expected parallel time $O(n^\varepsilon)$, where the number of states of the
interacting automata depends only on the choice of $\varepsilon>0$. The stated
time bounds hold for \emph{any} semilinear predicate computable in the
population protocol framework.
The key ingredient of our result is the decentralized design of a hierarchy
of phaseclocks, which tick at different rates, with the rates of adjacent
clocks separated by a factor of $\Theta(\log n)$. The construction of this
clock hierarchy relies on a new protocol composition technique, combined with
an adapted analysis of a selforganizing process of oscillatory dynamics. This
clock hierarchy is used to provide nested synchronization primitives, which
allow us to view the population in a global manner and design protocols using a
highlevel imperative programming language with a (limited) capacity for loops
and branching instructions.

We consider two types of searching models, where the goal is to design an
adaptive algorithm that locates an unknown vertex in a graph by repeatedly
performing queries. In the vertexquery model, each query points to a vertex
$v$ and the response either admits that $v$ is the target or provides a
neighbor of $v$ on a shortest path from $v$ to the target. This model has been
introduced for trees by Onak and Parys [FOCS 2006] and by EmamjomehZadeh et
al. [STOC 2016] for arbitrary graphs. In the edgequery model, each query
chooses an edge and the response reveals which endpoint of the edge is closer
to the target, breaking ties arbitrarily.
Our goal is to analyze solutions to these problems assuming that some
responses may be erroneous. We develop a scheme for tackling such noisy models
with the following line of arguments: For each of the two models, we analyze a
generic strategy that assumes a fixed number of lies and give a precise bound
for its length via an amortized analysis. From this, we derive bounds for both
a linearly bounded error rate, where the number of errors in $T$ queries is
bounded by $rT$ for some $r<1/2$, and a probabilistic model in which each
response is incorrect with some probability $p<1/2$. The bounds for adversarial
case turn out to be strong enough for nonadversarial scenarios as well.
We obtain thus a much simpler strategy performing fewer vertexqueries than
one by EmamjomehZadeh et al. [STOC 2016]. For edgequeries, not studied before
for general graphs, we obtain bounds that are tight up to $\log \Delta$ factors
in all error models. Applying our graphtheoretic results to the setting of
edgequeries for paths, we obtain a number of improvements over existing bounds
for searching in a sorted array in the presence of errors, including an
exponential improvement for the prefixbounded model in unbounded domains.

The model of population protocols refers to a large collection of simple
indistinguishable entities, frequently called agents. The agents communicate
and perform computation through pairwise interactions. We study fast and space
efficient leader election in population of cardinality $n$ governed by a random
scheduler, where during each time step the scheduler uniformly at random
selects for interaction exactly one pair of agents.
We propose the first $o(\log^2 n)$time leader election protocol. Our
solution operates in expected parallel time $O(\log n\log\log n)$ which is
equivalent to $O(n \log n\log\log n)$ pairwise interactions. This is the
fastest currently known leader election algorithm in which each agent utilises
asymptotically optimal number of $O(\log\log n)$ states. The new protocol
incorporates and amalgamates successfully the power of assorted synthetic coins
with variable rate phase clocks.

Depth first search is a natural algorithmic technique for constructing a
closed route that visits all vertices of a graph. The length of such route
equals, in an edgeweighted tree, twice the total weight of all edges of the
tree and this is asymptotically optimal over all exploration strategies. This
paper considers a variant of such search strategies where the length of each
route is bounded by a positive integer $B$ (e.g. due to limited energy
resources of the searcher). The objective is to cover all the edges of a tree
$T$ using the minimum number of routes, each starting and ending at the root
and each being of length at most $B$. To this end, we analyze the following
natural greedy tree traversal process that is based on decomposing a depth
first search traversal into a sequence of limited length routes. Given any
arbitrary depth first search traversal $R$ of the tree $T$, we cover $R$ with
routes $R_1,\ldots,R_l$, each of length at most $B$ such that: $R_i$ starts at
the root, reaches directly the farthest point of $R$ visited by $R_{i1}$, then
$R_i$ continues along the path $R$ as far as possible, and finally $R_i$
returns to the root. We call the above algorithm \emph{piecemealDFS} and we
prove that it achieves the asymptotically minimal number of routes $l$,
regardless of the choice of $R$. Our analysis also shows that the total length
of the traversal (and thus the traversal time) of piecemealDFS is
asymptotically minimum over all energyconstrained exploration strategies. The
fact that $R$ can be chosen arbitrarily means that the exploration strategy can
be constructed in an online fashion when the input tree $T$ is not known in
advance. Surprisingly, our results show that depth first search is efficient
for energy constrained exploration of trees, even though it is known that the
same does not hold for energy constrained exploration of arbitrary graphs.

We give a new, simple distributed algorithm for graph colouring in paths and
cycles. Our algorithm is fast and selfcontained, it does not need any globally
consistent orientation, and it reduces the number of colours from $10^{100}$ to
$3$ in three iterations.

In the $2$reachability problem we are given a directed graph $G$ and we wish
to determine if there are two (edge or vertex) disjoint paths from $u$ to $v$,
for a given pair of vertices $u$ and $v$. In this paper, we present an
algorithm that computes $2$reachability information for all pairs of vertices
in $\mathcal{O}(n^{\omega}\log n)$ time, where $n$ is the number of vertices
and $\omega$ is the matrix multiplication exponent. Hence, we show that the
running time of allpairs $2$reachability is only within a $\log$ factor of
transitive closure.
Moreover, our algorithm produces a witness (i.e., a separating edge or a
separating vertex) for all pair of vertices where $2$reachability does not
hold. By processing these witnesses, we can compute all the edge and
vertexdominator trees of $G$ in $\mathcal{O}(n^2)$ additional time, which in
turn enables us to answer various connectivity queries in $\mathcal{O}(1)$
time. For instance, we can test in constant time if there is a path from $u$ to
$v$ avoiding an edge $e$, for any pair of query vertices $u$ and $v$, and any
query edge $e$, or if there is a path from $u$ to $v$ avoiding a vertex $w$,
for any query vertices $u$, $v$, and $w$.

We consider a dynamical process in a network which distributes all tokens
located at a node among its neighbors, in a roundrobin manner.
We show that in the recurrent state of this dynamics (i.e., disregarding a
polynomially long initialization phase of the system), the number of particles
located on a given edge, averaged over an interval of time, is tightly
concentrated around the average particle density in the system. Formally, for a
system of $k$ particles in a graph of $m$ edges, during any interval of length
$T$, this timeaveraged value is $k/m \pm \widetilde{O}(1/T)$, whenever
$\gcd(m,k) = \widetilde{O}(1)$ (and so, e.g., whenever $m$ is a prime number).
To achieve these bounds, we link the behavior of the studied dynamics to
ergodic properties of traversals based on Eulerian circuits on a symmetric
directed graph. These results are proved through sum set methods and are likely
to be of independent interest.
As a corollary, we also obtain bounds on the idleness of the studied
dynamics, i.e., on the longest possible time between two consecutive
appearances of a token on an edge, taken over all edges. Designing trajectories
for $k$ tokens in a way which minimizes idleness is fundamental to the study of
the patrolling problem in networks. Our results immediately imply a bound of
$\widetilde{O}(m/k)$ on the idleness of the studied process, showing that it is
a distributed $\widetilde{O}(1)$competitive solution to the patrolling task,
for all of the covered cases. Our work also provides some further insights may
be interesting in loadbalancing applications.

In contrast to electronic computation, chemical computation is noisy and
susceptible to a variety of sources of error, which has prevented the
construction of robust complex systems. To be effective, chemical algorithms
must be designed with an appropriate error model in mind. Here we consider the
model of chemical reaction networks that preserve molecular count (population
protocols), and ask whether computation can be made robust to a natural model
of unintended "leak" reactions. Our definition of leak is motivated by both the
particular spurious behavior seen when implementing chemical reaction networks
with DNA strand displacement cascades, as well as the unavoidable side
reactions in any implementation due to the basic laws of chemistry. We develop
a new "Robust Detection" algorithm for the problem of fast (logarithmic time)
single molecule detection, and prove that it is robust to this general model of
leaks. Besides potential applications in single molecule detection, the
errorcorrection ideas developed here might enable a new class of
robustbydesign chemical algorithms. Our analysis is based on a nonstandard
hybrid argument, combining ideas from discrete analysis of population protocols
with classic Markov chain techniques.

LCLs or locally checkable labelling problems (e.g. maximal independent set,
maximal matching, and vertex colouring) in the LOCAL model of computation are
very wellunderstood in cycles (toroidal 1dimensional grids): every problem
has a complexity of $O(1)$, $\Theta(\log^* n)$, or $\Theta(n)$, and the design
of optimal algorithms can be fully automated.
This work develops the complexity theory of LCL problems for toroidal
2dimensional grids. The complexity classes are the same as in the
1dimensional case: $O(1)$, $\Theta(\log^* n)$, and $\Theta(n)$. However, given
an LCL problem it is undecidable whether its complexity is $\Theta(\log^* n)$
or $\Theta(n)$ in 2dimensional grids.
Nevertheless, if we correctly guess that the complexity of a problem is
$\Theta(\log^* n)$, we can completely automate the design of optimal
algorithms. For any problem we can find an algorithm that is of a normal form
$A' \circ S_k$, where $A'$ is a finite function, $S_k$ is an algorithm for
finding a maximal independent set in $k$th power of the grid, and $k$ is a
constant.
Finally, partially with the help of automated design tools, we classify the
complexity of several concrete LCL problems related to colourings and
orientations.

Given a pattern of length $m$ and a text of length $n$, the goal in
$k$mismatch pattern matching is to compute, for every $m$substring of the
text, the exact Hamming distance to the pattern or report that it exceeds $k$.
This can be solved in either $\widetilde{O}(n \sqrt{k})$ time as shown by Amir
et al. [J. Algorithms 2004] or $\widetilde{O}((m + k^2) \cdot n/m)$ time due to
a result of Clifford et al. [SODA 2016]. We provide a smooth time tradeoff
between these two bounds by designing an algorithm working in time
$\widetilde{O}( (m + k \sqrt{m}) \cdot n/m)$. We complement this with a
matching conditional lower bound, showing that a significantly faster
combinatorial algorithm is not possible, unless the combinatorial matrix
multiplication conjecture fails.

A distance labeling scheme is an assignments of labels, that is binary
strings, to all nodes of a graph, so that the distance between any two nodes
can be computed from their labels and the labels are as short as possible. A
major open problem is to determine the complexity of distance labeling in
unweighted and undirected planar graphs. It is known that, in such a graph on
$n$ nodes, some labels must consist of $\Omega(n^{1/3})$ bits, but the best
known labeling scheme uses labels of length $O(\sqrt{n}\log n)$ [Gavoille,
Peleg, P\'erennes, and Raz, J. Algorithms, 2004]. We show that, in fact, labels
of length $O(\sqrt{n})$ are enough.

We consider computing a longest palindrome in the streaming model, where the
symbols arrive onebyone and we do not have random access to the input. While
computing the answer exactly using sublinear space is not possible in such a
setting, one can still hope for a good approximation guarantee. Our
contribution is twofold. First, we provide lower bounds on the space
requirements for randomized approximation algorithms processing inputs of
length $n$. We rule out Las Vegas algorithms, as they cannot achieve sublinear
space complexity. For Monte Carlo algorithms, we prove a lower bounds of
$\Omega( M \log\min\{\Sigma,M\})$ bits of memory; here $M=n/E$ for
approximating the answer with additive error $E$, and $M= \frac{\log n}{\log
(1+\varepsilon)}$ for approximating the answer with multiplicative error $(1 +
\varepsilon)$. Second, we design three realtime algorithms for this problem.
Our Monte Carlo approximation algorithms for both additive and multiplicative
versions of the problem use $O(M)$ words of memory. Thus the obtained lower
bounds are asymptotically tight up to a logarithmic factor. The third algorithm
is deterministic and finds a longest palindrome exactly if it is short. This
algorithm can be run in parallel with a Monte Carlo algorithm to obtain better
results in practice. Overall, both the time and space complexity of finding a
longest palindrome in a stream are essentially settled.

A distance labeling scheme is an assignment of bitlabels to the vertices of
an undirected, unweighted graph such that the distance between any pair of
vertices can be decoded solely from their labels. We propose a series of new
labeling schemes within the framework of socalled hub labeling (HL, also known
as landmark labeling or 2hopcover labeling), in which each node $u$ stores
its distance to all nodes from an appropriately chosen set of hubs $S(u)
\subseteq V$. For a queried pair of nodes $(u,v)$, the length of a shortest
$uv$path passing through a hub node from $S(u)\cap S(v)$ is then used as an
upper bound on the distance between $u$ and $v$.
We present a hub labeling which allows us to decode exact distances in sparse
graphs using labels of size sublinear in the number of nodes. For graphs with
at most $n$ nodes and average degree $\Delta$, the tradeoff between label bit
size $L$ and query decoding time $T$ for our approach is given by $L = O(n \log
\log_\Delta T / \log_\Delta T)$, for any $T \leq n$. Our simple approach is
thus the first sublinearspace distance labeling for sparse graphs that
simultaneously admits small decoding time (for constant $\Delta$, we can
achieve any $T=\omega(1)$ while maintaining $L=o(n)$), and it also provides an
improvement in terms of label size with respect to previous slower approaches.
By using similar techniques, we then present a $2$additive labeling scheme
for general graphs, i.e., one in which the decoder provides a
2additiveapproximation of the distance between any pair of nodes. We achieve
almost the same label sizetime tradeoff $L = O(n \log^2 \log T / \log T)$, for
any $T \leq n$. To our knowledge, this is the first additive scheme with
constant absolute error to use labels of sublinear size. The corresponding
decoding time is then small (any $T=\omega(1)$ is sufficient).

Given $n$ colored balls, we want to detect if more than $\lfloor n/2\rfloor$
of them have the same color, and if so find one ball with such majority color.
We are only allowed to choose two balls and compare their colors, and the goal
is to minimize the total number of such operations. A wellknown exercise is to
show how to find such a ball with only $2n$ comparisons while using only a
logarithmic number of bits for bookkeeping. The resulting algorithm is called
the BoyerMoore majority vote algorithm. It is known that any deterministic
method needs $\lceil 3n/2\rceil2$ comparisons in the worst case, and this is
tight. However, it is not clear what is the required number of comparisons if
we allow randomization. We construct a randomized algorithm which always
correctly finds a ball of the majority color (or detects that there is none)
using, with high probability, only $7n/6+o(n)$ comparisons. We also prove that
the expected number of comparisons used by any such randomized method is at
least $1.019n$.

We consider computing the longest palindrome in a text of length $n$ in the
streaming model, where the characters arrive onebyone, and we do not have
random access to the input. While computing the answer exactly using sublinear
memory is not possible in such a setting, one can still hope for a good
approximation guarantee.
We focus on the two most natural variants, where we aim for either additive
or multiplicative approximation of the length of the longest palindrome. We
first show that there is no point in considering Las Vegas algorithms in such a
setting, as they cannot achieve sublinear space complexity. For Monte Carlo
algorithms, we provide a lowerbound of $\Omega(\frac{n}{E})$ bits for
approximating the answer with additive error $E$, and $\Omega(\frac{\log
n}{\log(1+\varepsilon)})$ bits for approximating the answer with multiplicative
error $(1+\varepsilon)$ for the binary alphabet. Then, we construct a generic
Monte Carlo algorithm, which by choosing the parameters appropriately achieves
space complexity matching up to a logarithmic factor for both variants. This
substantially improves the previous results by Berenbrink et al. (STACS 2014)
and essentially settles the space complexity.

We consider the problem of exploration of an anonymous, portlabeled,
undirected graph with $n$ nodes and $m$ edges and diameter $D$, by a single
mobile agent. Initially the agent does not know the graph topology nor any of
the global parameters. Moreover, the agent does not know the incoming port when
entering to a vertex. Each vertex is endowed with memory that can be read and
modified by the agent upon its visit to that node. However the agent has no
operational memory i.e., it cannot carry any state while traversing an edge. In
such a model at least $\log_2 d$ bits are needed at each vertex of degree $d$
for the agent to be able to traverse each graph edge. This number of bits is
always sufficient to explore any graph in time $O(mD)$ using algorithm
RotorRouter. We show that even if the available node memory is unlimited then
time $\Omega(n^3)$ is sometimes required for any algorithm. This shows that
RotorRouter is asymptotically optimal in the worstcase graphs. Secondly we
show that for the case of the path the RotorRouter attains exactly optimal
time.

We study the problem of enumerating all rooted directed spanning trees
(arborescences) of a directed graph (digraph) $G=(V,E)$ of $n$ vertices. An
arborescence $A$ consisting of edges $e_1,\ldots,e_{n1}$ can be represented as
a monomial $e_1\cdot e_2 \cdots e_{n1}$ in variables $e \in E$. All
arborescences $\mathsf{arb}(G)$ of a digraph then define the Kirchhoff
polynomial $\sum_{A \in \mathsf{arb}(G)} \prod_{e\in A} e$. We show how to
compute a compact representation of the Kirchhoff polynomial  its prime
factorization, and how it relates to combinatorial properties of digraphs such
as strong connectivity and vertex domination. In particular, we provide digraph
decomposition rules that correspond to factorization steps of the polynomial,
and also give necessary and sufficient primality conditions of the resulting
factors expressed by connectivity properties of the corresponding decomposed
components. Thereby, we obtain a linear time algorithm for decomposing a
digraph into components corresponding to factors of the initial polynomial, and
a guarantee that no finer factorization is possible. The decomposition serves
as a starting point for a recursive deletioncontraction algorithm, and also as
a preprocessing phase for iterative enumeration algorithms. Both approaches
produce a compressed output and retain some structural properties in the
resulting polynomial. This proves advantageous in practical applications such
as calculating steady states on digraphs governed by Laplacian dynamics, or
computing the greatest common divisor of Kirchhoff polynomials. Finally, we
initiate the study of a class of digraphs which allow for a practical
enumeration of arborescences. Using our decomposition rules we observe that
various digraphs from realworld applications fall into this class or are
structurally similar to it.

We prove that deciding whether a given input word contains as subsequence
every possible permutation of integers $\{1,2,\ldots,n\}$ is coNPcomplete. The
coNPcompleteness holds even when given the guarantee that the input word
contains as subsequences all of length $n1$ sequences over the same set of
integers. We also show NPcompleteness of a related problem of Partially
Noncrossing Perfect Matching in Bipartite Graphs, i.e. to find a perfect
matching in an ordered bipartite graph where edges of the matching incident to
selected vertices (even only from one side) are noncrossing.