
The degree splitting problem requires coloring the edges of a graph red or
blue such that each node has almost the same number of edges in each color, up
to a small additive discrepancy. The directed variant of the problem requires
orienting the edges such that each node has almost the same number of incoming
and outgoing edges, again up to a small additive discrepancy.
We present deterministic distributed algorithms for both variants, which
improve on their counterparts presented by Ghaffari and Su [SODA'17]: our
algorithms are significantly simpler and faster, and have a much smaller
discrepancy. This also leads to a faster and simpler deterministic algorithm
for $(2+o(1))\Delta$edgecoloring, improving on that of Ghaffari and Su.

The congested clique model of distributed computing has been receiving
attention as a model for densely connected distributed systems. While there has
been significant progress on the side of upper bounds, we have very little in
terms of lower bounds for the congested clique; indeed, it is now know that
proving explicit congested clique lower bounds is as difficult as proving
circuit lower bounds.
In this work, we use various more traditional complexitytheoretic tools to
build a clearer picture of the complexity landscape of the congested clique:
 Nondeterminism and beyond: We introduce the nondeterministic congested
clique model (analogous to NP) and show that there is a natural canonical
problem family that captures all problems solvable in constant time with
nondeterministic algorithms. We further generalise these notions by introducing
the constantround decision hierarchy (analogous to the polynomial hierarchy).
 Nonconstructive lower bounds: We lift the prior nonuniform counting
arguments to a general technique for proving nonconstructive uniform lower
bounds for the congested clique. In particular, we prove a time hierarchy
theorem for the congested clique, showing that there are decision problems of
essentially all complexities, both in the deterministic and nondeterministic
settings.
 Finegrained complexity: We map out relationships between various natural
problems in the congested clique model, arguing that a reductionbased
complexity theory currently gives us a fairly good picture of the complexity
landscape of the congested clique.

A number of recent papers  e.g. Brandt et al. (STOC 2016), Chang et al.
(FOCS 2016), Ghaffari & Su (SODA 2017), Brandt et al. (PODC 2017), and Chang &
Pettie (FOCS 2017)  have advanced our understanding of one of the most
fundamental questions in theory of distributed computing: what are the possible
time complexity classes of LCL problems in the LOCAL model? In essence, we have
a graph problem $\Pi$ in which a solution can be verified by checking all
radius$O(1)$ neighbourhoods, and the question is what is the smallest $T$ such
that a solution can be computed so that each node chooses its own output based
on its radius$T$ neighbourhood. Here $T$ is the distributed time complexity of
$\Pi$.
The time complexity classes for deterministic algorithms in boundeddegree
graphs that are known to exist by prior work are $\Theta(1)$, $\Theta(\log^*
n)$, $\Theta(\log n)$, $\Theta(n^{1/k})$, and $\Theta(n)$. It is also known
that there are two gaps: one between $\omega(1)$ and $o(\log \log^* n)$, and
another between $\omega(\log^* n)$ and $o(\log n)$. It has been conjectured
that many more gaps exist, and that the overall time hierarchy is relatively
simple  indeed, this is known to be the case in restricted graph families
such as cycles and grids.
We show that the picture is much more diverse than previously expected. We
present a general technique for engineering LCL problems with numerous
different deterministic time complexities, including $\Theta(\log^{\alpha}n)$
for any $\alpha\ge1$, $2^{\Theta(\log^{\alpha}n)}$ for any $\alpha\le 1$, and
$\Theta(n^{\alpha})$ for any $\alpha <1/2$ in the high end of the complexity
spectrum, and $\Theta(\log^{\alpha}\log^* n)$ for any $\alpha\ge 1$,
$\smash{2^{\Theta(\log^{\alpha}\log^* n)}}$ for any $\alpha\le 1$, and
$\Theta((\log^* n)^{\alpha})$ for any $\alpha \le 1$ in the low end; here
$\alpha$ is a positive rational number.

Given two colorings of a graph, we consider the following problem: can we
recolor the graph from one coloring to the other through a series of elementary
changes, such that the graph is properly colored after each step?
We introduce the notion of distributed recoloring: The input graph represents
a network of computers that needs to be recolored. Initially, each node is
aware of its own input color and target color. The nodes can exchange messages
with each other, and eventually each node has to stop and output its own
recoloring schedule, indicating when and how the node changes its color. The
recoloring schedules have to be globally consistent so that the graph remains
properly colored at each point, and we require that adjacent nodes do not
change their colors simultaneously.
We are interested in the following questions: How many communication rounds
are needed (in the LOCAL model of distributed computing) to find a recoloring
schedule? What is the length of the recoloring schedule? And how does the
picture change if we can use extra colors to make recoloring easier?
The main contributions of this work are related to distributed recoloring
with one extra color in the following graph classes: trees, $3$regular graphs,
and toroidal grids.

We give a new, simple distributed algorithm for graph colouring in paths and
cycles. Our algorithm is fast and selfcontained, it does not need any globally
consistent orientation, and it reduces the number of colours from $10^{100}$ to
$3$ in three iterations.

LCLs or locally checkable labelling problems (e.g. maximal independent set,
maximal matching, and vertex colouring) in the LOCAL model of computation are
very wellunderstood in cycles (toroidal 1dimensional grids): every problem
has a complexity of $O(1)$, $\Theta(\log^* n)$, or $\Theta(n)$, and the design
of optimal algorithms can be fully automated.
This work develops the complexity theory of LCL problems for toroidal
2dimensional grids. The complexity classes are the same as in the
1dimensional case: $O(1)$, $\Theta(\log^* n)$, and $\Theta(n)$. However, given
an LCL problem it is undecidable whether its complexity is $\Theta(\log^* n)$
or $\Theta(n)$ in 2dimensional grids.
Nevertheless, if we correctly guess that the complexity of a problem is
$\Theta(\log^* n)$, we can completely automate the design of optimal
algorithms. For any problem we can find an algorithm that is of a normal form
$A' \circ S_k$, where $A'$ is a finite function, $S_k$ is an algorithm for
finding a maximal independent set in $k$th power of the grid, and $k$ is a
constant.
Finally, partially with the help of automated design tools, we classify the
complexity of several concrete LCL problems related to colourings and
orientations.

Consider a complete communication network of $n$ nodes, where the nodes
receive a common clock pulse. We study the synchronous $c$counting problem:
given any starting state and up to $f$ faulty nodes with arbitrary behaviour,
the task is to eventually have all correct nodes labeling the pulses with
increasing values modulo $c$ in agreement. Thus, we are considering algorithms
that are selfstabilising despite Byzantine failures. In this work, we give new
algorithms for the synchronous counting problem that (1) are deterministic, (2)
have optimal resilience, (3) have a linear stabilisation time in $f$
(asymptotically optimal), (4) use a small number of states, and consequently,
(5) communicate a small number of bits per round. Prior algorithms either
resort to randomisation, use a large number of states and need high
communication bandwidth, or have suboptimal resilience. In particular, we
achieve an exponential improvement in both state complexity and message size
for deterministic algorithms. Moreover, we present two complementary approaches
for reducing the number of bits communicated during and after stabilisation.

While the relationship of time and space is an established topic in
traditional centralised complexity theory, this is not the case in distributed
computing. We aim to remedy this by studying the time and space complexity of
algorithms in a weak messagepassing model of distributed computing. While a
constant number of communication rounds implies a constant number of states
visited during the execution, the other direction is not clear at all. We
consider several graph families and show that indeed, there exist nontrivial
graph problems that are solvable by constantspace algorithms but that require
a nonconstant running time. This provides us with a new complexity class for
distributed computing and raises interesting questions about the existence of
further combinations of time and space complexity.

Given $n$ colored balls, we want to detect if more than $\lfloor n/2\rfloor$
of them have the same color, and if so find one ball with such majority color.
We are only allowed to choose two balls and compare their colors, and the goal
is to minimize the total number of such operations. A wellknown exercise is to
show how to find such a ball with only $2n$ comparisons while using only a
logarithmic number of bits for bookkeeping. The resulting algorithm is called
the BoyerMoore majority vote algorithm. It is known that any deterministic
method needs $\lceil 3n/2\rceil2$ comparisons in the worst case, and this is
tight. However, it is not clear what is the required number of comparisons if
we allow randomization. We construct a randomized algorithm which always
correctly finds a ball of the majority color (or detects that there is none)
using, with high probability, only $7n/6+o(n)$ comparisons. We also prove that
the expected number of comparisons used by any such randomized method is at
least $1.019n$.

We study the minimum backlog problem (MBP). This online problem arises, e.g.,
in the context of sensor networks. We focus on two main variants of MBP.
The discrete MBP is a 2person game played on a graph $G=(V,E)$. The player
is initially located at a vertex of the graph. In each time step, the adversary
pours a total of one unit of water into cups that are located on the vertices
of the graph, arbitrarily distributing the water among the cups. The player
then moves from her current vertex to an adjacent vertex and empties the cup at
that vertex. The player's objective is to minimize the backlog, i.e., the
maximum amount of water in any cup at any time.
The geometric MBP is a continuoustime version of the MBP: the cups are
points in the twodimensional plane, the adversary pours water continuously at
a constant rate, and the player moves in the plane with unit speed. Again, the
player's objective is to minimize the backlog.
We show that the competitive ratio of any algorithm for the MBP has a lower
bound of $\Omega(D)$, where $D$ is the diameter of the graph (for the discrete
MBP) or the diameter of the point set (for the geometric MBP). Therefore we
focus on determining a strategy for the player that guarantees a uniform upper
bound on the absolute value of the backlog.
For the absolute value of the backlog there is a trivial lower bound of
$\Omega(D)$, and the deamortization analysis of Dietz and Sleator gives an
upper bound of $O(D\log N)$ for $N$ cups. Our main result is a tight upper
bound for the geometric MBP: we show that there is a strategy for the player
that guarantees a backlog of $O(D)$, independently of the number of cups.

This work bridges the gap between distributed and centralised models of
computing in the context of sublineartime graph algorithms. A priori, typical
centralised models of computing (e.g., parallel decision trees or centralised
local algorithms) seem to be much more powerful than distributed
messagepassing algorithms: centralised algorithms can directly probe any part
of the input, while in distributed algorithms nodes can only communicate with
their immediate neighbours. We show that for a large class of graph problems,
this extra freedom does not help centralised algorithms at all: for example,
efficient stateless deterministic centralised local algorithms can be simulated
with efficient distributed messagepassing algorithms. In particular, this
enables us to transfer existing lower bound results from distributed algorithms
to centralised local algorithms.

In the relay placement problem the input is a set of sensors and a number $r
\ge 1$, the communication range of a relay. In the onetier version of the
problem the objective is to place a minimum number of relays so that between
every pair of sensors there is a path through sensors and/or relays such that
the consecutive vertices of the path are within distance $r$ if both vertices
are relays and within distance 1 otherwise. The twotier version adds the
restrictions that the path must go through relays, and not through sensors. We
present a 3.11approximation algorithm for the onetier version and a PTAS for
the twotier version. We also show that the onetier version admits no PTAS,
assuming P $\ne$ NP.

We show that any randomised Monte Carlo distributed algorithm for the
Lov\'asz local lemma requires $\Omega(\log \log n)$ communication rounds,
assuming that it finds a correct assignment with high probability. Our result
holds even in the special case of $d = O(1)$, where $d$ is the maximum degree
of the dependency graph. By prior work, there are distributed algorithms for
the Lov\'asz local lemma with a running time of $O(\log n)$ rounds in
boundeddegree graphs, and the best lower bound before our work was
$\Omega(\log^* n)$ rounds [Chung et al. 2014].

The role of unique node identifiers in network computing is well understood
as far as symmetry breaking is concerned. However, the unique identifiers also
leak information about the computing environment  in particular, they provide
some nodes with information related to the size of the network. It was recently
proved that in the context of local decision, there are some decision problems
such that (1) they cannot be solved without unique identifiers, and (2) unique
node identifiers leak a sufficient amount of information such that the problem
becomes solvable (PODC 2013).
In this work we give study what is the minimal amount of information that we
need to leak from the environment to the nodes in order to solve local decision
problems. Our key results are related to scalar oracles $f$ that, for any given
$n$, provide a multiset $f(n)$ of $n$ labels; then the adversary assigns the
labels to the $n$ nodes in the network. This is a direct generalisation of the
usual assumption of unique node identifiers. We give a complete
characterisation of the weakest oracle that leaks at least as much information
as the unique identifiers.
Our main result is the following dichotomy: we classify scalar oracles as
large and small, depending on their asymptotic behaviour, and show that (1) any
large oracle is at least as powerful as the unique identifiers in the context
of local decision problems, while (2) for any small oracle there are local
decision problems that still benefit from unique identifiers.

Consider a complete communication network of $n$ nodes, where the nodes
receive a common clock pulse. We study the synchronous $c$counting problem:
given any starting state and up to $f$ faulty nodes with arbitrary behaviour,
the task is to eventually have all correct nodes counting modulo $c$ in
agreement. Thus, we are considering algorithms that are selfstabilizing
despite Byzantine failures. In this work, we give new algorithms for the
synchronous counting problem that (1) are deterministic, (2) have linear
stabilisation time in $f$, (3) use a small number of states, and (4) achieve
almostoptimal resilience. Prior algorithms either resort to randomisation, use
a large number of states, or have poor resilience. In particular, we achieve an
exponential improvement in the space complexity of deterministic algorithms,
while still achieving linear stabilisation time and almostlinear resilience.

While algebrisation constitutes a powerful technique in the design and
analysis of centralised algorithms, to date there have been hardly any
applications of algebraic techniques in the context of distributed graph
algorithms. This work is a case study that demonstrates the potential of
algebrisation in the distributed context. We will focus on distributed graph
algorithms in the congested clique model; the graph problems that we will
consider include, e.g., the triangle detection problem and the allpairs
shortest path problem (APSP). There is plenty of prior work on combinatorial
algorithms in the congested clique model: for example, Dolev et al. (DISC 2012)
gave an algorithm for triangle detection with a running time of $\tilde
O(n^{1/3})$, and Nanongkai (STOC 2014) gave an approximation algorithm for APSP
with a running time of $\tilde O(n^{1/2})$. In this work, we will use algebraic
techniques  in particular, algorithms based on fast matrix multiplication 
to solve both triangle detection and the unweighted APSP in time
$O(n^{0.15715})$; for weighted APSP, we give a $(1+o(1))$approximation with
this running time, as well as an exact $\tilde O(n^{1/3})$ solution.

In this work, we use algebraic methods for studying distance computation and
subgraph detection tasks in the congested clique model. Specifically, we adapt
parallel matrix multiplication implementations to the congested clique,
obtaining an $O(n^{12/\omega})$ round matrix multiplication algorithm, where
$\omega < 2.3728639$ is the exponent of matrix multiplication. In conjunction
with known techniques from centralised algorithmics, this gives significant
improvements over previous best upper bounds in the congested clique model. The
highlight results include:
 triangle and 4cycle counting in $O(n^{0.158})$ rounds, improving upon the
$O(n^{1/3})$ triangle detection algorithm of Dolev et al. [DISC 2012],
 a $(1 + o(1))$approximation of allpairs shortest paths in $O(n^{0.158})$
rounds, improving upon the $\tilde{O} (n^{1/2})$round $(2 +
o(1))$approximation algorithm of Nanongkai [STOC 2014], and
 computing the girth in $O(n^{0.158})$ rounds, which is the first
nontrivial solution in this model.
In addition, we present a novel constantround combinatorial algorithm for
detecting 4cycles.

We prove exact bounds on the time complexity of distributed graph colouring.
If we are given a directed path that is properly coloured with $n$ colours, by
prior work it is known that we can find a proper 3colouring in $\frac{1}{2}
\log^*(n) \pm O(1)$ communication rounds. We close the gap between upper and
lower bounds: we show that for infinitely many $n$ the time complexity is
precisely $\frac{1}{2} \log^* n$ communication rounds.

This work studies distributed algorithms for locally optimal loadbalancing:
We are given a graph of maximum degree $\Delta$, and each node has up to $L$
units of load. The task is to distribute the load more evenly so that the loads
of adjacent nodes differ by at most $1$.
If the graph is a path ($\Delta = 2$), it is easy to solve the fractional
version of the problem in $O(L)$ communication rounds, independently of the
number of nodes. We show that this is tight, and we show that it is possible to
solve also the discrete version of the problem in $O(L)$ rounds in paths.
For the general case ($\Delta > 2$), we show that fractional load balancing
can be solved in $\operatorname{poly}(L,\Delta)$ rounds and discrete load
balancing in $f(L,\Delta)$ rounds for some function $f$, independently of the
number of nodes.

Consider a complete communication network on $n$ nodes, each of which is a
state machine. In synchronous 2counting, the nodes receive a common clock
pulse and they have to agree on which pulses are "odd" and which are "even". We
require that the solution is selfstabilising (reaching the correct operation
from any initial state) and it tolerates $f$ Byzantine failures (nodes that
send arbitrary misinformation). Prior algorithms are expensive to implement in
hardware: they require a source of random bits or a large number of states.
This work consists of two parts. In the first part, we use computational
techniques (often known as synthesis) to construct very compact deterministic
algorithms for the first nontrivial case of $f = 1$. While no algorithm exists
for $n < 4$, we show that as few as 3 states per node are sufficient for all
values $n \ge 4$. Moreover, the problem cannot be solved with only 2 states per
node for $n = 4$, but there is a 2state solution for all values $n \ge 6$.
In the second part, we develop and compare two different approaches for
synthesising synchronous counting algorithms. Both approaches are based on
casting the synthesis problem as a propositional satisfiability (SAT) problem
and employing modern SATsolvers. The difference lies in how to solve the SAT
problem: either in a direct fashion, or incrementally within a counterexample
guided abstraction refinement loop. Empirical results suggest that the former
technique is more efficient if we want to synthesise timeoptimal algorithms,
while the latter technique discovers nonoptimal algorithms more quickly.

This work shows that the following problems are equivalent, both in theory
and in practice:
 median filtering: given an $n$element vector, compute the sliding window
median with window size $k$,
 piecewise sorting: given an $n$element vector, divide it in $n/k$ blocks
of length $k$ and sort each block.
By prior work, median filtering is known to be at least as hard as piecewise
sorting: with a single median filter operation we can sort $\Theta(n/k)$ blocks
of length $\Theta(k)$. The present work shows that median filtering is also as
easy as piecewise sorting: we can do median filtering with one piecewise
sorting operation and lineartime postprocessing. In particular, median
filtering can directly benefit from the vast literature on sorting
algorithmsfor example, adaptive sorting algorithms imply adaptive median
filtering algorithms.
The reduction is very efficient in practicefor random inputs the
performance of the new sortingbased algorithm is on a par with the fastest
heapbased algorithms, and for benign data distributions it typically
outperforms prior algorithms.
The key technical idea is that we can represent the sliding window with a
pair of sorted doublylinked lists: we delete items from one list and add items
to the other list. Deletions are easy; additions can be done efficiently if we
reverse the time twice: First we construct the full list and delete the items
in the reverse order. Then we undo each deletion with Knuth's dancing links
technique.

We study the problem of finding large cuts in $d$regular trianglefree
graphs. In prior work, Shearer (1992) gives a randomised algorithm that finds a
cut of expected size $(1/2 + 0.177/\sqrt{d})m$, where $m$ is the number of
edges. We give a simpler algorithm that does much better: it finds a cut of
expected size $(1/2 + 0.28125/\sqrt{d})m$. As a corollary, this shows that in
any $d$regular trianglefree graph there exists a cut of at least this size.
Our algorithm can be interpreted as a very efficient randomised distributed
algorithm: each node needs to produce only one random bit, and the algorithm
runs in one synchronous communication round. This work is also a case study of
applying computational techniques in the design of distributed algorithms: our
algorithm was designed by a computer program that searched for optimal
algorithms for small values of $d$.

We show that there is no deterministic local algorithm (constanttime
distributed graph algorithm) that finds a $(7\epsilon)$approximation of a
minimum dominating set on planar graphs, for any positive constant $\epsilon$.
In prior work, the best lower bound on the approximation ratio has been
$5\epsilon$; there is also an upper bound of $52$.

Linial's seminal result shows that any deterministic distributed algorithm
that finds a $3$colouring of an $n$cycle requires at least $\log^*(n)/2  1$
communication rounds. We give a new simpler proof of this theorem.

This work presents a classification of weak models of distributed computing.
We focus on deterministic distributed algorithms, and study models of computing
that are weaker versions of the widelystudied portnumbering model. In the
portnumbering model, a node of degree d receives messages through d input
ports and sends messages through d output ports, both numbered with 1,2,...,d.
In this work, VVc is the class of all graph problems that can be solved in the
standard portnumbering model. We study the following subclasses of VVc:
VV: Input port i and output port i are not necessarily connected to the same
neighbour.
MV: Input ports are not numbered; algorithms receive a multiset of messages.
SV: Input ports are not numbered; algorithms receive a set of messages.
VB: Output ports are not numbered; algorithms send the same message to all
output ports.
MB: Combination of MV and VB.
SB: Combination of SV and VB.
Now we have many trivial containment relations, such as SB \subseteq MB
\subseteq VB \subseteq VV \subseteq VVc, but it is not obvious if, for example,
either of VB \subseteq SV or SV \subseteq VB should hold. Nevertheless, it
turns out that we can identify a linear order on these classes. We prove that
SB \subsetneq MB = VB \subsetneq SV = MV = VV \subsetneq VVc. The same holds
for the constanttime versions of these classes.
We also show that the constanttime variants of these classes can be
characterised by a corresponding modal logic. Hence the linear order identified
in this work has direct implications in the study of the expressibility of
modal logic. Conversely, one can use tools from modal logic to study these
classes.