
String matching is the problem of deciding whether a given $n$bit string
contains a given $k$bit pattern. We study the complexity of this problem in
three settings.
Communication complexity. For small $k$, we provide nearoptimal upper and
lower bounds on the communication complexity of string matching. For large $k$,
our bounds leave open an exponential gap; we exhibit some evidence for the
existence of a better protocol.
Circuit complexity. We present several upper and lower bounds on the size of
circuits with threshold and DeMorgan gates solving the string matching problem.
Similarly to the above, our bounds are nearoptimal for small $k$.
Learning. We consider the problem of learning a hidden pattern of length at
most $k$ relative to the classifier that assigns 1 to every string that
contains the pattern. We prove optimal bounds on the VC dimension and sample
complexity of this problem.

For any $n$bit boolean function $f$, we show that the randomized
communication complexity of the composed function $f\circ g^n$, where $g$ is an
index gadget, is characterized by the randomized decision tree complexity of
$f$. In particular, this means that many query complexity separations involving
randomized models (e.g., classical vs. quantum) automatically imply analogous
separations in communication complexity.

While exponential separations are known between quantum and randomized
communication complexity for partial functions (Raz, STOC 1999), the best known
separation between these measures for a total function is quadratic, witnessed
by the disjointness function. We give the first superquadratic separation
between quantum and randomized communication complexity for a total function,
giving an example exhibiting a power 2.5 gap. We further present a 1.5 power
separation between exact quantum and randomized communication complexity,
improving on the previous ~1.15 separation by Ambainis (STOC 2013). Finally, we
present a nearly optimal quadratic separation between randomized communication
complexity and the logarithm of the partition number, improving upon the
previous best power 1.5 separation due to G\"o\"os, Jayram, Pitassi, and
Watson.
Our results are the communication analogues of separations in query
complexity proved using the recent cheat sheet framework of Aaronson,
BenDavid, and Kothari (STOC 2016). Our main technical results are randomized
communication and information complexity lower bounds for a family of
functions, called lookup functions, that generalize and port the cheat sheet
framework to communication complexity.

We use critical block sensitivity, a new complexity measure introduced by
Huynh and Nordstr\"om (STOC 2012), to study the communication complexity of
search problems. To begin, we give a simple new proof of the following central
result of Huynh and Nordstr\"om: if $S$ is a search problem with critical block
sensitivity $b$, then every randomised twoparty protocol solving a certain
twoparty lift of $S$ requires $\Omega(b)$ bits of communication. Besides
simplicity, our proof has the advantage of generalising to the multiparty
setting. We combine these results with new critical block sensitivity lower
bounds for Tseitin and Pebbling search problems to obtain the following
applications:
(1) Monotone Circuit Depth: We exhibit a monotone $n$variable function in NP
whose monotone circuits require depth $\Omega(n/\log n)$; previously, a bound
of $\Omega(\sqrt{n})$ was known (Raz and Wigderson, JACM 1992). Moreover, we
prove a $\Theta(\sqrt{n})$ monotone depth bound for a function in monotone P.
(2) Proof Complexity: We prove new rank lower bounds as well as obtain the
first lengthspace lower bounds for semialgebraic proof systems, including
Lov\'aszSchrijver and Lasserre (SOS) systems. In particular, these results
extend and simplify the works of Beame et al. (SICOMP 2007) and Huynh and
Nordstr\"om.

We exhibit an $n$node graph whose independent set polytope requires extended
formulations of size exponential in $\Omega(n/\log n)$. Previously, no explicit
examples of $n$dimensional $0/1$polytopes were known with extension
complexity larger than exponential in $\Theta(\sqrt{n})$. Our construction is
inspired by a relatively littleknown connection between extended formulations
and (monotone) circuit depth.

This work bridges the gap between distributed and centralised models of
computing in the context of sublineartime graph algorithms. A priori, typical
centralised models of computing (e.g., parallel decision trees or centralised
local algorithms) seem to be much more powerful than distributed
messagepassing algorithms: centralised algorithms can directly probe any part
of the input, while in distributed algorithms nodes can only communicate with
their immediate neighbours. We show that for a large class of graph problems,
this extra freedom does not help centralised algorithms at all: for example,
efficient stateless deterministic centralised local algorithms can be simulated
with efficient distributed messagepassing algorithms. In particular, this
enables us to transfer existing lower bound results from distributed algorithms
to centralised local algorithms.

The Pattern selfAssembly Tile set Synthesis (PATS) problem, which arises in
the theory of structured DNA selfassembly, is to determine a set of coloured
tiles that, starting from a bordering seed structure, selfassembles to a given
rectangular colour pattern. The task of finding minimumsize tile sets is known
to be NPhard. We explore several complete and incomplete search techniques for
finding minimal, or at least small, tile sets and also assess the reliability
of the solutions obtained according to the kinetic Tile Assembly Model.

Given a boolean n by n matrix A we consider arithmetic circuits for computing
the transformation x>Ax over different semirings. Namely, we study three
circuit models: monotone ORcircuits, monotone SUMcircuits (addition of
nonnegative integers), and nonmonotone XORcircuits (addition modulo 2). Our
focus is on \emph{separating} these models in terms of their circuit
complexities. We give three results towards this goal:
(1) We prove a direct sum type theorem on the monotone complexity of tensor
product matrices. As a corollary, we obtain matrices that admit ORcircuits of
size O(n), but require SUMcircuits of size \Omega(n^{3/2}/\log^2n).
(2) We construct socalled \emph{kuniform} matrices that admit XORcircuits
of size O(n), but require ORcircuits of size \Omega(n^2/\log^2n).
(3) We consider the task of \emph{rewriting} a given ORcircuit as a
XORcircuit and prove that any subquadratictime algorithm for this task
violates the strong exponential time hypothesis.

By prior work, there is a distributed algorithm that finds a maximal
fractional matching (maximal edge packing) in $O(\Delta)$ rounds, where
$\Delta$ is the maximum degree of the graph. We show that this is optimal:
there is no distributed algorithm that finds a maximal fractional matching in
$o(\Delta)$ rounds.
Our work gives the first linearin$\Delta$ lower bound for a natural graph
problem in the standard model of distributed computingprior lower bounds for
a wide range of graph problems have been at best logarithmic in $\Delta$.

Do unique node identifiers help in deciding whether a network $G$ has a
prescribed property $P$? We study this question in the context of distributed
local decision, where the objective is to decide whether $G \in P$ by having
each node run a constanttime distributed decision algorithm. If $G \in P$, all
the nodes should output yes; if $G \notin P$, at least one node should output
no.
A recent work (Fraigniaud et al., OPODIS 2012) studied the role of
identifiers in local decision and gave several conditions under which
identifiers are not needed. In this article, we answer their original question.
More than that, we do so under all combinations of the following two critical
variations on the underlying model of distributed computing:
($B$): the size of the identifiers is bounded by a function of the size of
the input network; as opposed to ($\neg B$): the identifiers are unbounded.
($C$): the nodes run a computable algorithm; as opposed to ($\neg C$): the
nodes can compute any, possibly uncomputable function.
While it is easy to see that under ($\neg B, \neg C$) identifiers are not
needed, we show that under all other combinations there are properties that can
be decided locally if and only if identifiers are present. Our constructions
use ideas from classical computability theory.

K\"onig's theorem states that on bipartite graphs the size of a maximum
matching equals the size of a minimum vertex cover. It is known from prior work
that for every \epsilon > 0 there exists a constanttime distributed algorithm
that finds a (1+\epsilon)approximation of a maximum matching on 2coloured
graphs of bounded degree. In this work, we showsomewhat surprisinglythat
no sublogarithmictime approximation scheme exists for the dual problem: there
is a constant \delta > 0 so that no randomised distributed algorithm with
running time o(\log n) can find a (1+\delta)approximation of a minimum vertex
cover on 2coloured graphs of maximum degree 3. In fact, a simple application
of the LinialSaks (1993) decomposition demonstrates that this lower bound is
tight.
Our lowerbound construction is simple and, to some extent, independent of
previous techniques. Along the way we prove that a certain cut minimisation
problem, which might be of independent interest, is hard to approximate locally
on expander graphs.

In the study of deterministic distributed algorithms it is commonly assumed
that each node has a unique $O(\log n)$bit identifier. We prove that for a
general class of graph problems, local algorithms (constanttime distributed
algorithms) do not need such identifiers: a port numbering and orientation is
sufficient.
Our result holds for socalled simple POcheckable graph optimisation
problems; this includes many classical packing and covering problems such as
vertex covers, edge covers, matchings, independent sets, dominating sets, and
edge dominating sets. We focus on the case of boundeddegree graphs and show
that if a local algorithm finds a constantfactor approximation of a simple
POcheckable graph problem with the help of unique identifiers, then the same
approximation ratio can be achieved on anonymous networks.
As a corollary of our result and by prior work, we derive a tight lower bound
on the local approximability of the minimum edge dominating set problem.
Our main technical tool is an algebraic construction of homogeneously ordered
graphs: We say that a graph is $(\alpha,r)$homogeneous if its nodes are
linearly ordered so that an $\alpha$ fraction of nodes have pairwise isomorphic
radius$r$ neighbourhoods. We show that there exists a finite
$(\alpha,r)$homogeneous $2k$regular graph of girth at least $g$ for any
$\alpha < 1$ and any $r$, $k$, and $g$.

The Pattern selfAssembly Tile set Synthesis (PATS) problem is to determine a
set of coloured tiles that selfassemble to implement a given rectangular
colour pattern. We give an exhaustive branchandbound algorithm to find tile
sets of minimum cardinality for the PATS problem. Our algorithm makes use of a
search tree in the lattice of partitions of the ambient rectangular grid, and
an efficient bounding function to prune this search tree. Empirical data on the
performance of the algorithm shows that it compares favourably to previously
presented heuristic solutions to the problem.