• String matching is the problem of deciding whether a given $n$-bit string contains a given $k$-bit pattern. We study the complexity of this problem in three settings. Communication complexity. For small $k$, we provide near-optimal upper and lower bounds on the communication complexity of string matching. For large $k$, our bounds leave open an exponential gap; we exhibit some evidence for the existence of a better protocol. Circuit complexity. We present several upper and lower bounds on the size of circuits with threshold and DeMorgan gates solving the string matching problem. Similarly to the above, our bounds are near-optimal for small $k$. Learning. We consider the problem of learning a hidden pattern of length at most $k$ relative to the classifier that assigns 1 to every string that contains the pattern. We prove optimal bounds on the VC dimension and sample complexity of this problem.
  • For any $n$-bit boolean function $f$, we show that the randomized communication complexity of the composed function $f\circ g^n$, where $g$ is an index gadget, is characterized by the randomized decision tree complexity of $f$. In particular, this means that many query complexity separations involving randomized models (e.g., classical vs. quantum) automatically imply analogous separations in communication complexity.
  • While exponential separations are known between quantum and randomized communication complexity for partial functions (Raz, STOC 1999), the best known separation between these measures for a total function is quadratic, witnessed by the disjointness function. We give the first super-quadratic separation between quantum and randomized communication complexity for a total function, giving an example exhibiting a power 2.5 gap. We further present a 1.5 power separation between exact quantum and randomized communication complexity, improving on the previous ~1.15 separation by Ambainis (STOC 2013). Finally, we present a nearly optimal quadratic separation between randomized communication complexity and the logarithm of the partition number, improving upon the previous best power 1.5 separation due to G\"o\"os, Jayram, Pitassi, and Watson. Our results are the communication analogues of separations in query complexity proved using the recent cheat sheet framework of Aaronson, Ben-David, and Kothari (STOC 2016). Our main technical results are randomized communication and information complexity lower bounds for a family of functions, called lookup functions, that generalize and port the cheat sheet framework to communication complexity.
  • We use critical block sensitivity, a new complexity measure introduced by Huynh and Nordstr\"om (STOC 2012), to study the communication complexity of search problems. To begin, we give a simple new proof of the following central result of Huynh and Nordstr\"om: if $S$ is a search problem with critical block sensitivity $b$, then every randomised two-party protocol solving a certain two-party lift of $S$ requires $\Omega(b)$ bits of communication. Besides simplicity, our proof has the advantage of generalising to the multi-party setting. We combine these results with new critical block sensitivity lower bounds for Tseitin and Pebbling search problems to obtain the following applications: (1) Monotone Circuit Depth: We exhibit a monotone $n$-variable function in NP whose monotone circuits require depth $\Omega(n/\log n)$; previously, a bound of $\Omega(\sqrt{n})$ was known (Raz and Wigderson, JACM 1992). Moreover, we prove a $\Theta(\sqrt{n})$ monotone depth bound for a function in monotone P. (2) Proof Complexity: We prove new rank lower bounds as well as obtain the first length--space lower bounds for semi-algebraic proof systems, including Lov\'asz--Schrijver and Lasserre (SOS) systems. In particular, these results extend and simplify the works of Beame et al. (SICOMP 2007) and Huynh and Nordstr\"om.
  • We exhibit an $n$-node graph whose independent set polytope requires extended formulations of size exponential in $\Omega(n/\log n)$. Previously, no explicit examples of $n$-dimensional $0/1$-polytopes were known with extension complexity larger than exponential in $\Theta(\sqrt{n})$. Our construction is inspired by a relatively little-known connection between extended formulations and (monotone) circuit depth.
  • This work bridges the gap between distributed and centralised models of computing in the context of sublinear-time graph algorithms. A priori, typical centralised models of computing (e.g., parallel decision trees or centralised local algorithms) seem to be much more powerful than distributed message-passing algorithms: centralised algorithms can directly probe any part of the input, while in distributed algorithms nodes can only communicate with their immediate neighbours. We show that for a large class of graph problems, this extra freedom does not help centralised algorithms at all: for example, efficient stateless deterministic centralised local algorithms can be simulated with efficient distributed message-passing algorithms. In particular, this enables us to transfer existing lower bound results from distributed algorithms to centralised local algorithms.
  • The Pattern self-Assembly Tile set Synthesis (PATS) problem, which arises in the theory of structured DNA self-assembly, is to determine a set of coloured tiles that, starting from a bordering seed structure, self-assembles to a given rectangular colour pattern. The task of finding minimum-size tile sets is known to be NP-hard. We explore several complete and incomplete search techniques for finding minimal, or at least small, tile sets and also assess the reliability of the solutions obtained according to the kinetic Tile Assembly Model.
  • Given a boolean n by n matrix A we consider arithmetic circuits for computing the transformation x->Ax over different semirings. Namely, we study three circuit models: monotone OR-circuits, monotone SUM-circuits (addition of non-negative integers), and non-monotone XOR-circuits (addition modulo 2). Our focus is on \emph{separating} these models in terms of their circuit complexities. We give three results towards this goal: (1) We prove a direct sum type theorem on the monotone complexity of tensor product matrices. As a corollary, we obtain matrices that admit OR-circuits of size O(n), but require SUM-circuits of size \Omega(n^{3/2}/\log^2n). (2) We construct so-called \emph{k-uniform} matrices that admit XOR-circuits of size O(n), but require OR-circuits of size \Omega(n^2/\log^2n). (3) We consider the task of \emph{rewriting} a given OR-circuit as a XOR-circuit and prove that any subquadratic-time algorithm for this task violates the strong exponential time hypothesis.
  • By prior work, there is a distributed algorithm that finds a maximal fractional matching (maximal edge packing) in $O(\Delta)$ rounds, where $\Delta$ is the maximum degree of the graph. We show that this is optimal: there is no distributed algorithm that finds a maximal fractional matching in $o(\Delta)$ rounds. Our work gives the first linear-in-$\Delta$ lower bound for a natural graph problem in the standard model of distributed computing---prior lower bounds for a wide range of graph problems have been at best logarithmic in $\Delta$.
  • Do unique node identifiers help in deciding whether a network $G$ has a prescribed property $P$? We study this question in the context of distributed local decision, where the objective is to decide whether $G \in P$ by having each node run a constant-time distributed decision algorithm. If $G \in P$, all the nodes should output yes; if $G \notin P$, at least one node should output no. A recent work (Fraigniaud et al., OPODIS 2012) studied the role of identifiers in local decision and gave several conditions under which identifiers are not needed. In this article, we answer their original question. More than that, we do so under all combinations of the following two critical variations on the underlying model of distributed computing: ($B$): the size of the identifiers is bounded by a function of the size of the input network; as opposed to ($\neg B$): the identifiers are unbounded. ($C$): the nodes run a computable algorithm; as opposed to ($\neg C$): the nodes can compute any, possibly uncomputable function. While it is easy to see that under ($\neg B, \neg C$) identifiers are not needed, we show that under all other combinations there are properties that can be decided locally if and only if identifiers are present. Our constructions use ideas from classical computability theory.
  • K\"onig's theorem states that on bipartite graphs the size of a maximum matching equals the size of a minimum vertex cover. It is known from prior work that for every \epsilon > 0 there exists a constant-time distributed algorithm that finds a (1+\epsilon)-approximation of a maximum matching on 2-coloured graphs of bounded degree. In this work, we show---somewhat surprisingly---that no sublogarithmic-time approximation scheme exists for the dual problem: there is a constant \delta > 0 so that no randomised distributed algorithm with running time o(\log n) can find a (1+\delta)-approximation of a minimum vertex cover on 2-coloured graphs of maximum degree 3. In fact, a simple application of the Linial--Saks (1993) decomposition demonstrates that this lower bound is tight. Our lower-bound construction is simple and, to some extent, independent of previous techniques. Along the way we prove that a certain cut minimisation problem, which might be of independent interest, is hard to approximate locally on expander graphs.
  • In the study of deterministic distributed algorithms it is commonly assumed that each node has a unique $O(\log n)$-bit identifier. We prove that for a general class of graph problems, local algorithms (constant-time distributed algorithms) do not need such identifiers: a port numbering and orientation is sufficient. Our result holds for so-called simple PO-checkable graph optimisation problems; this includes many classical packing and covering problems such as vertex covers, edge covers, matchings, independent sets, dominating sets, and edge dominating sets. We focus on the case of bounded-degree graphs and show that if a local algorithm finds a constant-factor approximation of a simple PO-checkable graph problem with the help of unique identifiers, then the same approximation ratio can be achieved on anonymous networks. As a corollary of our result and by prior work, we derive a tight lower bound on the local approximability of the minimum edge dominating set problem. Our main technical tool is an algebraic construction of homogeneously ordered graphs: We say that a graph is $(\alpha,r)$-homogeneous if its nodes are linearly ordered so that an $\alpha$ fraction of nodes have pairwise isomorphic radius-$r$ neighbourhoods. We show that there exists a finite $(\alpha,r)$-homogeneous $2k$-regular graph of girth at least $g$ for any $\alpha < 1$ and any $r$, $k$, and $g$.
  • The Pattern self-Assembly Tile set Synthesis (PATS) problem is to determine a set of coloured tiles that self-assemble to implement a given rectangular colour pattern. We give an exhaustive branch-and-bound algorithm to find tile sets of minimum cardinality for the PATS problem. Our algorithm makes use of a search tree in the lattice of partitions of the ambient rectangular grid, and an efficient bounding function to prune this search tree. Empirical data on the performance of the algorithm shows that it compares favourably to previously presented heuristic solutions to the problem.