• A parallel algorithm for maximal independent set (MIS) in hypergraphs has been a long-standing algorithmic challenge, dating back nearly 30 years to a survey of Karp & Ramachandran (1990). The best randomized parallel algorithm for hypergraphs of fixed rank $r$ was developed by Beame & Luby (1990) and Kelsen (1992), running in time roughly $(\log n)^{r!}$. We improve the randomized algorithm of Kelsen, reducing the runtime to roughly $(\log n)^{2^r}$ and simplifying the analysis through the use of more-modern concentration inequalities. We also give a method for derandomizing concentration bounds for low-degree polynomials, which are the key technical tool used to analyze that algorithm. This leads to a deterministic PRAM algorithm also running in $(\log n)^{2^{r+3}}$ time and $\text{poly}(m,n)$ processors. This is the first deterministic algorithm with sub-polynomial runtime for hypergraphs of rank $r > 3$. Our analysis can also apply when $r$ is slowly growing; using this in conjunction with a strategy of Bercea et al. (2015) gives a deterministic MIS algorithm running in time $\exp(O( \frac{\log (mn)}{\log \log (mn)}))$.
  • A variety of powerful extremal results have been shown for the chromatic number of triangle-free graphs. Three noteworthy bounds are in terms of the number of vertices, edges, and maximum degree given by Poljak \& Tuza (1994), and Johansson. There have been comparatively fewer works extending these types of bounds to graphs with a small number of triangles. One noteworthy exception is a result of Alon et. al (1999) bounding the chromatic number for graphs with low degree and few triangles per vertex; this bound is nearly the same as for triangle-free graphs. This type of parametrization is much less rigid, and has appeared in dozens of combinatorial constructions. In this paper, we show a similar type of result for $\chi(G)$ as a function of the number of vertices $n$, the number of edges $m$, as well as the triangle count (both local and global measures). Our results smoothly interpolate between the generic bounds true for all graphs and bounds for triangle-free graphs. Our results are tight for most of these cases; we show how an open problem regarding fractional chromatic number and degeneracy in triangle-free graphs can resolve the small remaining gap in our bounds.
  • Motivated by the Erd\H{o}s-Faber-Lov\'{a}sz (EFL) conjecture for hypergraphs, we consider the list edge coloring of linear hypergraphs. We show that if the hyper-edge sizes are bounded between $i$ and $C_{i,\epsilon} \sqrt{n}$ inclusive, then there is a list edge coloring using $(1 + \epsilon) \frac{n}{i - 1}$ colors. The dependence on $n$ in the upper bound is optimal (up to the value of $C_{i,\epsilon}$).
  • The Lov\'asz Local Lemma (LLL) is a probabilistic tool which shows that, if a collection of "bad" events $\mathcal B$ in a probability space are not too likely and not too interdependent, then there is a positive probability that no bad-events in $\mathcal B$ occur. Moser & Tardos (2010) gave sequential and parallel algorithms which transformed most applications of the variable-assignment LLL into efficient algorithms. A framework of Harvey & Vondr\'ak (2015) based on "resampling oracles" extended this give very general sequential algorithms for other probability spaces satisfying the Lopsided Lov\'asz Local Lemma (LLLL). We describe a new structural property of resampling oracles, which we refer to as "obliviousness." Essentially, it means that the interaction between two bad-events $B, B'$ depends only on the randomness used to resample $B$, and not on the precise state within $B$ itself. This property has two major benefits. First, it is the key to achieving a unified parallel LLLL algorithm, which is faster than previous, problem-specific algorithms of Harris (2016) for the variable-assignment LLLL algorithm and of Harris & Srinivasan (2014) for permutations. This gives the first RNC algorithms for perfect matchings and hamiltonian cycles of $K_n$. Second, this property allows us to build LLLL probability spaces out of a relatively simple "atomic" set of events. It was intuitively clear that existing LLLL spaces were built in this way, but the obliviousness property formalizes this and gives a way of automatically turning a resampling oracle for atomic events into a resampling oracle for conjunctions of them. Using this framework, we get the first sequential resampling oracle for perfect matchings on the complete $s$-uniform hypergraph $K_n^{(s)}$, and the first commutative resampling oracle for hamiltonian cycles of $K_n$.
  • Clustering problems are fundamental to unsupervised learning. There is an increased emphasis on "fairness" in machine learning and AI; one representative notion of fairness is that no single demographic group should be over-represented among the cluster-centers. This, and much more general clustering problems, can be formulated with "knapsack" and "partition" constraints. We develop new randomized algorithms targeting such problems, and study two in particular: multi-knapsack median and multi-knapsack center. Our rounding algorithms give new approximation and pseudo-approximation algorithms for these problems. One key technical tool we develop and use, which may be of independent interest, is a new tail bound analogous to Feige (2006) for sums of random variables with unbounded variances. Such bounds are very useful in inferring properties of large networks using few samples.
  • The resampling algorithm of Moser \& Tardos is a powerful approach to develop constructive versions of the Lov\'{a}sz Local Lemma (LLL). We generalize this to partial resampling: when a bad event holds, we resample an appropriately-random subset of the variables that define this event, rather than the entire set as in Moser & Tardos. This is particularly useful when the bad events are determined by sums of random variables. This leads to several improved algorithmic applications in scheduling, graph transversals, packet routing etc. For instance, we settle a conjecture of Szab\'{o} & Tardos (2006) on graph transversals asymptotically, and obtain improved approximation ratios for a packet routing problem of Leighton, Maggs, & Rao (1994).
  • We consider column-sparse covering integer programs, a generalization of set cover, which have attracted a long line of research developing (randomized) approximation algorithms. We develop a new rounding scheme based on the Partial Resampling variant of the Lov\'{a}sz Local Lemma developed by Harris & Srinivasan (2013). This achieves an approximation ratio of $1 + \frac{\ln (\Delta_1+1)}{a_{\text{min}}} + O\Big( \log(1 + \sqrt{ \frac{\log (\Delta_1+1)}{a_{\text{min}}}}) \Big)$, where $a_{\text{min}}$ is the minimum covering constraint and $\Delta_1$ is the maximum $\ell_1$-norm of any column of the covering matrix (whose entries are scaled to lie in $[0,1]$). When there are additional constraints on the sizes of the variables, we show an approximation ratio of $\ln \Delta_0 + O(\log \log \Delta_0)$ (where $\Delta_0$ is the maximum number of non-zero entries in any column of the covering matrix). We also show nearly-matching inapproximability and integrality-gap lower bounds. These results improve asymptotically, in several different ways, over results shown by Srinivasan (2006) and Kolliopoulos & Young (2005). We show also that the rounding process leads to negative correlation among the variables. This allows us to automatically handle multi-criteria programs, efficiently achieving approximation ratios which are essentially equivalent to the single-criterion case and apply even when the number of criteria is large.
  • Many randomized algorithms can be derandomized efficiently using either the method of conditional expectations or probability spaces with low (almost-) independence. A series of papers, beginning with Luby (1993) and continuing with Berger & Rompel (1991) and Chari et al. (2000), showed that these techniques can be combined to give deterministic parallel algorithms for combinatorial optimization problems involving sums of $w$-juntas. We improve these algorithms through derandomized variable partitioning and a new code construction for fooling Fourier characters over $GF(2)$. This reduces the processor complexity to essentially independent of $w$ while the running time is reduced from exponential in $w$ to linear in $w$. As a key subroutine, we give a new algorithm to generate a probability space which can fool a given set of neighborhoods. Schulman (1992) gave an NC algorithm to do so for neighborhoods of size $w \leq O(\log n)$. Our new algorithm is NC1, with essentially optimal time and processor complexity, when $w = O(\log n)$; it remains NC up to $w = \text{polylog}(n)$. This answers an open problem of Schulman. One major application of these algorithms is an NC algorithm for the Lov\'{a}sz Local Lemma. Previous NC algorithms, including the seminal algorithm of Moser & Tardos (2010) and the work of Chandrasekaran et. al (2013), required that (essentially) the bad-events could span only $O(\log n)$ variables; we relax this to $\text{polylog}(n)$ variables. We use this for an $\text{NC}^2$ algorithm for defective vertex coloring, which works for arbitrary degree graphs.
  • The Lopsided Lovasz Local Lemma (LLLL) is a probabilistic tool which is a cornerstone of the probabilistic method of combinatorics, which shows that it is possible to avoid a collection of "bad" events as long as their probabilities and interdependencies are sufficiently small. The strongest possible criterion that can be stated in these terms is due to Shearer (1985), although it is technically difficult to apply to constructions in combinatorics. The original formulation of the LLLL was non-constructive, a seminal algorithm of Moser & Tardos (2010) gave an efficient constructive algorithm for nearly all applications of it, including applications to $k$-SAT instances with a bounded number of occurrences per variables. Harris (2015) later gave an alternate criterion for this algorithm to converge, which appeared to give stronger bounds than the standard LLL. Unlike the LLL criterion or its variants, this criterion depends in a fundamental way on the decomposition of bad-events into variables. In this note, we show that the criterion given by Harris can be stronger in some cases even than Shearer's criterion. We construct $k$-SAT formulas with bounded variable occurrence, and show that the criterion of Harris is satisfied while the criterion of Shearer is violated. In fact, there is an exponentially growing gap between the bounds provable from any form of the LLLL and from the bound shown by Harris.
  • The gap between the known randomized and deterministic local distributed algorithms underlies arguably the most fundamental and central open question in distributed graph algorithms. In this paper, we develop a generic and clean recipe for derandomizing randomized LOCAL algorithms and transforming them into efficient deterministic LOCAL algorithm. We also exhibit how this simple recipe leads to significant improvements on a number of problems, in cases resolving known open problems. Two sample end-results are as follows: - An improved distributed hypergraph maximal matching algorithm, which improves on that of Fischer, Ghaffari, and Kuhn [FOCS'17], and leads to improved algorithms for edge-coloring, maximum matching approximation, and low out-degree edge orientation. The first gives an improved algorithm for Open Problem 11.4 of the book of Barenboim and Elkin, and the last gives the first positive resolution of their Open Problem 11.10. - An improved distributed algorithm for the Lov\'{a}sz Local Lemma, which gets closer to a conjecture of Chang and Pettie [FOCS'17], and moreover leads to improved distributed algorithms for problems such as defective coloring and $k$-SAT.
  • We give a new randomized distributed algorithm for $(\Delta+1)$-coloring in the LOCAL model, running in $O(\sqrt{\log \Delta})+ 2^{O(\sqrt{\log \log n})}$ rounds in a graph of maximum degree~$\Delta$. This implies that the $(\Delta+1)$-coloring problem is easier than the maximal independent set problem and the maximal matching problem, due to their lower bounds of $\Omega \left( \min \left( \sqrt{\frac{\log n}{\log \log n}}, \frac{\log \Delta}{\log \log \Delta} \right) \right)$ by Kuhn, Moscibroda, and Wattenhofer [PODC'04]. Our algorithm also extends to list-coloring where the palette of each node contains $\Delta+1$ colors. We extend the set of distributed symmetry-breaking techniques by performing a decomposition of graphs into dense and sparse parts.
  • Given a hypergraph $H$ and a weight function $w: V \rightarrow \{1, \dots, M\}$ on its vertices, we say that $w$ is isolating if there is exactly one edge of minimum weight $w(e) = \sum_{i \in e} w(i)$. The Isolation Lemma is a combinatorial principle introduced in Mulmuley et. al (1987) which gives a lower bound on the number of isolating weight functions. Mulmuley used this as the basis of a parallel algorithm for finding perfect graph matchings. It has a number of other applications to parallel algorithms and to reductions of general search problems to unique search problems (in which there are one or zero solutions). The original bound given by Mulmuley et al. was recently improved by Ta-Shma (2015). In this paper, we show improved lower bounds on the number of isolating weight functions, and we conjecture that the extremal case is when $H$ consists of $n$ singleton edges. When $M \gg n$ our improved bound matches this extremal case asymptotically. We are able to show that this conjecture holds in a number of special cases: when $H$ is a linear hypergraph or is 1-degenerate, or when $M = 2$. We also show that it holds asymptotically when $M \gg n \gg 1$.
  • In this paper, we give tight approximation algorithms for the $k$-center and matroid center problems with outliers. Unfairness arises naturally in this setting: certain clients could always be considered as outliers. To address this issue, we introduce a lottery model in which each client $j$ is allowed to submit a parameter $p_j \in [0,1]$ and we look for a random solution that covers every client $j$ with probability at least $p_j$. Our techniques include a randomized rounding procedure to round a point inside a matroid intersection polytope to a basis plus at most one extra item such that all marginal probabilities are preserved and such that a certain linear function of the variables does not decrease in the process with probability one.
  • The Lov\'{a}sz Local Lemma (LLL) is a cornerstone principle in the probabilistic method of combinatorics, and a seminal algorithm of Moser & Tardos (2010) provides an efficient randomized algorithm to implement it. This can be parallelized to give an algorithm that uses polynomially many processors and runs in $O(\log^3 n)$ time on an EREW PRAM, stemming from $O(\log n)$ adaptive computations of a maximal independent set (MIS). Chung et al. (2014) developed faster local and parallel algorithms, potentially running in time $O(\log^2 n)$, but these algorithms require more stringent conditions than the LLL. We give a new parallel algorithm that works under essentially the same conditions as the original algorithm of Moser & Tardos but uses only a single MIS computation, thus running in $O(\log^2 n)$ time on an EREW PRAM. This can be derandomized to give an NC algorithm running in time $O(\log^2 n)$ as well, speeding up a previous NC LLL algorithm of Chandrasekaran et al. (2013). We also provide improved and tighter bounds on the run-times of the sequential and parallel resampling-based algorithms originally developed by Moser & Tardos. These apply to any problem instance in which the tighter Shearer LLL criterion is satisfied.
  • We consider an issue of much current concern: could fairness, an issue that is already difficult to guarantee, worsen when algorithms run much of our lives? We consider this in the context of resource-allocation problems, we show that algorithms can guarantee certain types of fairness in a verifiable way. Our conceptual contribution is a simple approach to fairness in this context, which only requires that all users trust some public lottery. Our technical contributions are in ways to address the $k$-center and knapsack-center problems that arise in this context: we develop a novel dependent-rounding technique that, via the new ingredients of "slowing down" and additional randomization, guarantees stronger correlation properties than known before.
  • Karger (SIAM Journal on Computing, 1999) developed the first fully-polynomial approximation scheme to estimate the probability that a graph $G$ becomes disconnected, given that its edges are removed independently with probability $p$. This algorithm runs in $n^{5+o(1)} \epsilon^{-3}$ time to obtain an estimate within relative error $\epsilon$. We improve this run-time through algorithmic and graph-theoretic advances. First, there is a certain key sub-problem encountered by Karger, for which a generic estimation procedure is employed, we show that this has a special structure for which a much more efficient algorithm can be used. Second, we show better bounds on the number of edge cuts which are likely to fail. Here, Karger's analysis uses a variety of bounds for various graph parameters, we show that these bounds cannot be simultaneously tight. We describe a new graph parameter, which simultaneously influences all the bounds used by Karger, and obtain much tighter estimates of the cut structure of $G$. These techniques allow us to improve the runtime to $n^{3+o(1)} \epsilon^{-2}$, our results also rigorously prove certain experimental observations of Karger & Tai (Proc. ACM-SIAM Symposium on Discrete Algorithms, 1997). Our rigorous proofs are motivated by certain non-rigorous differential-equation approximations which, however, provably track the worst-case trajectories of the relevant parameters. A key driver of Karger's approach (and other cut-related results) is a bound on the number of small cuts: we improve these estimates when the min-cut size is "small" and odd, augmenting, in part, a result of Bixby (Bulletin of the AMS, 1974).
  • The Lovasz Local Lemma (LLL) is a probabilistic tool which has been used to show the existence of a variety of combinatorial structures with good "local" properties. The "LLL-distribution" can be used to show that the resulting structures have good global properties in expectation. The simplest, variable-based setting of the LLL was covered by the seminal algorithm of Moser & Tardos (2010). This has since been extended to other probability spaces including random permutations. One can similarly define an "MT-distribution" for these algorithms, that is, the distribution of the configuration they produce. Haeupler et al. (2011) showed bounds on the MT-distribution which essentially match the LLL-distribution for the variable-assignment setting; Harris & Srinivasan showed similar results for the permutation setting. In this work, we show new bounds on the MT-distribution which are significantly stronger than those known to hold for the LLL-distribution. In the variable-assignment setting, we show a tighter bound on the probability of a disjunctive event or singleton event. As a consequence, in $k$-SAT instances with bounded variable occurrence, the MT-distribution satisfies an $\epsilon$-approximate independence condition asymptotically stronger than the LLL-distribution. We use this to show a nearly tight bound on the minimum implicate size of a CNF boolean formula. Another noteworthy application is constructing independent transversals which avoid a given subset of vertices; this provides a constructive analogue to a result of Rabern (2014). In the permutation LLL setting, we show a new type of bound which is similar to the cluster-expansion LLL criterion of Bissacot et al. (2011), but is stronger and takes advantage of the extra structure in permutations. We illustrate with improved bounds on weighted Latin transversals and partial Latin transversals.
  • Moser & Tardos have developed a powerful algorithmic approach (henceforth "MT") to the Lovasz Local Lemma (LLL); the basic operation done in MT and its variants is a search for "bad" events in a current configuration. In the initial stage of MT, the variables are set independently. We examine the distributions on these variables which arise during intermediate stages of MT. We show that these configurations have a more or less "random" form, building further on the "MT-distribution" concept of Haeupler et al. in understanding the (intermediate and) output distribution of MT. This has a variety of algorithmic applications; the most important is that bad events can be found relatively quickly, improving upon MT across the complexity spectrum: it makes some polynomial-time algorithms sub-linear (e.g., for Latin transversals, which are of basic combinatorial interest), gives lower-degree polynomial run-times in some settings, transforms certain super-polynomial-time algorithms into polynomial-time ones, and leads to Las Vegas algorithms for some coloring problems for which only Monte Carlo algorithms were known. We show that in certain conditions when the LLL condition is violated, a variant of the MT algorithm can still produce a distribution which avoids most of the bad events. We show in some cases this MT variant can run faster than the original MT algorithm itself, and develop the first-known criterion for the case of the asymmetric LLL. This can be used to find partial Latin transversals -- improving upon earlier bounds of Stein (1975) -- among other applications. We furthermore give applications in enumeration, showing that most applications (where we aim for all or most of the bad events to be avoided) have many more solutions than known before by proving that the MT-distribution has "large" min-entropy and hence that its support-size is large.
  • Set Cover is a classic NP-hard problem; as shown by Slav\'{i}k (1997) the greedy algorithm gives an approximation ratio of $\ln n - \ln \ln n + \Theta(1)$. A series of works by Lund \& Yannakakis (1994), Feige (1998), Moshkovitz (2015) have shown that, under the assumption $P \neq NP$, it is impossible to obtain a polynomial-time approximation ratio with approximation ratio $(1 - \alpha) \ln n$, for any constant $\alpha > 0$. In this note, we show that under the Exponential Time Hypothesis (a stronger complexity-theoretic assumptions than $P \neq NP$), there are no polynomial-time algorithms achieving approximation ratio $\ln n - C \ln \ln n$, where $C$ is some universal constant. Thus, the greedy algorithm achieves an essentially optimal approximation ratio (up to the coefficient of $\ln \ln n$).
  • While there has been significant progress on algorithmic aspects of the Lov\'{a}sz Local Lemma (LLL) in recent years, a noteworthy exception is when the LLL is used in the context of random permutations. The breakthrough algorithm of Moser & Tardos only works in the setting of independent variables, and does not apply in this context. We resolve this by developing a randomized polynomial-time algorithm for such applications. A noteworthy application is for Latin transversals: the best-known general result here (Bissacot et al., improving on Erd\H{o}s and Spencer), states that any $n \times n$ matrix in which each entry appears at most $(27/256)n$ times, has a Latin transversal. We present the first polynomial-time algorithm to construct such a transversal. We also develop RNC algorithms for Latin transversals, rainbow Hamiltonian cycles, strong chromatic number, and hypergraph packing. In addition to efficiently finding a configuration which avoids bad-events, the algorithm of Moser & Tardos has many powerful extensions and properties. These include a well-characterized distribution on the output distribution, parallel algorithms, and a partial resampling variant. We show that our algorithm has nearly all of the same useful properties as the Moser-Tardos algorithm, and present a comparison of this aspect with recent works on the LLL in general probability spaces.
  • The Lopsided Lov\'{a}sz Local Lemma (LLLL) is a powerful probabilistic principle which has been used in a variety of combinatorial constructions. While originally a general statement about probability spaces, it has recently been transformed into a variety of polynomial-time algorithms. The resampling algorithm of Moser & Tardos (2010) is the most well-known example of this. A variety of criteria have been shown for the LLLL; the strongest possible criterion was shown by Shearer, and other criteria which are easier to use computationally have been shown by Bissacot et al (2011), Pegden (2014), Kolipaka & Szegedy (2011), and Kolipaka, Szegedy, Xu (2012). We show a new criterion for the Moser-Tardos algorithm to converge. This criterion is stronger than the LLLL criterion; this is possible because it does not apply in the same generality as the original LLLL; yet, it is strong enough to cover many applications of the LLLL in combinatorics. We show a variety of new bounds and algorithms. A noteworthy application is for $k$-SAT, with bounded occurrences of variables. As shown in Gebauer, Sz\'{a}bo, and Tardos (2011), a $k$-SAT instance in which every variable appears $L \leq \frac{2^{k+1}}{e (k+1)}$ times, is satisfiable. Although this bound is asymptotically tight (in $k$), we improve it to $L \leq \frac{2^{k+1} (1 - 1/k)^k}{k-1} - \frac{2}{k}$ which can be significantly stronger when $k$ is small. We introduce a new parallel algorithm for the LLLL. While Moser & Tardos described a simple parallel algorithm for the Lov\'{a}sz Local Lemma, and described a simple sequential algorithm for a form of the Lopsided Lemma, they were not able to combine the two. Our new algorithm applies in nearly all settings in which the sequential algorithm works --- this includes settings covered by our new stronger LLLL criterion.
  • Suppose that $a$ and $d$ are positive integers with $a \geq 2$. Let $h_{a,d}(n)$ be the largest integer $t$ such that any set of $n$ points in $\mathbb{R}^d$ contains a subset of $t$ points for which all the non-zero volumes of the ${t \choose a}$ subsets of order $a$ are distinct. Beginning with Erd\H{o}s in 1957, the function $h_{2,d}(n)$ has been closely studied and is known to be at least a power of $n$. We improve the best known bound for $h_{2,d}(n)$ and show that $h_{a,d}(n)$ is at least a power of $n$ for all $a$ and $d$.
  • Whether or not the problem of finding maximal independent sets (MIS) in hypergraphs is in (R)NC is one of the fundamental problems in the theory of parallel computing. Unlike the well-understood case of MIS in graphs, for the hypergraph problem, our knowledge is quite limited despite considerable work. It is known that the problem is in \emph{RNC} when the edges of the hypergraph have constant size. For general hypergraphs with $n$ vertices and $m$ edges, the fastest previously known algorithm works in time $O(\sqrt{n})$ with $\text{poly}(m,n)$ processors. In this paper we give an EREW PRAM algorithm that works in time $n^{o(1)}$ with $\text{poly}(m,n)$ processors on general hypergraphs satisfying $m \leq n^{\frac{\log^{(2)}n}{8(\log^{(3)}n)^2}}$, where $\log^{(2)}n = \log\log n$ and $\log^{(3)}n = \log\log\log n$. Our algorithm is based on a sampling idea that reduces the dimension of the hypergraph and employs the algorithm for constant dimension hypergraphs as a subroutine.