• We consider the $(\ell_p,\ell_r)$-Grothendieck problem, which seeks to maximize the bilinear form $y^T A x$ for an input matrix $A$ over vectors $x,y$ with $\|x\|_p=\|y\|_r=1$. The problem is equivalent to computing the $p \to r^*$ operator norm of $A$. The case $p=r=\infty$ corresponds to the classical Grothendieck problem. Our main result is an algorithm for arbitrary $p,r \ge 2$ with approximation ratio $(1+\epsilon_0)/(\sinh^{-1}(1)\cdot \gamma_{p^*} \,\gamma_{r^*})$ for some fixed $\epsilon_0 \le 0.00863$. Comparing this with Krivine's approximation ratio of $(\pi/2)/\sinh^{-1}(1)$ for the original Grothendieck problem, our guarantee is off from the best known hardness factor of $(\gamma_{p^*} \gamma_{r^*})^{-1}$ for the problem by a factor similar to Krivine's defect. Our approximation follows by bounding the value of the natural vector relaxation for the problem which is convex when $p,r \ge 2$. We give a generalization of random hyperplane rounding and relate the performance of this rounding to certain hypergeometric functions, which prescribe necessary transformations to the vector solution before the rounding is applied. Unlike Krivine's Rounding where the relevant hypergeometric function was $\arcsin$, we have to study a family of hypergeometric functions. The bulk of our technical work then involves methods from complex analysis to gain detailed information about the Taylor series coefficients of the inverses of these hypergeometric functions, which then dictate our approximation factor. Our result also implies improved bounds for "factorization through $\ell_{2}^{\,n}$" of operators from $\ell_{p}^{\,n}$ to $\ell_{q}^{\,m}$ (when $p\geq 2 \geq q$)--- such bounds are of significant interest in functional analysis and our work provides modest supplementary evidence for an intriguing parallel between factorizability, and constant-factor approximability.
  • We study the problem of deleting the smallest set $S$ of vertices (resp.\ edges) from a given graph $G$ such that the induced subgraph (resp.\ subgraph) $G \setminus S$ belongs to some class $\mathcal{H}$. We consider the case where graphs in $\mathcal{H}$ have treewidth bounded by $t$, and give a general framework to obtain approximation algorithms for both vertex and edge-deletion settings from approximation algorithms for certain natural graph partitioning problems called $k$-Subset Vertex Separator and $k$-Subset Edge Separator, respectively. For the vertex deletion setting, our framework combined with the current best result for $k$-Subset Vertex Separator, yields a significant improvement in the approximation ratios for basic problems such as $k$-Treewidth Vertex Deletion and Planar-$F$ Vertex Deletion. Our algorithms are simpler than previous works and give the first uniform approximation algorithms under the natural parameterization. For the edge deletion setting, we give improved approximation algorithms for $k$-Subset Edge Separator combining ideas from LP relaxations and important separators. We present their applications in bounded-degree graphs, and also give an APX-hardness result for the edge deletion problems.
  • We study the problem of computing the $p\rightarrow q$ norm of a matrix $A \in R^{m \times n}$, defined as \[ \|A\|_{p\rightarrow q} ~:=~ \max_{x \,\in\, R^n \setminus \{0\}} \frac{\|Ax\|_q}{\|x\|_p} \] This problem generalizes the spectral norm of a matrix ($p=q=2$) and the Grothendieck problem ($p=\infty$, $q=1$), and has been widely studied in various regimes. When $p \geq q$, the problem exhibits a dichotomy: constant factor approximation algorithms are known if $2 \in [q,p]$, and the problem is hard to approximate within almost polynomial factors when $2 \notin [q,p]$. The regime when $p < q$, known as \emph{hypercontractive norms}, is particularly significant for various applications but much less well understood. The case with $p = 2$ and $q > 2$ was studied by [Barak et al, STOC'12] who gave sub-exponential algorithms for a promise version of the problem (which captures small-set expansion) and also proved hardness of approximation results based on the Exponential Time Hypothesis. However, no NP-hardness of approximation is known for these problems for any $p < q$. We study the hardness of approximating matrix norms in both the above cases and prove the following results: - We show that for any $1< p < q < \infty$ with $2 \notin [p,q]$, $\|A\|_{p\rightarrow q}$ is hard to approximate within $2^{O(\log^{1-\epsilon}\!n)}$ assuming $NP \not\subseteq BPTIME(2^{\log^{O(1)}\!n})$. This suggests that, similar to the case of $p \geq q$, the hypercontractive setting may be qualitatively different when $2$ does not lie between $p$ and $q$. - For all $p \geq q$ with $2 \in [q,p]$, we show $\|A\|_{p\rightarrow q}$ is hard to approximate within any factor than $1/(\gamma_{p^*} \cdot \gamma_q)$, where for any $r$, $\gamma_r$ denotes the $r^{th}$ norm of a gaussian, and $p^*$ is the dual norm of $p$.
  • Given a web-scale graph that grows over time, how should its edges be stored and processed on multiple machines for rapid and accurate estimation of the count of triangles? The count of triangles (i.e., cliques of size three) has proven useful in many applications, including anomaly detection, community detection, and link recommendation. For triangle counting in large and dynamic graphs, recent work has focused largely on streaming algorithms and distributed algorithms. To achieve the advantages of both approaches, we propose DiSLR, a distributed streaming algorithm that estimates the counts of global triangles and local triangles associated with each node. Making one pass over the input stream, DiSLR carefully processes and stores the edges across multiple machines so that the redundant use of computational and storage resources is minimized. Compared to its best competitors, DiSLR is (a) Accurate: giving up to 39X smaller estimation error, (b) Fast: up to 10.4X faster, scaling linearly with the number of edges in the input stream, and (c) Theoretically sound: yielding unbiased estimates with variances decreasing faster as the number of machines is scaled up.
  • Given a set of vertices $V$ with $|V| = n$, a weight vector $w \in (\mathbb{R}^+ \cup \{ 0 \})^{\binom{V}{2}}$, and a probability vector $x \in [0, 1]^{\binom{V}{2}}$ in the matching polytope, we study the quantity $\frac{E_{G}[ \nu_w(G)]}{\sum_{(u, v) \in \binom{V}{2}} w_{u, v} x_{u, v}}$ where $G$ is a random graph where each edge $e$ with weight $w_e$ appears with probability $x_e$ independently, and let $\nu_w(G)$ denotes the weight of the maximum matching of $G$. This quantity is closely related to correlation gap and contention resolution schemes, which are important tools in the design of approximation algorithms, algorithmic game theory, and stochastic optimization. We provide lower bounds for the above quantity for general and bipartite graphs, and for weighted and unweighted settings. he best known upper bound is $0.54$ by Karp and Sipser, and the best lower bound is $0.4$. We show that it is at least $0.47$ for unweighted bipartite graphs, at least $0.45$ for weighted bipartite graphs, and at lea st $0.43$ for weighted general graphs. To achieve our results, we construct local distribution schemes on the dual which may be of independent interest.
  • The computational complexity of multicut-like problems may vary significantly depending on whether the terminals are fixed or not. In this work we present a comprehensive study of this phenomenon in two types of cut problems in directed graphs: double cut and bicut. 1. The fixed-terminal edge-weighted double cut is known to be solvable efficiently. We show a tight approximability factor of $2$ for the fixed-terminal node-weighted double cut. We show that the global node-weighted double cut cannot be approximated to a factor smaller than $3/2$ under the Unique Games Conjecture (UGC). 2. The fixed-terminal edge-weighted bicut is known to have a tight approximability factor of $2$. We show that the global edge-weighted bicut is approximable to a factor strictly better than $2$, and that the global node-weighted bicut cannot be approximated to a factor smaller than $3/2$ under UGC. 3. In relation to these investigations, we also prove two results on undirected graphs which are of independent interest. First, we show NP-completeness and a tight inapproximability bound of $4/3$ for the node-weighted $3$-cut problem. Second, we show that for constant $k$, there exists an efficient algorithm to solve the minimum $\{s,t\}$-separating $k$-cut problem. Our techniques for the algorithms are combinatorial, based on LPs and based on enumeration of approximate min-cuts. Our hardness results are based on combinatorial reductions and integrality gap instances.
  • For an $n$-variate order-$d$ tensor $A$, define $ A_{\max} := \sup_{\| x \|_2 = 1} \langle A , x^{\otimes d} \rangle$ to be the maximum value taken by the tensor on the unit sphere. It is known that for a random tensor with i.i.d $\pm 1$ entries, $A_{\max} \lesssim \sqrt{n\cdot d\cdot\log d}$ w.h.p. We study the problem of efficiently certifying upper bounds on $A_{\max}$ via the natural relaxation from the Sum of Squares (SoS) hierarchy. Our results include: - When $A$ is a random order-$q$ tensor, we prove that $q$ levels of SoS certifies an upper bound $B$ on $A_{\max}$ that satisfies \[ B ~~~~\leq~~ A_{\max} \cdot \biggl(\frac{n}{q^{\,1-o(1)}}\biggr)^{q/4-1/2} \quad \text{w.h.p.} \] Our upper bound improves a result of Montanari and Richard (NIPS 2014) when $q$ is large. - We show the above bound is the best possible up to lower order terms, namely the optimum of the level-$q$ SoS relaxation is at least \[ A_{\max} \cdot \biggl(\frac{n}{q^{\,1+o(1)}}\biggr)^{q/4-1/2} \ . \] - When $A$ is a random order-$d$ tensor, we prove that $q$ levels of SoS certifies an upper bound $B$ on $A_{\max}$ that satisfies \[ B ~~\leq ~~ A_{\max} \cdot \biggl(\frac{\widetilde{O}(n)}{q}\biggr)^{d/4 - 1/2} \quad \text{w.h.p.} \] For growing $q$, this improves upon the bound certified by constant levels of SoS. This answers in part, a question posed by Hopkins, Shi, and Steurer (COLT 2015), who established the tight characterization for constant levels of SoS.
  • We consider goods that can be shared with k-hop neighbors (i.e., the set of nodes within k hops from an owner) on a social network. We examine incentives to buy such a good by devising game-theoretic models where each node decides whether to buy the good or free ride. First, we find that social inefficiency, specifically excessive purchase of the good, occurs in Nash equilibria. Second, the social inefficiency decreases as k increases and thus a good can be shared with more nodes. Third, and most importantly, the social inefficiency can also be significantly reduced by charging free riders an access cost and paying it to owners, leading to the conclusion that organizations and system designers should impose such a cost. These findings are supported by our theoretical analysis in terms of the price of anarchy and the price of stability; and by simulations based on synthetic and real social networks.
  • We consider the following basic problem: given an $n$-variate degree-$d$ homogeneous polynomial $f$ with real coefficients, compute a unit vector $x \in \mathbb{R}^n$ that maximizes $|f(x)|$. Besides its fundamental nature, this problem arises in diverse contexts ranging from tensor and operator norms to graph expansion to quantum information theory. The homogeneous degree $2$ case is efficiently solvable as it corresponds to computing the spectral norm of an associated matrix, but the higher degree case is NP-hard. We give approximation algorithms for this problem that offer a trade-off between the approximation ratio and running time: in $n^{O(q)}$ time, we get an approximation within factor $O_d((n/q)^{d/2-1})$ for arbitrary polynomials, $O_d((n/q)^{d/4-1/2})$ for polynomials with non-negative coefficients, and $O_d(\sqrt{m/q})$ for sparse polynomials with $m$ monomials. The approximation guarantees are with respect to the optimum of the level-$q$ sum-of-squares (SoS) SDP relaxation of the problem. Known polynomial time algorithms for this problem rely on "decoupling lemmas." Such tools are not capable of offering a trade-off like our results as they blow up the number of variables by a factor equal to the degree. We develop new decoupling tools that are more efficient in the number of variables at the expense of less structure in the output polynomials. This enables us to harness the benefits of higher level SoS relaxations. We complement our algorithmic results with some polynomially large integrality gaps, albeit for a slightly weaker (but still very natural) relaxation. Toward this, we give a method to lift a level-$4$ solution matrix $M$ to a higher level solution, under a mild technical condition on $M$.
  • Given a graph $G = (V, E)$ and an integer $k$, we study $k$-Vertex Seperator (resp. $k$-Edge Separator), where the goal is to remove the minimum number of vertices (resp. edges) such that each connected component in the resulting graph has at most $k$ vertices. Our primary focus is on the case where $k$ is either a constant or a slowly growing function of $n$ (e.g. $O(\log n)$ or $n^{o(1)}$). Our problems can be interpreted as a special case of three general classes of problems that have been studied separately (balanced graph partitioning, Hypergraph Vertex Cover (HVC), and fixed parameter tractability (FPT)). Our main result is an $O(\log k)$-approximation algorithm for $k$-Vertex Seperator that runs in time $2^{O(k)} n^{O(1)}$, and an $O(\log k)$-approximation algorithm for $k$-Edge Separator that runs in time $n^{O(1)}$. Our result on $k$-Edge Seperator improves the best previous graph partitioning algorithm for small $k$. Our result on $k$-Vertex Seperator improves the simple $(k+1)$-approximation from HVC. When $OPT > k$, the running time $2^{O(k)} n^{O(1)}$ is faster than the lower bound $k^{\Omega(OPT)} n^{\Omega(1)}$ for exact algorithms assuming the Exponential Time Hypothesis. While the running time of $2^{O(k)} n^{O(1)}$ for $k$-Vertex Separator seems unsatisfactory, we show that the superpolynomial dependence on $k$ may be needed to achieve a polylogarithmic approximation ratio, based on hardness of Densest $k$-Subgraph. We also study $k$-Path Transversal, where the goal is to remove the minimum number of vertices such that there is no simple path of length $k$. With additional ideas from FPT algorithms and graph theory, we present an $O(\log k)$-approximation algorithm for $k$-Path Transversal that runs in time $2^{O(k^3 \log k)} n^{O(1)}$. Previously, the existence of even $(1 - \delta)k$-approximation algorithm for fixed $\delta > 0$ was open.
  • We study variants of the classic $s$-$t$ cut problem and prove the following improved hardness results assuming the Unique Games Conjecture (UGC). - For any constant $k \geq 2$ and $\epsilon > 0$, we show that Directed Multicut with $k$ source-sink pairs is hard to approximate within a factor $k - \epsilon$. This matches the trivial $k$-approximation algorithm. By a simple reduction, our result for $k = 2$ implies that Directed Multiway Cut with two terminals (also known as $s$-$t$ Bicut) is hard to approximate within a factor $2 - \epsilon$, matching the trivial $2$-approximation algorithm. Previously, the best hardness factor for these problems (for constant $k$) was $1.5 - \epsilon$ under the UGC. - For Length-Bounded Cut and Shortest Path Interdiction, we show that both problems are hard to approximate within any constant factor, even if we allow bicriteria approximation. If we want to cut vertices or the graph is directed, our hardness factor for Length-Bounded Cut matches the best approximation ratio up to a constant. Previously, the best hardness factor was $1.1377$ for Length-Bounded Cut and $2$ for Shortest Path Interdiction. - Assuming a variant of the UGC (implied by another variant of Bansal and Khot), we prove that it is hard to approximate Resource Minimization Fire Containment within any constant factor. Previously, the best hardness factor was $2$. Our results are based on a general method of converting an integrality gap instance to a length-control dictatorship test for variants of the $s$-$t$ cut problem, which may be useful for other problems.
  • The k-means problem consists of finding k centers in the d-dimensional Euclidean space that minimize the sum of the squared distances of all points in an input set P to their closest respective center. Awasthi et. al. recently showed that there exists a constant c > 1 such that it is NP-hard to approximate the k-means objective within a factor of c. We establish that the constant c is at least 1.0013.
  • We study the problem of allocating a set of indivisible items to agents with additive utilities to maximize the Nash social welfare. Cole and Gkatzelis recently proved that this problem admits a constant factor approximation. We complement their result by showing that this problem is APX-hard.
  • A hypergraph is said to be $\chi$-colorable if its vertices can be colored with $\chi$ colors so that no hyperedge is monochromatic. $2$-colorability is a fundamental property (called Property B) of hypergraphs and is extensively studied in combinatorics. Algorithmically, however, given a $2$-colorable $k$-uniform hypergraph, it is NP-hard to find a $2$-coloring miscoloring fewer than a fraction $2^{-k+1}$ of hyperedges (which is achieved by a random $2$-coloring), and the best algorithms to color the hypergraph properly require $\approx n^{1-1/k}$ colors, approaching the trivial bound of $n$ as $k$ increases. In this work, we study the complexity of approximate hypergraph coloring, for both the maximization (finding a $2$-coloring with fewest miscolored edges) and minimization (finding a proper coloring using fewest number of colors) versions, when the input hypergraph is promised to have the following stronger properties than $2$-colorability: (A) Low-discrepancy: If the hypergraph has discrepancy $\ell \ll \sqrt{k}$, we give an algorithm to color the it with $\approx n^{O(\ell^2/k)}$ colors. However, for the maximization version, we prove NP-hardness of finding a $2$-coloring miscoloring a smaller than $2^{-O(k)}$ (resp. $k^{-O(k)}$) fraction of the hyperedges when $\ell = O(\log k)$ (resp. $\ell=2$). Assuming the UGC, we improve the latter hardness factor to $2^{-O(k)}$ for almost discrepancy-$1$ hypergraphs. (B) Rainbow colorability: If the hypergraph has a $(k-\ell)$-coloring such that each hyperedge is polychromatic with all these colors, we give a $2$-coloring algorithm that miscolors at most $k^{-\Omega(k)}$ of the hyperedges when $\ell \ll \sqrt{k}$, and complement this with a matching UG hardness result showing that when $\ell =\sqrt{k}$, it is hard to even beat the $2^{-k+1}$ bound achieved by a random coloring.
  • Given an undirected graph $G = (V_G, E_G)$ and a fixed "pattern" graph $H = (V_H, E_H)$ with $k$ vertices, we consider the $H$-Transversal and $H$-Packing problems. The former asks to find the smallest $S \subseteq V_G$ such that the subgraph induced by $V_G \setminus S$ does not have $H$ as a subgraph, and the latter asks to find the maximum number of pairwise disjoint $k$-subsets $S_1, ..., S_m \subseteq V_G$ such that the subgraph induced by each $S_i$ has $H$ as a subgraph. We prove that if $H$ is 2-connected, $H$-Transversal and $H$-Packing are almost as hard to approximate as general $k$-Hypergraph Vertex Cover and $k$-Set Packing, so it is NP-hard to approximate them within a factor of $\Omega (k)$ and $\widetilde \Omega (k)$ respectively. We also show that there is a 1-connected $H$ where $H$-Transversal admits an $O(\log k)$-approximation algorithm, so that the connectivity requirement cannot be relaxed from 2 to 1. For a special case of $H$-Transversal where $H$ is a (family of) cycles, we mention the implication of our result to the related Feedback Vertex Set problem, and give a different hardness proof for directed graphs.
  • The Graph Pricing problem is among the fundamental problems whose approximability is not well-understood. While there is a simple combinatorial 1/4-approximation algorithm, the best hardness result remains at 1/2 assuming the Unique Games Conjecture (UGC). We show that it is NP-hard to approximate within a factor better than 1/4 under the UGC, so that the simple combinatorial algorithm might be the best possible. We also prove that for any $\epsilon > 0$, there exists $\delta > 0$ such that the integrality gap of $n^{\delta}$-rounds of the Sherali-Adams hierarchy of linear programming for Graph Pricing is at most 1/2 + $\epsilon$. This work is based on the effort to view the Graph Pricing problem as a Constraint Satisfaction Problem (CSP) simpler than the standard and complicated formulation. We propose the problem called Generalized Max-Dicut($T$), which has a domain size $T + 1$ for every $T \geq 1$. Generalized Max-Dicut(1) is well-known Max-Dicut. There is an approximation-preserving reduction from Generalized Max-Dicut on directed acyclic graphs (DAGs) to Graph Pricing, and both our results are achieved through this reduction. Besides its connection to Graph Pricing, the hardness of Generalized Max-Dicut is interesting in its own right since in most arity two CSPs studied in the literature, SDP-based algorithms perform better than LP-based or combinatorial algorithms --- for this arity two CSP, a simple combinatorial algorithm does the best.
  • Random (dv,dc)-regular LDPC codes are well-known to achieve the Shannon capacity of the binary symmetric channel (for sufficiently large dv and dc) under exponential time decoding. However, polynomial time algorithms are only known to correct a much smaller fraction of errors. One of the most powerful polynomial-time algorithms with a formal analysis is the LP decoding algorithm of Feldman et al. which is known to correct an Omega(1/dc) fraction of errors. In this work, we show that fairly powerful extensions of LP decoding, based on the Sherali-Adams and Lasserre hierarchies, fail to correct much more errors than the basic LP-decoder. In particular, we show that: 1) For any values of dv and dc, a linear number of rounds of the Sherali-Adams LP hierarchy cannot correct more than an O(1/dc) fraction of errors on a random (dv,dc)-regular LDPC code. 2) For any value of dv and infinitely many values of dc, a linear number of rounds of the Lasserre SDP hierarchy cannot correct more than an O(1/dc) fraction of errors on a random (dv,dc)-regular LDPC code. Our proofs use a new stretching and collapsing technique that allows us to leverage recent progress in the study of the limitations of LP/SDP hierarchies for Maximum Constraint Satisfaction Problems (Max-CSPs). The problem then reduces to the construction of special balanced pairwise independent distributions for Sherali-Adams and special cosets of balanced pairwise independent subgroups for Lasserre. Some of our techniques are more generally applicable to a large class of Boolean CSPs called Min-Ones. In particular, for k-Hypergraph Vertex Cover, we obtain an improved integrality gap of $k-1-\epsilon$ that holds after a \emph{linear} number of rounds of the Lasserre hierarchy, for any k = q+1 with q an arbitrary prime power. The best previous gap for a linear number of rounds was equal to $2-\epsilon$ and due to Schoenebeck.