
We study highdimensional distribution learning in an agnostic setting where
an adversary is allowed to arbitrarily corrupt an $\varepsilon$fraction of the
samples. Such questions have a rich history spanning statistics, machine
learning and theoretical computer science. Even in the most basic settings, the
only known approaches are either computationally inefficient or lose
dimensiondependent factors in their error guarantees. This raises the
following question:Is highdimensional agnostic distribution learning even
possible, algorithmically?
In this work, we obtain the first computationally efficient algorithms with
dimensionindependent error guarantees for agnostically learning several
fundamental classes of highdimensional distributions: (1) a single Gaussian,
(2) a product distribution on the hypercube, (3) mixtures of two product
distributions (under a natural balancedness condition), and (4) mixtures of
spherical Gaussians. Our algorithms achieve error that is independent of the
dimension, and in many cases scales nearlylinearly with the fraction of
adversarially corrupted samples. Moreover, we develop a general recipe for
detecting and correcting corruptions in highdimensions, that may be applicable
to many other problems.

We investigate the problem of learning Bayesian networks in a robust model
where an $\epsilon$fraction of the samples are adversarially corrupted. In
this work, we study the fully observable discrete case where the structure of
the network is given. Even in this basic setting, previous learning algorithms
either run in exponential time or lose dimensiondependent factors in their
error guarantees. We provide the first computationally efficient robust
learning algorithm for this problem with dimensionindependent error
guarantees. Our algorithm has nearoptimal sample complexity, runs in
polynomial time, and achieves error that scales nearlylinearly with the
fraction of adversarially corrupted samples. Finally, we show on both synthetic
and semisynthetic data that our algorithm performs well in practice.

In the Orthogonal Vectors (OV) problem, we wish to determine if there is an
orthogonal pair of vectors among $n$ Boolean vectors in $d$ dimensions. The OV
Conjecture (OVC) posits that OV requires $n^{2o(1)}$ time to solve, for all
$d=\omega(\log n)$. Assuming the OVC, optimal time lower bounds have been
proved for many prominent problems in $P$.
We prove that OVC is true in several computational models of interest:
* For all sufficiently large $n$ and $d$, OV for $n$ vectors in $\{0,1\}^d$
has branching program complexity $\tilde{\Theta}(n\cdot \min(n,2^d))$. In
particular, the lower bounds match the upper bounds up to polylog factors.
* OV has Boolean formula complexity $\tilde{\Theta}(n\cdot \min(n,2^d))$,
over all complete bases of $O(1)$ fanin.
* OV requires $\tilde{\Theta}(n\cdot \min(n,2^d))$ wires, in formulas
comprised of gates computing arbitrary symmetric functions of unbounded fanin.
Our lower bounds basically match the best known (quadratic) lower bounds for
any explicit function in those models. Analogous lower bounds hold for many
related problems shown to be hard under OVC, such as Batch Partial Match, Batch
Subset Queries, and Batch Hamming Nearest Neighbors, all of which have very
succinct reductions to OV.
The proofs use a certain kind of input restriction that is different from
typical random restrictions where variables are assigned independently. We give
a sense in which independent random restrictions cannot be used to show
hardness, in that OVC is false in the "average case" even for $AC^0$ formulas:
* For every fixed $p \in (0,1)$ there is an $\epsilon_p > 0$ such that for
every $n$ and $d$, OV instances where input bits are independently set to $1$
with probability $p$ (and $0$ otherwise) can be solved with $AC^0$ formulas of
size $O(n^{2\epsilon_p})$, on all but a $o_n(1)$ fraction of instances.

We consider the problem of robust polynomial regression, where one receives
samples $(x_i, y_i)$ that are usually within $\sigma$ of a polynomial $y =
p(x)$, but have a $\rho$ chance of being arbitrary adversarial outliers.
Previously, it was known how to efficiently estimate $p$ only when $\rho <
\frac{1}{\log d}$. We give an algorithm that works for the entire feasible
range of $\rho < 1/2$, while simultaneously improving other parameters of the
problem. We complement our algorithm, which gives a factor 2 approximation,
with impossibility results that show, for example, that a $1.09$ approximation
is impossible even with infinitely many samples.

Let $X$ be a finite collection of sets (or "clusters"). We consider the
problem of counting the number of ways a cluster $A \in X$ can be partitioned
into two disjoint clusters $A_1, A_2 \in X$, thus $A = A_1 \uplus A_2$ is the
disjoint union of $A_1$ and $A_2$; this problem arises in the run time analysis
of the ASTRAL algorithm in phylogenetic reconstruction. We obtain the bound $$
 \{ (A_1,A_2,A) \in X \times X \times X: A = A_1 \uplus A_2 \}  \leq
X^{3/p} $$ where $X$ denotes the cardinality of $X$, and $p := \log_3
\frac{27}{4} = 1.73814\dots$, so that $\frac{3}{p} = 1.72598\dots$.
Furthermore, the exponent $p$ cannot be replaced by any larger quantity. This
improves upon the trivial bound of $X^2$. The argument relies on establishing
a onedimensional convolution inequality that can be established by elementary
calculus combined with some numerical verification.
In a similar vein, we show that for any subset $A$ of a discrete cube
$\{0,1\}^n$, the additive energy of $A$ (the number of quadruples
$(a_1,a_2,a_3,a_4)$ in $A^4$ with $a_1+a_2=a_3+a_4$) is at most $A^{\log_2
6}$, and that this exponent is best possible.

Maximally recoverable codes are codes designed for distributed storage which
combine quick recovery from single node failure and optimal recovery from
catastrophic failure. Gopalan et al [SODA 2017] studied the alphabet size
needed for such codes in grid topologies and gave a combinatorial
characterization for it.
Consider a labeling of the edges of the complete bipartite graph $K_{n,n}$
with labels coming from $F_2^d$ , that satisfies the following condition: for
any simple cycle, the sum of the labels over its edges is nonzero. The minimal
d where this is possible controls the alphabet size needed for maximally
recoverable codes in n x n grid topologies.
Prior to the current work, it was known that d is between $(\log n)^2$ and
$n\log n$. We improve both bounds and show that d is linear in n. The upper
bound is a recursive construction which beats the random construction. The
lower bound follows by first relating the problem to the independence number of
the Birkhoff polytope graph, and then providing tight bounds for it using the
representation theory of the symmetric group.

We construct sets $A, B$ in a vector space over $\mathbb{F}_2$ with the
property that $A$ is "statistically" almost closed under addition by $B$ in the
sense that $a + b$ almost always lies in $A$ when $a \in A, b \in B$, but which
is extremely far from being "combinatorially" almost closed under addition by
$B$: if $A' \subset A$, $B' \subset B$ and $A' + B'$ is comparable in size to
$A'$ then $B' \lessapprox B^{1/2}$.

If $E$ is an elliptic curve with a point of order two, then work of Klagsbrun
and Lemke Oliver shows that the distribution of
$\dim_{\mathbb{F}_2}\mathrm{Sel}_\phi(E^d/\mathbb{Q})  \dim_{\mathbb{F}_2}
\mathrm{Sel}_{\hat\phi}(E^{\prime d}/\mathbb{Q})$ within the quadratic twist
family tends to the discrete normal distribution $\mathcal{N}(0,\frac{1}{2}
\log \log X)$ as $X \rightarrow \infty$.
We consider the distribution of $\mathrm{dim}_{\mathbb{F}_2}
\mathrm{Sel}_\phi(E^d/\mathbb{Q})$ within such a quadratic twist family when
$\dim_{\mathbb{F}_2} \mathrm{Sel}_\phi(E^d/\mathbb{Q})  \dim_{\mathbb{F}_2}
\mathrm{Sel}_{\hat\phi}(E^{\prime d}/\mathbb{Q})$ has a fixed value $u$.
Specifically, we show that for every $r$, the limiting probability that
$\dim_{\mathbb{F}_2} \mathrm{Sel}_\phi(E^d/\mathbb{Q}) = r$ is given by an
explicit constant $\alpha_{r,u}$. The constants $\alpha_{r,u}$ are closely
related to the $u$probabilities introduced in Cohen and Lenstra's work on the
distribution of class groups, and thus provide a connection between the
distribution of Selmer groups of elliptic curves and random abelian groups.
Our analysis of this problem has two steps. The first step uses algebraic and
combinatorial methods to directly relate the ranks of the Selmer groups in
question to the dimensions of the kernels of random $\mathbb{F}_2$matrices.
This proves that the density of twists with a given $\phi$Selmer rank $r$ is
given by $\alpha_{r,u}$ for an unusual notion of density. The second step of
the analysis utilizes techniques from analytic number theory to show that this
result implies the correct asymptotics in terms of the natural notion of
density.

This work initiates a systematic investigation of testing highdimensional
structured distributions by focusing on testing Bayesian networks  the
prototypical family of directed graphical models. A Bayesian network is defined
by a directed acyclic graph, where we associate a random variable with each
node. The value at any particular node is conditionally independent of all the
other nondescendant nodes once its parents are fixed. Specifically, we study
the properties of identity testing and closeness testing of Bayesian networks.
Our main contribution is the first nontrivial efficient testing algorithms for
these problems and corresponding informationtheoretic lower bounds. For a wide
range of parameter settings, our testing algorithms have sample complexity
sublinear in the dimension and are sampleoptimal, up to constant factors.

We present a new approach to constructing unconditional pseudorandom
generators against classes of functions that involve computing a linear
function of the inputs. We give an explicit construction of a pseudorandom
generator that fools the discrete Fourier transforms of linear functions with
seedlength that is nearly logarithmic (up to polyloglog factors) in the input
size and the desired error parameter. Our result gives a single pseudorandom
generator that fools several important classes of tests computable in logspace
that have been considered in the literature, including halfspaces (over general
domains), modular tests and combinatorial shapes. For all these classes, our
generator is the first that achieves near logarithmic seedlength in both the
input length and the error parameter. Getting such a seedlength is a natural
challenge in its own right, which needs to be overcome in order to derandomize
RL  a central question in complexity theory.
Our construction combines ideas from a large body of prior work, ranging from
a classical construction of [NN93] to the recent gradually increasing
independence paradigm of [KMN11, CRSW13, GMRTV12], while also introducing some
novel analytic machinery which might find other applications.

The problem of constructing pseudorandom generators that fool halfspaces has
been studied intensively in recent times. For fooling halfspaces over the
hypercube with polynomially small error, the best construction known requires
seedlength O(log^2 n) (MekaZ13). Getting the seedlength down to O(log(n)) is
a natural challenge in its own right, which needs to be overcome in order to
derandomize RL. In this work we make progress towards this goal by obtaining
nearoptimal generators for two important special cases:
1) We give a near optimal derandomization of the Chernoff bound for
independent, uniformly random bits. Specifically, we show how to generate a x
in {1,1}^n using $\tilde{O}(\log (n/\epsilon))$ random bits such that for any
unit vector u, <u,x> matches the subGaussian tail behaviour predicted by the
Chernoff bound up to error eps.
2) We construct a generator which fools halfspaces with {0,1,1} coefficients
with error eps with a seedlength of $\tilde{O}(\log(n/\epsilon))$. This
includes the important special case of majorities.
In both cases, the best previous results required seedlength of $O(\log n +
\log^2(1/\epsilon))$.
Technically, our work combines new Fourieranalytic tools with the iterative
dimension reduction techniques and the gradually increasing independence
paradigm of previous works (KaneMN11, CelisRSW13, GopalanMRTV12).

We give improved pseudorandom generators (PRGs) for Lipschitz functions of
lowdegree polynomials over the hypercube. These are functions of the form
psi(P(x)), where P is a lowdegree polynomial and psi is a function with small
Lipschitz constant. PRGs for smooth functions of lowdegree polynomials have
received a lot of attention recently and play an important role in constructing
PRGs for the natural class of polynomial threshold functions. In spite of the
recent progress, no nontrivial PRGs were known for fooling Lipschitz functions
of degree O(log n) polynomials even for constant error rate. In this work, we
give the first such generator obtaining a seedlength of (log
n)\tilde{O}(d^2/eps^2) for fooling degree d polynomials with error eps.
Previous generators had an exponential dependence on the degree.
We use our PRG to get better integrality gap instances for sparsest cut, a
fundamental problem in graph theory with many applications in graph
optimization. We give an instance of uniform sparsest cut for which a powerful
semidefinite relaxation (SDP) first introduced by Goemans and Linial and
studied in the seminal work of Arora, Rao and Vazirani has an integrality gap
of exp(\Omega((log log n)^{1/2})). Understanding the performance of the
GoemansLinial SDP for uniform sparsest cut is an important open problem in
approximation algorithms and metric embeddings and our work gives a
nearexponential improvement over previous lower bounds which achieved a gap of
\Omega(log log n).

Alexandrov's Theorem states that every metric with the global topology and
local geometry required of a convex polyhedron is in fact the intrinsic metric
of a unique convex polyhedron. Recent work by Bobenko and Izmestiev describes a
differential equation whose solution leads to the polyhedron corresponding to a
given metric. We describe an algorithm based on this differential equation to
compute the polyhedron to arbitrary precision given the metric, and prove a
pseudopolynomial bound on its running time. Along the way, we develop
pseudopolynomial algorithms for computing shortest paths and weighted Delaunay
triangulations on a polyhedral surface, even when the surface edges are not
shortest paths.

When studying the least common multiple of some finite sequences of integers,
the first author introduced the interesting arithmetic functions $g_k$ $(k \in
\mathbb{N})$, defined by $g_k(n) := \frac{n (n + 1) ... (n + k)}{\lcm(n, n + 1,
>..., n + k)}$ $(\forall n \in \mathbb{N} \setminus \{0\})$. He proved that
$g_k$ $(k \in \mathbb{N})$ is periodic and $k!$ is a period of $g_k$. He raised
the open problem consisting to determine the smallest positive period $P_k$ of
$g_k$. Very recently, S. Hong and Y. Yang have improved the period $k!$ of
$g_k$ to $\lcm(1, 2, ..., k)$. In addition, they have conjectured that $P_k$ is
always a multiple of the positive integer $\frac{\lcm(1, 2, >..., k, k + 1)}{k
+ 1}$. An immediate consequence of this conjecture states that if $(k + 1)$ is
prime then the exact period of $g_k$ is precisely equal to $\lcm(1, 2, ...,
k)$.
In this paper, we first prove the conjecture of S. Hong and Y. Yang and then
we give the exact value of $P_k$ $(k \in \mathbb{N})$. We deduce, as a
corollary, that $P_k$ is equal to the part of $\lcm(1, 2, ..., k)$ not
divisible by some prime.