
We consider a robust analog of the planted clique problem. In this analog, a
set $S$ of vertices is chosen and all edges in $S$ are included; then, edges
between $S$ and the rest of the graph are included with probability
$\frac{1}{2}$, while edges not touching $S$ are allowed to vary arbitrarily.
For this semirandom model, we show that the informationtheoretic threshold
for recovery is $\tilde{\Theta}(\sqrt{n})$, in sharp contrast to the classical
informationtheoretic threshold of $\Theta(\log(n))$. This matches the
conjectured computational threshold for the classical planted clique problem,
and thus raises the intriguing possibility that, once we require robustness,
there is no computationalstatistical gap for planted clique. Our lower bound
involves establishing a result regarding the KL divergence of a family of
perturbed Bernoulli distributions, which may be of independent interest.

In high dimensions, most machine learning methods are brittle to even a small
fraction of structured outliers. To address this, we introduce a new
metaalgorithm that can take in a base learner such as least squares or
stochastic gradient descent, and harden the learner to be resistant to
outliers. Our method, Sever, possesses strong theoretical guarantees yet is
also highly scalable  beyond running the base learner itself, it only
requires computing the top singular vector of a certain $n \times d$ matrix. We
apply Sever on a drug design dataset and a spam classification dataset, and
find that in both cases it has substantially greater robustness than several
baselines. On the spam dataset, with $1\%$ corruptions, we achieved $7.4\%$
test error, compared to $13.4\%20.5\%$ for the baselines, and $3\%$ error on
the uncorrupted dataset. Similarly, on the drug design dataset, with $10\%$
corruptions, we achieved $1.42$ meansquared error test error, compared to
$1.51$$2.33$ for the baselines, and $1.23$ error on the uncorrupted dataset.

This report surveys the landscape of potential security threats from
malicious uses of AI, and proposes ways to better forecast, prevent, and
mitigate these threats. After analyzing the ways in which AI may influence the
threat landscape in the digital, physical, and political domains, we make four
highlevel recommendations for AI researchers and other stakeholders. We also
suggest several promising areas for further research that could expand the
portfolio of defenses, or make attacks less effective or harder to execute.
Finally, we discuss, but do not conclusively resolve, the longterm equilibrium
of attackers and defenders.

While neural networks have achieved high accuracy on standard image
classification benchmarks, their accuracy drops to nearly zero in the presence
of small adversarial perturbations to test inputs. Defenses based on
regularization and adversarial training have been proposed, but often followed
by new, stronger attacks that defeat these defenses. Can we somehow end this
arms race? In this work, we study this problem for neural networks with one
hidden layer. We first propose a method based on a semidefinite relaxation that
outputs a certificate that for a given network and test input, no attack can
force the error to exceed a certain value. Second, as this certificate is
differentiable, we jointly optimize it with the network parameters, providing
an adaptive regularizer that encourages robustness against all attacks. On
MNIST, our approach produces a network and a certificate that no attack that
perturbs each pixel by at most \epsilon = 0.1 can cause more than 35% test
error.

The vast majority of theoretical results in machine learning and statistics
assume that the available training data is a reasonably reliable reflection of
the phenomena to be learned or estimated. Similarly, the majority of machine
learning and statistical techniques used in practice are brittle to the
presence of large amounts of biased or malicious data. In this work we consider
two frameworks in which to study estimation, learning, and optimization in the
presence of significant fractions of arbitrary data.
The first framework, listdecodable learning, asks whether it is possible to
return a list of answers, with the guarantee that at least one of them is
accurate. For example, given a dataset of $n$ points for which an unknown
subset of $\alpha n$ points are drawn from a distribution of interest, and no
assumptions are made about the remaining $(1\alpha)n$ points, is it possible
to return a list of $\operatorname{poly}(1/\alpha)$ answers, one of which is
correct? The second framework, which we term the semiverified learning model,
considers the extent to which a small dataset of trusted data (drawn from the
distribution in question) can be leveraged to enable the accurate extraction of
information from a much larger but untrusted dataset (of which only an
$\alpha$fraction is drawn from the distribution).
We show strong positive results in both settings, and provide an algorithm
for robust learning in a very general stochastic optimization setting. This
general result has immediate implications for robust estimation in a number of
settings, including for robustly estimating the mean of distributions with
bounded second moments, robustly learning mixtures of such distributions, and
robustly finding planted partitions in random graphs in which significant
portions of the graph have been perturbed by an adversary.

Machine learning systems trained on userprovided data are susceptible to
data poisoning attacks, whereby malicious users inject false training data with
the aim of corrupting the learned model. While recent work has proposed a
number of attacks and defenses, little is understood about the worstcase loss
of a defense in the face of a determined attacker. We address this by
constructing approximate upper bounds on the loss across a broad family of
attacks, for defenders that first perform outlier removal followed by empirical
risk minimization. Our bound comes paired with a candidate attack that nearly
realizes the bound, giving us a powerful tool for quickly assessing defenses on
a given dataset. Empirically, we find that even under a simple defense, the
MNIST17 and Dogfish datasets are resilient to attack, while in contrast the
IMDB sentiment dataset can be driven from 12% to 23% test error by adding only
3% poisoned data.

We introduce a criterion, resilience, which allows properties of a dataset
(such as its mean or best low rank approximation) to be robustly computed, even
in the presence of a large fraction of arbitrary additional data. Resilience is
a weaker condition than most other properties considered so far in the
literature, and yet enables robust estimation in a broader variety of settings,
including the previously unstudied problem of robust mean estimation in
$\ell_p$norms.

Rapid progress in machine learning and artificial intelligence (AI) has
brought increasing attention to the potential impacts of AI technologies on
society. In this paper we discuss one such potential impact: the problem of
accidents in machine learning systems, defined as unintended and harmful
behavior that may emerge from poor design of realworld AI systems. We present
a list of five practical research problems related to accident risk,
categorized according to whether the problem originates from having the wrong
objective function ("avoiding side effects" and "avoiding reward hacking"), an
objective function that is too expensive to evaluate frequently ("scalable
supervision"), or undesirable behavior during the learning process ("safe
exploration" and "distributional shift"). We review previous work in these
areas as well as suggesting research directions with a focus on relevance to
cuttingedge AI systems. Finally, we consider the highlevel question of how to
think most productively about the safety of forwardlooking applications of AI.

We show how to estimate a model's test error from unlabeled data, on
distributions very different from the training distribution, while assuming
only that certain conditional independencies are preserved between train and
test. We do not need to assume that the optimal predictor is the same between
train and test, or that the true distribution lies in any parametric family. We
can also efficiently differentiate the error estimate to perform unsupervised
discriminative learning. Our technical tool is the method of moments, which
allows us to exploit conditional independencies in the absence of a
fullyspecified model. Our framework encompasses a large family of losses
including the log and exponential loss, and extends to structured output
settings such as hidden Markov models.

We consider a crowdsourcing model in which $n$ workers are asked to rate the
quality of $n$ items previously generated by other workers. An unknown set of
$\alpha n$ workers generate reliable ratings, while the remaining workers may
behave arbitrarily and possibly adversarially. The manager of the experiment
can also manually evaluate the quality of a small number of items, and wishes
to curate together almost all of the highquality items with at most an
$\epsilon$ fraction of lowquality items. Perhaps surprisingly, we show that
this is possible with an amount of work required of the manager, and each
worker, that does not scale with $n$: the dataset can be curated with
$\tilde{O}\Big(\frac{1}{\beta\alpha^3\epsilon^4}\Big)$ ratings per worker, and
$\tilde{O}\Big(\frac{1}{\beta\epsilon^2}\Big)$ ratings by the manager, where
$\beta$ is the fraction of highquality items. Our results extend to the more
general setting of peer prediction, including peer grading in online
classrooms.

A classic tension exists between exact inference in a simple model and
approximate inference in a complex model. The latter offers expressivity and
thus accuracy, but the former provides coverage of the space, an important
property for confidence estimation and learning with indirect supervision. In
this work, we introduce a new approach, reified context models, to reconcile
this tension. Specifically, we let the amount of context (the arity of the
factors in a graphical model) be chosen "at runtime" by reifying itthat is,
letting this choice itself be a random variable inside the model. Empirically,
we show that our approach obtains expressivity and coverage on three natural
language tasks.

Markov Chain Monte Carlo (MCMC) algorithms are often used for approximate
inference inside learning, but their slow mixing can be difficult to diagnose
and the approximations can seriously degrade learning. To alleviate these
issues, we define a new model family using strong Doeblin Markov chains, whose
mixing times can be precisely controlled by a parameter. We also develop an
algorithm to learn such models, which involves maximizing the data likelihood
under the induced stationary distribution of these chains. We show empirical
improvements on two challenging inference tasks.

We present a sparse analogue to stochastic gradient descent that is
guaranteed to perform well under similar conditions to the lasso. In the linear
regression setup with irrepresentable noise features, our algorithm recovers
the support set of the optimal parameter vector with high probability, and
achieves a statistically quasioptimal rate of convergence of Op(k log(d)/T),
where k is the sparsity of the solution, d is the number of features, and T is
the number of training examples. Meanwhile, our algorithm does not require any
more computational resources than stochastic gradient descent. In our
experiments, we find that our method substantially outperforms existing
streaming algorithms on both real and simulated data.

We investigate permutations in terms of their cycle structure and descent
set. To do this, we generalize the classical bijection of Gessel and Reutenauer
to deal with permutations that have some ascending and some descending blocks.
We then provide the first bijective proofs of some known results. We also solve
some problems posed in [3] by Eriksen, Freij, and Wastlund, who study
derangements that descend in blocks of prescribed lengths.

We continue the work of Eriksen, Freij, and Wastlund [3], who study
derangements that descend in blocks of prescribed lengths. We generalize their
work to derangements that ascend in some blocks and descend in others. In
particular, we obtain a generating function for the derangements that ascend in
blocks of prescribed lengths, thus solving a problem posed in [3]. We also work
towards a combinatorial interpretation of a polynomial sum appearing in [3]. As
a result, we obtain a new combinatorial sum for counting derangements with
ascending and descending blocks.

We present a proof, using spectral techniques, that there is no finite
measurable coloring of the odddistance graph.

We investigate subsets of the symmetric group with structure similar to that
of a graph. The trees of these subsets correspond to minimal conjugate
generating sets of the symmetric group. There are two main theorems in this
paper. The first is a characterization of minimal conjugate generating sets of
S_n. The second is a generalization of a result due to Feng characterizing the
automorphism groups of the Cayley graphs formed by certain generating sets
composed of cycles. We compute the full automorphism groups subject to a weak
condition and conjecture that the characterization still holds without the
condition. We also present some computational results in relation to
hamiltonicity of Cayley graphs, including a generalization of the work on
quasihamiltonicity by Gutin and Yeo to undirected graphs.