
Many statistical models seek relationship between variables via subspaces of
reduced dimensions. For instance, in factor models, variables are roughly
distributed around a low dimensional subspace determined by the loading matrix;
in mixed linear regression models, the coefficient vectors for different
mixtures form a subspace that captures all regression functions; in multiple
index models, the effect of covariates is summarized by the effective dimension
reduction space.
Such subspaces are typically unknown, and good estimates are crucial for data
visualization, dimension reduction, diagnostics and estimation of unknown
parameters. Usually, we can estimate these subspaces by computing moments from
data. Often, there are many ways to estimate a subspace, by using moments of
different orders, transformed moments, etc. A natural question is: how can we
combine all these moment conditions and achieve optimality for subspace
estimation?
In this paper, we formulate our problem as estimation of an unknown subspace
$\mathcal{S}$ of dimension $r$, given a set of overidentifying vectors $\{
\mathrm{\bf v}_\ell \}_{\ell=1}^m$ (namely $m \ge r$) that satisfy $\mathbb{E}
\mathrm{\bf v}_{\ell} \in \mathcal{S}$ and have the form $$ \mathrm{\bf v}_\ell
= \frac{1}{n} \sum_{i=1}^n \mathrm{\bf f}_\ell(\mathbf{x}_i, y_i), $$ where
data are i.i.d. and each function $\mathrm{\bf f}_\ell$ is known. By exploiting
certain covariance information related to $\mathrm{\bf v}_\ell$, our estimator
of $\mathcal{S}$ uses an optimal weighting matrix and achieves the smallest
asymptotic error, in terms of canonical angles. The analysis is based on the
generalized method of moments that is tailored to our problem. Our method is
applied to aforementioned models and distributed estimation of heterogeneous
datasets, and may be potentially extended to analyze matrix completion, neural
nets, among others.

Recovering lowrank structures via eigenvector perturbation analysis is a
common problem in statistical machine learning, such as in factor analysis,
community detection, ranking, matrix completion, among others. While a large
variety of bounds are available for average errors between empirical and
population statistics of eigenvectors, few results are tight for entrywise
analyses, which are critical for a number of problems such as community
detection.
This paper investigates entrywise behaviors of eigenvectors for a large class
of random matrices whose expectations are lowrank, which helps settle the
conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact
recovery in the stochastic block model without any trimming or cleaning steps.
The key is a firstorder approximation of eigenvectors under the $\ell_\infty$
norm: $$u_k \approx \frac{A u_k^*}{\lambda_k^*},$$ where $\{u_k\}$ and
$\{u_k^*\}$ are eigenvectors of a random matrix $A$ and its expectation
$\mathbb{E} A$, respectively. The fact that the approximation is both tight and
linear in $A$ facilitates sharp comparisons between $u_k$ and $u_k^*$. In
particular, it allows for comparing the signs of $u_k$ and $u_k^*$ even if $\
u_k  u_k^*\_{\infty}$ is large. The results are further extended to
perturbations of eigenspaces, yielding new $\ell_\infty$type bounds for
synchronization ($\mathbb{Z}_2$spiked Wigner model) and noisy matrix
completion.

In statistics and machine learning, people are often interested in the
eigenvectors (or singular vectors) of certain matrices (e.g. covariance
matrices, data matrices, etc). However, those matrices are usually perturbed by
noises or statistical errors, either from random sampling or structural
patterns. One usually employs DavisKahan $\sin \theta$ theorem to bound the
difference between the eigenvectors of a matrix $A$ and those of a perturbed
matrix $\widetilde{A} = A + E$, in terms of $\ell_2$ norm. In this paper, we
prove that when $A$ is a lowrank and incoherent matrix, the $\ell_{\infty}$
norm perturbation bound of singular vectors (or eigenvectors in the symmetric
case) is smaller by a factor of $\sqrt{d_1}$ or $\sqrt{d_2}$ for left and right
vectors, where $d_1$ and $d_2$ are the matrix dimensions. The power of this new
perturbation result is shown in robust covariance estimation, particularly when
random variables have heavy tails. There, we propose new robust covariance
estimators and establish their asymptotic properties using the newly developed
perturbation bound. Our theoretical results are verified through extensive
numerical experiments.

The problem of phase synchronization is to estimate the phases (angles) of a
complex unitmodulus vector $z$ from their noisy pairwise relative measurements
$C = zz^* + \sigma W$, where $W$ is a complexvalued Gaussian random matrix.
The maximum likelihood estimator (MLE) is a solution to a unitmodulus
constrained quadratic programming problem, which is nonconvex. Existing works
have proposed polynomialtime algorithms such as a semidefinite relaxation
(SDP) approach or the generalized power method (GPM) to solve it. Numerical
experiments suggest both of these methods succeed with high probability for
$\sigma$ up to $\tilde{\mathcal{O}}(n^{1/2})$, yet, existing analyses only
confirm this observation for $\sigma$ up to $\mathcal{O}(n^{1/4})$. In this
paper, we bridge the gap, by proving SDP is tight for $\sigma =
\mathcal{O}(\sqrt{n /\log n})$, and GPM converges to the global optimum under
the same regime. Moreover, we establish a linear convergence rate for GPM, and
derive a tighter $\ell_\infty$ bound for the MLE. A novel technique we develop
in this paper is to track (theoretically) $n$ closely related sequences of
iterates, in addition to the sequence of iterates GPM actually produces. As a
byproduct, we obtain an $\ell_\infty$ perturbation bound for leading
eigenvectors. Our result also confirms intuitions that use techniques from
statistical mechanics.

RayleighSchr\"{o}dinger perturbation theory is a wellknown theory in
quantum mechanics and it offers useful characterization of eigenvectors of a
perturbed matrix. Suppose $A$ and perturbation $E$ are both Hermitian matrices,
$A^t = A + tE$, $\{\lambda_j\}_{j=1}^n$ are eigenvalues of $A$ in descending
order, and $u_1, u^t_1$ are leading eigenvectors of $A$ and $A^t$.
RayleighSchr\"{o}dinger theory shows asymptotically, $\langle u^t_1, u_j
\rangle \propto t / (\lambda_1  \lambda_j)$ where $ t = o(1)$. However, the
asymptotic theory does not apply to larger $t$; in particular, it fails when $
t \ E \_2 > \lambda_1  \lambda_2$. In this paper, we present a nonasymptotic
theory with $E$ being a random matrix. We prove that, when $t = 1$ and $E$ has
independent and centered subgaussian entries above its diagonal, with high
probability, \begin{equation*}  \langle u^1_1, u_j \rangle  = O(\sqrt{\log n}
/ (\lambda_1  \lambda_j)), \end{equation*} for all $j>1$ simultaneously, under
a condition on eigenvalues of $A$ that involves all gaps $\lambda_1 
\lambda_j$. This bound is valid, even in cases where $\ E \_2 \gg \lambda_1 
\lambda_2$. The result is optimal, except for a log term. It also leads to an
improvement of DavisKahan theorem.

In this paper, we study robust covariance estimation under the approximate
factor model with observed factors. We propose a novel framework to first
estimate the initial joint covariance matrix of the observed data and the
factors, and then use it to recover the covariance matrix of the observed data.
We prove that once the initial matrix estimator is good enough to maintain the
elementwise optimal rate, the whole procedure will generate an estimated
covariance with desired properties. For data with only bounded fourth moments,
we propose to use Huber loss minimization to give the initial joint covariance
estimation. This approach is applicable to a much wider range of distributions,
including subGaussian and elliptical distributions. We also present an
asymptotic result for Huber's Mestimator with a diverging parameter. The
conclusions are demonstrated by extensive simulations and real data analysis.

We consider accurately answering smooth queries while preserving differential
privacy. A query is said to be $K$smooth if it is specified by a function
defined on $[1,1]^d$ whose partial derivatives up to order $K$ are all
bounded. We develop an $\epsilon$differentially private mechanism for the
class of $K$smooth queries. The major advantage of the algorithm is that it
outputs a synthetic database. In real applications, a synthetic database output
is appealing. Our mechanism achieves an accuracy of $O
(n^{\frac{K}{2d+K}}/\epsilon )$, and runs in polynomial time. We also
generalize the mechanism to preserve $(\epsilon, \delta)$differential privacy
with slightly improved accuracy. Extensive experiments on benchmark datasets
demonstrate that the mechanisms have good accuracy and are efficient.