• A Super-Resolution Framework for Tensor Decomposition(1602.08614)

March 8, 2019 cs.IT, math.IT
This work considers a super-resolution framework for overcomplete tensor decomposition. Specifically, we view tensor decomposition as a super-resolution problem of recovering a sum of Dirac measures on the sphere and solve it by minimizing a continuous analog of the $\ell_1$ norm on the space of measures. The optimal value of this optimization defines the tensor nuclear norm. Similar to the separation condition in the super-resolution problem, by explicitly constructing a dual certificate, we develop incoherence conditions of the tensor factors so that they form the unique optimal solution of the continuous analog of $\ell_1$ norm minimization. Remarkably, the derived incoherence conditions are satisfied with high probability by random tensor factors uniformly distributed on the sphere, implying global identifiability of random tensor factors.
• The Non-convex Geometry of Low-rank Matrix Optimization(1611.03060)

Feb. 21, 2019 cs.IT, math.IT, math.OC
This work considers two popular minimization problems: (i) the minimization of a general convex function $f(\mathbf{X})$ with the domain being positive semi-definite matrices; (ii) the minimization of a general convex function $f(\mathbf{X})$ regularized by the matrix nuclear norm $\|\mathbf{X}\|_*$ with the domain being general matrices. Despite their optimal statistical performance in the literature, these two optimization problems have a high computational complexity even when solved using tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer and Monteiro to factor the low-rank variable $\mathbf{X} = \mathbf{U}\mathbf{U}^\top$ (for semi-definite matrices) or $\mathbf{X}=\mathbf{U}\mathbf{V}^\top$ (for general matrices) and also replace the nuclear norm $\|\mathbf{X}\|_*$ with $(\|\mathbf{U}\|_F^2+\|\mathbf{V}\|_F^2)/2$. In spite of the non-convexity of the resulting factored formulations, we prove that each critical point either corresponds to the global optimum of the original convex problems or is a strict saddle where the Hessian matrix has a strictly negative eigenvalue. Such a nice geometric structure of the factored formulations allows many local search algorithms to find a global optimizer even with random initializations.
• Approximate Support Recovery of Atomic Line Spectral Estimation: A Tale of Resolution and Precision(1612.01459)

Oct. 23, 2018 cs.IT, math.IT, math.OC
This work investigates the parameter estimation performance of super-resolution line spectral estimation using atomic norm minimization. The focus is on analyzing the algorithm's accuracy of inferring the frequencies and complex magnitudes from noisy observations. When the Signal-to-Noise Ratio is reasonably high and the true frequencies are separated by $O(\frac{1}{n})$, the atomic norm estimator is shown to localize the correct number of frequencies, each within a neighborhood of size $O(\sqrt{{\log n}/{n^3}} \sigma)$ of one of the true frequencies. Here $n$ is half the number of temporal samples and $\sigma^2$ is the Gaussian noise variance. The analysis is based on a primal-dual witness construction procedure. The obtained error bound matches the Cram\'er-Rao lower bound up to a logarithmic factor. The relationship between resolution (separation of frequencies) and precision or accuracy of the estimator is highlighted. Our analysis also reveals that the atomic norm minimization can be viewed as a convex way to solve a $\ell_1$-norm regularized, nonlinear and nonconvex least-squares problem to global optimality.
• Global Optimality in Low-rank Matrix Optimization(1702.07945)

March 2, 2018 cs.IT, math.IT, math.OC
This paper considers the minimization of a general objective function $f(X)$ over the set of rectangular $n\times m$ matrices that have rank at most $r$. To reduce the computational burden, we factorize the variable $X$ into a product of two smaller matrices and optimize over these two matrices instead of $X$. Despite the resulting nonconvexity, recent studies in matrix completion and sensing have shown that the factored problem has no spurious local minima and obeys the so-called strict saddle property (the function has a directional negative curvature at all critical points but local minima). We analyze the global geometry for a general and yet well-conditioned objective function $f(X)$ whose restricted strong convexity and restricted strong smoothness constants are comparable. In particular, we show that the reformulated objective function has no spurious local minima and obeys the strict saddle property. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) can provably solve the factored problem with global convergence.
• The Global Optimization Geometry of Low-Rank Matrix Optimization(1703.01256)

Jan. 4, 2018 cs.IT, math.IT, math.OC
This paper considers general rank-constrained optimization problems that minimize a general objective function $f(X)$ over the set of rectangular $n\times m$ matrices that have rank at most $r$. To tackle the rank constraint and also to reduce the computational burden, we factorize $X$ into $UV^T$ where $U$ and $V$ are $n\times r$ and $m\times r$ matrices, respectively, and then optimize over the small matrices $U$ and $V$. We characterize the global optimization geometry of the nonconvex factored problem and show that the corresponding objective function satisfies the robust strict saddle property as long as the original objective function $f$ satisfies restricted strong convexity and smoothness properties, ensuring global convergence of many local search algorithms (such as noisy gradient descent) in polynomial time for solving the factored problem. We also provide a comprehensive analysis for the optimization geometry of a matrix factorization problem where we aim to find $n\times r$ and $m\times r$ matrices $U$ and $V$ such that $UV^T$ approximates a given matrix $X^\star$. Aside from the robust strict saddle property, we show that the objective function of the matrix factorization problem has no spurious local minima and obeys the strict saddle property not only for the exact-parameterization case where $rank(X^\star) = r$, but also for the over-parameterization case where $rank(X^\star) < r$ and the under-parameterization case where $rank(X^\star) > r$. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) converge to a global solution with random initialization.
• Atomic Norm Minimization for Modal Analysis from Random and Compressed Samples(1703.00938)

Dec. 21, 2017 cs.IT, math.IT
Modal analysis is the process of estimating a system's modal parameters such as its natural frequencies and mode shapes. One application of modal analysis is in structural health monitoring (SHM), where a network of sensors may be used to collect vibration data from a physical structure such as a building or bridge. There is a growing interest in developing automated techniques for SHM based on data collected in a wireless sensor network. In order to conserve power and extend battery life, however, it is desirable to minimize the amount of data that must be collected and transmitted in such a sensor network. In this paper, we highlight the fact that modal analysis can be formulated as an atomic norm minimization (ANM) problem, which can be solved efficiently and in some cases recover perfectly a structure's mode shapes and frequencies. We survey a broad class of sampling and compression strategies that one might consider in a physical sensor network, and we provide bounds on the sample complexity of these compressive schemes in order to recover a structure's mode shapes and frequencies via ANM. A main contribution of our paper is to establish a bound on the sample complexity of modal analysis with random temporal compression, and in this scenario we prove that the samples per sensor can actually decrease as the number of sensors increases. We also extend an atomic norm denoising problem to the multiple measurement vector (MMV) setting in the case of uniform sampling.
• Geometry of Factored Nuclear Norm Regularization(1704.01265)

April 5, 2017 cs.IT, math.IT, cs.NA, math.OC, cs.LG
This work investigates the geometry of a nonconvex reformulation of minimizing a general convex loss function $f(X)$ regularized by the matrix nuclear norm $\|X\|_*$. Nuclear-norm regularized matrix inverse problems are at the heart of many applications in machine learning, signal processing, and control. The statistical performance of nuclear norm regularization has been studied extensively in literature using convex analysis techniques. Despite its optimal performance, the resulting optimization has high computational complexity when solved using standard or even tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer-Monteiro to factor the matrix variable $X$ into the product of two smaller rectangular matrices $X=UV^T$ and also replace the nuclear norm $\|X\|_*$ with $(\|U\|_F^2+\|V\|_F^2)/2$. In spite of the nonconvexity of the factored formulation, we prove that when the convex loss function $f(X)$ is $(2r,4r)$-restricted well-conditioned, each critical point of the factored problem either corresponds to the optimal solution $X^\star$ of the original convex optimization or is a strict saddle point where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation allows many local search algorithms to converge to the global optimum with random initializations.
• Demixing Sines and Spikes: Robust Spectral Super-resolution in the Presence of Outliers(1609.02247)

March 21, 2017 cs.IT, math.IT, math.OC
We consider the problem of super-resolving the line spectrum of a multisinusoidal signal from a finite number of samples, some of which may be completely corrupted. Measurements of this form can be modeled as an additive mixture of a sinusoidal and a sparse component. We propose to demix the two components and super-resolve the spectrum of the multisinusoidal signal by solving a convex program. Our main theoretical result is that-- up to logarithmic factors-- this approach is guaranteed to be successful with high probability for a number of spectral lines that is linear in the number of measurements, even if a constant fraction of the data are outliers. The result holds under the assumption that the phases of the sinusoidal and sparse components are random and the line spectrum satisfies a minimum-separation condition. We show that the method can be implemented via semidefinite programming and explain how to adapt it in the presence of dense perturbations, as well as exploring its connection to atomic-norm denoising. In addition, we propose a fast greedy demixing method which provides good empirical results when coupled with a local nonconvex-optimization step.
• Super-Resolution of Complex Exponentials from Modulations with Unknown Waveforms(1601.03712)

Aug. 8, 2016 cs.IT, math.IT
Super-resolution is generally referred to as the task of recovering fine details from coarse information. Motivated by applications such as single-molecule imaging, radar imaging, etc., we consider parameter estimation of complex exponentials from their modulations with unknown waveforms, allowing for non-stationary blind super-resolution. This problem, however, is ill-posed since both the parameters associated with the complex exponentials and the modulating waveforms are unknown. To alleviate this, we assume that the unknown waveforms live in a common low-dimensional subspace. Using a lifting trick, we recast the blind super-resolution problem as a structured low-rank matrix recovery problem. Atomic norm minimization is then used to enforce the structured low-rankness, and is reformulated as a semidefinite program that is solvable in polynomial time. We show that, up to scaling ambiguities, exact recovery of both of the complex exponential parameters and the unknown waveforms is possible when the waveform subspace is random and the number of measurements is proportional to the number of degrees of freedom in the problem. Numerical simulations support our theoretical findings, showing that non-stationary blind super-resolution using atomic norm minimization is possible.
• Experimental robustness of Fourier Ptychography phase retrieval algorithms(1511.02986)

Dec. 18, 2015 cs.CV, physics.optics
Fourier ptychography is a new computational microscopy technique that provides gigapixel-scale intensity and phase images with both wide field-of-view and high resolution. By capturing a stack of low-resolution images under different illumination angles, a nonlinear inverse algorithm can be used to computationally reconstruct the high-resolution complex field. Here, we compare and classify multiple proposed inverse algorithms in terms of experimental robustness. We find that the main sources of error are noise, aberrations and mis-calibration (i.e. model mis-match). Using simulations and experiments, we demonstrate that the choice of cost function plays a critical role, with amplitude-based cost functions performing better than intensity-based ones. The reason for this is that Fourier ptychography datasets consist of images from both brightfield and darkfield illumination, representing a large range of measured intensities. Both noise (e.g. Poisson noise) and model mis-match errors are shown to scale with intensity. Hence, algorithms that use an appropriate cost function will be more tolerant to both noise and model mis-match. Given these insights, we propose a global Newton's method algorithm which is robust and computationally efficient. Finally, we discuss the impact of procedures for algorithmic correction of aberrations and mis-calibration.
• Optimal Low-Rank Tensor Recovery from Separable Measurements: Four Contractions Suffice(1505.04085)

May 15, 2015 cs.IT, math.IT, math.OC, cs.LG, stat.ML
Tensors play a central role in many modern machine learning and signal processing applications. In such applications, the target tensor is usually of low rank, i.e., can be expressed as a sum of a small number of rank one tensors. This motivates us to consider the problem of low rank tensor recovery from a class of linear measurements called separable measurements. As specific examples, we focus on two distinct types of separable measurement mechanisms (a) Random projections, where each measurement corresponds to an inner product of the tensor with a suitable random tensor, and (b) the completion problem where measurements constitute revelation of a random set of entries. We present a computationally efficient algorithm, with rigorous and order-optimal sample complexity results (upto logarithmic factors) for tensor recovery. Our method is based on reduction to matrix completion sub-problems and adaptation of Leurgans' method for tensor decomposition. We extend the methodology and sample complexity results to higher order tensors, and experimentally validate our theoretical results.
• Compressed Sensing off the Grid(1207.6053)

July 10, 2013 cs.IT, math.IT
We consider the problem of estimating the frequency components of a mixture of s complex sinusoids from a random subset of n regularly spaced samples. Unlike previous work in compressed sensing, the frequencies are not assumed to lie on a grid, but can assume any values in the normalized frequency domain [0,1]. We propose an atomic norm minimization approach to exactly recover the unobserved samples. We reformulate this atomic norm minimization as an exact semidefinite program. Even with this continuous dictionary, we show that most sampling sets of size O(s log s log n) are sufficient to guarantee the exact frequency estimation with high probability, provided the frequencies are well separated. Numerical experiments are performed to illustrate the effectiveness of the proposed method.
• The Sample Complexity of Search over Multiple Populations(1209.1380)

May 1, 2013 cs.IT, math.IT, stat.ML
This paper studies the sample complexity of searching over multiple populations. We consider a large number of populations, each corresponding to either distribution P0 or P1. The goal of the search problem studied here is to find one population corresponding to distribution P1 with as few samples as possible. The main contribution is to quantify the number of samples needed to correctly find one such population. We consider two general approaches: non-adaptive sampling methods, which sample each population a predetermined number of times until a population following P1 is found, and adaptive sampling methods, which employ sequential sampling schemes for each population. We first derive a lower bound on the number of samples required by any sampling scheme. We then consider an adaptive procedure consisting of a series of sequential probability ratio tests, and show it comes within a constant factor of the lower bound. We give explicit expressions for this constant when samples of the populations follow Gaussian and Bernoulli distributions. An alternative adaptive scheme is discussed which does not require full knowledge of P1, and comes within a constant factor of the optimal scheme. For comparison, a lower bound on the sampling requirements of any non-adaptive scheme is presented.
• Near Minimax Line Spectral Estimation(1303.4348)

April 1, 2013 cs.IT, math.IT
This paper establishes a nearly optimal algorithm for estimating the frequencies and amplitudes of a mixture of sinusoids from noisy equispaced samples. We derive our algorithm by viewing line spectral estimation as a sparse recovery problem with a continuous, infinite dictionary. We show how to compute the estimator via semidefinite programming and provide guarantees on its mean-square error rate. We derive a complementary minimax lower bound on this estimation rate, demonstrating that our approach nearly achieves the best possible estimation error. Furthermore, we establish bounds on how well our estimator localizes the frequencies in the signal, showing that the localization error tends to zero as the number of samples grows. We verify our theoretical results in an array of numerical experiments, demonstrating that the semidefinite programming approach outperforms two classical spectral estimation techniques.
• Atomic norm denoising with applications to line spectral estimation(1204.0562)

Feb. 16, 2013 cs.IT, math.IT
Motivated by recent work on atomic norms in inverse problems, we propose a new approach to line spectral estimation that provides theoretical guarantees for the mean-squared-error (MSE) performance in the presence of noise and without knowledge of the model order. We propose an abstract theory of denoising with atomic norms and specialize this theory to provide a convex optimization problem for estimating the frequencies and phases of a mixture of complex exponentials. We show that the associated convex optimization problem can be solved in polynomial time via semidefinite programming (SDP). We also show that the SDP can be approximated by an l1-regularized least-squares problem that achieves nearly the same error rate as the SDP but can scale to much larger problems. We compare both SDP and l1-based approaches with classical line spectral analysis methods and demonstrate that the SDP outperforms the l1 optimization which outperforms MUSIC, Cadzow's, and Matrix Pencil approaches in terms of MSE over a wide range of signal-to-noise ratios.
• The Stability of Low-Rank Matrix Reconstruction: a Constrained Singular Value View(1006.4088)

Sept. 7, 2012 cs.IT, math.IT
The stability of low-rank matrix reconstruction with respect to noise is investigated in this paper. The $\ell_*$-constrained minimal singular value ($\ell_*$-CMSV) of the measurement operator is shown to determine the recovery performance of nuclear norm minimization based algorithms. Compared with the stability results using the matrix restricted isometry constant, the performance bounds established using $\ell_*$-CMSV are more concise, and their derivations are less complex. Isotropic and subgaussian measurement operators are shown to have $\ell_*$-CMSVs bounded away from zero with high probability, as long as the number of measurements is relatively large. The $\ell_*$-CMSV for correlated Gaussian operators are also analyzed and used to illustrate the advantage of $\ell_*$-CMSV compared with the matrix restricted isometry constant. We also provide a fixed point characterization of $\ell_*$-CMSV that is potentially useful for its computation.
• Linear System Identification via Atomic Norm Regularization(1204.0590)

April 3, 2012 cs.IT, math.IT, math.OC
This paper proposes a new algorithm for linear system identification from noisy measurements. The proposed algorithm balances a data fidelity term with a norm induced by the set of single pole filters. We pose a convex optimization problem that approximately solves the atomic norm minimization problem and identifies the unknown system from noisy linear measurements. This problem can be solved efficiently with standard, freely available software. We provide rigorous statistical guarantees that explicitly bound the estimation error (in the H_2-norm) in terms of the stability radius, the Hankel singular values of the true system and the number of measurements. These results in turn yield complexity bounds and asymptotic consistency. We provide numerical experiments demonstrating the efficacy of our method for estimating linear systems from a variety of linear measurements.
• Fixed point theory and semidefinite programming for computable performance analysis of block-sparsity recovery(1110.1078)

Oct. 5, 2011 math.NA, cs.IT, math.IT
In this paper, we employ fixed point theory and semidefinite programming to compute the performance bounds on convex block-sparsity recovery algorithms. As a prerequisite for optimal sensing matrix design, a computable performance bound would open doors for wide applications in sensor arrays, radar, DNA microarrays, and many other areas where block-sparsity arises naturally. We define a family of goodness measures for arbitrary sensing matrices as the optimal values of certain optimization problems. The reconstruction errors of convex recovery algorithms are bounded in terms of these goodness measures. We demonstrate that as long as the number of measurements is relatively large, these goodness measures are bounded away from zero for a large class of random sensing matrices, a result parallel to the probabilistic analysis of the block restricted isometry property. As the primary contribution of this work, we associate the goodness measures with the fixed points of functions defined by a series of semidefinite programs. This relation with fixed point theory yields efficient algorithms with global convergence guarantees to compute the goodness measures.
• Verifiable and computable performance analysis of sparsity recovery(1102.4868)

Oct. 5, 2011 math.NA, cs.IT, math.IT
In this paper, we develop verifiable and computable performance analysis of sparsity recovery. We define a family of goodness measures for arbitrary sensing matrices as a set of optimization problems, and design algorithms with a theoretical global convergence guarantee to compute these goodness measures. The proposed algorithms solve a series of second-order cone programs, or linear programs. As a by-product, we implement an efficient algorithm to verify a sufficient condition for exact sparsity recovery in the noise-free case. We derive performance bounds on the recovery errors in terms of these goodness measures. We also analytically demonstrate that the developed goodness measures are non-degenerate for a large class of random sensing matrices, as long as the number of measurements is relatively large. Numerical experiments show that, compared with the restricted isometry based performance bounds, our error bounds apply to a wider range of problems and are tighter, when the sparsity levels of the signals are relatively low.
• Performance Analysis of Sparse Recovery Based on Constrained Minimal Singular Values(1004.4222)

Feb. 24, 2011 cs.IT, math.IT
The stability of sparse signal reconstruction is investigated in this paper. We design efficient algorithms to verify the sufficient condition for unique $\ell_1$ sparse recovery. One of our algorithm produces comparable results with the state-of-the-art technique and performs orders of magnitude faster. We show that the $\ell_1$-constrained minimal singular value ($\ell_1$-CMSV) of the measurement matrix determines, in a very concise manner, the recovery performance of $\ell_1$-based algorithms such as the Basis Pursuit, the Dantzig selector, and the LASSO estimator. Compared with performance analysis involving the Restricted Isometry Constant, the arguments in this paper are much less complicated and provide more intuition on the stability of sparse signal recovery. We show also that, with high probability, the subgaussian ensemble generates measurement matrices with $\ell_1$-CMSVs bounded away from zero, as long as the number of measurements is relatively large. To compute the $\ell_1$-CMSV and its lower bound, we design two algorithms based on the interior point algorithm and the semi-definite relaxation.
• Performance analysis for sparse support recovery(0908.0744)

Nov. 5, 2009 cs.IT, math.IT
The performance of estimating the common support for jointly sparse signals based on their projections onto lower-dimensional space is analyzed. Support recovery is formulated as a multiple-hypothesis testing problem. Both upper and lower bounds on the probability of error are derived for general measurement matrices, by using the Chernoff bound and Fano's inequality, respectively. The upper bound shows that the performance is determined by a quantity measuring the measurement matrix incoherence, while the lower bound reveals the importance of the total measurement gain. The lower bound is applied to derive the minimal number of samples needed for accurate direction-of-arrival (DOA) estimation for a sparse representation based algorithm. When applied to Gaussian measurement ensembles, these bounds give necessary and sufficient conditions for a vanishing probability of error for majority realizations of the measurement matrix. Our results offer surprising insights into sparse signal recovery. For example, as far as support recovery is concerned, the well-known bound in Compressive Sensing with the Gaussian measurement matrix is generally not sufficient unless the noise level is low. Our study provides an alternative performance measure, one that is natural and important in practice, for signal recovery in Compressive Sensing and other application areas exploiting signal sparsity.