• In this paper, we propose a new approach to solve low-rank tensor completion and robust tensor PCA. Our approach is based on some novel notion of (even-order) tensor ranks, to be called the M-rank, the symmetric M-rank, and the strongly symmetric M-rank. We discuss the connections between these new tensor ranks and the CP-rank and the symmetric CP-rank of an even-order tensor. We show that the M-rank provides a reliable and easy-computable approximation to the CP-rank. As a result, we propose to replace the CP-rank by the M-rank in the low-CP-rank tensor completion and robust tensor PCA. Numerical results suggest that our new approach based on the M-rank outperforms existing methods that are based on low-n-rank, t-SVD and KBR approaches for solving low-rank tensor completion and robust tensor PCA when the underlying tensor has low CP-rank.
  • The modeling and analysis of an LRU cache is extremely challenging as exact results for the main performance metrics (e.g. hit rate) are either lacking or cannot be used because of their high computational complexity for large caches. As a result, various approximations have been proposed. The state-of-the-art method is the so-called TTL approximation, first proposed and shown to be asymptotically exact for IRM requests by Fagin. It has been applied to various other workload models and numerically demonstrated to be accurate but without theoretical justification. In this paper we provide theoretical justification for the approximation in the case where distinct contents are described by independent stationary and ergodic processes. We show that this approximation is exact as the cache size and the number of contents go to infinity. This extends earlier results for the independent reference model. Moreover, we establish results not only for the aggregate cache hit probability but also for every individual content. Last, we obtain bounds on the rate of convergence.
  • Recently, weighted patch representation has been widely studied for alleviating the impact of background information included in bounding box to improve visual tracking results. However, existing weighted patch representation models generally exploit spatial structure information among patches in each frame separately which ignore (1) unary featureof each patch and (2) temporal correlation among patches in different frames. To address this problem, we propose a novel unified temporal coherence and graph optimized ranking model for weighted patch representation in visual tracking problem. There are three main contributions of this paper. First, we propose to employ a flexible graph ranking for patch weight computation which exploits both structure information among patches and unary feature of each patch simultaneously. Second, we propose a new more discriminative ranking model by further considering the temporal correlation among patches in different frames. Third, a neighborhood preserved, low-rank graph is learned and incorporated to build a unified optimized ranking model. Experiments on two benchmark datasets show the benefits of our model.
  • In this paper, localized information privacy (LIP) is proposed, as a new privacy definition, which allows statistical aggregation while protecting users' privacy without relying on a trusted third party. The notion of context-awareness is incorporated in LIP by the introduction of priors, which enables the design of privacy-preserving data aggregation with knowledge of priors. We show that LIP relaxes the Localized Differential Privacy (LDP) notion by explicitly modeling the adversary's knowledge. However, it is stricter than $2\epsilon$-LDP and $\epsilon$-mutual information privacy. The incorporation of local priors allows LIP to achieve higher utility compared to other approaches. We then present an optimization framework for privacy-preserving data aggregation, with the goal of minimizing the expected squared error while satisfying the LIP privacy constraints. Utility-privacy tradeoffs are obtained under several models in closed-form. We then validate our analysis by {numerical analysis} using both synthetic and real-world data. Results show that our LIP mechanism provides better utility-privacy tradeoffs than LDP and when the prior is not uniformly distributed, the advantage of LIP is even more significant.
  • Conjugate partial-symmetric (CPS) tensors are the high-order generalization of Hermitian matrices. As the role played by Hermitian matrices in matrix theory and quadratic optimization, CPS tensors have shown growing interest recently in tensor theory and optimization, particularly in many application-driven complex polynomial optimization problems. In this paper, we study CPS tensors with a focus on ranks, rank-one decompositions and approximations, as well as their applications. The analysis is conducted along side with a more general class of complex tensors called partial-symmetric tensors. We prove constructively that any CPS tensor can be decomposed into a sum of rank-one CPS tensors, which provides an alternative definition of CPS tensors via linear combinations of rank-one CPS tensors. Three types of ranks for CPS tensors are defined and shown to be different in general. This leads to the invalidity of the conjugate version of Comon's conjecture. We then study rank-one approximations and matricizations of CPS tensors. By carefully unfolding CPS tensors to Hermitian matrices, rank-one equivalence can be preserved. This enables us to develop new convex optimization models and algorithms to compute best rank-one approximation of CPS tensors. Numerical experiments from various data are performed to justify the capability of our methods.
  • In this paper, we consider an unconstrained optimization model where the objective is a sum of a large number of possibly nonconvex functions, though overall the objective is assumed to be smooth and convex. Our bid to solving such model uses the framework of cubic regularization of Newton's method.As well known, the crux in cubic regularization is its utilization of the Hessian information, which may be computationally expensive for large-scale problems. To tackle this, we resort to approximating the Hessian matrix via sub-sampling. In particular, we propose to compute an approximated Hessian matrix by either uniformly or non-uniformly sub-sampling the components of the objective. Based upon sub-sampling, we develop both standard and accelerated adaptive cubic regularization approaches and provide theoretical guarantees on global iteration complexity. We show that the standard and accelerated sub-sampled cubic regularization methods achieve iteration complexity in the order of $O(\epsilon^{-1/2})$ and $O(\epsilon^{-1/3})$ respectively, which match those of the original standard and accelerated cubic regularization methods \cite{Cartis-2012-Evaluation, Jiang-2017-Unified} using the full Hessian information. The performances of the proposed methods on regularized logistic regression problems show a clear effect of acceleration in terms of epochs on several real data sets.
  • Computed tomography (CT) examinations are commonly used to predict lung nodule malignancy in patients, which are shown to improve noninvasive early diagnosis of lung cancer. It remains challenging for computational approaches to achieve performance comparable to experienced radiologists. Here we present NoduleX, a systematic approach to predict lung nodule malignancy from CT data, based on deep learning convolutional neural networks (CNN). For training and validation, we analyze >1000 lung nodules in images from the LIDC/IDRI cohort. All nodules were identified and classified by four experienced thoracic radiologists who participated in the LIDC project. NoduleX achieves high accuracy for nodule malignancy classification, with an AUC of ~0.99. This is commensurate with the analysis of the dataset by experienced radiologists. Our approach, NoduleX, provides an effective framework for highly accurate nodule malignancy prediction with the model trained on a large patient population. Our results are replicable with software available at http://bioinformatics.astate.edu/NoduleX.
  • Nonconvex and nonsmooth optimization problems are frequently encountered in much of statistics, business, science and engineering, but they are not yet widely recognized as a technology in the sense of scalability. A reason for this relatively low degree of popularity is the lack of a well developed system of theory and algorithms to support the applications, as is the case for its convex counterpart. This paper aims to take one step in the direction of disciplined nonconvex and nonsmooth optimization. In particular, we consider in this paper some constrained nonconvex optimization models in block decision variables, with or without coupled affine constraints. In the case of without coupled constraints, we show a sublinear rate of convergence to an $\epsilon$-stationary solution in the form of variational inequality for a generalized conditional gradient method, where the convergence rate is shown to be dependent on the H\"olderian continuity of the gradient of the smooth part of the objective. For the model with coupled affine constraints, we introduce corresponding $\epsilon$-stationarity conditions, and apply two proximal-type variants of the ADMM to solve such a model, assuming the proximal ADMM updates can be implemented for all the block variables except for the last block, for which either a gradient step or a majorization-minimization step is implemented. We show an iteration complexity bound of $O(1/\epsilon^2)$ to reach an $\epsilon$-stationary solution for both algorithms. Moreover, we show that the same iteration complexity of a proximal BCD method follows immediately. Numerical results are provided to illustrate the efficacy of the proposed algorithms for tensor robust PCA.
  • In this paper we propose a unified two-phase scheme for convex optimization to accelerate: (1) the adaptive cubic regularization methods with exact/inexact Hessian matrices, and (2) the adaptive gradient method, without any knowledge of the Lipschitz constants for the gradient or the Hessian. This is achieved by tuning the parameters used in the algorithm $\textit{adaptively}$ in its process of progression, which can be viewed as a relaxation over the existing algorithms in the literature. Under the assumption that the sub-problems can be solved approximately, we establish overall iteration complexity bounds for three newly proposed algorithms to obtain an $\epsilon$-approximate solution. Specifically, we show that the adaptive cubic regularization methods with the exact/inexact Hessian matrix both achieve an iteration complexity in the order of $O\left( 1 / \epsilon^{1/3} \right)$, which matches that of the original accelerated cubic regularization method presented in [Nesterov-2008-Accelerating] assuming the availability of the exact Hessian information and the Lipschitz constants, and that the sub-problems are solved to optimality. Under the same two-phase adaptive acceleration framework, the gradient method achieves an iteration complexity in the order $O\left( 1 / \epsilon^{1/2} \right)$, which is known to be best possible (cf. Nesterov-2013-Introductory). Our numerical experiment results show a clear effect of acceleration displayed in the adaptive Newton method with cubic regularization on a set of regularized logistic regression instances.
  • The discrete moment problem is a foundational problem in distribution-free robust optimization, where the goal is to find a worst-case distribution that satisfies a given set of moments. This paper studies the discrete moment problems with additional "shape constraints" that guarantee the worst case distribution is either log-concave or has an increasing failure rate. These classes of shape constraints have not previously been studied in the literature, in part due to their inherent nonconvexities. Nonetheless, these classes of distributions are useful in practice. We characterize the structure of optimal extreme point distributions by developing new results in reverse convex optimization, a lesser-known tool previously employed in designing global optimization algorithms. We are able to show, for example, that an optimal extreme point solution to a moment problem with $m$ moments and log-concave shape constraints is piecewise geometric with at most $m$ pieces. Moreover, this structure allows us to design an exact algorithm for computing optimal solutions in a low-dimensional space of parameters. Moreover, We describe a computational approach to solving these low-dimensional problems, including numerical results for a representative set of instances.
  • In many real-world applications, data usually contain outliers. One popular approach is to use L2,1 norm function as a robust error/loss function. However, the robustness of L2,1 norm function is not well understood so far. In this paper, we propose a new Vector Outlier Regularization (VOR) framework to understand and analyze the robustness of L2,1 norm function. Our VOR function defines a data point to be outlier if it is outside a threshold with respect to a theoretical prediction, and regularize it-pull it back to the threshold line. We then prove that L2,1 function is the limiting case of this VOR with the usual least square/L2 error function as the threshold shrinks to zero. One interesting property of VOR is that how far an outlier lies away from its theoretically predicted value does not affect the final regularization and analysis results. This VOR property unmasks one of the most peculiar property of L2,1 norm function: The effects of outliers seem to be independent of how outlying they are-if an outlier is moved further away from the intrinsic manifold/subspace, the final analysis results do not change. VOR provides a new way to understand and analyze the robustness of L2,1 norm function. Applying VOR to matrix factorization leads to a new VORPCA model. We give a comprehensive comparison with trace-norm based L21-norm PCA to demonstrate the advantages of VORPCA.
  • In many real-world applications, data come with corruptions, large errors or outliers. One popular approach is to use L1-norm function. However, the robustness of L1-norm function is not well understood so far. In this paper, we present a new outlier regularization framework to understand and analyze the robustness of L1-norm function. There are two main features for the proposed outlier regularization. (1) A key property of outlier regularization is that how far an outlier lies away from its theoretically predicted value does not affect the final regularization and analysis results. (2) Another important feature of outlier regularization is that it has an equivalent continuous representation that closely relates to L1 function. This provides a new way to understand and analyze the robustness of L1 function. We apply our outlier regularization framework to PCA and propose an outlier regularized PCA (ORPCA) model. Comparing to the trace-normbased robust PCA, ORPCA has several benefits: (1) It does not suffer singular value suppression. (2) It can retain small high rank components which help retain fine details of data. (3) ORPCA can be computed more efficiently.
  • In this paper, we propose a vector transport-free stochastic variance reduced gradient (SVRG) method with general retraction for empirical risk minimization over Riemannian manifold. Existing SVRG methods on manifold usually consider a specific retraction operation, and involve additional computational costs such as parallel transport or vector transport. The vector transport-free SVRG with general retraction we propose in this paper handles general retraction operations, and do not need additional computational costs mentioned above. As a result, we name our algorithm S-SVRG, where the first "S" means simple. We analyze the iteration complexity of S-SVRG for obtaining an $\epsilon$-stationary point and its local linear convergence by assuming the \L ojasiewicz inequality, which naturally holds for PCA and holds with high probability for matrix completion problem. We also incorporate the Barzilai-Borwein step size and design a very practical S-SVRG-BB method. Numerical results on PCA and matrix completion problems are reported to demonstrate the efficiency of our methods.
  • In many real-world applications, image data often come with noises, corruptions or large errors. One approach to deal with noise image data is to use data recovery techniques which aim to recover the true uncorrupted signals from the observed noise images. In this paper, we first introduce a novel corruption recovery transformation (CRT) model which aims to recover multiple (or a collection of) corrupted images using a single affine transformation. Then, we show that the introduced CRT can be efficiently constructed through learning from training data. Once CRT is learned, we can recover the true signals from the new incoming/test corrupted images explicitly. As an application, we apply our CRT to image recognition task. Experimental results on six image datasets demonstrate that the proposed CRT model is effective in recovering noise image data and thus leads to better recognition results.
  • Cumulative advantage (CA) refers to the notion that accumulated resources foster the accumulation of further resources in competitions, a phenomenon that has been empirically observed in various contexts. The oldest and arguably simplest mathematical model that embodies this general principle is the P\'olya urn process, which finds applications in a myriad of problems. The original model captures the dynamics of competitions between two equally fit agents under linear CA effects, which can be readily generalized to incorporate different fitnesses and nonlinear CA effects. We study two statistics of competitions under the generalized model, namely duration (i.e., time of the last tie) and intensity (i.e., number of ties). We give rigorous mathematical characterizations of the tail distributions of both duration and intensity under the various regimes for fitness and nonlinearity, which reveal very interesting behaviors. For example, fitness superiority induces much shorter competitions in the sublinear regime while much longer competitions in the superlinear regime. Our findings can shed light on the application of P\'olya urn processes in more general contexts where fitness and nonlinearity may be present.
  • The auxiliary problem principle of augmented Lagrangian (APP-AL), proposed by Cohen and Zhu (1984), aims to find the solution of a constrained optimization problem through a sequence of auxiliary problems involving augmented Lagrangian. The merits of this approach are two folds. First, the core function is usually separable, which makes the subproblems at each step decomposable and particularly attractive for parallel computing. Second, the choice of the core function is quite flexible. Consequently, by carefully specifying this function, APP-AL may reduce to some standard optimization algorithms. In this paper, we pursue enhancing such flexibility by allowing the core function to be non-identical at each step of the algorithm, and name it varying auxiliary problem principle (VAPP-AL). Depending on the problem structure, the varying core functions in VAPP-AL can be adapted to design new flexible and suitable algorithm for parallel and distributed computing. The convergence and O(1/t) convergence rate of VAPP-AL for convex problem with coupling objective and constraints is proved. Moreover, if this function is specialized to be quadratic, an o(1/t) convergence rate can be established. Interestingly, the new VAPP framework can cover several variants of Jacobian type augmented Lagrangian decomposition methods as special cases. Furthermore, our technique works for the convex problem with nonseparable objective and multi-blocks coupled linear constraints, which usually can not be handled by ADMM.
  • Complex polynomial optimization has recently gained more and more attention in both theory and practice. In this paper, we study the optimization of a real-valued general conjugate complex form over various popular constraint sets including the m-th roots of complex unity, the complex unit circle, and the complex unit sphere. A real-valued general conjugate complex form is a homogenous polynomial function of complex variables as well as their conjugates, and always takes real values. General conjugate form optimization is a wide class of complex polynomial optimization models, which include many homogenous polynomial optimization in the real domain with either discrete or continuous variables, and Hermitian quadratic form optimization as well as its higher degree extensions. All the problems under consideration are NP-hard in general and we focus on polynomial-time approximation algorithms with worst-case performance ratios. These approximation ratios improve previous results when restricting our problems to some special classes of complex polynomial optimization, and improve or equate previous results when restricting our problems to some special classes of polynomial optimization in the real domain. These algorithms are based on tensor relaxation and random sampling. Our novel technical contributions are to establish the first set of probability lower bounds for random sampling over the m-th root of unity, the complex unit circle, and the complex unit sphere, and propose the first polarization formula linking general conjugate forms and complex multilinear forms.
  • Detecting dependence between two random variables is a fundamental problem. Although the Pearson correlation is effective for capturing linear dependency, it can be entirely powerless for detecting nonlinear and/or heteroscedastic patterns. We introduce a new measure, G-squared, to test whether two univariate random variables are independent and to measure the strength of their relationship. The G-squared is almost identical to the square of the Pearson correlation coefficient, R-squared, for linear relationships with constant error variance, and has the intuitive meaning of the piecewise R-squared between the variables. It is particularly effective in handling nonlinearity and heteroscedastic errors. We propose two estimators of G-squared and show their consistency. Simulations demonstrate that G-squared estimates are among the most powerful test statistics compared with several state-of-the-art methods.
  • In [13], Hillar and Lim famously demonstrated that "multilinear (tensor) analogues of many efficiently computable problems in numerical linear algebra are NP-hard". Despite many recent advancements, the state-of-the-art methods for computing such `tensor analogues' still suffer severely from the curse of dimensionality. In this paper we show that the Tucker core of a tensor however, retains many properties of the original tensor, including the CP rank, the border rank, the tensor Schatten quasi norms, and the Z-eigenvalues. When the core tensor is smaller than the original tensor, this property leads to considerable computational advantages as confirmed by our numerical experiments. In our analysis, we in fact work with a generalized Tucker-like decomposition that can accommodate any full column-rank factor matrices.
  • Optimization problems over permutation matrices appear widely in facility layout, chip design, scheduling, pattern recognition, computer vision, graph matching, etc. Since this problem is NP-hard due to the combinatorial nature of permutation matrices, we relax the variable to be the more tractable doubly stochastic matrices and add an $L_p$-norm ($0 < p < 1$) regularization term to the objective function. The optimal solutions of the $L_p$-regularized problem are the same as the original problem if the regularization parameter is sufficiently large. A lower bound estimation of the nonzero entries of the stationary points and some connections between the local minimizers and the permutation matrices are further established. Then we propose an $L_p$ regularization algorithm with local refinements. The algorithm approximately solves a sequence of $L_p$ regularization subproblems by the projected gradient method using a nonmontone line search with the Barzilai-Borwein step sizes. Its performance can be further improved if it is combined with certain local search methods, the cutting plane techniques as well as a new negative proximal point scheme. Extensive numerical results on QAPLIB and the bandwidth minimization problem show that our proposed algorithms can often find reasonably high quality solutions within a competitive amount of time.
  • Rechargeable lithium ion batteries are an attractive alternative power source for a wide variety of applications. To optimize their performances, a complete description of the solvation properties of the ion in the electrolyte is crucial. A comprehensive understanding at the nanoscale of the solvation structure of lithium ions in nonaqueous carbonate electrolytes is, however, still unclear. We have measured by femtosecond vibrational spectroscopy the orientational correlation time of the CO stretching mode of Li+-bound and Li+-unbound ethylene carbonate molecules, in LiBF4, LiPF6, and LiClO4 ethylene carbonate solutions with different concentrations. Surprisingly, we have found that the coordination number of ethylene carbonate in the first solvation shell of Li+ is only two, in all solutions with concentrations higher than 0.5 M. Density functional theory calculations indicate that the presence of anions in the first coordination shell modifies the generally accepted tetrahedral structure of the complex, allowing only two EC molecules to coordinate to Li+ directly. Our results demonstrate for the first time, to the best of our knowledge, the anion influence on the overall structure of the first solvation shell of the Li+ ion. The formation of such a cation/solvent/anion complex provides a rational explanation for the ionic conductivity drop of lithium/carbonate electrolyte solutions at high concentrations.
  • In this paper we study multivariate polynomial functions in complex variables and the corresponding associated symmetric tensor representations. The focus is on finding conditions under which such complex polynomials/tensors always take real values. We introduce the notion of symmetric conjugate forms and general conjugate forms, and present characteristic conditions for such complex polynomials to be real-valued. As applications of our results, we discuss the relation between nonnegative polynomials and sums of squares in the context of complex polynomials. Moreover, new notions of eigenvalues/eigenvectors for complex tensors are introduced, extending properties from the Hermitian matrices. Finally, we discuss an important property for symmetric tensors, which states that the largest absolute value of eigenvalue of a symmetric real tensor is equal to its largest singular value; the result is known as Banach's theorem. We show that a similar result holds in the complex case as well.
  • An Android-based smart Television (TV) must reliably run its applications in an embedded program environment under diverse hardware resource conditions. Owing to the diverse hardware components used to build numerous TV models, TV simulators are usually not high enough in fidelity to simulate various TV models, and thus are only regarded as unreliable alternatives when stress testing such applications. Therefore, even though stress testing on real TV sets is tedious, it is the de facto approach to ensure the reliability of these applications in the industry. In this paper, we study to what extent stress testing of smart TV applications can be fully automated in the industrial environments. To the best of our knowledge, no previous work has addressed this important question. We summarize the find-ings collected from 10 industrial test engineers to have tested 20 such TV applications in a real production environment. Our study shows that the industry required test automation supports on high-level GUI object controls and status checking, setup of resource conditions and the interplay between the two. With such supports, 87% of the industrial test specifications of one TV model can be fully automated and 71.4% of them were found to be fully reusable to test a subsequent TV model with major up-grades of hardware, operating system and application. It repre-sents a significant improvement with margins of 28% and 38%, respectively, compared to stress testing without such supports.
  • Directed links -- representing asymmetric social ties or interactions (e.g., "follower-followee") -- arise naturally in many social networks and other complex networks, giving rise to directed graphs (or digraphs) as basic topological models for these networks. Reciprocity, defined for a digraph as the percentage of edges with a reciprocal edge, is a key metric that has been used in the literature to compare different directed networks and provide "hints" about their structural properties: for example, are reciprocal edges generated randomly by chance or are there other processes driving their generation? In this paper we study the problem of maximizing achievable reciprocity for an ensemble of digraphs with the same prescribed in- and out-degree sequences. We show that the maximum reciprocity hinges crucially on the in- and out-degree sequences, which may be intuitively interpreted as constraints on some "social capacities" of nodes and impose fundamental limits on achievable reciprocity. We show that it is NP-complete to decide the achievability of a simple upper bound on maximum reciprocity, and provide conditions for achieving it. We demonstrate that many real networks exhibit reciprocities surprisingly close to the upper bound, which implies that users in these social networks are in a sense more "social" than suggested by the empirical reciprocity alone in that they are more willing to reciprocate, subject to their "social capacity" constraints. We find some surprising linear relationships between empirical reciprocity and the bound. We also show that a particular type of small network motifs that we call 3-paths are the major source of loss in reciprocity for real networks.
  • The Peaceman-Rachford splitting method is very efficient for minimizing sum of two functions each depends on its variable, and the constraint is a linear equality. However, its convergence was not guaranteed without extra requirements. Very recently, He et al. (SIAM J. Optim. 24: 1011 - 1040, 2014) proved the convergence of a strictly contractive Peaceman-Rachford splitting method by employing a suitable underdetermined relaxation factor. In this paper, we further extend the so-called strictly contractive Peaceman-Rachford splitting method by using two different relaxation factors, and to make the method more flexible, we introduce semi-proximal terms to the subproblems. We characterize the relation of these two factors, and show that one factor is always underdetermined while the other one is allowed to be larger than 1. Such a flexible conditions makes it possible to cover the Glowinski's ADMM whith larger stepsize. We show that the proposed modified strictly contractive Peaceman-Rachford splitting method is convergent and also prove $O(1/t)$ convergence rate in ergodic and nonergodic sense, respectively. The numerical tests on an extensive collection of problems demonstrate the efficiency of the proposed method.