
In this paper, we propose a new approach to solve lowrank tensor completion
and robust tensor PCA. Our approach is based on some novel notion of
(evenorder) tensor ranks, to be called the Mrank, the symmetric Mrank, and
the strongly symmetric Mrank. We discuss the connections between these new
tensor ranks and the CPrank and the symmetric CPrank of an evenorder tensor.
We show that the Mrank provides a reliable and easycomputable approximation
to the CPrank. As a result, we propose to replace the CPrank by the Mrank in
the lowCPrank tensor completion and robust tensor PCA. Numerical results
suggest that our new approach based on the Mrank outperforms existing methods
that are based on lownrank, tSVD and KBR approaches for solving lowrank
tensor completion and robust tensor PCA when the underlying tensor has low
CPrank.

The modeling and analysis of an LRU cache is extremely challenging as exact
results for the main performance metrics (e.g. hit rate) are either lacking or
cannot be used because of their high computational complexity for large caches.
As a result, various approximations have been proposed. The stateoftheart
method is the socalled TTL approximation, first proposed and shown to be
asymptotically exact for IRM requests by Fagin. It has been applied to various
other workload models and numerically demonstrated to be accurate but without
theoretical justification. In this paper we provide theoretical justification
for the approximation in the case where distinct contents are described by
independent stationary and ergodic processes. We show that this approximation
is exact as the cache size and the number of contents go to infinity. This
extends earlier results for the independent reference model. Moreover, we
establish results not only for the aggregate cache hit probability but also for
every individual content. Last, we obtain bounds on the rate of convergence.

Recently, weighted patch representation has been widely studied for
alleviating the impact of background information included in bounding box to
improve visual tracking results. However, existing weighted patch
representation models generally exploit spatial structure information among
patches in each frame separately which ignore (1) unary featureof each patch
and (2) temporal correlation among patches in different frames. To address this
problem, we propose a novel unified temporal coherence and graph optimized
ranking model for weighted patch representation in visual tracking problem.
There are three main contributions of this paper. First, we propose to employ a
flexible graph ranking for patch weight computation which exploits both
structure information among patches and unary feature of each patch
simultaneously. Second, we propose a new more discriminative ranking model by
further considering the temporal correlation among patches in different frames.
Third, a neighborhood preserved, lowrank graph is learned and incorporated to
build a unified optimized ranking model. Experiments on two benchmark datasets
show the benefits of our model.

In this paper, localized information privacy (LIP) is proposed, as a new
privacy definition, which allows statistical aggregation while protecting
users' privacy without relying on a trusted third party. The notion of
contextawareness is incorporated in LIP by the introduction of priors, which
enables the design of privacypreserving data aggregation with knowledge of
priors. We show that LIP relaxes the Localized Differential Privacy (LDP)
notion by explicitly modeling the adversary's knowledge. However, it is
stricter than $2\epsilon$LDP and $\epsilon$mutual information privacy. The
incorporation of local priors allows LIP to achieve higher utility compared to
other approaches. We then present an optimization framework for
privacypreserving data aggregation, with the goal of minimizing the expected
squared error while satisfying the LIP privacy constraints. Utilityprivacy
tradeoffs are obtained under several models in closedform. We then validate
our analysis by {numerical analysis} using both synthetic and realworld data.
Results show that our LIP mechanism provides better utilityprivacy tradeoffs
than LDP and when the prior is not uniformly distributed, the advantage of LIP
is even more significant.

Conjugate partialsymmetric (CPS) tensors are the highorder generalization
of Hermitian matrices. As the role played by Hermitian matrices in matrix
theory and quadratic optimization, CPS tensors have shown growing interest
recently in tensor theory and optimization, particularly in many
applicationdriven complex polynomial optimization problems. In this paper, we
study CPS tensors with a focus on ranks, rankone decompositions and
approximations, as well as their applications. The analysis is conducted along
side with a more general class of complex tensors called partialsymmetric
tensors. We prove constructively that any CPS tensor can be decomposed into a
sum of rankone CPS tensors, which provides an alternative definition of CPS
tensors via linear combinations of rankone CPS tensors. Three types of ranks
for CPS tensors are defined and shown to be different in general. This leads to
the invalidity of the conjugate version of Comon's conjecture. We then study
rankone approximations and matricizations of CPS tensors. By carefully
unfolding CPS tensors to Hermitian matrices, rankone equivalence can be
preserved. This enables us to develop new convex optimization models and
algorithms to compute best rankone approximation of CPS tensors. Numerical
experiments from various data are performed to justify the capability of our
methods.

In this paper, we consider an unconstrained optimization model where the
objective is a sum of a large number of possibly nonconvex functions, though
overall the objective is assumed to be smooth and convex. Our bid to solving
such model uses the framework of cubic regularization of Newton's method.As
well known, the crux in cubic regularization is its utilization of the Hessian
information, which may be computationally expensive for largescale problems.
To tackle this, we resort to approximating the Hessian matrix via subsampling.
In particular, we propose to compute an approximated Hessian matrix by either
uniformly or nonuniformly subsampling the components of the objective. Based
upon subsampling, we develop both standard and accelerated adaptive cubic
regularization approaches and provide theoretical guarantees on global
iteration complexity. We show that the standard and accelerated subsampled
cubic regularization methods achieve iteration complexity in the order of
$O(\epsilon^{1/2})$ and $O(\epsilon^{1/3})$ respectively, which match those
of the original standard and accelerated cubic regularization methods
\cite{Cartis2012Evaluation, Jiang2017Unified} using the full Hessian
information. The performances of the proposed methods on regularized logistic
regression problems show a clear effect of acceleration in terms of epochs on
several real data sets.

Computed tomography (CT) examinations are commonly used to predict lung
nodule malignancy in patients, which are shown to improve noninvasive early
diagnosis of lung cancer. It remains challenging for computational approaches
to achieve performance comparable to experienced radiologists. Here we present
NoduleX, a systematic approach to predict lung nodule malignancy from CT data,
based on deep learning convolutional neural networks (CNN). For training and
validation, we analyze >1000 lung nodules in images from the LIDC/IDRI cohort.
All nodules were identified and classified by four experienced thoracic
radiologists who participated in the LIDC project. NoduleX achieves high
accuracy for nodule malignancy classification, with an AUC of ~0.99. This is
commensurate with the analysis of the dataset by experienced radiologists. Our
approach, NoduleX, provides an effective framework for highly accurate nodule
malignancy prediction with the model trained on a large patient population. Our
results are replicable with software available at
http://bioinformatics.astate.edu/NoduleX.

Nonconvex and nonsmooth optimization problems are frequently encountered in
much of statistics, business, science and engineering, but they are not yet
widely recognized as a technology in the sense of scalability. A reason for
this relatively low degree of popularity is the lack of a well developed system
of theory and algorithms to support the applications, as is the case for its
convex counterpart. This paper aims to take one step in the direction of
disciplined nonconvex and nonsmooth optimization. In particular, we consider in
this paper some constrained nonconvex optimization models in block decision
variables, with or without coupled affine constraints. In the case of without
coupled constraints, we show a sublinear rate of convergence to an
$\epsilon$stationary solution in the form of variational inequality for a
generalized conditional gradient method, where the convergence rate is shown to
be dependent on the H\"olderian continuity of the gradient of the smooth part
of the objective. For the model with coupled affine constraints, we introduce
corresponding $\epsilon$stationarity conditions, and apply two proximaltype
variants of the ADMM to solve such a model, assuming the proximal ADMM updates
can be implemented for all the block variables except for the last block, for
which either a gradient step or a majorizationminimization step is
implemented. We show an iteration complexity bound of $O(1/\epsilon^2)$ to
reach an $\epsilon$stationary solution for both algorithms. Moreover, we show
that the same iteration complexity of a proximal BCD method follows
immediately. Numerical results are provided to illustrate the efficacy of the
proposed algorithms for tensor robust PCA.

In this paper we propose a unified twophase scheme for convex optimization
to accelerate: (1) the adaptive cubic regularization methods with exact/inexact
Hessian matrices, and (2) the adaptive gradient method, without any knowledge
of the Lipschitz constants for the gradient or the Hessian. This is achieved by
tuning the parameters used in the algorithm $\textit{adaptively}$ in its
process of progression, which can be viewed as a relaxation over the existing
algorithms in the literature. Under the assumption that the subproblems can be
solved approximately, we establish overall iteration complexity bounds for
three newly proposed algorithms to obtain an $\epsilon$approximate solution.
Specifically, we show that the adaptive cubic regularization methods with the
exact/inexact Hessian matrix both achieve an iteration complexity in the order
of $O\left( 1 / \epsilon^{1/3} \right)$, which matches that of the original
accelerated cubic regularization method presented in
[Nesterov2008Accelerating] assuming the availability of the exact Hessian
information and the Lipschitz constants, and that the subproblems are solved
to optimality. Under the same twophase adaptive acceleration framework, the
gradient method achieves an iteration complexity in the order $O\left( 1 /
\epsilon^{1/2} \right)$, which is known to be best possible (cf.
Nesterov2013Introductory). Our numerical experiment results show a clear
effect of acceleration displayed in the adaptive Newton method with cubic
regularization on a set of regularized logistic regression instances.

The discrete moment problem is a foundational problem in distributionfree
robust optimization, where the goal is to find a worstcase distribution that
satisfies a given set of moments. This paper studies the discrete moment
problems with additional "shape constraints" that guarantee the worst case
distribution is either logconcave or has an increasing failure rate. These
classes of shape constraints have not previously been studied in the
literature, in part due to their inherent nonconvexities. Nonetheless, these
classes of distributions are useful in practice. We characterize the structure
of optimal extreme point distributions by developing new results in reverse
convex optimization, a lesserknown tool previously employed in designing
global optimization algorithms. We are able to show, for example, that an
optimal extreme point solution to a moment problem with $m$ moments and
logconcave shape constraints is piecewise geometric with at most $m$ pieces.
Moreover, this structure allows us to design an exact algorithm for computing
optimal solutions in a lowdimensional space of parameters. Moreover, We
describe a computational approach to solving these lowdimensional problems,
including numerical results for a representative set of instances.

In many realworld applications, data usually contain outliers. One popular
approach is to use L2,1 norm function as a robust error/loss function. However,
the robustness of L2,1 norm function is not well understood so far. In this
paper, we propose a new Vector Outlier Regularization (VOR) framework to
understand and analyze the robustness of L2,1 norm function. Our VOR function
defines a data point to be outlier if it is outside a threshold with respect to
a theoretical prediction, and regularize itpull it back to the threshold line.
We then prove that L2,1 function is the limiting case of this VOR with the
usual least square/L2 error function as the threshold shrinks to zero. One
interesting property of VOR is that how far an outlier lies away from its
theoretically predicted value does not affect the final regularization and
analysis results. This VOR property unmasks one of the most peculiar property
of L2,1 norm function: The effects of outliers seem to be independent of how
outlying they areif an outlier is moved further away from the intrinsic
manifold/subspace, the final analysis results do not change. VOR provides a new
way to understand and analyze the robustness of L2,1 norm function. Applying
VOR to matrix factorization leads to a new VORPCA model. We give a
comprehensive comparison with tracenorm based L21norm PCA to demonstrate the
advantages of VORPCA.

In many realworld applications, data come with corruptions, large errors or
outliers. One popular approach is to use L1norm function. However, the
robustness of L1norm function is not well understood so far. In this paper, we
present a new outlier regularization framework to understand and analyze the
robustness of L1norm function. There are two main features for the proposed
outlier regularization. (1) A key property of outlier regularization is that
how far an outlier lies away from its theoretically predicted value does not
affect the final regularization and analysis results. (2) Another important
feature of outlier regularization is that it has an equivalent continuous
representation that closely relates to L1 function. This provides a new way to
understand and analyze the robustness of L1 function. We apply our outlier
regularization framework to PCA and propose an outlier regularized PCA (ORPCA)
model. Comparing to the tracenormbased robust PCA, ORPCA has several benefits:
(1) It does not suffer singular value suppression. (2) It can retain small high
rank components which help retain fine details of data. (3) ORPCA can be
computed more efficiently.

In this paper, we propose a vector transportfree stochastic variance reduced
gradient (SVRG) method with general retraction for empirical risk minimization
over Riemannian manifold. Existing SVRG methods on manifold usually consider a
specific retraction operation, and involve additional computational costs such
as parallel transport or vector transport. The vector transportfree SVRG with
general retraction we propose in this paper handles general retraction
operations, and do not need additional computational costs mentioned above. As
a result, we name our algorithm SSVRG, where the first "S" means simple. We
analyze the iteration complexity of SSVRG for obtaining an
$\epsilon$stationary point and its local linear convergence by assuming the \L
ojasiewicz inequality, which naturally holds for PCA and holds with high
probability for matrix completion problem. We also incorporate the
BarzilaiBorwein step size and design a very practical SSVRGBB method.
Numerical results on PCA and matrix completion problems are reported to
demonstrate the efficiency of our methods.

In many realworld applications, image data often come with noises,
corruptions or large errors. One approach to deal with noise image data is to
use data recovery techniques which aim to recover the true uncorrupted signals
from the observed noise images. In this paper, we first introduce a novel
corruption recovery transformation (CRT) model which aims to recover multiple
(or a collection of) corrupted images using a single affine transformation.
Then, we show that the introduced CRT can be efficiently constructed through
learning from training data. Once CRT is learned, we can recover the true
signals from the new incoming/test corrupted images explicitly. As an
application, we apply our CRT to image recognition task. Experimental results
on six image datasets demonstrate that the proposed CRT model is effective in
recovering noise image data and thus leads to better recognition results.

Cumulative advantage (CA) refers to the notion that accumulated resources
foster the accumulation of further resources in competitions, a phenomenon that
has been empirically observed in various contexts. The oldest and arguably
simplest mathematical model that embodies this general principle is the P\'olya
urn process, which finds applications in a myriad of problems. The original
model captures the dynamics of competitions between two equally fit agents
under linear CA effects, which can be readily generalized to incorporate
different fitnesses and nonlinear CA effects. We study two statistics of
competitions under the generalized model, namely duration (i.e., time of the
last tie) and intensity (i.e., number of ties). We give rigorous mathematical
characterizations of the tail distributions of both duration and intensity
under the various regimes for fitness and nonlinearity, which reveal very
interesting behaviors. For example, fitness superiority induces much shorter
competitions in the sublinear regime while much longer competitions in the
superlinear regime. Our findings can shed light on the application of P\'olya
urn processes in more general contexts where fitness and nonlinearity may be
present.

The auxiliary problem principle of augmented Lagrangian (APPAL), proposed by
Cohen and Zhu (1984), aims to find the solution of a constrained optimization
problem through a sequence of auxiliary problems involving augmented
Lagrangian. The merits of this approach are two folds. First, the core function
is usually separable, which makes the subproblems at each step decomposable and
particularly attractive for parallel computing. Second, the choice of the core
function is quite flexible. Consequently, by carefully specifying this
function, APPAL may reduce to some standard optimization algorithms. In this
paper, we pursue enhancing such flexibility by allowing the core function to be
nonidentical at each step of the algorithm, and name it varying auxiliary
problem principle (VAPPAL). Depending on the problem structure, the varying
core functions in VAPPAL can be adapted to design new flexible and suitable
algorithm for parallel and distributed computing. The convergence and O(1/t)
convergence rate of VAPPAL for convex problem with coupling objective and
constraints is proved. Moreover, if this function is specialized to be
quadratic, an o(1/t) convergence rate can be established. Interestingly, the
new VAPP framework can cover several variants of Jacobian type augmented
Lagrangian decomposition methods as special cases. Furthermore, our technique
works for the convex problem with nonseparable objective and multiblocks
coupled linear constraints, which usually can not be handled by ADMM.

Complex polynomial optimization has recently gained more and more attention
in both theory and practice. In this paper, we study the optimization of a
realvalued general conjugate complex form over various popular constraint sets
including the mth roots of complex unity, the complex unit circle, and the
complex unit sphere. A realvalued general conjugate complex form is a
homogenous polynomial function of complex variables as well as their
conjugates, and always takes real values. General conjugate form optimization
is a wide class of complex polynomial optimization models, which include many
homogenous polynomial optimization in the real domain with either discrete or
continuous variables, and Hermitian quadratic form optimization as well as its
higher degree extensions. All the problems under consideration are NPhard in
general and we focus on polynomialtime approximation algorithms with
worstcase performance ratios. These approximation ratios improve previous
results when restricting our problems to some special classes of complex
polynomial optimization, and improve or equate previous results when
restricting our problems to some special classes of polynomial optimization in
the real domain. These algorithms are based on tensor relaxation and random
sampling. Our novel technical contributions are to establish the first set of
probability lower bounds for random sampling over the mth root of unity, the
complex unit circle, and the complex unit sphere, and propose the first
polarization formula linking general conjugate forms and complex multilinear
forms.

Detecting dependence between two random variables is a fundamental problem.
Although the Pearson correlation is effective for capturing linear dependency,
it can be entirely powerless for detecting nonlinear and/or heteroscedastic
patterns. We introduce a new measure, Gsquared, to test whether two univariate
random variables are independent and to measure the strength of their
relationship. The Gsquared is almost identical to the square of the Pearson
correlation coefficient, Rsquared, for linear relationships with constant
error variance, and has the intuitive meaning of the piecewise Rsquared
between the variables. It is particularly effective in handling nonlinearity
and heteroscedastic errors. We propose two estimators of Gsquared and show
their consistency. Simulations demonstrate that Gsquared estimates are among
the most powerful test statistics compared with several stateoftheart
methods.

In [13], Hillar and Lim famously demonstrated that "multilinear (tensor)
analogues of many efficiently computable problems in numerical linear algebra
are NPhard". Despite many recent advancements, the stateoftheart methods
for computing such `tensor analogues' still suffer severely from the curse of
dimensionality. In this paper we show that the Tucker core of a tensor however,
retains many properties of the original tensor, including the CP rank, the
border rank, the tensor Schatten quasi norms, and the Zeigenvalues. When the
core tensor is smaller than the original tensor, this property leads to
considerable computational advantages as confirmed by our numerical
experiments. In our analysis, we in fact work with a generalized Tuckerlike
decomposition that can accommodate any full columnrank factor matrices.

Optimization problems over permutation matrices appear widely in facility
layout, chip design, scheduling, pattern recognition, computer vision, graph
matching, etc. Since this problem is NPhard due to the combinatorial nature of
permutation matrices, we relax the variable to be the more tractable doubly
stochastic matrices and add an $L_p$norm ($0 < p < 1$) regularization term to
the objective function. The optimal solutions of the $L_p$regularized problem
are the same as the original problem if the regularization parameter is
sufficiently large. A lower bound estimation of the nonzero entries of the
stationary points and some connections between the local minimizers and the
permutation matrices are further established. Then we propose an $L_p$
regularization algorithm with local refinements. The algorithm approximately
solves a sequence of $L_p$ regularization subproblems by the projected gradient
method using a nonmontone line search with the BarzilaiBorwein step sizes. Its
performance can be further improved if it is combined with certain local search
methods, the cutting plane techniques as well as a new negative proximal point
scheme. Extensive numerical results on QAPLIB and the bandwidth minimization
problem show that our proposed algorithms can often find reasonably high
quality solutions within a competitive amount of time.

Rechargeable lithium ion batteries are an attractive alternative power source
for a wide variety of applications. To optimize their performances, a complete
description of the solvation properties of the ion in the electrolyte is
crucial. A comprehensive understanding at the nanoscale of the solvation
structure of lithium ions in nonaqueous carbonate electrolytes is, however,
still unclear. We have measured by femtosecond vibrational spectroscopy the
orientational correlation time of the CO stretching mode of Li+bound and
Li+unbound ethylene carbonate molecules, in LiBF4, LiPF6, and LiClO4 ethylene
carbonate solutions with different concentrations. Surprisingly, we have found
that the coordination number of ethylene carbonate in the first solvation shell
of Li+ is only two, in all solutions with concentrations higher than 0.5 M.
Density functional theory calculations indicate that the presence of anions in
the first coordination shell modifies the generally accepted tetrahedral
structure of the complex, allowing only two EC molecules to coordinate to Li+
directly. Our results demonstrate for the first time, to the best of our
knowledge, the anion influence on the overall structure of the first solvation
shell of the Li+ ion. The formation of such a cation/solvent/anion complex
provides a rational explanation for the ionic conductivity drop of
lithium/carbonate electrolyte solutions at high concentrations.

In this paper we study multivariate polynomial functions in complex variables
and the corresponding associated symmetric tensor representations. The focus is
on finding conditions under which such complex polynomials/tensors always take
real values. We introduce the notion of symmetric conjugate forms and general
conjugate forms, and present characteristic conditions for such complex
polynomials to be realvalued. As applications of our results, we discuss the
relation between nonnegative polynomials and sums of squares in the context of
complex polynomials. Moreover, new notions of eigenvalues/eigenvectors for
complex tensors are introduced, extending properties from the Hermitian
matrices. Finally, we discuss an important property for symmetric tensors,
which states that the largest absolute value of eigenvalue of a symmetric real
tensor is equal to its largest singular value; the result is known as Banach's
theorem. We show that a similar result holds in the complex case as well.

An Androidbased smart Television (TV) must reliably run its applications in
an embedded program environment under diverse hardware resource conditions.
Owing to the diverse hardware components used to build numerous TV models, TV
simulators are usually not high enough in fidelity to simulate various TV
models, and thus are only regarded as unreliable alternatives when stress
testing such applications. Therefore, even though stress testing on real TV
sets is tedious, it is the de facto approach to ensure the reliability of these
applications in the industry. In this paper, we study to what extent stress
testing of smart TV applications can be fully automated in the industrial
environments. To the best of our knowledge, no previous work has addressed this
important question. We summarize the findings collected from 10 industrial
test engineers to have tested 20 such TV applications in a real production
environment. Our study shows that the industry required test automation
supports on highlevel GUI object controls and status checking, setup of
resource conditions and the interplay between the two. With such supports, 87%
of the industrial test specifications of one TV model can be fully automated
and 71.4% of them were found to be fully reusable to test a subsequent TV model
with major upgrades of hardware, operating system and application. It
represents a significant improvement with margins of 28% and 38%,
respectively, compared to stress testing without such supports.

Directed links  representing asymmetric social ties or interactions (e.g.,
"followerfollowee")  arise naturally in many social networks and other
complex networks, giving rise to directed graphs (or digraphs) as basic
topological models for these networks. Reciprocity, defined for a digraph as
the percentage of edges with a reciprocal edge, is a key metric that has been
used in the literature to compare different directed networks and provide
"hints" about their structural properties: for example, are reciprocal edges
generated randomly by chance or are there other processes driving their
generation? In this paper we study the problem of maximizing achievable
reciprocity for an ensemble of digraphs with the same prescribed in and
outdegree sequences. We show that the maximum reciprocity hinges crucially on
the in and outdegree sequences, which may be intuitively interpreted as
constraints on some "social capacities" of nodes and impose fundamental limits
on achievable reciprocity. We show that it is NPcomplete to decide the
achievability of a simple upper bound on maximum reciprocity, and provide
conditions for achieving it. We demonstrate that many real networks exhibit
reciprocities surprisingly close to the upper bound, which implies that users
in these social networks are in a sense more "social" than suggested by the
empirical reciprocity alone in that they are more willing to reciprocate,
subject to their "social capacity" constraints. We find some surprising linear
relationships between empirical reciprocity and the bound. We also show that a
particular type of small network motifs that we call 3paths are the major
source of loss in reciprocity for real networks.

The PeacemanRachford splitting method is very efficient for minimizing sum
of two functions each depends on its variable, and the constraint is a linear
equality. However, its convergence was not guaranteed without extra
requirements. Very recently, He et al. (SIAM J. Optim. 24: 1011  1040, 2014)
proved the convergence of a strictly contractive PeacemanRachford splitting
method by employing a suitable underdetermined relaxation factor. In this
paper, we further extend the socalled strictly contractive PeacemanRachford
splitting method by using two different relaxation factors, and to make the
method more flexible, we introduce semiproximal terms to the subproblems. We
characterize the relation of these two factors, and show that one factor is
always underdetermined while the other one is allowed to be larger than 1. Such
a flexible conditions makes it possible to cover the Glowinski's ADMM whith
larger stepsize. We show that the proposed modified strictly contractive
PeacemanRachford splitting method is convergent and also prove $O(1/t)$
convergence rate in ergodic and nonergodic sense, respectively. The numerical
tests on an extensive collection of problems demonstrate the efficiency of the
proposed method.