
In this paper, we consider a fundamental problem in modern digital
communications known as multiinput multioutput (MIMO) detection, which can be
formulated as a complex quadratic programming problem subject to unitmodulus
and discrete argument constraints. Various semidefinite relaxation (SDR) based
algorithms have been proposed to solve the problem in the literature. In this
paper, we first show that the conventional SDR is generally not tight for the
problem. Then, we propose a new and enhanced SDR and show its tightness under
an easily checkable condition, which essentially requires the level of the
noise to be below a certain threshold. The above results have answered an open
question posed by So in [35]. Numerical simulation results show that our
proposed SDR significantly outperforms the conventional SDR in terms of the
relaxation gap.

In this paper, we propose a new approach to solve lowrank tensor completion
and robust tensor PCA. Our approach is based on some novel notion of
(evenorder) tensor ranks, to be called the Mrank, the symmetric Mrank, and
the strongly symmetric Mrank. We discuss the connections between these new
tensor ranks and the CPrank and the symmetric CPrank of an evenorder tensor.
We show that the Mrank provides a reliable and easycomputable approximation
to the CPrank. As a result, we propose to replace the CPrank by the Mrank in
the lowCPrank tensor completion and robust tensor PCA. Numerical results
suggest that our new approach based on the Mrank outperforms existing methods
that are based on lownrank, tSVD and KBR approaches for solving lowrank
tensor completion and robust tensor PCA when the underlying tensor has low
CPrank.

In this paper, we consider an unconstrained optimization model where the
objective is a sum of a large number of possibly nonconvex functions, though
overall the objective is assumed to be smooth and convex. Our bid to solving
such model uses the framework of cubic regularization of Newton's method.As
well known, the crux in cubic regularization is its utilization of the Hessian
information, which may be computationally expensive for largescale problems.
To tackle this, we resort to approximating the Hessian matrix via subsampling.
In particular, we propose to compute an approximated Hessian matrix by either
uniformly or nonuniformly subsampling the components of the objective. Based
upon subsampling, we develop both standard and accelerated adaptive cubic
regularization approaches and provide theoretical guarantees on global
iteration complexity. We show that the standard and accelerated subsampled
cubic regularization methods achieve iteration complexity in the order of
$O(\epsilon^{1/2})$ and $O(\epsilon^{1/3})$ respectively, which match those
of the original standard and accelerated cubic regularization methods
\cite{Cartis2012Evaluation, Jiang2017Unified} using the full Hessian
information. The performances of the proposed methods on regularized logistic
regression problems show a clear effect of acceleration in terms of epochs on
several real data sets.

Computed tomography (CT) examinations are commonly used to predict lung
nodule malignancy in patients, which are shown to improve noninvasive early
diagnosis of lung cancer. It remains challenging for computational approaches
to achieve performance comparable to experienced radiologists. Here we present
NoduleX, a systematic approach to predict lung nodule malignancy from CT data,
based on deep learning convolutional neural networks (CNN). For training and
validation, we analyze >1000 lung nodules in images from the LIDC/IDRI cohort.
All nodules were identified and classified by four experienced thoracic
radiologists who participated in the LIDC project. NoduleX achieves high
accuracy for nodule malignancy classification, with an AUC of ~0.99. This is
commensurate with the analysis of the dataset by experienced radiologists. Our
approach, NoduleX, provides an effective framework for highly accurate nodule
malignancy prediction with the model trained on a large patient population. Our
results are replicable with software available at
http://bioinformatics.astate.edu/NoduleX.

Nonconvex and nonsmooth optimization problems are frequently encountered in
much of statistics, business, science and engineering, but they are not yet
widely recognized as a technology in the sense of scalability. A reason for
this relatively low degree of popularity is the lack of a well developed system
of theory and algorithms to support the applications, as is the case for its
convex counterpart. This paper aims to take one step in the direction of
disciplined nonconvex and nonsmooth optimization. In particular, we consider in
this paper some constrained nonconvex optimization models in block decision
variables, with or without coupled affine constraints. In the case of without
coupled constraints, we show a sublinear rate of convergence to an
$\epsilon$stationary solution in the form of variational inequality for a
generalized conditional gradient method, where the convergence rate is shown to
be dependent on the H\"olderian continuity of the gradient of the smooth part
of the objective. For the model with coupled affine constraints, we introduce
corresponding $\epsilon$stationarity conditions, and apply two proximaltype
variants of the ADMM to solve such a model, assuming the proximal ADMM updates
can be implemented for all the block variables except for the last block, for
which either a gradient step or a majorizationminimization step is
implemented. We show an iteration complexity bound of $O(1/\epsilon^2)$ to
reach an $\epsilon$stationary solution for both algorithms. Moreover, we show
that the same iteration complexity of a proximal BCD method follows
immediately. Numerical results are provided to illustrate the efficacy of the
proposed algorithms for tensor robust PCA.

The alternating direction method of multipliers (ADMM) has been successfully
applied to solve structured convex optimization problems due to its superior
practical performance. The convergence properties of the 2block ADMM have been
studied extensively in the literature. Specifically, it has been proven that
the 2block ADMM globally converges for any penalty parameter $\gamma>0$. In
this sense, the 2block ADMM allows the parameter to be free, i.e., there is no
need to restrict the value for the parameter when implementing this algorithm
in order to ensure convergence. However, for the 3block ADMM, Chen \etal
\cite{Chenadmmfailure2013} recently constructed a counterexample showing
that it can diverge if no further condition is imposed. The existing results on
studying further sufficient conditions on guaranteeing the convergence of the
3block ADMM usually require $\gamma$ to be smaller than a certain bound, which
is usually either difficult to compute or too small to make it a practical
algorithm. In this paper, we show that the 3block ADMM still globally
converges with any penalty parameter $\gamma>0$ if the third function $f_3$ in
the objective is smooth and strongly convex, and its condition number is in
$[1,1.0798)$, besides some other mild conditions. This requirement covers an
important class of problems to be called regularized least squares
decomposition (RLSD) in this paper.

In this paper we propose a unified twophase scheme for convex optimization
to accelerate: (1) the adaptive cubic regularization methods with exact/inexact
Hessian matrices, and (2) the adaptive gradient method, without any knowledge
of the Lipschitz constants for the gradient or the Hessian. This is achieved by
tuning the parameters used in the algorithm $\textit{adaptively}$ in its
process of progression, which can be viewed as a relaxation over the existing
algorithms in the literature. Under the assumption that the subproblems can be
solved approximately, we establish overall iteration complexity bounds for
three newly proposed algorithms to obtain an $\epsilon$approximate solution.
Specifically, we show that the adaptive cubic regularization methods with the
exact/inexact Hessian matrix both achieve an iteration complexity in the order
of $O\left( 1 / \epsilon^{1/3} \right)$, which matches that of the original
accelerated cubic regularization method presented in
[Nesterov2008Accelerating] assuming the availability of the exact Hessian
information and the Lipschitz constants, and that the subproblems are solved
to optimality. Under the same twophase adaptive acceleration framework, the
gradient method achieves an iteration complexity in the order $O\left( 1 /
\epsilon^{1/2} \right)$, which is known to be best possible (cf.
Nesterov2013Introductory). Our numerical experiment results show a clear
effect of acceleration displayed in the adaptive Newton method with cubic
regularization on a set of regularized logistic regression instances.

In this paper we study nonconvex and nonsmooth multiblock optimization over
Riemannian manifolds with coupled linear constraints. Such optimization
problems naturally arise from machine learning, statistical learning,
compressive sensing, image processing, and tensor PCA, among others. We develop
an ADMMlike primaldual approach based on decoupled solvable subroutines such
as linearized proximal mappings. First, we introduce the optimality conditions
for the aforementioned optimization models. Then, the notion of
$\epsilon$stationary solutions is introduced as a result. The main part of the
paper is to show that the proposed algorithms enjoy an iteration complexity of
$O(1/\epsilon^2)$ to reach an $\epsilon$stationary solution. For prohibitively
largesize tensor or machine learning models, we present a samplingbased
stochastic algorithm with the same iteration complexity bound in expectation.
In case the subproblems are not analytically solvable, a feasible curvilinear
linesearch variant of the algorithm based on retraction operators is proposed.
Finally, we show specifically how the algorithms can be implemented to solve a
variety of practical problems such as the NPhard maximum bisection problem,
the $\ell_q$ regularized sparse tensor principal component analysis and the
community detection problem. Our preliminary numerical results show great
potentials of the proposed methods.

In this paper, we consider the community detection problem under either the
stochastic block model (SBM) assumption or the degreecorrelated stochastic
block model (DCSBM) assumption. The modularity maximization formulation for the
community detection problem is NPhard in general. In this paper, we propose a
sparse and lowrank completely positive relaxation for the modularity
maximization problem, we then develop an efficient rowbyrow (RBR) type block
coordinate descent (BCD) algorithm to solve the relaxation and prove an
$\mathcal{O}(1/\sqrt{N})$ convergence rate to a stationary point where $N$ is
the number of iterations. A fast rounding scheme is constructed to retrieve the
community structure from the solution. Nonasymptotic high probability bounds
on the misclassification rate are established to justify our approach. We
further develop an asynchronous parallel RBR algorithm to speed up the
convergence. Extensive numerical experiments on both synthetic and real world
networks show that the proposed approach enjoys advantages in both clustering
accuracy and numerical efficiency. Our numerical results indicate that the
newly proposed method is a quite competitive alternative for community
detection on sparse networks with over 50 million nodes.

In this paper, we propose a vector transportfree stochastic variance reduced
gradient (SVRG) method with general retraction for empirical risk minimization
over Riemannian manifold. Existing SVRG methods on manifold usually consider a
specific retraction operation, and involve additional computational costs such
as parallel transport or vector transport. The vector transportfree SVRG with
general retraction we propose in this paper handles general retraction
operations, and do not need additional computational costs mentioned above. As
a result, we name our algorithm SSVRG, where the first "S" means simple. We
analyze the iteration complexity of SSVRG for obtaining an
$\epsilon$stationary point and its local linear convergence by assuming the \L
ojasiewicz inequality, which naturally holds for PCA and holds with high
probability for matrix completion problem. We also incorporate the
BarzilaiBorwein step size and design a very practical SSVRGBB method.
Numerical results on PCA and matrix completion problems are reported to
demonstrate the efficiency of our methods.

Block Coordinate Update (BCU) methods enjoy low perupdate computational
complexity because every time only one or a few block variables would need to
be updated among possibly a large number of blocks. They are also easily
parallelized and thus have been particularly popular for solving problems
involving largescale dataset and/or variables. In this paper, we propose a
primaldual BCU method for solving linearly constrained convex program in
multiblock variables. The method is an accelerated version of a primaldual
algorithm proposed by the authors, which applies randomization in selecting
block variables to update and establishes an $O(1/t)$ convergence rate under
weak convexity assumption. We show that the rate can be accelerated to
$O(1/t^2)$ if the objective is strongly convex. In addition, if one block
variable is independent of the others in the objective, we then show that the
algorithm can be modified to achieve a linear rate of convergence. The
numerical experiments show that the accelerated method performs stably with a
single set of parameters while the original method needs to tune the parameters
for different datasets in order to achieve a comparable level of performance.

In this paper we propose a randomized primaldual proximal block coordinate
updating framework for a general multiblock convex optimization model with
coupled objective function and linear constraints. Assuming mere convexity, we
establish its $O(1/t)$ convergence rate in terms of the objective value and
feasibility measure. The framework includes several existing algorithms as
special cases such as a primaldual method for bilinear saddlepoint problems
(PDS), the proximal Jacobian ADMM (ProxJADMM) and a randomized variant of the
ADMM method for multiblock convex optimization. Our analysis recovers and/or
strengthens the convergence properties of several existing algorithms. For
example, for PDS our result leads to the same order of convergence rate
without the previously assumed boundedness condition on the constraint sets,
and for ProxJADMM the new result provides convergence rate in terms of the
objective value and the feasibility violation. It is well known that the
original ADMM may fail to converge when the number of blocks exceeds two. Our
result shows that if an appropriate randomization procedure is invoked to
select the updating blocks, then a sublinear rate of convergence in expectation
can be guaranteed for multiblock ADMM, without assuming any strong convexity.
The new approach is also extended to solve problems where only a stochastic
approximation of the (sub)gradient of the objective is available, and we
establish an $O(1/\sqrt{t})$ convergence rate of the extended approach for
solving stochastic programming.

In [13], Hillar and Lim famously demonstrated that "multilinear (tensor)
analogues of many efficiently computable problems in numerical linear algebra
are NPhard". Despite many recent advancements, the stateoftheart methods
for computing such `tensor analogues' still suffer severely from the curse of
dimensionality. In this paper we show that the Tucker core of a tensor however,
retains many properties of the original tensor, including the CP rank, the
border rank, the tensor Schatten quasi norms, and the Zeigenvalues. When the
core tensor is smaller than the original tensor, this property leads to
considerable computational advantages as confirmed by our numerical
experiments. In our analysis, we in fact work with a generalized Tuckerlike
decomposition that can accommodate any full columnrank factor matrices.

In this paper we study multivariate polynomial functions in complex variables
and the corresponding associated symmetric tensor representations. The focus is
on finding conditions under which such complex polynomials/tensors always take
real values. We introduce the notion of symmetric conjugate forms and general
conjugate forms, and present characteristic conditions for such complex
polynomials to be realvalued. As applications of our results, we discuss the
relation between nonnegative polynomials and sums of squares in the context of
complex polynomials. Moreover, new notions of eigenvalues/eigenvectors for
complex tensors are introduced, extending properties from the Hermitian
matrices. Finally, we discuss an important property for symmetric tensors,
which states that the largest absolute value of eigenvalue of a symmetric real
tensor is equal to its largest singular value; the result is known as Banach's
theorem. We show that a similar result holds in the complex case as well.

The alternating direction method of multipliers (ADMM) is widely used in
solving structured convex optimization problems. Despite of its success in
practice, the convergence properties of the standard ADMM for minimizing the
sum of $N$ $(N\geq 3)$ convex functions with $N$ block variables linked by
linear constraints, have remained unclear for a very long time. In this paper,
we present convergence and convergence rate results for the standard ADMM
applied to solve $N$block $(N\geq 3)$ convex minimization problem, under the
condition that one of these functions is convex (not necessarily strongly
convex) and the other $N1$ functions are strongly convex. Specifically, in
that case the ADMM is proven to converge with rate $O(1/t)$ in a certain
ergodic sense, and $o(1/t)$ in nonergodic sense, where $t$ denotes the number
of iterations. As a byproduct, we also provide a simple proof for the $O(1/t)$
convergence rate of twoblock ADMM in terms of both objective error and
constraint violation, without assuming any condition on the penalty parameter
and strong convexity on the functions.

In this paper, we consider the problem of minimizing the sum of two convex
functions subject to linear linking constraints. The classical alternating
direction type methods usually assume that the two convex functions have
relatively easy proximal mappings. However, many problems arising from
statistics, image processing and other fields have the structure that while one
of the two functions has easy proximal mapping, the other function is smoothly
convex but does not have an easy proximal mapping. Therefore, the classical
alternating direction methods cannot be applied. To deal with the difficulty,
we propose in this paper an alternating direction method based on
extragradients. Under the assumption that the smooth function has a Lipschitz
continuous gradient, we prove that the proposed method returns an
$\epsilon$optimal solution within $O(1/\epsilon)$ iterations. We apply the
proposed method to solve a new statistical model called fused logistic
regression. Our numerical experiments show that the proposed method performs
very well when solving the test problems. We also test the performance of the
proposed method through solving the lasso problem arising from statistics and
compare the result with several existing efficient solvers for this problem;
the results are very encouraging indeed.

The alternating direction method of multipliers (ADMM) has been widely used
for solving structured convex optimization problems. In particular, the ADMM
can solve convex programs that minimize the sum of $N$ convex functions with
$N$block variables linked by some linear constraints. While the convergence of
the ADMM for $N=2$ was well established in the literature, it remained an open
problem for a long time whether or not the ADMM for $N \ge 3$ is still
convergent. Recently, it was shown in [3] that without further conditions the
ADMM for $N\ge 3$ may actually fail to converge. In this paper, we show that
under some easily verifiable and reasonable conditions the global linear
convergence of the ADMM when $N\geq 3$ can still be assured, which is important
since the ADMM is a popular method for solving large scale multiblock
optimization models and is known to perform very well in practice even when
$N\ge 3$. Our study aims to offer an explanation for this phenomenon.

The alternating direction method of multipliers (ADMM) is widely used in
solving structured convex optimization problems due to its superior practical
performance. On the theoretical side however, a counterexample was shown in [7]
indicating that the multiblock ADMM for minimizing the sum of $N$ $(N\geq 3)$
convex functions with $N$ block variables linked by linear constraints may
diverge. It is therefore of great interest to investigate further sufficient
conditions on the input side which can guarantee convergence for the
multiblock ADMM. The existing results typically require the strong convexity
on parts of the objective. In this paper, we present convergence and
convergence rate results for the multiblock ADMM applied to solve certain
$N$block $(N\geq 3)$ convex minimization problems without requiring strong
convexity. Specifically, we prove the following two results: (1) the
multiblock ADMM returns an $\epsilon$optimal solution within
$O(1/\epsilon^2)$ iterations by solving an associated perturbation to the
original problem; (2) the multiblock ADMM returns an $\epsilon$optimal
solution within $O(1/\epsilon)$ iterations when it is applied to solve a
certain sharing problem, under the condition that the augmented Lagrangian
function satisfies the KurdykaLojasiewicz property, which essentially covers
most convex optimization models except for some pathological cases.

We establish local linear convergence bounds for the ISTA and FISTA
iterations on the model LASSO problem. We show that FISTA can be viewed as an
accelerated ISTA process. Using a spectral analysis, we show that, when close
enough to the solution, both iterations converge linearly, but FISTA slows down
compared to ISTA, making it advantageous to switch to ISTA toward the end of
the iteration processs. We illustrate the results with some synthetic numerical
examples.

In this paper we study proximal conditionalgradient (CG) and proximal
gradientprojection type algorithms for a blockstructured constrained
nonconvex optimization model, which arises naturally from tensor data analysis.
First, we introduce a new notion of $\epsilon$stationarity, which is suitable
for the structured problem under consideration. %, compared with other similar
solution concepts. We then propose two types of firstorder algorithms for the
model based on the proximal conditionalgradient (CG) method and the proximal
gradientprojection method respectively. If the nonconvex objective function is
in the form of mathematical expectation, we then discuss how to incorporate
randomized sampling to avoid computing the expectations exactly. For the
general block optimization model, the proximal subroutines are performed for
each block according to either the blockcoordinatedescent (BCD) or the
maximumblockimprovement (MBI) updating rule. If the gradient of the nonconvex
part of the objective $f$ satisfies $\ \nabla f(x)  \nabla f(y)\_q \le M
\xy\_p^\delta$ where $\delta=p/q$ with $1/p+1/q=1$, then we prove that the
new algorithms have an overall iteration complexity bound of $O(1/\epsilon^q)$
in finding an $\epsilon$stationary solution. If $f$ is concave then the
iteration complexity reduces to $O(1/\epsilon)$. Our numerical experiments for
tensor approximation problems show promising performances of the new solution
algorithms.

The composite $L_q~(0<q<1)$ minimization problem over a general polyhedron
has received various applications in machine learning, wireless communications,
image restoration, signal reconstruction, etc. This paper aims to provide a
theoretical study on this problem. Firstly, we show that for any fixed $0<q<1$,
finding the global minimizer of the problem, even its unconstrained
counterpart, is strongly NPhard. Secondly, we derive KarushKuhnTucker (KKT)
optimality conditions for local minimizers of the problem. Thirdly, we propose
a smoothing sequential quadratic programming framework for solving this
problem. The framework requires a (approximate) solution of a convex quadratic
program at each iteration. Finally, we analyze the worstcase iteration
complexity of the framework for returning an $\epsilon$KKT point; i.e., a
feasible point that satisfies a perturbed version of the derived KKT optimality
conditions. To the best of our knowledge, the proposed framework is the first
one with a worstcase iteration complexity guarantee for solving composite
$L_q$ minimization over a general polyhedron.

This paper is concerned with the computation of the principal components for
a general tensor, known as the tensor principal component analysis (PCA)
problem. We show that the general tensor PCA problem is reducible to its
special case where the tensor in question is supersymmetric with an even
degree. In that case, the tensor can be embedded into a symmetric matrix. We
prove that if the tensor is rankone, then the embedded matrix must be rankone
too, and vice versa. The tensor PCA problem can thus be solved by means of
matrix optimization under a rankone constraint, for which we propose two
solution methods: (1) imposing a nuclear norm penalty in the objective to
enforce a lowrank solution; (2) relaxing the rankone constraint by
Semidefinite Programming. Interestingly, our experiments show that both methods
yield a rankone solution with high probability, thereby solving the original
tensor PCA problem to optimality with high probability. To further cope with
the size of the resulting convex optimization models, we propose to use the
alternating direction method of multipliers, which reduces significantly the
computational efforts. Various extensions of the model are considered as well.

In this paper we study the relationship between the optimal value of a
homogeneous quadratic optimization problem and that of its Semidefinite
Programming (SDP) relaxation. We consider two quadratic optimization models:
(1) $\min \{x^* C x \mid x^* A_k x \ge 1, x\in\mathbb{F}^n, k=0,1,...,m\}$; and
(2) $\max \{x^* C x \mid x^* A_k x \le 1, x\in\mathbb{F}^n, k=0,1,...,m\}$. If
\emph{one} of $A_k$'s is indefinite while others and $C$ are positive
semidefinite, we prove that the ratio between the optimal value of (1) and its
SDP relaxation is upper bounded by $O(m^2)$ when $\mathbb{F}$ is the real line
$\mathbb{R}$, and by $O(m)$ when $\mathbb{F}$ is the complex plane
$\mathbb{C}$. This result is an extension of the recent work of Luo {\em et
al.} \cite{LSTZ}. For (2), we show that the same ratio is bounded from below by
$O(1/\log m)$ for both the real and complex case, whenever all but one of
$A_k$'s are positive semidefinite while $C$ can be indefinite. This result
improves the socalled approximate SLemma of BenTal {\em et al.}
\cite{BNR02}. We also consider (2) with multiple indefinite quadratic
constraints and derive a general bound in terms of the problem data and the SDP
solution. Throughout the paper, we present examples showing that all of our
results are essentially tight.