• Semidefinite programs (SDPs) -- some of the most useful and versatile optimization problems of the last few decades -- are often pathological: the optimal values of the primal and dual problems may differ and may not be attained. Such SDPs are both theoretically interesting and often impossible to solve; yet, the pathological SDPs in the literature look strikingly similar. Based on our recent work \cite{Pataki:17} we characterize pathological semidefinite systems by certain {\em excluded matrices}, which are easy to spot in all published examples. Our main tool is a normal (canonical) form of semidefinite systems, which makes their pathological behavior easy to verify. The normal form is constructed in a surprisingly simple fashion, using mostly elementary row operations inherited from Gaussian elimination. The proofs are elementary and can be followed by a reader at the advanced undergraduate level. As a byproduct, we show how to transform any linear map acting on symmetric matrices into a normal form, which allows us to quickly check whether the image of the semidefinite cone under the map is closed. We can thus introduce readers to a fundamental issue in convex analysis: the linear image of a closed convex set may not be closed, and often simple conditions are available to verify the closedness, or lack of it.
  • In conic linear programming -- in contrast to linear programming -- the Lagrange dual is not an exact dual: it may not attain its optimal value, or there may be a positive duality gap. The corresponding Farkas' lemma is also not exact (it does not always prove infeasibility). We describe exact duals, and certificates of infeasibility and weak infeasibility for conic LPs which are nearly as simple as the Lagrange dual, but do not rely on any constraint qualification. Some of our exact duals generalize the SDP duals of Ramana, and Klep and Schweighofer to the context of general conic LPs. Some of our infeasibility certificates generalize the row echelon form of a linear system of equations: they consist of a small, trivially infeasible subsystem obtained by elementary row operations. We prove analogous results for weakly infeasible systems. We obtain some fundamental geometric corollaries: an exact characterization of when the linear image of a closed convex cone is closed, and an exact characterization of nice cones. Our infeasibility certificates provide algorithms to generate {\em all} infeasible conic LPs over several important classes of cones; and {\em all} weakly infeasible SDPs in a natural class. Using these algorithms we generate a public domain library of infeasible and weakly infeasible SDPs. The status of our instances can be verified by inspection in exact arithmetic, but they turn out to be challenging for commercial and research codes.
  • Conic linear programs, among them semidefinite programs, often behave pathologically: the optimal values of the primal and dual programs may differ, and may not be attained. We present a novel analysis of these pathological behaviors. We call a conic linear system $Ax <= b$ {\em badly behaved} if the value of $\sup { < c, x > | A x <= b }$ is finite but the dual program has no solution with the same value for {\em some} $c.$ We describe simple and intuitive geometric characterizations of badly behaved conic linear systems. Our main motivation is the striking similarity of badly behaved semidefinite systems in the literature; we characterize such systems by certain {\em excluded matrices}, which are easy to spot in all published examples. We show how to transform semidefinite systems into a canonical form, which allows us to easily verify whether they are badly behaved. We prove several other structural results about badly behaved semidefinite systems; for example, we show that they are in $NP \cap co-NP$ in the real number model of computing. As a byproduct, we prove that all linear maps that act on symmetric matrices can be brought into a canonical form; this canonical form allows us to easily check whether the image of the semidefinite cone under the given linear map is closed.
  • We propose a very simple preprocessing algorithm for semidefinite programming. Our algorithm inspects the constraints of the problem, deletes redundant rows and columns in the constraints, and reduces the size of the variable matrix. It often detects infeasibility. Our algorithm does not rely on any optimization solver: the only subroutine it needs is Cholesky factorization, hence it can be implemented with a few lines of code in machine precision. We present computational results on a set of problems arising mostly from polynomial optimization.
  • In semidefinite programming (SDP), unlike in linear programming, Farkas' lemma may fail to prove infeasibility. Here we obtain an exact, short certificate of infeasibility in SDP by an elementary approach: we reformulate any semidefinite system of the form Ai*X = bi (i=1,...,m) (P) X >= 0 using only elementary row operations, and rotations. When (P) is infeasible, the reformulated system is trivially infeasible. When (P) is feasible, the reformulated system has strong duality with its Lagrange dual for all objective functions. As a corollary, we obtain algorithms to generate the constraints of {\em all} infeasible SDPs and the constraints of {\em all} feasible SDPs with a fixed rank maximal solution.
  • The facial reduction algorithm of Borwein and Wolkowicz and the extended dual of Ramana provide a strong dual for the conic linear program $$ (P) \sup {<c, x> | Ax \leq_K b} $$ in the absence of any constraint qualification. The facial reduction algorithm solves a sequence of auxiliary optimization problems to obtain such a dual. Ramana's dual is applicable when (P) is a semidefinite program (SDP) and is an explicit SDP itself. Ramana, Tuncel, and Wolkowicz showed that these approaches are closely related; in particular, they proved the correctness of Ramana's dual using certificates from a facial reduction algorithm. Here we give a clear and self-contained exposition of facial reduction, of extended duals, and generalize Ramana's dual: -- we state a simple facial reduction algorithm and prove its correctness; and -- building on this algorithm we construct a family of extended duals when $K$ is a {\em nice} cone. This class of cones includes the semidefinite cone and other important cones.
  • A closed convex cone K is called nice, if the set K^* + F^\perp is closed for all F faces of K, where K^* is the dual cone of K, and F^\perp is the orthogonal complement of the linear span of F. The niceness property is important for two reasons: it plays a role in the facial reduction algorithm of Borwein and Wolkowicz, and the question whether the linear image of a nice cone is closed also has a simple answer. We prove several characterizations of nice cones and show a strong connection with facial exposedness. We prove that a nice cone must be facially exposed; in reverse, facial exposedness with an added condition implies niceness. We conjecture that nice, and facially exposed cones are actually the same, and give supporting evidence.
  • Object Oriented Data Analysis is a new area in statistics that studies populations of general data objects. In this article we consider populations of tree-structured objects as our focus of interest. We develop improved analysis tools for data lying in a binary tree space analogous to classical Principal Component Analysis methods in Euclidean space. Our extensions of PCA are analogs of one dimensional subspaces that best fit the data. Previous work was based on the notion of tree-lines. In this paper, a generalization of the previous tree-line notion is proposed: k-tree-lines. Previously proposed tree-lines are k-tree-lines where k=1. New sub-cases of k-tree-lines studied in this work are the 2-tree-lines and tree-curves, which explain much more variation per principal component than tree-lines. The optimal principal component tree-lines were computable in linear time. Because 2-tree-lines and tree-curves are more complex, they are computationally more expensive, but yield improved data analysis results. We provide a comparative study of all these methods on a motivating data set consisting of brain vessel structures of 98 subjects.
  • This study introduces a new method of visualizing complex tree structured objects. The usefulness of this method is illustrated in the context of detecting unexpected features in a data set of very large trees. The major contribution is a novel two-dimensional graphical representation of each tree, with a covariate coded by color. The motivating data set contains three dimensional representations of brain artery systems of 105 subjects. Due to inaccuracies inherent in the medical imaging techniques, issues with the reconstruction algo- rithms and inconsistencies introduced by manual adjustment, various discrepancies are present in the data. The proposed representation enables quick visual detection of the most common discrepancies. For our driving example, this tool led to the modification of 10% of the artery trees and deletion of 6.7%. The benefits of our cleaning method are demonstrated through a statistical hypothesis test on the effects of aging on vessel structure. The data cleaning resulted in improved significance levels.
  • We propose a very simple preconditioning method for integer programming feasibility problems: replacing the problem b' <= Ax <= b, x \in Z^n with b' <= AUy <= b, y \in Z^n, where U is a unimodular matrix computed via basis reduction, to make the columns of $AU$ short and nearly orthogonal. The reformulation is called rangespace reformulation. It is motivated by the reformulation technique proposed for equality constrained IPs by Aardal, Hurkens and Lenstra. We also study a family of IP instances, called decomposable knapsack problems (DKPs). DKPs generalize the instances proposed by Jeroslow, Chvatal and Todd, Avis, Aardal and Lenstra, and Cornuejols et al. DKPs are knapsack problems with a constraint vector of the form $pM + r, $ with $p >0$ and $r$ integral vectors, and $M$ a large integer. If the parameters are suitably chosen in DKPs, we prove 1) hardness results for these problems, when branch-and-bound branching on individual variables is applied; 2) that they are easy, if one branches on the constraint $px$ instead; and 3) that branching on the last few variables in either the rangespace- or the AHL-reformulations is equivalent to branching on $px$ in the original problem. We also provide recipes to generate such instances. Our computational study confirms that the behavior of the studied instances in practice is as predicted by the theoretical results.
  • The classical branch-and-bound algorithm for the integer feasibility problem has exponential worst case complexity. We prove that it is surprisingly efficient on reformulated problems, in which the columns of the constraint matrix are short, and near orthogonal, i.e. a reduced basis of the generated lattice; when the entries of A (the dense part of the constraint matrix) are from {1, ..., M} for a large enough M, branch-and-bound solves almost all reformulated instances at the rootnode. We also prove an upper bound on the width of the reformulations along the last unit vector. The analysis builds on the ideas of Furst and Kannan to bound the number of integral matrices for which the shortest vectors of certain lattices are long, and also uses a bound on the size of the branch-and-bound tree based on the norms of the Gram-Schmidt vectors of the constraint matrix. We explore practical aspects of these results. First, we compute numerical values of M which guarantee that 90, and 99 percent of the reformulated problems solve at the root: these turn out to be surprisingly small when the problem size is moderate. Second, we confirm with a computational study that random integer programs become easier, as the coefficients grow.
  • The active field of Functional Data Analysis (about understanding the variation in a set of curves) has been recently extended to Object Oriented Data Analysis, which considers populations of more general objects. A particularly challenging extension of this set of ideas is to populations of tree-structured objects. We develop an analog of Principal Component Analysis for trees, based on the notion of tree-lines, and propose numerically fast (linear time) algorithms to solve the resulting optimization problems. The solutions we obtain are used in the analysis of a data set of 73 individuals, where each data object is a tree of blood vessels in one person's brain.
  • We prove that the subset sum problem has a polynomial time computable certificate of infeasibility for all $a$ weight vectors with density at most $1/(2n)$ and for almost all integer right hand sides. The certificate is branching on a hyperplane, i.e. by a methodology dual to the one explored by Lagarias and Odlyzko; Frieze; Furst and Kannan; and Coster et. al. The proof has two ingredients. We first prove that a vector that is near parallel to $a$ is a suitable branching direction, regardless of the density. Then we show that for a low density $a$ such a near parallel vector can be computed using diophantine approximation, via a methodology introduced by Frank and Tardos. We also show that there is a small number of long intervals whose disjoint union covers the integer right hand sides, for which the infeasibility is proven by branching on the above hyperplane.
  • We show that in a knapsack feasibility problem an integral vector $p$, which is short, and near parallel to the constraint vector gives a branching direction with small integer width. We use this result to analyze two computationally efficient reformulation techniques on low density knapsack problems. Both reformulations have a constraint matrix with columns reduced in the sense of Lenstra, Lenstra, and Lov\'asz. We prove an upper bound on the integer width along the last variable, which becomes 1, when the density is sufficiently small. In the proof we extract from the transformation matrices a vector which is near parallel to the constraint vector $a.$ The near parallel vector is a good branching direction in the original knapsack problem, and this transfers to the last variable in the reformulations.
  • We prove several inequalities on the determinants of sublattices in LLL-reduced bases. They generalize the inequalities on the length of the shortest vector proven by Lenstra, Lenstra, and Lovasz, and show that LLL-reduction finds not only a short vector, but more generally, sublattices with small determinants. We also prove new upper bounds on the product of the norms of the first few vectors.