• We consider maximum likelihood estimation for Gaussian Mixture Models (Gmms). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM by appealing to the rich Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop (Riemannian) batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods.
  • We revisit the task of learning a Euclidean metric from data. We approach this problem from first principles and formulate it as a surprisingly simple optimization problem. Indeed, our formulation even admits a closed form solution. This solution possesses several very attractive properties: (i) an innate geometric appeal through the Riemannian geometry of positive definite matrices; (ii) ease of interpretability; and (iii) computational speed several orders of magnitude faster than the widely used LMNN and ITML methods. Furthermore, on standard benchmark datasets, our closed-form solution consistently attains higher classification accuracy.
  • We study modeling and inference with the Elliptical Gamma Distribution (EGD). We consider maximum likelihood (ML) estimation for EGD scatter matrices, a task for which we develop new fixed-point algorithms. Our algorithms are efficient and converge to global optima despite nonconvexity. Moreover, they turn out to be much faster than both a well-known iterative algorithm of Kent & Tyler (1991) and sophisticated manifold optimization algorithms. Subsequently, we invoke our ML algorithms as subroutines for estimating parameters of a mixture of EGDs. We illustrate our methods by applying them to model natural image statistics---the proposed EGD mixture model yields the most parsimonious model among several competing approaches.
  • Mixture models are powerful statistical models used in many applications ranging from density estimation to clustering and classification. When dealing with mixture models, there are many issues that the experimenter should be aware of and needs to solve. The MixEst toolbox is a powerful and user-friendly package for MATLAB that implements several state-of-the-art approaches to address these problems. Additionally, MixEst gives the possibility of using manifold optimization for fitting the density model, a feature specific to this toolbox. MixEst simplifies using and integration of mixture models in statistical models and applications. For developing mixture models of new densities, the user just needs to provide a few functions for that statistical distribution and the toolbox takes care of all the issues regarding mixture models. MixEst is available at visionlab.ut.ac.ir/mixest and is fully documented and is licensed under GPL.
  • We take a new look at parameter estimation for Gaussian Mixture Models (GMMs). In particular, we propose using \emph{Riemannian manifold optimization} as a powerful counterpart to Expectation Maximization (EM). An out-of-the-box invocation of manifold optimization, however, fails spectacularly: it converges to the same solution but vastly slower. Driven by intuition from manifold convexity, we then propose a reparamerization that has remarkable empirical consequences. It makes manifold optimization not only match EM---a highly encouraging result in itself given the poor record nonlinear programming methods have had against EM so far---but also outperform EM in many practical settings, while displaying much less variability in running times. We further highlight the strengths of manifold optimization by developing a somewhat tuned manifold LBFGS method that proves even more competitive and reliable than existing manifold optimization tools. We hope that our results encourage a wider consideration of manifold optimization for parameter estimation problems.
  • We develop \emph{geometric optimisation} on the manifold of Hermitian positive definite (HPD) matrices. In particular, we consider optimising two types of cost functions: (i) geodesically convex (g-convex); and (ii) log-nonexpansive (LN). G-convex functions are nonconvex in the usual euclidean sense, but convex along the manifold and thus allow global optimisation. LN functions may fail to be even g-convex, but still remain globally optimisable due to their special structure. We develop theoretical tools to recognise and generate g-convex functions as well as cone theoretic fixed-point optimisation algorithms. We illustrate our techniques by applying them to maximum-likelihood parameter estimation for elliptically contoured distributions (a rich class that substantially generalises the multivariate normal distribution). We compare our fixed-point algorithms with sophisticated manifold optimisation methods and obtain notable speedups.
  • GRavitational lEnsing Accuracy Testing 2010 (GREAT10) is a public image analysis challenge aimed at the development of algorithms to analyze astronomical images. Specifically, the challenge is to measure varying image distortions in the presence of a variable convolution kernel, pixelization and noise. This is the second in a series of challenges set to the astronomy, computer science and statistics communities, providing a structured environment in which methods can be improved and tested in preparation for planned astronomical surveys. GREAT10 extends upon previous work by introducing variable fields into the challenge. The "Galaxy Challenge" involves the precise measurement of galaxy shape distortions, quantified locally by two parameters called shear, in the presence of a known convolution kernel. Crucially, the convolution kernel and the simulated gravitational lensing shape distortion both now vary as a function of position within the images, as is the case for real data. In addition, we introduce the "Star Challenge" that concerns the reconstruction of a variable convolution kernel, similar to that in a typical astronomical observation. This document details the GREAT10 Challenge for potential participants. Continually updated information is also available from http://www.greatchallenges.info.
  • We present a probabilistic model for natural images which is based on Gaussian scale mixtures and a simple multiscale representation. In contrast to the dominant approach to modeling whole images focusing on Markov random fields, we formulate our model in terms of a directed graphical model. We show that it is able to generate images with interesting higher-order correlations when trained on natural images or samples from an occlusion based model. More importantly, the directed model enables us to perform a principled evaluation. While it is easy to generate visually appealing images, we demonstrate that our model also yields the best performance reported to date when evaluated with respect to the cross-entropy rate, a measure tightly linked to the average log-likelihood.
  • We present the results of the GREAT08 Challenge, a blind analysis challenge to infer weak gravitational lensing shear distortions from images. The primary goal was to stimulate new ideas by presenting the problem to researchers outside the shear measurement community. Six GREAT08 Team methods were presented at the launch of the Challenge and five additional groups submitted results during the 6 month competition. Participants analyzed 30 million simulated galaxies with a range in signal to noise ratio, point-spread function ellipticity, galaxy size, and galaxy type. The large quantity of simulations allowed shear measurement methods to be assessed at a level of accuracy suitable for currently planned future cosmic shear observations for the first time. Different methods perform well in different parts of simulation parameter space and come close to the target level of accuracy in several of these. A number of fresh ideas have emerged as a result of the Challenge including a re-examination of the process of combining information from different galaxies, which reduces the dependence on realistic galaxy modelling. The image simulations will become increasingly sophisticated in future GREAT challenges, meanwhile the GREAT08 simulations remain as a benchmark for additional developments in shear measurement algorithms.