
Sparse principal component analysis (SPCA) has emerged as a powerful
technique for modern data analysis. We discuss a robust and scalable algorithm
for computing sparse principal component analysis. Specifically, we model SPCA
as a matrix factorization problem with orthogonality constraints, and develop
specialized optimization algorithms that partially minimize a subset of the
variables (variable projection). The framework incorporates a wide variety of
sparsityinducing regularizers for SPCA. We also extend the variable projection
approach to robust SPCA, for any robust loss that can be expressed as the
Moreau envelope of a simple function, with the canonical example of the Huber
loss. Finally, randomized methods for linear algebra are used to extend the
approach to the largescale (big data) setting. The proposed algorithms are
demonstrated using both synthetic and real world data.

Topological data analysis (TDA) has emerged as one of the most promising
techniques to reconstruct the unknown shapes of highdimensional spaces from
observed data samples. TDA, thus, yields key shape descriptors in the form of
persistent topological features that can be used for any supervised or
unsupervised learning task, including multiway classification. Sparse
sampling, on the other hand, provides a highly efficient technique to
reconstruct signals in the spatialtemporal domain from just a few
carefullychosen samples. Here, we present a new method, referred to as the
SparseTDA algorithm, that combines favorable aspects of the two techniques.
This combination is realized by selecting an optimal set of sparse pixel
samples from the persistent features generated by a vectorbased TDA algorithm.
These sparse samples are selected from a lowrank matrix representation of
persistent features using QR pivoting. We show that the SparseTDA method
demonstrates promising performance on three benchmark problems related to human
posture recognition and image texture classification.

Optimal sensor placement is a central challenge in the design, prediction,
estimation, and control of highdimensional systems. Highdimensional states
can often leverage a latent lowdimensional representation, and this inherent
compressibility enables sparse sensing. This article explores optimized sensor
placement for signal reconstruction based on a tailored library of features
extracted from training data. Sparse point sensors are discovered using the
singular value decomposition and QR pivoting, which are two ubiquitous matrix
computations that underpin modern linear dimensionality reduction. Sparse
sensing in a tailored basis is contrasted with compressed sensing, a universal
signal recovery method in which an unknown signal is reconstructed via a sparse
representation in a universal basis. Although compressed sensing can recover a
wider class of signals, we demonstrate the benefits of exploiting known
patterns in data with optimized sensing. In particular, drastic reductions in
the required number of sensors and improved reconstruction are observed in
examples ranging from facial images to fluid vorticity fields. Principled
sensor placement may be critically enabling when sensors are costly and
provides faster state estimation for lowlatency, highbandwidth control.
MATLAB code is provided for all examples.

The CANDECOMP/PARAFAC (CP) tensor decomposition is a popular
dimensionalityreduction method for multiway data. Dimensionality reduction is
often sought since many highdimensional tensors have low intrinsic rank
relative to the dimension of the ambient measurement space. However, the
emergence of `big data' poses significant computational challenges for
computing this fundamental tensor decomposition. Leveraging modern randomized
algorithms, we demonstrate that the coherent structure can be learned from a
smaller representation of the tensor in a fraction of the time. Moreover, the
highdimensional signal can be faithfully approximated from the compressed
measurements. Thus, this simple but powerful algorithm enables one to compute
the approximate CP decomposition even for massive tensors. The approximation
error can thereby be controlled via oversampling and the computation of power
iterations. In addition to theoretical results, several empirical results
demonstrate the performance of the proposed algorithm.

This paper addresses the problem of identifying different flow environments
from sparse data collected by wing strain sensors. Insects regularly perform
this feat using a sparse ensemble of noisy strain sensors on their wing. First,
we obtain strain data from numerical simulation of a Manduca sexta hawkmoth
wing undergoing different flow environments. Our datadriven method learns
lowdimensional strain features originating from different aerodynamic
environments using proper orthogonal decomposition (POD) modes in the frequency
domain, and leverages sparse approximation to classify a set of strain
frequency signatures using a dictionary of POD modes. This bioinspired machine
learning architecture for dictionary learning and sparse classification permits
fewer costly physical strain sensors while being simultaneously robust to
sensor noise. A measurement selection algorithm identifies frequencies that
best discriminate the different aerodynamic environments in lowrank POD
feature space. In this manner, sparse and noisy wing strain data can be
exploited to robustly identify different aerodynamic environments encountered
in flight, providing insight into the stereotyped placement of neurons that act
as strain sensors on a Manduca sexta hawkmoth wing.