
This paper presents a randomized algorithm for computing the nearoptimal
lowrank dynamic mode decomposition (DMD). Randomized algorithms are emerging
techniques to compute lowrank matrix approximations at a fraction of the cost
of deterministic algorithms, easing the computational challenges arising in the
area of `big data'. The idea is to derive a small matrix from the
highdimensional data, which is then used to efficiently compute the dynamic
modes and eigenvalues. The algorithm is presented in a modular probabilistic
framework, and the approximation quality can be controlled via oversampling and
power iterations. The effectiveness of the resulting randomized DMD (rDMD)
algorithm is demonstrated on several benchmark examples of increasing
complexity, providing an accurate and efficient approach to extract
spatiotemporal coherent structures from big data in a framework that scales
with the intrinsic rank of the data, rather than the ambient measurement
dimension.

Nonnegative matrix factorization (NMF) is a powerful tool for data mining.
However, the emergence of `big data' has severely challenged our ability to
compute this fundamental decomposition using deterministic algorithms. This
paper presents a randomized hierarchical alternating least squares (HALS)
algorithm to compute the NMF. By deriving a smaller matrix from the nonnegative
input data, a more efficient nonnegative decomposition can be computed. Our
algorithm scales to big data applications while attaining a nearoptimal
factorization. The proposed algorithm is evaluated using synthetic and real
world data and shows substantial speedups compared to deterministic HALS.

Sparse principal component analysis (SPCA) has emerged as a powerful
technique for modern data analysis. We discuss a robust and scalable algorithm
for computing sparse principal component analysis. Specifically, we model SPCA
as a matrix factorization problem with orthogonality constraints, and develop
specialized optimization algorithms that partially minimize a subset of the
variables (variable projection). The framework incorporates a wide variety of
sparsityinducing regularizers for SPCA. We also extend the variable projection
approach to robust SPCA, for any robust loss that can be expressed as the
Moreau envelope of a simple function, with the canonical example of the Huber
loss. Finally, randomized methods for linear algebra are used to extend the
approach to the largescale (big data) setting. The proposed algorithms are
demonstrated using both synthetic and real world data.

Matrix decompositions are fundamental tools in the area of applied
mathematics, statistical computing, and machine learning. In particular,
lowrank matrix decompositions are vital, and widely used for data analysis,
dimensionality reduction, and data compression. Massive datasets, however, pose
a computational challenge for traditional algorithms, placing significant
constraints on both memory and processing power. Recently, the powerful concept
of randomness has been introduced as a strategy to ease the computational load.
The essential idea of probabilistic algorithms is to employ some amount of
randomness in order to derive a smaller matrix from a highdimensional data
matrix. The smaller matrix is then used to compute the desired lowrank
approximation. Such algorithms are shown to be computationally efficient for
approximating matrices with lowrank structure. We present the R package rsvd,
and provide a tutorial introduction to randomized matrix decompositions.
Specifically, randomized routines for the singular value decomposition,
(robust) principal component analysis, interpolative decomposition, and CUR
decomposition are discussed. Several examples demonstrate the routines, and
show the computational advantage over other methods implemented in R.

Diffusion maps are an emerging datadriven technique for nonlinear
dimensionality reduction, which are especially useful for the analysis of
coherent structures and nonlinear embeddings of dynamical systems. However, the
computational complexity of the diffusion maps algorithm scales with the number
of observations. Thus, long timeseries data presents a significant challenge
for fast and efficient embedding. We propose integrating the Nystr\"om method
with diffusion maps in order to ease the computational demand. We achieve a
speedup of roughly two to four times when approximating the dominant diffusion
map components.

The CANDECOMP/PARAFAC (CP) tensor decomposition is a popular
dimensionalityreduction method for multiway data. Dimensionality reduction is
often sought since many highdimensional tensors have low intrinsic rank
relative to the dimension of the ambient measurement space. However, the
emergence of `big data' poses significant computational challenges for
computing this fundamental tensor decomposition. Leveraging modern randomized
algorithms, we demonstrate that the coherent structure can be learned from a
smaller representation of the tensor in a fraction of the time. Moreover, the
highdimensional signal can be faithfully approximated from the compressed
measurements. Thus, this simple but powerful algorithm enables one to compute
the approximate CP decomposition even for massive tensors. The approximation
error can thereby be controlled via oversampling and the computation of power
iterations. In addition to theoretical results, several empirical results
demonstrate the performance of the proposed algorithm.

We introduce the method of compressed dynamic mode decomposition (cDMD) for
background modeling. The dynamic mode decomposition (DMD) is a regression
technique that integrates two of the leading data analysis methods in use
today: Fourier transforms and singular value decomposition. Borrowing ideas
from compressed sensing and matrix sketching, cDMD eases the computational
workload of high resolution video processing. The key principal of cDMD is to
obtain the decomposition on a (small) compressed matrix representation of the
video feed. Hence, the cDMD algorithm scales with the intrinsic rank of the
matrix, rather then the size of the actual video (data) matrix. Selection of
the optimal modes characterizing the background is formulated as a
sparsityconstrained sparse coding problem. Our results show, that the quality
of the resulting background model is competitive, quantified by the Fmeasure,
Recall and Precision. A GPU (graphics processing unit) accelerated
implementation is also presented which further boosts the computational
efficiency of the algorithm.

This paper introduces a fast algorithm for randomized computation of a
lowrank Dynamic Mode Decomposition (DMD) of a matrix. Here we consider this
matrix to represent the development of a spatial grid through time e.g. data
from a static video source. DMD was originally introduced in the fluid
mechanics community, but is also suitable for motion detection in video streams
and its use for background subtraction has received little previous
investigation. In this study we present a comprehensive evaluation of
background subtraction, using the randomized DMD and compare the results with
leading robust principal component analysis algorithms. The results are
convincing and show the random DMD is an efficient and powerful approach for
background modeling, allowing processing of high resolution videos in
realtime. Supplementary materials include implementations of the algorithms in
Python.