
In physics, biology and engineering, network systems abound. How does the
connectivity of a network system combine with the behavior of its individual
components to determine its collective function? We approach this question for
networks with linear timeinvariant dynamics by relating internal network
feedbacks to the statistical prevalence of connectivity motifs, a set of
surprisingly simple and local statistics of connectivity. This results in a
reduced order model of the network inputoutput dynamics in terms of motifs
structures. As an example, the new formulation dramatically simplifies the
classic ErdosRenyi graph, reducing the overall network behavior to one
proportional feedback wrapped around the dynamics of a single node. For general
networks, higherorder motifs systematically provide further layers and types
of feedback to regulate the network response. Thus, the local connectivity
shapes temporal and spectral processing by the network as a whole, and we show
how this enables robust, yet tunable, functionality such as extending the time
constant with which networks remember past signals. The theory also extends to
networks composed from heterogeneous nodes with distinct dynamics and
connectivity, and patterned input to (and readout from) subsets of nodes. These
statistical descriptions provide a powerful theoretical framework to understand
the functionality of realworld network systems, as we illustrate with examples
including the mouse brain connectome.

This paper presents a randomized algorithm for computing the nearoptimal
lowrank dynamic mode decomposition (DMD). Randomized algorithms are emerging
techniques to compute lowrank matrix approximations at a fraction of the cost
of deterministic algorithms, easing the computational challenges arising in the
area of `big data'. The idea is to derive a small matrix from the
highdimensional data, which is then used to efficiently compute the dynamic
modes and eigenvalues. The algorithm is presented in a modular probabilistic
framework, and the approximation quality can be controlled via oversampling and
power iterations. The effectiveness of the resulting randomized DMD algorithm
is demonstrated on several benchmark examples of increasing complexity,
providing an accurate and efficient approach to extract spatiotemporal coherent
structures from big data in a framework that scales with the intrinsic rank of
the data, rather than the ambient measurement dimension. For this work we
assume that the dynamics of the problem under consideration is evolving on a
lowdimensional subspace that is well characterized by a fast decaying singular
value spectrum.

The problem of optimally placing sensors under a cost constraint arises
naturally in the design of industrial and commercial products, as well as in
scientific experiments. We consider a relaxation of the full optimization
formulation of this problem and then extend a wellestablished QRbased greedy
algorithm for the optimal sensor placement problem without cost constraints. We
demonstrate the effectiveness of this algorithm on data sets related to facial
recognition, climate science, and fluid mechanics. This algorithm is scalable
and often identifies sparse sensors with near optimal reconstruction
performance, while dramatically reducing the overall cost of the sensors. We
find that the costerror landscape varies by application, with intuitive
connections to the underlying physics. Additionally, we include experiments for
various preprocessing techniques and find that a popular technique based on
the singular value decomposition is often suboptimal.

Sparse sensor placement is a central challenge in the efficient
characterization of complex systems when the cost of acquiring and processing
data is high. Leading sparse sensing methods typically exploit either spatial
or temporal correlations, but rarely both. This work introduces a new sparse
sensor optimization that is designed to leverage the rich spatiotemporal
coherence exhibited by many systems. Our approach is inspired by the remarkable
performance of flying insects, which use a few embedded strainsensitive
neurons to achieve rapid and robust flight control despite large gust
disturbances. Specifically, we draw on nature to identify targeted
neuralinspired sensors on a flapping wing to detect body rotation. This task
is particularly challenging as the rotational twisting mode is three
ordersofmagnitude smaller than the flapping modes. We show that nonlinear
filtering in time, built to mimic strainsensitive neurons, is essential to
detect rotation, whereas instantaneous measurements fail. Optimized sparse
sensor placement results in efficient classification with approximately ten
sensors, achieving the same accuracy and noise robustness as full measurements
consisting of hundreds of sensors. Sparse sensing with neural inspired encoding
establishes a new paradigm in hyperefficient, embodied sensing of
spatiotemporal data and sheds light on principles of biological sensing for
agile flight control.

Identifying coordinate transformations that make strongly nonlinear dynamics
approximately linear is a central challenge in modern dynamical systems. These
transformations have the potential to enable prediction, estimation, and
control of nonlinear systems using standard linear theory. The Koopman operator
has emerged as a leading datadriven embedding, as eigenfunctions of this
operator provide intrinsic coordinates that globally linearize the dynamics.
However, identifying and representing these eigenfunctions has proven to be
mathematically and computationally challenging. This work leverages the power
of deep learning to discover representations of Koopman eigenfunctions from
trajectory data of dynamical systems. Our network is parsimonious and
interpretable by construction, embedding the dynamics on a lowdimensional
manifold that is of the intrinsic rank of the dynamics and parameterized by the
Koopman eigenfunctions. In particular, we identify nonlinear coordinates on
which the dynamics are globally linear using a modified autoencoder. We also
generalize Koopman representations to include a ubiquitous class of systems
that exhibit continuous spectra, ranging from the simple pendulum to nonlinear
optics and broadband turbulence. Our framework parametrizes the continuous
frequency using an auxiliary network, enabling a compact and efficient
embedding at the intrinsic rank, while connecting our models to half a century
of asymptotics. In this way, we benefit from the power and generality of deep
learning, while retaining the physical interpretability of Koopman embeddings.

Datadriven transformations that reformulate nonlinear systems in a linear
framework have the potential to enable the prediction, estimation, and control
of strongly nonlinear dynamics using linear systems theory. The Koopman
operator has emerged as a principled linear embedding of nonlinear dynamics,
and its eigenfunctions establish intrinsic coordinates along which the dynamics
behave linearly. Previous studies have used finitedimensional approximations
of the Koopman operator for modelpredictive control approaches. In this work,
we illustrate a fundamental closure issue of this approach and argue that it is
beneficial to represent the dynamics directly in eigenfunction coordinates.
These coordinates form a Koopmaninvariant subspace by design and, thus, have
improved predictive power. We show then how the control is formulated in these
intrinsic coordinates and discuss potential benefits and caveats of this
perspective. The resulting control architecture is termed Koopman Reduced Order
Nonlinear Identification and Control (KRONIC). It is further demonstrated that
these eigenfunctions can be approximated with datadriven regression and power
series expansions, based on the partial differential equation governing the
infinitesimal generator of the Koopman operator. Validating discovered
eigenfunctions is crucial and we show that lightly damped eigenfunctions may be
faithfully extracted. These lightly damped eigenfunctions are particularly
relevant for control, as they correspond to nearly conserved quantities that
are associated with persistent dynamics, such as the Hamiltonian. KRONIC is
then demonstrated on a number of relevant examples, including 1) a nonlinear
system with a known linear embedding, 2) a variety of Hamiltonian systems, and
3) a highdimensional doublegyre model for ocean mixing.

Sparse principal component analysis (SPCA) has emerged as a powerful
technique for modern data analysis. We discuss a robust and scalable algorithm
for computing sparse principal component analysis. Specifically, we model SPCA
as a matrix factorization problem with orthogonality constraints, and develop
specialized optimization algorithms that partially minimize a subset of the
variables (variable projection). The framework incorporates a wide variety of
sparsityinducing regularizers for SPCA. We also extend the variable projection
approach to robust SPCA, for any robust loss that can be expressed as the
Moreau envelope of a simple function, with the canonical example of the Huber
loss. Finally, randomized methods for linear algebra are used to extend the
approach to the largescale (big data) setting. The proposed algorithms are
demonstrated using both synthetic and real world data.

Matrix decompositions are fundamental tools in the area of applied
mathematics, statistical computing, and machine learning. In particular,
lowrank matrix decompositions are vital, and widely used for data analysis,
dimensionality reduction, and data compression. Massive datasets, however, pose
a computational challenge for traditional algorithms, placing significant
constraints on both memory and processing power. Recently, the powerful concept
of randomness has been introduced as a strategy to ease the computational load.
The essential idea of probabilistic algorithms is to employ some amount of
randomness in order to derive a smaller matrix from a highdimensional data
matrix. The smaller matrix is then used to compute the desired lowrank
approximation. Such algorithms are shown to be computationally efficient for
approximating matrices with lowrank structure. We present the \proglang{R}
package rsvd, and provide a tutorial introduction to randomized matrix
decompositions. Specifically, randomized routines for the singular value
decomposition, (robust) principal component analysis, interpolative
decomposition, and CUR decomposition are discussed. Several examples
demonstrate the routines, and show the computational advantage over other
methods implemented in R.

Big data has become a critically enabling component of emerging mathematical
methods aimed at the automated discovery of dynamical systems, where first
principles modeling may be intractable. However, in many engineering systems,
abrupt changes must be rapidly characterized based on limited, incomplete, and
noisy data. Many leading automated learning techniques rely on unrealistically
large data sets and it is unclear how to leverage prior knowledge effectively
to reidentify a model after an abrupt change. In this work, we propose a
conceptual framework to recover parsimonious models of a system in response to
abrupt changes in the lowdata limit. First, the abrupt change is detected by
comparing the estimated Lyapunov time of the data with the model prediction.
Next, we apply the sparse identification of nonlinear dynamics (SINDy)
regression to update a previously identified model with the fewest changes,
either by addition, deletion, or modification of existing model terms. We
demonstrate this sparse model recovery on several examples for abrupt system
change detection in periodic and chaotic dynamical systems. Our examples show
that sparse updates to a previously identified model perform better with less
data, have lower runtime complexity, and are less sensitive to noise than
identifying an entirely new model. The proposed abruptSINDy architecture
provides a new paradigm for the rapid and efficient recovery of a system model
after abrupt changes.

Diffusion maps are an emerging datadriven technique for nonlinear
dimensionality reduction, which are especially useful for the analysis of
coherent structures and nonlinear embeddings of dynamical systems. However, the
computational complexity of the diffusion maps algorithm scales with the number
of observations. Thus, long timeseries data presents a significant challenge
for fast and efficient embedding. We propose integrating the Nystr\"om method
with diffusion maps in order to ease the computational demand. We achieve a
speedup of roughly two to four times when approximating the dominant diffusion
map components.

Selftuning optical systems are of growing importance in technological
applications such as modelocked fiber lasers. Such selftuning paradigms
require {\em intelligent} algorithms capable of inferring approximate models of
the underlying physics and discovering appropriate control laws in order to
maintain robust performance for a given objective. In this work, we demonstrate
the first integration of a {\em deep learning} (DL) architecture with {\em
model predictive control} (MPC) in order to selftune a modelocked fiber
laser. Not only can our DLMPC algorithmic architecture approximate the unknown
fiber birefringence, it also builds a dynamical model of the laser and
appropriate control law for maintaining robust, highenergy pulses despite a
stochastically drifting birefringence. We demonstrate the effectiveness of this
method on a fiber laser which is modelocked by nonlinear polarization
rotation. The method advocated can be broadly applied to a variety of optical
systems that require robust controllers.

Topological data analysis (TDA) has emerged as one of the most promising
techniques to reconstruct the unknown shapes of highdimensional spaces from
observed data samples. TDA, thus, yields key shape descriptors in the form of
persistent topological features that can be used for any supervised or
unsupervised learning task, including multiway classification. Sparse
sampling, on the other hand, provides a highly efficient technique to
reconstruct signals in the spatialtemporal domain from just a few
carefullychosen samples. Here, we present a new method, referred to as the
SparseTDA algorithm, that combines favorable aspects of the two techniques.
This combination is realized by selecting an optimal set of sparse pixel
samples from the persistent features generated by a vectorbased TDA algorithm.
These sparse samples are selected from a lowrank matrix representation of
persistent features using QR pivoting. We show that the SparseTDA method
demonstrates promising performance on three benchmark problems related to human
posture recognition and image texture classification.

Optimal sensor placement is a central challenge in the design, prediction,
estimation, and control of highdimensional systems. Highdimensional states
can often leverage a latent lowdimensional representation, and this inherent
compressibility enables sparse sensing. This article explores optimized sensor
placement for signal reconstruction based on a tailored library of features
extracted from training data. Sparse point sensors are discovered using the
singular value decomposition and QR pivoting, which are two ubiquitous matrix
computations that underpin modern linear dimensionality reduction. Sparse
sensing in a tailored basis is contrasted with compressed sensing, a universal
signal recovery method in which an unknown signal is reconstructed via a sparse
representation in a universal basis. Although compressed sensing can recover a
wider class of signals, we demonstrate the benefits of exploiting known
patterns in data with optimized sensing. In particular, drastic reductions in
the required number of sensors and improved reconstruction are observed in
examples ranging from facial images to fluid vorticity fields. Principled
sensor placement may be critically enabling when sensors are costly and
provides faster state estimation for lowlatency, highbandwidth control.
MATLAB code is provided for all examples.

Simple aerodynamic configurations under even modest conditions can exhibit
complex flows with a wide range of temporal and spatial features. It has become
common practice in the analysis of these flows to look for and extract
physically important features, or modes, as a first step in the analysis. This
step typically starts with a modal decomposition of an experimental or
numerical dataset of the flow field, or of an operator relevant to the system.
We describe herein some of the dominant techniques for accomplishing these
modal decompositions and analyses that have seen a surge of activity in recent
decades. For a nonexpert, keeping track of recent developments can be
daunting, and the intent of this document is to provide an introduction to
modal analysis that is accessible to the larger fluid dynamics community. In
particular, we present a brief overview of several of the wellestablished
techniques and clearly lay the framework of these methods using familiar linear
algebra. The modal analysis techniques covered in this paper include the proper
orthogonal decomposition (POD), balanced proper orthogonal decomposition
(Balanced POD), dynamic mode decomposition (DMD), Koopman analysis, global
linear stability analysis, and resolvent analysis.

A networked oscillator based analysis is performed for periodic bluff body
flows to examine and control the transfer of kinetic energy. Spatial modes
extracted from the flow field with corresponding amplitudes form a set of
oscillators describing unsteady fluctuations. These oscillators are connected
through a network that captures the energy exchanges amongst them. To extract
the network of interactions among oscillators, amplitude and phase
perturbations are impulsively introduced to the oscillators and the ensuing
dynamics are analyzed. Using linear regression techniques, a networked
oscillator model is constructed that reveals energy transfers and phase
interactions among the modes. The model captures the nonlinear interactions
amongst the modal oscillators through a linear approximation. A large
collection of system responses are aggregated into a network model that
captures interactions for general perturbations. The networked oscillator model
describes the modal perturbation dynamics better than the empirical Galerkin
reducedorder models. A modelbased feedback controller is then designed to
suppress modal amplitudes and the resulting wake unsteadiness leading to drag
reduction. The strength of the proposed approach is demonstrated for a
canonical example of two dimensional unsteady flow over a circular cylinder.
The present formulation enables the characterization of modal interactions to
control fundamental energy transfers in unsteady vortical flows.

We propose a general dynamic reducedorder modeling framework for typical
experimental data: timeresolved sensor data and optional nontimeresolved PIV
snapshots. This framework contains four steps. First, the sensor signals are
lifted to a dynamic feature space. Second, we identify a sparse
humaninterpretable nonlinear dynamical system for the feature state based on
the sparse identification of nonlinear dynamics (SINDy). Third, if PIV
snapshots are available, a local linear mapping from the feature state to
velocity fields is shown to be orders of magnitudes more accurate than optimal
modal expansions of the same order. Fourth, a generalized featurebased modal
decomposition identifies coherent structures that are most dynamically
correlated with the linear and nonlinear interaction terms in the sparse model,
adding interpretability. Steps 1 and 2 define a blackbox model. Optional steps
3 and 4 lift the blackbox dynamics to a 'graybox' model of the coherent
structures, if nontimeresolved fullstate data is available. This graybox
modeling strategy is successfully applied to the transient and posttransient
laminar cylinder wake, and compares favorably with a POD model. We foresee
numerous applications of this highly flexible modeling strategy, including
estimation, prediction and control. Moreover, the feature space may be based on
intrinsic coordinates, which are unaffected by a key challenge of modal
expansion: the slow change of lowdimensional coherent structures with changing
geometry and varying parameters.

The present paper reports on our effort to characterize vortical interactions
in complex fluid flows through the use of network analysis. In particular, we
examine the vortex interactions in twodimensional decaying isotropic
turbulence and find that the vortical interaction network can be characterized
by a weighted scalefree network. It is found that the turbulent flow network
retains its scalefree behavior until the characteristic value of circulation
reaches a critical value. Furthermore, we show that the twodimensional
turbulence network is resilient against random perturbations but can be greatly
influenced when forcing is focused towards the vortical structures that are
categorized as network hubs. These findings can serve as a networkanalytic
foundation to examine complex geophysical and thinfilm flows and take
advantage of the rapidly growing field of network theory, which complements
ongoing turbulence research based on vortex dynamics, hydrodynamic stability,
and statistics. While additional work is essential to extend the mathematical
tools from network analysis to extract deeper physical insights of turbulence,
an understanding of turbulence based on the interactionbased networktheoretic
framework presents a promising alternative in turbulence modeling and control
efforts.

The CANDECOMP/PARAFAC (CP) tensor decomposition is a popular
dimensionalityreduction method for multiway data. Dimensionality reduction is
often sought after since many highdimensional tensors have low intrinsic rank
relative to the dimension of the ambient measurement space. However, the
emergence of `big data' poses significant computational challenges for
computing this fundamental tensor decomposition. By leveraging modern
randomized algorithms, we demonstrate that coherent structures can be learned
from a smaller representation of the tensor in a fraction of the time. Thus,
this simple but powerful algorithm enables one to compute the approximate CP
decomposition even for massive tensors. The approximation error can thereby be
controlled via oversampling and the computation of power iterations. In
addition to theoretical results, several empirical results demonstrate the
performance of the proposed algorithm.

This paper addresses the problem of identifying different flow environments
from sparse data collected by wing strain sensors. Insects regularly perform
this feat using a sparse ensemble of noisy strain sensors on their wing. First,
we obtain strain data from numerical simulation of a Manduca sexta hawkmoth
wing undergoing different flow environments. Our datadriven method learns
lowdimensional strain features originating from different aerodynamic
environments using proper orthogonal decomposition (POD) modes in the frequency
domain, and leverages sparse approximation to classify a set of strain
frequency signatures using a dictionary of POD modes. This bioinspired machine
learning architecture for dictionary learning and sparse classification permits
fewer costly physical strain sensors while being simultaneously robust to
sensor noise. A measurement selection algorithm identifies frequencies that
best discriminate the different aerodynamic environments in lowrank POD
feature space. In this manner, sparse and noisy wing strain data can be
exploited to robustly identify different aerodynamic environments encountered
in flight, providing insight into the stereotyped placement of neurons that act
as strain sensors on a Manduca sexta hawkmoth wing.

We develop an algorithm for model selection which allows for the
consideration of a combinatorially large number of candidate models governing a
dynamical system. The innovation circumvents a disadvantage of standard model
selection which typically limits the number candidate models considered due to
the intractability of computing information criteria. Using a recently
developed sparse identification of nonlinear dynamics algorithm, the
subselection of candidate models near the Pareto frontier allows for a
tractable computation of AIC (Akaike information criteria) or BIC (Bayes
information criteria) scores for the remaining candidate models. The
information criteria hierarchically ranks the most informative models, enabling
the automatic and principled selection of the model with the strongest support
in relation to the time series data. Specifically, we show that AIC scores
place each candidate model in the {\em strong support}, {\em weak support} or
{\em no support} category. The method correctly identifies several canonical
dynamical systems, including an SEIR (susceptibleexposedinfectiousrecovered)
disease model and the Lorenz equations, giving the correct dynamical system as
the only candidate model with strong support.

This work develops a parallelized algorithm to compute the dynamic mode
decomposition (DMD) on a graphics processing unit using the streaming method of
snapshots singular value decomposition. This allows the algorithm to operate
efficiently on streaming data by avoiding redundant innerproducts as new data
becomes available. In addition, it is possible to leverage the native
compressed format of many data streams, such as HD video and computational
physics codes that are represented sparsely in the Fourier domain, to massively
reduce data transfer from CPU to GPU and to enable sparse matrix
multiplications. Taken together, these algorithms facilitate realtime
streaming DMD on highdimensional data streams. We demonstrate the proposed
method on numerous highdimensional data sets ranging from video background
modeling to scientific computing applications, where DMD is becoming a mainstay
algorithm. The computational framework is developed as an opensource library
written in C++ with CUDA, and the algorithms may be generalized to include
other DMD advances, such as compressed sensing DMD, multi resolution DMD, or
DMD with control. Keywords: Singular value decomposition, dynamic mode
decomposition, streaming computations, graphics processing unit, video
background modeling, scientific computing.

Although major advances have been achieved over the past decades for the
reduction and identification of linear systems, deriving nonlinear loworder
models still is a chal lenging task. In this work, we develop a new
datadriven framework to identify nonlinear reducedorder models of a fluid by
combining dimensionality reductions techniques (e.g. proper orthogonal
decomposition) and sparse regression techniques from machine learn ing. In
particular, we extend the sparse identification of nonlinear dynamics (SINDy)
algorithm to enforce physical constraints in the regression, namely
energypreserving quadratic nonlinearities. The resulting models, hereafter
referred to as Galerkin regression models, incorporate many beneficial aspects
of Galerkin projection, but without the need for a fullorder or highfidelity
solver to project the NavierStokes equations. Instead, the most parsimonious
nonlinear model is determined that is consistent with observed mea surement
data and satisfies necessary constraints. Galerkin regression models also
readily generalize to include higherorder nonlinear terms that model the
effect of truncated modes. The effectiveness of Galerkin regression is
demonstrated on two different flow configurations: the twodimensional flow
past a circular cylinder and the sheardriven cavity flow. For both cases, the
accuracy of the identified models compare favorably against reducedorder
models obtained from a standard Galerkin projection procedure. Present results
highlight the importance of cubic nonlinearities in the construction of
accurate nonlinear lowdimensional approximations of the flow systems,
something which cannot be readily obtained using a standard Galerkin projection
of the NavierStokes equations. Finally, the entire code base for our
constrained sparse Galerkin regression algorithm is freely available online.

We propose a sparse regression method capable of discovering the governing
partial differential equation(s) of a given system by time series measurements
in the spatial domain. The regression framework relies on sparsity promoting
techniques to select the nonlinear and partial derivative terms terms of the
governing equations that most accurately represent the data, bypassing a
combinatorially large search through all possible candidate models. The method
balances model complexity and regression accuracy by selecting a parsimonious
model via Pareto analysis. Time series measurements can be made in an Eulerian
framework where the sensors are fixed spatially, or in a Lagrangian framework
where the sensors move with the dynamics. The method is computationally
efficient, robust, and demonstrated to work on a variety of canonical problems
of mathematical physics including NavierStokes, the quantum harmonic
oscillator, and the diffusion equation. Moreover, the method is capable of
disambiguating between potentially nonunique dynamical terms by using multiple
time series taken with different initial data. Thus for a traveling wave, the
method can distinguish between a linear wave equation or the KortewegdeVries
equation, for instance. The method provides a promising new technique for
discovering governing equations and physical laws in parametrized
spatiotemporal systems where firstprinciples derivations are intractable.

We introduce the method of compressed dynamic mode decomposition (cDMD) for
background modeling. The dynamic mode decomposition (DMD) is a regression
technique that integrates two of the leading data analysis methods in use
today: Fourier transforms and singular value decomposition. Borrowing ideas
from compressed sensing and matrix sketching, cDMD eases the computational
workload of high resolution video processing. The key principal of cDMD is to
obtain the decomposition on a (small) compressed matrix representation of the
video feed. Hence, the cDMD algorithm scales with the intrinsic rank of the
matrix, rather then the size of the actual video (data) matrix. Selection of
the optimal modes characterizing the background is formulated as a
sparsityconstrained sparse coding problem. Our results show, that the quality
of the resulting background model is competitive, quantified by the Fmeasure,
Recall and Precision. A GPU (graphics processing unit) accelerated
implementation is also presented which further boosts the computational
efficiency of the algorithm.

Understanding the interplay of order and disorder in chaotic systems is a
central challenge in modern quantitative science. We present a universal,
datadriven decomposition of chaos as an intermittently forced linear system.
This work combines Takens' delay embedding with modern Koopman operator theory
and sparse regression to obtain linear representations of strongly nonlinear
dynamics. The result is a decomposition of chaotic dynamics into a linear model
in the leading delay coordinates with forcing by low energy delay coordinates;
we call this the Hankel alternative view of Koopman (HAVOK) analysis. This
analysis is applied to the canonical Lorenz system, as well as to realworld
examples such as the Earth's magnetic field reversal, and data from
electrocardiogram, electroencephalogram, and measles outbreaks. In each case,
the forcing statistics are nonGaussian, with long tails corresponding to rare
events that trigger intermittent switching and bursting phenomena; this forcing
is highly predictive, providing a clear signature that precedes these events.
Moreover, the activity of the forcing signal demarcates large coherent regions
of phase space where the dynamics are approximately linear from those that are
strongly nonlinear.