• In physics, biology and engineering, network systems abound. How does the connectivity of a network system combine with the behavior of its individual components to determine its collective function? We approach this question for networks with linear time-invariant dynamics by relating internal network feedbacks to the statistical prevalence of connectivity motifs, a set of surprisingly simple and local statistics of connectivity. This results in a reduced order model of the network input-output dynamics in terms of motifs structures. As an example, the new formulation dramatically simplifies the classic Erdos-Renyi graph, reducing the overall network behavior to one proportional feedback wrapped around the dynamics of a single node. For general networks, higher-order motifs systematically provide further layers and types of feedback to regulate the network response. Thus, the local connectivity shapes temporal and spectral processing by the network as a whole, and we show how this enables robust, yet tunable, functionality such as extending the time constant with which networks remember past signals. The theory also extends to networks composed from heterogeneous nodes with distinct dynamics and connectivity, and patterned input to (and readout from) subsets of nodes. These statistical descriptions provide a powerful theoretical framework to understand the functionality of real-world network systems, as we illustrate with examples including the mouse brain connectome.
  • This paper presents a randomized algorithm for computing the near-optimal low-rank dynamic mode decomposition (DMD). Randomized algorithms are emerging techniques to compute low-rank matrix approximations at a fraction of the cost of deterministic algorithms, easing the computational challenges arising in the area of `big data'. The idea is to derive a small matrix from the high-dimensional data, which is then used to efficiently compute the dynamic modes and eigenvalues. The algorithm is presented in a modular probabilistic framework, and the approximation quality can be controlled via oversampling and power iterations. The effectiveness of the resulting randomized DMD (rDMD) algorithm is demonstrated on several benchmark examples of increasing complexity, providing an accurate and efficient approach to extract spatiotemporal coherent structures from big data in a framework that scales with the intrinsic rank of the data, rather than the ambient measurement dimension.
  • The problem of optimally placing sensors under a cost constraint arises naturally in the design of industrial and commercial products, as well as in scientific experiments. We consider a relaxation of the full optimization formulation of this problem and then extend a well-established QR-based greedy algorithm for the optimal sensor placement problem without cost constraints. We demonstrate the effectiveness of this algorithm on data sets related to facial recognition, climate science, and fluid mechanics. This algorithm is scalable and often identifies sparse sensors with near optimal reconstruction performance, while dramatically reducing the overall cost of the sensors. We find that the cost-error landscape varies by application, with intuitive connections to the underlying physics. Additionally, we include experiments for various pre-processing techniques and find that a popular technique based on the singular value decomposition is often sub-optimal.
  • Sparse sensor placement is a central challenge in the efficient characterization of complex systems when the cost of acquiring and processing data is high. Leading sparse sensing methods typically exploit either spatial or temporal correlations, but rarely both. This work introduces a new sparse sensor optimization that is designed to leverage the rich spatiotemporal coherence exhibited by many systems. Our approach is inspired by the remarkable performance of flying insects, which use a few embedded strain-sensitive neurons to achieve rapid and robust flight control despite large gust disturbances. Specifically, we draw on nature to identify targeted neural-inspired sensors on a flapping wing to detect body rotation. This task is particularly challenging as the rotational twisting mode is three orders-of-magnitude smaller than the flapping modes. We show that nonlinear filtering in time, built to mimic strain-sensitive neurons, is essential to detect rotation, whereas instantaneous measurements fail. Optimized sparse sensor placement results in efficient classification with approximately ten sensors, achieving the same accuracy and noise robustness as full measurements consisting of hundreds of sensors. Sparse sensing with neural inspired encoding establishes a new paradigm in hyper-efficient, embodied sensing of spatiotemporal data and sheds light on principles of biological sensing for agile flight control.
  • Identifying coordinate transformations that make strongly nonlinear dynamics approximately linear is a central challenge in modern dynamical systems. These transformations have the potential to enable prediction, estimation, and control of nonlinear systems using standard linear theory. The Koopman operator has emerged as a leading data-driven embedding, as eigenfunctions of this operator provide intrinsic coordinates that globally linearize the dynamics. However, identifying and representing these eigenfunctions has proven to be mathematically and computationally challenging. This work leverages the power of deep learning to discover representations of Koopman eigenfunctions from trajectory data of dynamical systems. Our network is parsimonious and interpretable by construction, embedding the dynamics on a low-dimensional manifold that is of the intrinsic rank of the dynamics and parameterized by the Koopman eigenfunctions. In particular, we identify nonlinear coordinates on which the dynamics are globally linear using a modified auto-encoder. We also generalize Koopman representations to include a ubiquitous class of systems that exhibit continuous spectra, ranging from the simple pendulum to nonlinear optics and broadband turbulence. Our framework parametrizes the continuous frequency using an auxiliary network, enabling a compact and efficient embedding at the intrinsic rank, while connecting our models to half a century of asymptotics. In this way, we benefit from the power and generality of deep learning, while retaining the physical interpretability of Koopman embeddings.
  • Data-driven transformations that reformulate nonlinear systems in a linear framework have the potential to enable the prediction, estimation, and control of strongly nonlinear dynamics using linear systems theory. The Koopman operator has emerged as a principled linear embedding of nonlinear dynamics, and its eigenfunctions establish intrinsic coordinates along which the dynamics behave linearly. In this work, we demonstrate a data-driven control architecture, termed Koopman Reduced Order Nonlinear Identification and Control (KRONIC), that utilizes Koopman eigenfunctions to manipulate nonlinear systems using linear systems theory. We approximate these eigenfunctions with data-driven regression and power series expansions, based on the partial differential equation governing the infinitesimal generator of the Koopman operator. Although previous regression-based methods may identify spurious dynamics, we show that lightly damped eigenfunctions may be faithfully extracted using sparse regression. These lightly damped eigenfunctions are particularly relevant for control, as they correspond to nearly conserved quantities that are associated with persistent dynamics, such as the Hamiltonian. We derive the form of control in these intrinsic eigenfunction coordinates and design nonlinear controllers using standard linear control theory. KRONIC is then demonstrated on a number of relevant examples, including 1) a nonlinear system with a known linear embedding, 2) a variety of Hamiltonian systems, and 3) a high-dimensional double-gyre model for ocean mixing.
  • Sparse principal component analysis (SPCA) has emerged as a powerful technique for modern data analysis. We discuss a robust and scalable algorithm for computing sparse principal component analysis. Specifically, we model SPCA as a matrix factorization problem with orthogonality constraints, and develop specialized optimization algorithms that partially minimize a subset of the variables (variable projection). The framework incorporates a wide variety of sparsity-inducing regularizers for SPCA. We also extend the variable projection approach to robust SPCA, for any robust loss that can be expressed as the Moreau envelope of a simple function, with the canonical example of the Huber loss. Finally, randomized methods for linear algebra are used to extend the approach to the large-scale (big data) setting. The proposed algorithms are demonstrated using both synthetic and real world data.
  • Matrix decompositions are fundamental tools in the area of applied mathematics, statistical computing, and machine learning. In particular, low-rank matrix decompositions are vital, and widely used for data analysis, dimensionality reduction, and data compression. Massive datasets, however, pose a computational challenge for traditional algorithms, placing significant constraints on both memory and processing power. Recently, the powerful concept of randomness has been introduced as a strategy to ease the computational load. The essential idea of probabilistic algorithms is to employ some amount of randomness in order to derive a smaller matrix from a high-dimensional data matrix. The smaller matrix is then used to compute the desired low-rank approximation. Such algorithms are shown to be computationally efficient for approximating matrices with low-rank structure. We present the R package rsvd, and provide a tutorial introduction to randomized matrix decompositions. Specifically, randomized routines for the singular value decomposition, (robust) principal component analysis, interpolative decomposition, and CUR decomposition are discussed. Several examples demonstrate the routines, and show the computational advantage over other methods implemented in R.
  • Big data has become a critically enabling component of emerging mathematical methods aimed at the automated discovery of dynamical systems, where first principles modeling may be intractable. However, in many engineering systems, abrupt changes must be rapidly characterized based on limited, incomplete, and noisy data. Many leading automated learning techniques rely on unrealistically large data sets and it is unclear how to leverage prior knowledge effectively to re-identify a model after an abrupt change. In this work, we propose a conceptual framework to recover parsimonious models of a system in response to abrupt changes in the low-data limit. First, the abrupt change is detected by comparing the estimated Lyapunov time of the data with the model prediction. Next, we apply the sparse identification of nonlinear dynamics (SINDy) regression to update a previously identified model with the fewest changes, either by addition, deletion, or modification of existing model terms. We demonstrate this sparse model recovery on several examples for abrupt system change detection in periodic and chaotic dynamical systems. Our examples show that sparse updates to a previously identified model perform better with less data, have lower runtime complexity, and are less sensitive to noise than identifying an entirely new model. The proposed abrupt-SINDy architecture provides a new paradigm for the rapid and efficient recovery of a system model after abrupt changes.
  • Diffusion maps are an emerging data-driven technique for non-linear dimensionality reduction, which are especially useful for the analysis of coherent structures and nonlinear embeddings of dynamical systems. However, the computational complexity of the diffusion maps algorithm scales with the number of observations. Thus, long time-series data presents a significant challenge for fast and efficient embedding. We propose integrating the Nystr\"om method with diffusion maps in order to ease the computational demand. We achieve a speedup of roughly two to four times when approximating the dominant diffusion map components.
  • Self-tuning optical systems are of growing importance in technological applications such as mode-locked fiber lasers. Such self-tuning paradigms require {\em intelligent} algorithms capable of inferring approximate models of the underlying physics and discovering appropriate control laws in order to maintain robust performance for a given objective. In this work, we demonstrate the first integration of a {\em deep learning} (DL) architecture with {\em model predictive control} (MPC) in order to self-tune a mode-locked fiber laser. Not only can our DL-MPC algorithmic architecture approximate the unknown fiber birefringence, it also builds a dynamical model of the laser and appropriate control law for maintaining robust, high-energy pulses despite a stochastically drifting birefringence. We demonstrate the effectiveness of this method on a fiber laser which is mode-locked by nonlinear polarization rotation. The method advocated can be broadly applied to a variety of optical systems that require robust controllers.
  • Topological data analysis (TDA) has emerged as one of the most promising techniques to reconstruct the unknown shapes of high-dimensional spaces from observed data samples. TDA, thus, yields key shape descriptors in the form of persistent topological features that can be used for any supervised or unsupervised learning task, including multi-way classification. Sparse sampling, on the other hand, provides a highly efficient technique to reconstruct signals in the spatial-temporal domain from just a few carefully-chosen samples. Here, we present a new method, referred to as the Sparse-TDA algorithm, that combines favorable aspects of the two techniques. This combination is realized by selecting an optimal set of sparse pixel samples from the persistent features generated by a vector-based TDA algorithm. These sparse samples are selected from a low-rank matrix representation of persistent features using QR pivoting. We show that the Sparse-TDA method demonstrates promising performance on three benchmark problems related to human posture recognition and image texture classification.
  • Optimal sensor placement is a central challenge in the design, prediction, estimation, and control of high-dimensional systems. High-dimensional states can often leverage a latent low-dimensional representation, and this inherent compressibility enables sparse sensing. This article explores optimized sensor placement for signal reconstruction based on a tailored library of features extracted from training data. Sparse point sensors are discovered using the singular value decomposition and QR pivoting, which are two ubiquitous matrix computations that underpin modern linear dimensionality reduction. Sparse sensing in a tailored basis is contrasted with compressed sensing, a universal signal recovery method in which an unknown signal is reconstructed via a sparse representation in a universal basis. Although compressed sensing can recover a wider class of signals, we demonstrate the benefits of exploiting known patterns in data with optimized sensing. In particular, drastic reductions in the required number of sensors and improved reconstruction are observed in examples ranging from facial images to fluid vorticity fields. Principled sensor placement may be critically enabling when sensors are costly and provides faster state estimation for low-latency, high-bandwidth control. MATLAB code is provided for all examples.
  • Simple aerodynamic configurations under even modest conditions can exhibit complex flows with a wide range of temporal and spatial features. It has become common practice in the analysis of these flows to look for and extract physically important features, or modes, as a first step in the analysis. This step typically starts with a modal decomposition of an experimental or numerical dataset of the flow field, or of an operator relevant to the system. We describe herein some of the dominant techniques for accomplishing these modal decompositions and analyses that have seen a surge of activity in recent decades. For a non-expert, keeping track of recent developments can be daunting, and the intent of this document is to provide an introduction to modal analysis that is accessible to the larger fluid dynamics community. In particular, we present a brief overview of several of the well-established techniques and clearly lay the framework of these methods using familiar linear algebra. The modal analysis techniques covered in this paper include the proper orthogonal decomposition (POD), balanced proper orthogonal decomposition (Balanced POD), dynamic mode decomposition (DMD), Koopman analysis, global linear stability analysis, and resolvent analysis.
  • A networked oscillator based analysis is performed for periodic bluff body flows to examine and control the transfer of kinetic energy. Spatial modes extracted from the flow field with corresponding amplitudes form a set of oscillators describing unsteady fluctuations. These oscillators are connected through a network that captures the energy exchanges amongst them. To extract the network of interactions among oscillators, amplitude and phase perturbations are impulsively introduced to the oscillators and the ensuing dynamics are analyzed. Using linear regression techniques, a networked oscillator model is constructed that reveals energy transfers and phase interactions among the modes. The model captures the nonlinear interactions amongst the modal oscillators through a linear approximation. A large collection of system responses are aggregated into a network model that captures interactions for general perturbations. The networked oscillator model describes the modal perturbation dynamics better than the empirical Galerkin reduced-order models. A model-based feedback controller is then designed to suppress modal amplitudes and the resulting wake unsteadiness leading to drag reduction. The strength of the proposed approach is demonstrated for a canonical example of two- dimensional unsteady flow over a circular cylinder. The present formulation enables the characterization of modal interactions to control fundamental energy transfers in unsteady vortical flows.
  • We propose a general dynamic reduced-order modeling framework for typical experimental data: time-resolved sensor data and optional non-time-resolved PIV snapshots. This framework contains four steps. First, the sensor signals are lifted to a dynamic feature space. Second, we identify a sparse human-interpretable nonlinear dynamical system for the feature state based on the sparse identification of nonlinear dynamics (SINDy). Third, if PIV snapshots are available, a local linear mapping from the feature state to velocity fields is shown to be orders of magnitudes more accurate than optimal modal expansions of the same order. Fourth, a generalized feature-based modal decomposition identifies coherent structures that are most dynamically correlated with the linear and nonlinear interaction terms in the sparse model, adding interpretability. Steps 1 and 2 define a black-box model. Optional steps 3 and 4 lift the black-box dynamics to a 'gray-box' model of the coherent structures, if non-time-resolved full-state data is available. This gray-box modeling strategy is successfully applied to the transient and post-transient laminar cylinder wake, and compares favorably with a POD model. We foresee numerous applications of this highly flexible modeling strategy, including estimation, prediction and control. Moreover, the feature space may be based on intrinsic coordinates, which are unaffected by a key challenge of modal expansion: the slow change of low-dimensional coherent structures with changing geometry and varying parameters.
  • The present paper reports on our effort to characterize vortical interactions in complex fluid flows through the use of network analysis. In particular, we examine the vortex interactions in two-dimensional decaying isotropic turbulence and find that the vortical interaction network can be characterized by a weighted scale-free network. It is found that the turbulent flow network retains its scale-free behavior until the characteristic value of circulation reaches a critical value. Furthermore, we show that the two-dimensional turbulence network is resilient against random perturbations but can be greatly influenced when forcing is focused towards the vortical structures that are categorized as network hubs. These findings can serve as a network-analytic foundation to examine complex geophysical and thin-film flows and take advantage of the rapidly growing field of network theory, which complements ongoing turbulence research based on vortex dynamics, hydrodynamic stability, and statistics. While additional work is essential to extend the mathematical tools from network analysis to extract deeper physical insights of turbulence, an understanding of turbulence based on the interaction-based network-theoretic framework presents a promising alternative in turbulence modeling and control efforts.
  • The CANDECOMP/PARAFAC (CP) tensor decomposition is a popular dimensionality-reduction method for multiway data. Dimensionality reduction is often sought since many high-dimensional tensors have low intrinsic rank relative to the dimension of the ambient measurement space. However, the emergence of `big data' poses significant computational challenges for computing this fundamental tensor decomposition. Leveraging modern randomized algorithms, we demonstrate that the coherent structure can be learned from a smaller representation of the tensor in a fraction of the time. Moreover, the high-dimensional signal can be faithfully approximated from the compressed measurements. Thus, this simple but powerful algorithm enables one to compute the approximate CP decomposition even for massive tensors. The approximation error can thereby be controlled via oversampling and the computation of power iterations. In addition to theoretical results, several empirical results demonstrate the performance of the proposed algorithm.
  • This paper addresses the problem of identifying different flow environments from sparse data collected by wing strain sensors. Insects regularly perform this feat using a sparse ensemble of noisy strain sensors on their wing. First, we obtain strain data from numerical simulation of a Manduca sexta hawkmoth wing undergoing different flow environments. Our data-driven method learns low-dimensional strain features originating from different aerodynamic environments using proper orthogonal decomposition (POD) modes in the frequency domain, and leverages sparse approximation to classify a set of strain frequency signatures using a dictionary of POD modes. This bio-inspired machine learning architecture for dictionary learning and sparse classification permits fewer costly physical strain sensors while being simultaneously robust to sensor noise. A measurement selection algorithm identifies frequencies that best discriminate the different aerodynamic environments in low-rank POD feature space. In this manner, sparse and noisy wing strain data can be exploited to robustly identify different aerodynamic environments encountered in flight, providing insight into the stereotyped placement of neurons that act as strain sensors on a Manduca sexta hawkmoth wing.
  • We develop an algorithm for model selection which allows for the consideration of a combinatorially large number of candidate models governing a dynamical system. The innovation circumvents a disadvantage of standard model selection which typically limits the number candidate models considered due to the intractability of computing information criteria. Using a recently developed sparse identification of nonlinear dynamics algorithm, the sub-selection of candidate models near the Pareto frontier allows for a tractable computation of AIC (Akaike information criteria) or BIC (Bayes information criteria) scores for the remaining candidate models. The information criteria hierarchically ranks the most informative models, enabling the automatic and principled selection of the model with the strongest support in relation to the time series data. Specifically, we show that AIC scores place each candidate model in the {\em strong support}, {\em weak support} or {\em no support} category. The method correctly identifies several canonical dynamical systems, including an SEIR (susceptible-exposed-infectious-recovered) disease model and the Lorenz equations, giving the correct dynamical system as the only candidate model with strong support.
  • This work develops a parallelized algorithm to compute the dynamic mode decomposition (DMD) on a graphics processing unit using the streaming method of snapshots singular value decomposition. This allows the algorithm to operate efficiently on streaming data by avoiding redundant inner-products as new data becomes available. In addition, it is possible to leverage the native compressed format of many data streams, such as HD video and computational physics codes that are represented sparsely in the Fourier domain, to massively reduce data transfer from CPU to GPU and to enable sparse matrix multiplications. Taken together, these algorithms facilitate real-time streaming DMD on high-dimensional data streams. We demonstrate the proposed method on numerous high-dimensional data sets ranging from video background modeling to scientific computing applications, where DMD is becoming a mainstay algorithm. The computational framework is developed as an open-source library written in C++ with CUDA, and the algorithms may be generalized to include other DMD advances, such as compressed sensing DMD, multi resolution DMD, or DMD with control. Keywords: Singular value decomposition, dynamic mode decomposition, streaming computations, graphics processing unit, video background modeling, scientific computing.
  • Although major advances have been achieved over the past decades for the reduction and identification of linear systems, deriving nonlinear low-order models still is a chal- lenging task. In this work, we develop a new data-driven framework to identify nonlinear reduced-order models of a fluid by combining dimensionality reductions techniques (e.g. proper orthogonal decomposition) and sparse regression techniques from machine learn- ing. In particular, we extend the sparse identification of nonlinear dynamics (SINDy) algorithm to enforce physical constraints in the regression, namely energy-preserving quadratic nonlinearities. The resulting models, hereafter referred to as Galerkin regression models, incorporate many beneficial aspects of Galerkin projection, but without the need for a full-order or high-fidelity solver to project the Navier-Stokes equations. Instead, the most parsimonious nonlinear model is determined that is consistent with observed mea- surement data and satisfies necessary constraints. Galerkin regression models also readily generalize to include higher-order nonlinear terms that model the effect of truncated modes. The effectiveness of Galerkin regression is demonstrated on two different flow configurations: the two-dimensional flow past a circular cylinder and the shear-driven cavity flow. For both cases, the accuracy of the identified models compare favorably against reduced-order models obtained from a standard Galerkin projection procedure. Present results highlight the importance of cubic nonlinearities in the construction of accurate nonlinear low-dimensional approximations of the flow systems, something which cannot be readily obtained using a standard Galerkin projection of the Navier-Stokes equations. Finally, the entire code base for our constrained sparse Galerkin regression algorithm is freely available online.
  • We propose a sparse regression method capable of discovering the governing partial differential equation(s) of a given system by time series measurements in the spatial domain. The regression framework relies on sparsity promoting techniques to select the nonlinear and partial derivative terms terms of the governing equations that most accurately represent the data, bypassing a combinatorially large search through all possible candidate models. The method balances model complexity and regression accuracy by selecting a parsimonious model via Pareto analysis. Time series measurements can be made in an Eulerian framework where the sensors are fixed spatially, or in a Lagrangian framework where the sensors move with the dynamics. The method is computationally efficient, robust, and demonstrated to work on a variety of canonical problems of mathematical physics including Navier-Stokes, the quantum harmonic oscillator, and the diffusion equation. Moreover, the method is capable of disambiguating between potentially non-unique dynamical terms by using multiple time series taken with different initial data. Thus for a traveling wave, the method can distinguish between a linear wave equation or the Korteweg-deVries equation, for instance. The method provides a promising new technique for discovering governing equations and physical laws in parametrized spatio-temporal systems where first-principles derivations are intractable.
  • We introduce the method of compressed dynamic mode decomposition (cDMD) for background modeling. The dynamic mode decomposition (DMD) is a regression technique that integrates two of the leading data analysis methods in use today: Fourier transforms and singular value decomposition. Borrowing ideas from compressed sensing and matrix sketching, cDMD eases the computational workload of high resolution video processing. The key principal of cDMD is to obtain the decomposition on a (small) compressed matrix representation of the video feed. Hence, the cDMD algorithm scales with the intrinsic rank of the matrix, rather then the size of the actual video (data) matrix. Selection of the optimal modes characterizing the background is formulated as a sparsity-constrained sparse coding problem. Our results show, that the quality of the resulting background model is competitive, quantified by the F-measure, Recall and Precision. A GPU (graphics processing unit) accelerated implementation is also presented which further boosts the computational efficiency of the algorithm.
  • Understanding the interplay of order and disorder in chaotic systems is a central challenge in modern quantitative science. We present a universal, data-driven decomposition of chaos as an intermittently forced linear system. This work combines Takens' delay embedding with modern Koopman operator theory and sparse regression to obtain linear representations of strongly nonlinear dynamics. The result is a decomposition of chaotic dynamics into a linear model in the leading delay coordinates with forcing by low energy delay coordinates; we call this the Hankel alternative view of Koopman (HAVOK) analysis. This analysis is applied to the canonical Lorenz system, as well as to real-world examples such as the Earth's magnetic field reversal, and data from electrocardiogram, electroencephalogram, and measles outbreaks. In each case, the forcing statistics are non-Gaussian, with long tails corresponding to rare events that trigger intermittent switching and bursting phenomena; this forcing is highly predictive, providing a clear signature that precedes these events. Moreover, the activity of the forcing signal demarcates large coherent regions of phase space where the dynamics are approximately linear from those that are strongly nonlinear.