• Consider the problem of modeling memory effects in discrete-state random walks using higher-order Markov chains. This paper explores cross validation and information criteria as proxies for a model's predictive accuracy. Our objective is to select, from data, the number of prior states of recent history upon which a trajectory is statistically dependent. Through simulations, I evaluate these criteria in the case where data are drawn from systems with fixed orders of history, noting trends in the relative performance of the criteria. As a real-world illustrative example of these methods, this manuscript evaluates the problem of detecting statistical dependencies in shot outcomes in free throw shooting. Over three NBA seasons analyzed, several players exhibited statistical dependencies in free throw hitting probability of various types - hot handedness, cold handedness, and error correction. For the 2013-2014 through 2015-2016 NBA seasons, I detected statistical dependencies in 23% of all player-seasons. Focusing on a single player, in two of these three seasons, LeBron James shot a better percentage after an immediate miss than otherwise. In those seasons, conditioning on the previous outcome makes for a more predictive model than treating free throw makes as independent. When extended to data from the 2016-2017 NBA season specifically for LeBron James, a model depending on the previous shot (single-step Markovian) does not clearly beat a model with independent outcomes. An error-correcting variable length model of two parameters, where James shoots a higher percentage after a missed free throw than otherwise, is more predictive than either model.
  • The Kaplan-Meier product-limit estimator is a simple and powerful tool in time to event analysis. An extension exists for populations stratified into cohorts where a population survival curve is generated by weighted averaging of cohort-level survival curves. For making population-level comparisons using this statistic, we analyze the statistics of the area between two such weighted survival curves. We derive the large sample behavior of this statistic based on an empirical process of product-limit estimators. This estimator was used by an interdisciplinary NIH-SSA team in the identification of medical conditions to prioritize for adjudication in disability benefits processing.
  • Consider the problem of modeling hysteresis for finite-state random walks using higher-order Markov chains. This Letter introduces a Bayesian framework to determine, from data, the number of prior states of recent history upon which a trajectory is statistically dependent. The general recommendation is to use leave-one-out cross validation, using an easily-computable formula that is provided in closed form. Importantly, Bayes factors using flat model priors are biased in favor of too-complex a model (more hysteresis) when a large amount of data is present and the Akaike information criterion (AIC) is biased in favor of too-sparse a model (less hysteresis) when few data are present.
  • We develop a method to reconstruct, from measured displacements of an underlying elastic substrate, the spatially dependent forces that cells or tissues impart on it. Given newly available high-resolution images of substrate displacements, it is desirable to be able to reconstruct small scale, compactly supported focal adhesions which are often localized and exist only within the footprint of a cell. In addition to the standard quadratic data mismatch terms that define least-squares fitting, we motivate a regularization term in the objective function that penalizes vectorial invariants of the reconstructed surface stress while preserving boundaries. We solve this inverse problem by providing a numerical method for setting up a discretized inverse problem that is solvable by standard convex optimization techniques. By minimizing the objective function subject to a number of important physically motivated constraints, we are able to efficiently reconstruct stress fields with localized structure from simulated and experimental substrate displacements. Our method incorporates the exact solution for the stress tensor accurate to first-order finite-differences and motivates the use of distance-based cut-offs for data inclusion and problem sparsification.
  • In vertebrates, insufficient availability of calcium and phosphate ions in extracellular fluids leads to loss of bone density and neuronal hyper-excitability. To counteract this problem, calcium ions are present at high concentrations throughout body fluids -- at concentrations exceeding the saturation point. This condition leads to the opposite situation where unwanted mineral sedimentation may occur. Remarkably, ectopic or out-of-place sedimentation into soft tissues is rare, in spite of the thermodynamic driving factors. This fortunate fact is due to the presence of auto-regulatory proteins that are found in abundance in bodily fluids. Yet, many important inflammatory disorders such as atherosclerosis and osteoarthritis are associated with this undesired calcification. Hence, it is important to gain an understanding of the regulatory process and the conditions under which it can go awry. In this manuscript, we adapt mean-field classical nucleation theory to the case of surface-shielding in order to study the regulation of sedimentation of calcium phosphate salts in biological tissues through the mechanism of post-nuclear shielding of nascent mineral particles by binding proteins. We develop a mathematical description of this phenomenon using a countable system of hyperbolic partial differential equations. A critical concentration of regulatory protein is identified as a function of the physical parameters that describe the system.
  • Quantifying the forces between and within macromolecules is a necessary first step in understanding the mechanics of molecular structure, protein folding, and enzyme function and performance. In such macromolecular settings, dynamic single-molecule force spectroscopy (DFS) has been used to distort bonds. The resulting responses, in the form of rupture forces, work applied, and trajectories of displacements, have been used to reconstruct bond potentials. Such approaches often rely on simple parameterizations of one-dimensional bond potentials, assumptions on equilibrium starting states, and/or large amounts of trajectory data. Parametric approaches typically fail at inferring complex-shaped bond potentials with multiple minima, while piecewise estimation may not guarantee smooth results with the appropriate behavior at large distances. Existing techniques, particularly those based on work theorems, also do not address spatial variations in the diffusivity that may arise from spatially inhomogeneous coupling to other degrees of freedom in the macromolecule, thereby presenting an incomplete picture of the overall bond dynamics. To solve these challenges, we have developed a comprehensive empirical Bayesian approach that incorporates data and regularization terms directly into a path integral. All experiemental and statistical parameters in our method are estimated empirically directly from the data. Upon testing our method on simulated data, our regularized approach requires fewer data and allows simultaneous inference of both complex bond potentials and diffusivity profiles.
  • Cortical spreading depression (CSD) is a slow-moving ionic and metabolic disturbance that propagates in cortical brain tissue. In addition to massive cellular depolarization, CSD also involves significant changes in perfusion and metabolism -- aspects of CSD that had not been modeled and are important to traumatic brain injury, subarachnoid hemorrhage, stroke, and migraine. In this study, we develop a mathematical model for CSD where we focus on modeling the features essential to understanding the implications of neurovascular coupling during CSD. In our model, the sodium-potassium--ATPase, mainly responsible for ionic homeostasis and active during CSD, operates at a rate that is dependent on the supply of oxygen. The supply of oxygen is determined by modeling blood flow through a lumped vascular tree with an effective local vessel radius that is controlled by the extracellular potassium concentration. We show that during CSD, the metabolic demands of the cortex exceed the physiological limits placed on oxygen delivery, regardless of vascular constriction or dilation. However, vasoconstriction and vasodilation play important roles in the propagation of CSD and its recovery. Our model replicates the qualitative and quantitative behavior of CSD -- vasoconstriction, oxygen depletion, extracellular potassium elevation, prolonged depolarization -- found in experimental studies. We predict faster, longer duration CSD in vivo than in vitro due to the contribution of the vasculature. Our results also help explain some of the variability of CSD between species and even within the same animal. These results have clinical and translational implications, as they allow for more precise in vitro, in vivo, and in silico exploration of a phenomenon broadly relevant to neurological disease.
  • Shape-based regularization has proven to be a useful method for delineating objects within noisy images where one has prior knowledge of the shape of the targeted object. When a collection of possible shapes is available, the specification of a shape prior using kernel density estimation is a natural technique. Unfortunately, energy functionals arising from kernel density estimation are of a form that makes them impossible to directly minimize using efficient optimization algorithms such as graph cuts. Our main contribution is to show how one may recast the energy functional into a form that is minimizable iteratively and efficiently using graph cuts.