• We present a systematic coarse-graining (CG) strategy for many particle molecular systems based on cluster expansion techniques. We construct a hierarchy of coarse-grained Hamiltonians with interaction potentials consisting of two, three and higher body interactions. The accuracy of the derived cluster expansion based on interatomic potentials is examined over a range of various temperatures and densities and compared to direct computation of pair potential of mean force. The comparison of the coarse-grained simulations is done on the basis of the structural properties, against the detailed all-atom data. We give specific examples for methane and ethane molecules in which the coarse-grained variable is the center of mass of the molecule. We investigate different temperature and density regimes, and we examine differences between the methane and ethane systems. Results show that the cluster expansion formalism can be used in order to provide accurate effective pair and three-body CG potentials at high $T$ and low $\rho$ regimes. In the liquid regime the three-body effective CG potentials give a small improvement, over the typical pair CG ones; however in order to get significantly better results one needs to consider even higher order terms.
  • In this paper, we discuss information-theoretic tools for obtaining optimized coarse-grained molecular models for both equilibrium and non-equilibrium molecular dynamics. The latter are ubiquitous in physicochemical and biological applications, where they are typically associated with coupling mechanisms, multi-physics and/or boundary conditions. In general the non-equilibrium steady states are not known explicitly as they do not necessarily have a Gibbs structure. The presented approach can compare microscopic behavior of molecular systems to parametric and non-parametric coarse-grained one using the relative entropy between distributions on the path space and setting up a corresponding path space variational inference problem. The methods can become entirely data-driven when the microscopic dynamics are replaced with corresponding correlated data in the form of time series. Furthermore, we present connections and generalizations of force matching methods in coarse-graining with path-space information methods, as well as demonstrate the enhanced transferability of information-based parameterizations to general observables due to information inequalities. We further discuss methodological connections between information-based coarse-graining of molecular systems and variational inference methods primarily developed in the machine learning community. However, we note that the work presented here addresses variational inference for correlated time series due to the focus on dynamics. The applicability of the proposed methods is demonstrated on high-dimensional stochastic processes given by Langevin, overdamped and driven Langevin dynamics of interacting particles.
  • Using the probabilistic language of conditional expectations we reformulate the force matching method for coarse-graining of molecular systems as a projection on spaces of coarse observables. A practical outcome of this probabilistic description is the link of the force matching method with thermodynamic integration. This connection provides a way to systematically construct a local mean force in order to optimally approximate the potential of mean force through force matching. We introduce a generalized force matching condition for the local mean force in the sense that allows the approximation of the potential of mean force under both linear and non-linear coarse graining mappings (e.g., reaction coordinates, end-to-end length of chains). Furthermore, we study the equivalence of force matching with relative entropy minimization which we derive for general non-linear coarse graining maps. We present in detail the generalized force matching condition through applications to specific examples in molecular systems.
  • In this paper we extend the parametric sensitivity analysis (SA) methodology proposed in Ref. [Y. Pantazis and M. A. Katsoulakis, J. Chem. Phys. 138, 054115 (2013)] to continuous time and continuous space Markov processes represented by stochastic differential equations and, particularly, stochastic molecular dynamics as described by the Langevin equation. The utilized SA method is based on the computation of the information-theoretic (and thermodynamic) quantity of relative entropy rate (RER) and the associated Fisher information matrix (FIM) between path distributions. A major advantage of the pathwise SA method is that both RER and pathwise FIM depend only on averages of the force field therefore they are tractable and computable as ergodic averages from a single run of the molecular dynamics simulation both in equilibrium and in non-equilibrium steady state regimes. We validate the performance of the extended SA method to two different molecular stochastic systems, a standard Lennard-Jones fluid and an all-atom methane liquid and compare the obtained parameter sensitivities with parameter sensitivities on three popular and well-studied observable functions, namely, the radial distribution function, the mean squared displacement and the pressure. Results show that the RER-based sensitivities are highly correlated with the observable-based sensitivities.