-
Most of our current understanding of mechanisms of photosynthesis comes from
spectroscopy. However, classical definition of radio-antenna can be extended to
optical regime to discuss the function of light-harvesting antennae. Further to
our previously proposed model of a loop antenna we provide several more
physical explanations on considering the non-reciprocal properties of the light
harvesters of bacteria. We explained the function of the non-heme iron at the
reaction center, and presented reasons for each module of the light harvester
being composed of one carotenoid, two short α-helical polypeptides and
three bacteriochlorophylls; we explained also the toroidal shape of the light
harvester, the upper bound of the characteristic length of the light harvester,
the functional role played by the long-lasting spectrometric signal observed,
and the photon anti-bunching observed. Based on these analyses, two mechanisms
might be used by radiation-durable bacteria, {\it Deinococcus radiodurans}; and
the non-reciprocity of an archaeon, {\it Haloquadratum walsbyi}, are analyzed.
The physical lessons involved are useful for designing artificial light
harvesters, optical sensors, wireless power chargers, passive super-Planckian
heat radiators, photocatalytic hydrogen generators, and radiation protective
cloaks. In particular it can predict what kind of particles should be used to
separate sunlight into a photovoltaically and thermally useful range to enhance
the efficiency of solar cells.
-
Ligand diffusion through proteins is a fundamental process governing
biological signaling and enzymatic catalysis. The complex topology of protein
tunnels results in difficulties with computing ligand escape pathways by
standard molecular dynamics (MD) simulations. Here, two novel methods for
searching of ligand exit pathways and cavity exploration are proposed: memory
random acceleration MD (mRAMD), and memetic algorithms (MA). In mRAMD, finding
exit pathways is based on a non-Markovian biasing that is introduced to
optimize the unbinding force. In MA, hybrid learning protocols are exploited to
predict optimal ligand exit paths. The methods are tested on three proteins
with increasing complexity of tunnels: M2 muscarinic receptor, nitrile
hydratase, and cytochrome P450cam. In these cases, the proposed methods
outperform standard techniques that are used currently to find ligand egress
pathways. The proposed approach is general and appropriate for accelerated
transport of an object through a network of protein tunnels.
-
Predicting three dimensional residue-residue contacts from evolutionary
information in protein sequences was attempted already in the early 1990s.
However, contact prediction accuracies of methods evaluated in CASP experiments
before CASP11 remained quite low, typically with <20% true positives.
Recently, contact prediction has been significantly improved to the level that
an accurate three dimensional model of a large protein can be generated on the
basis of predicted contacts. This improvement was attained by disentangling
direct from indirect correlations in amino acid covariations or cosubstitutions
between sites in protein evolution. Here, we review statistical methods for
extracting causative correlations and various approaches to describe protein
structure, complex, and flexibility based on predicted contacts.
-
The role of proton tunneling in biological catalysis is investigated here
within the frameworks of quantum information theory and thermodynamics. We
consider the quantum correlations generated through two hydrogen bonds between
a substrate and a prototypical enzyme that first catalyzes the tautomerization
of the substrate to move on to a subsequent catalysis, and discuss how the
enzyme can derive its catalytic potency from these correlations. In particular,
we show that classical changes induced in the binding site of the enzyme
spreads the quantum correlations among all of the four hydrogen-bonded atoms
thanks to the directionality of hydrogen bonds. If the enzyme rapidly returns
to its initial state after the binding stage, the substrate ends in a new
transition state corresponding to a quantum superposition. Open quantum system
dynamics can then naturally drive the reaction in the forward direction from
the major tautomeric form to the minor tautomeric form without needing any
additional catalytic activity. We find that in this scenario the enzyme lowers
the activation energy so much that there is no energy barrier left in the
tautomerization, even if the quantum correlations quickly decay.
-
Nonequilibrium energetics of single molecule translational motor kinesin was
investigated by measuring heat dissipation from the violation of the
fluctuation-response relation of a probe attached to the motor using optical
tweezers. The sum of the dissipation and work did not amount to the input free
energy change, indicating large hidden dissipation exists. Possible sources of
the hidden dissipation were explored by analyzing the Langevin dynamics of the
probe, which incorporates the two-state Markov stepper as a kinesin model. We
conclude that internal dissipation is dominant.
-
One of the key limitations of Molecular Dynamics simulations is the
computational intractability of sampling protein conformational landscapes
associated with either large system size or long timescales. To overcome this
bottleneck, we present the REinforcement learning based Adaptive samPling
(REAP) algorithm that aims to efficiently sample conformational space by
learning the relative importance of each reaction coordinate as it samples the
landscape. To achieve this, the algorithm uses concepts from the field of
reinforcement learning, a subset of machine learning, which rewards sampling
along important degrees of freedom and disregards others that do not facilitate
exploration or exploitation. We demonstrate the effectiveness of REAP by
comparing the sampling to long continuous MD simulations and least-counts
adaptive sampling on two model landscapes (L-shaped and circular), and
realistic systems such as alanine dipeptide and Src kinase. In all four
systems, the REAP algorithm consistently demonstrates its ability to explore
conformational space faster than the other two methods when comparing the
expected values of the landscape discovered for a given amount of time. The key
advantage of REAP is on-the-fly estimation of the importance of collective
variables, which makes it particularly useful for systems with limited
structural information.
-
Determining the principal energy pathways for allosteric communication in
biomolecules, that occur as a result of thermal motion, remains challenging due
to the intrinsic complexity of the systems involved. Graph theory provides an
approach for making sense of such complexity, where allosteric proteins can be
represented as networks of amino acids. In this work, we establish the
eigenvector centrality metric in terms of the mutual information, as a mean of
elucidating the allosteric mechanism that regulates the enzymatic activity of
proteins. Moreover, we propose a strategy to characterize the range of the
physical interactions that underlie the allosteric process. In particular, the
well known enzyme, imidazol glycerol phosphate synthase (IGPS), is utilized to
test the proposed methodology. The eigenvector centrality measurement
successfully describes the allosteric pathways of IGPS, and allows to pinpoint
key amino acids in terms of their relevance in the momentum transfer process.
The resulting insight can be utilized for refining the control of IGPS
activity, widening the scope for its engineering. Furthermore, we propose a new
centrality metric quantifying the relevance of the surroundings of each
residue. In addition, the proposed technique is validated against experimental
solution NMR measurements yielding fully consistent results. Overall, the
methodologies proposed in the present work constitute a powerful and cost
effective strategy to gain insight on the allosteric mechanism of proteins.
-
Protein misfolding is implicated in many diseases, including the
serpinopathies. For the canonical inhibitory serpin {\alpha}1-antitrypsin
(A1AT), mutations can result in protein deficiencies leading to lung disease,
and misfolded mutants can accumulate in hepatocytes leading to liver disease.
Using all-atom simulations based on the recently developed Bias Functional
algorithm we elucidate how wild-type A1AT folds and how the disease-associated
S (Glu264Val) and Z (Glu342Lys) mutations lead to misfolding. The deleterious Z
mutation disrupts folding at an early stage, while the relatively benign S
mutant shows late stage minor misfolding. A number of suppressor mutations
ameliorate the effects of the Z mutation and simulations on these mutants help
to elucidate the relative roles of steric clashes and electrostatic
interactions in Z misfolding. These results demonstrate a striking correlation
between atomistic events and disease severity and shine light on the mechanisms
driving chains away from their correct folding routes.
-
A theoretical analysis of the unfolding pathway of simple modular proteins in
length- controlled pulling experiments is put forward. Within this framework,
we predict the first module to unfold in a chain of identical units,
emphasizing the ranges of pulling speeds in which we expect our theory to hold.
These theoretical predictions are checked by means of steered molecular
dynamics of a simple construct, specifically a chain composed of two
coiled-coils motives, where anisotropic features are revealed. These
simulations also allow us to give an estimate for the range of pulling
velocities in which our theoretical approach is valid.
-
The response of proteins to chemical reactions or impulsive excitation that
occurs within the molecule has fascinated chemists for decades. In recent years
ultrafast X-ray studies have provided ever more detailed information about the
evolution of protein structural change following ligand photolysis, and
time-resolved IR and Raman techniques, e.g., have provided detailed pictures of
the nature and rate of energy transport in peptides and proteins, including
recent advances in identifying transport through individual amino acids of
several heme proteins. Computational tools to locate energy transport pathways
in proteins have also been advancing. Energy transport pathways in proteins
have since some time been identified by molecular dynamics (MD) simulations,
and more recent efforts have focused on the development of coarse graining
approaches, some of which have exploited analogies to thermal transport in
other molecular materials. With the identification of pathways in proteins and
protein complexes, network analysis has been applied to locate residues that
control protein dynamics and possibly allostery, where chemical reactions at
one binding site mediate reactions at distance sites of the protein. In this
chapter we review approaches for locating computationally energy transport
networks in proteins. We present background into energy and thermal transport
in condensed phase and macromolecules that underlies the approaches we discuss
before turning to a description of the approaches themselves. We also
illustrate the application of the computational methods for locating energy
transport networks and simulating energy dynamics in proteins with several
examples.
-
Organization and maintenance of the chromosomal DNA in living cells strongly
depends on the DNA interactions with a plethora of DNA-binding proteins.
Single-molecule studies show that formation of nucleoprotein complexes on DNA
by such proteins is frequently subject to force and torque constraints applied
to the DNA. Although the existing experimental techniques allow to exert these
type of mechanical constraints on individual DNA biopolymers, their exact
effects in regulation of DNA-protein interactions are still not completely
understood due to the lack of systematic theoretical methods able to
efficiently interpret complex experimental observations. To fill this gap, we
have developed a general theoretical framework based on the transfer-matrix
calculations that can be used to accurately describe behaviour of DNA-protein
interactions under force and torque constraints. Potential applications of the
constructed theoretical approach are demonstrated by predicting how these
constraints affect the DNA-binding properties of different types of
architectural proteins. Obtained results provide important insights into
potential physiological functions of mechanical forces in the chromosomal DNA
organization by architectural proteins as well as into single-DNA manipulation
studies of DNA-protein interactions.
-
During the last decade coarse-grained nucleotide models have emerged that
allow us to DNA and RNA on unprecedented time and length scales. Among them is
oxDNA, a coarse-grained, sequence-specific model that captures the
hybridisation transition of DNA and many structural properties of single- and
double-stranded DNA. oxDNA was previously only available as standalone
software, but has now been implemented into the popular LAMMPS molecular
dynamics code. This article describes the new implementation and analyses its
parallel performance. Practical applications are presented that focus on
single-stranded DNA, an area of research which has been so far
under-investigated. The LAMMPS implementation of oxDNA lowers the entry barrier
for using the oxDNA model significantly, facilitates future code development
and interfacing with existing LAMMPS functionality as well as other
coarse-grained and atomistic DNA models.
-
The assembly and maturation of viruses with icosahedral capsids must be
coordinated with icosahedral symmetry. The icosahedral symmetry imposes also
the restrictions on the cooperative specific interactions between genomic
RNA/DNA and coat proteins that should be reflected in quasi-regular
segmentation of viral genomic sequences. Combining discrete direct and double
Fourier transforms, we studied the quasi-regular large-scale segmentation in
genomic sequences of different ssRNA, ssDNA, and dsDNA viruses. The particular
representatives included satellite tobacco mosaic virus and the strains of
satellite tobacco necrosis virus, STNV-C, STNV-1, STNV-2, Escherichia phages
MS2, phiX174, alpha3, and HK97, and Simian virus 40. In all their genomes, we
found the significant quasi-regular segmentation of genomic sequences related
to the virion assembly and the genome packaging within icosahedral capsid. We
also found good correspondence between our results and available cryo-electron
microscopy data on capsid structures and genome packaging in these viruses.
Fourier analysis of genomic sequences provides the additional insight into
mechanisms of hierarchical genome packaging and may be used for verification of
the concepts of 3-fold or 5-fold intermediates in virion assembly. The results
of sequence analysis should be taken into account at the choice of models and
data interpretation. They also may be helpful for the development of antiviral
drugs.
-
This paper presents a novel Differential Evolution algorithm for protein
folding optimization that is applied to a three-dimensional AB off-lattice
model. The proposed algorithm includes two new mechanisms. A local search is
used to improve convergence speed and to reduce the runtime complexity of the
energy calculation. For this purpose, a local movement is introduced within the
local search. The designed evolutionary algorithm has fast convergence speed
and, therefore, when it is trapped into the local optimum or a relatively good
solution is located, it is hard to locate a better similar solution. The
similar solution is different from the good solution in only a few components.
A component reinitialization method is designed to mitigate this problem. Both
the new mechanisms and the proposed algorithm were analyzed on well-known amino
acid sequences that are used frequently in the literature. Experimental results
show that the employed new mechanisms improve the efficiency of our algorithm
and that the proposed algorithm is superior to other state-of-the-art
algorithms. It obtained a hit ratio of 100% for sequences up to 18 monomers,
within a budget of 1011 solution evaluations. New best-known solutions
were obtained for most of the sequences. The existence of the symmetric
best-known solutions is also demonstrated in the paper.
-
The meaningful comparison of ion mobility (IM) results and of collision cross
section (CCS) values on different platforms is a prerequisite for using CCS for
identification or structural assignment. The amount of internal energy imparted
to the ions prior to the ion mobility cell is a source of experimental
variation. Here we investigated the effects of virtually all tuning parameters
of the Agilent 6560 IM-Q-TOF on the arrival time distributions of Ubiquitin7+,
and found conditions in which the native state prevails. We will discuss the
effects of solvent evaporation conditions in the source, in the entire pre-IM
DC voltage gradient, and with the funnel RF amplitudes, and will also report on
ubiquitin7+ conformations in different solvents, including native supercharging
conditions. Collision-induced unfolding (CIU) can be conveniently provoked in
two distinct regions: behind the source capillary (by changing the fragmentor
voltage) and in the trapping funnel (by changing the trap entrance grid delta
voltage). The softness of the instrumental conditions were then optimized with
the benchmark DNA G-quadruplex [(dG4T4G4)2.(NH4+)3-8H]5-, for which ion
activation results in ammonia loss. To reduce the ion internal energy and
obtain the intact 3-NH4+ complex, we reduced the post-IM voltage gradient, but
this resulted in a lower IM resolving power due to increased diffusion behind
the drift tube. The article thus describes the various trade-offs between ion
activation, ion transmission, and ion mobility performance for native MS of
very fragile structures.
-
The predominant structural protein in vertebrates is collagen, which plays a
key role in extracellular matrix and connective tissue mechanics. Despite its
prevalence and physical importance in biology, the mechanical properties of
molecular collagen are far from established. The flexibility of its triple
helix is unresolved, with descriptions from different experimental techniques
ranging from flexible to semirigid. Furthermore, it is unknown how collagen
type (homo- vs. heterotrimeric) and source (tissue-derived vs. recombinant)
influence flexibility. Using SmarTrace, a chain tracing algorithm we devised,
we performed statistical analysis of collagen conformations collected with
atomic force microscopy (AFM) to determine the protein's mechanical properties.
Our results show that types I, II and III collagens - the key fibrillar
varieties - exhibit molecular flexibilities that are very similar. However,
collagen conformations are strongly modulated by salt, transitioning from
compact to extended as KCl concentration increases, in both neutral and acidic
pH. While analysis with a standard worm-like chain model suggests that the
persistence length of collagen can attain almost any value within the
literature range, closer inspection reveals that this modulation of collagen's
conformational behavior is not due to changes in flexibility, but rather arises
from the induction of curvature (either intrinsic or induced by interactions
with the mica surface). By modifying standard polymer theory to include innate
curvature, we show that collagen behaves as an equilibrated curved worm-like
chain (cWLC) in two dimensions. Analysis within the cWLC model shows that
collagen's curvature depends strongly on pH and salt, while its persistence
length does not. Thus, we find that triple-helical collagen is well described
as semiflexible, irrespective of source, type, pH and salt environment.
-
The function of proteins arises from cooperative interactions and
rearrangements of their amino acids, which exhibit large-scale dynamical modes.
Long-range correlations have also been revealed in protein sequences, and this
has motivated the search for physical links between the observed genetic and
dynamic cooperativity. We outline here a simplified theory of protein, which
relates sequence correlations to physical interactions and to the emergence of
mechanical function. Our protein is modeled as a strongly-coupled amino acid
network whose interactions and motions are captured by the mechanical
propagator, the Green function. The propagator describes how the gene
determines the connectivity of the amino acids, and thereby the transmission of
forces. Mutations introduce localized perturbations to the propagator which
scatter the force field. The emergence of function is manifested by a
topological transition when a band of such perturbations divides the protein
into subdomains. We find that epistasis -- the interaction among mutations in
the gene -- is related to the nonlinearity of the Green function, which can be
interpreted as a sum over multiple scattering paths. We apply this mechanical
framework to simulations of protein evolution, and observe long-range epistasis
which facilitates collective functional modes.
-
Graphical models are powerful tools for modeling and making statistical
inferences regarding complex associations among variables in multivariate data.
In this paper we introduce the R package netgwas, which is designed based on
undirected graphical models to accomplish three important and interrelated
goals in genetics: constructing linkage map, reconstructing linkage
disequilibrium (LD) networks from multi-loci genotype data, and detecting
high-dimensional genotype-phenotype networks. The netgwas package deals with
species with any chromosome copy number in a unified way, unlike other
software. It implements recent improvements in both linkage map construction
(Behrouzi and Wit, 2018), and reconstructing conditional independence network
for non-Gaussian continuous data, discrete data, and mixed
discrete-and-continuous data (Behrouzi and Wit, 2017). Such datasets routinely
occur in genetics and genomics such as genotype data, and genotype-phenotype
data. We demonstrate the value of our package functionality by applying it to
various multivariate example datasets taken from the literature. We show, in
particular, that our package allows a more realistic analysis of data, as it
adjusts for the effect of all other variables while performing pairwise
associations. This feature controls for spurious associations between variables
that can arise from classical multiple testing approach. This paper includes a
brief overview of the statistical methods which have been implemented in the
package. The main body of the paper explains how to use the package. The
package uses a parallelization strategy on multi-core processors to speed-up
computations for large datasets. In addition, it contains several functions for
simulation and visualization. The netgwas package is freely available at
https://cran.r-project.org/web/packages/netgwas
-
Targeting the mitochondrial enzyme FoF1-ATP synthase and modulating its
catalytic activities with small molecules is a promising new approach for
treatment of autoimmune diseases. The immuno-modulatory compound Bz-423 is such
a drug that binds to subunit OSCP of the mitochondrial FoF1-ATP synthase and
induces apoptosis via increased reactive oxygen production in coupled, actively
respiring mitochondria. Here we review the experimental progress to reveal the
binding of Bz-423 to the mitochondrial target and discuss how subunit rotation
of FoF1-ATP synthase is affected by Bz-423. Briefly, we report how F\"orster
resonance energy transfer (FRET) can be employed to colocalize the enzyme and
the fluorescently tagged Bz-423 within the mitochondria of living cells with
nanometer resolution.
-
Advanced mathematics, such as multiscale weighted colored graph and element
specific persistent homology, and machine learning including deep neural
networks were integrated to construct mathematical deep learning models for
pose and binding affinity prediction and ranking in the last two D3R grand
challenges in computer-aided drug design and discovery. D3R Grand Challenge 2
(GC2) focused on the pose prediction and binding affinity ranking and free
energy prediction for Farnesoid X receptor ligands. Our models obtained the top
place in absolute free energy prediction for free energy Set 1 in Stage 2. The
latest competition, D3R Grand Challenge 3 (GC3), is considered as the most
difficult challenge so far. It has 5 subchallenges involving Cathepsin S and
five other kinase targets, namely VEGFR2, JAK2, p38-α, TIE2, and ABL1.
There is a total of 26 official competitive tasks for GC3. Our predictions were
ranked 1st in 10 out of 26 official competitive tasks.
-
Inferential methods can be used to integrate experimental informations and
molecular simulations. The maximum entropy principle provides a framework for
using equilibrium experimental data and it has been shown that replica-averaged
simulations, restrained using a static potential, are a practical and powerful
implementation of such principle. Here we show that replica-averaged
simulations restrained using a time-dependent potential are equivalent to the
principle of maximum caliber, the dynamic version of the principle of maximum
entropy, and thus may allow to integrate time-resolved data in molecular
dynamics simulations. We provide an analytical proof of the equivalence as well
as a computational validation making use of simple models and synthetic data.
Some limitations and possible solutions are also discussed.
-
Many important analgesics relieve pain by binding to the μ-Opioid
Receptor (μOR), which makes the μOR among the most clinically relevant
proteins of the G Protein Coupled Receptor (GPCR) family. Despite previous
studies on the activation pathways of the GPCRs, the mechanism of opiate
binding and the selectivity of μOR are largely unknown. We performed
extensive molecular dynamics (MD) simulation and analysis to find the selective
allosteric binding sites of the μOR and the path opiates take to bind to
the orthosteric site. In this study, we predicted that the allosteric site is
responsible for the attraction and selection of opiates. Using Markov state
models and machine learning, we traced the pathway of opiates in binding to the
orthosteric site, the main binding pocket. Our results have important
implications in designing novel analgesics.
-
Managing blood lipid levels is important for the treatment and prevention of
diabetes, cardiovascular disease, and obesity. An easy-to-use, portable lipid
blood test will accelerate more frequent testing by patients and at-risk
populations. We used smartphone systems that are already familiar to many
people. Because smartphone systems can be carried around everywhere, blood can
be measured easily and frequently. We compared the results of lipid tests with
those of existing clinical diagnostic laboratory methods. We found that
smartphone-based point-of-care lipid blood tests are as accurate as
hospital-grade laboratory tests. Our system will be useful for those who need
to manage blood lipid levels to motivate them to track and control their
behavior.
-
MOTIVATION: Proteins fold into complex structures that are crucial for their
biological functions. Experimental determination of protein structures is
costly and therefore limited to a small fraction of all known proteins. Hence,
different computational structure prediction methods are necessary for the
modelling of the vast majority of all proteins. In most structure prediction
pipelines, the last step is to select the best available model and to estimate
its accuracy. This model quality estimation problem has been growing in
importance during the last decade, and progress is believed to be important for
large scale modelling of proteins. The current generation of model quality
estimation programs performs well at separating incorrect and good models, but
fails to consistently identify the best possible model. State-of-the-art model
quality assessment methods use a combination of features that describe a model
and the agreement of the model with features predicted from the protein
sequence.
RESULTS: We first introduce a deep neural network architecture to predict
model quality using significantly fewer input features than state-of-the-art
methods. Thereafter, we propose a methodology to train the deep network that
leverages the comparative structure of the problem. We also show the
possibility of applying transfer learning on databases of known protein
structures. We demonstrate its viability by reaching state-of-the-art
performance using only a reduced set of input features and a coarse description
of the models.
AVAILABILITY: The code will be freely available for download at
github.com/ElofssonLab/ProQ4.
-
Bayesian network models are finding success in characterizing
enzyme-catalyzed reactions, slow conformational changes, predicting enzyme
inhibition, and genomics. In this work, we apply them to statistical modeling
of peptides by simultaneously identifying amino acid sequence motifs and using
a motif-based model to clarify the role motifs may play in antimicrobial
activity. We construct models of increasing sophistication, demonstrating how
chemical knowledge of a peptide system may be embedded without requiring new
derivation of model fitting equations after changing model structure. These
models are used to construct classifiers with good performance (94% accuracy,
Matthews correlation coefficient of 0.87) at predicting antimicrobial activity
in peptides, while at the same time being built of interpretable parameters. We
demonstrate use of these models to identify peptides that are potentially both
antimicrobial and antifouling, and show that the background distribution of
amino acids could play a greater role in activity than sequence motifs do. This
provides an advancement in the type of peptide activity modeling that can be
done and the ease in which models can be constructed.