We will calculate the lifetime of intramolecular hydrogen bonds in a protein.

Last updated: December 2022

Minimum version of MDAnalysis: 2.0.0-dev0

Packages required:

Note

Please cite Smith et al. (2018) when using HydrogenBondAnaysis in published work.

:

from tqdm.auto import tqdm
import numpy as np
import matplotlib.pyplot as plt

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PSF, DCD
from MDAnalysis.analysis.hydrogenbonds import HydrogenBondAnalysis

import warnings
# suppress some MDAnalysis warnings
warnings.filterwarnings('ignore')


:

u = mda.Universe(PSF, DCD)


The test files we will be working with here feature adenylate kinase (AdK), a phosophotransferase enzyme. ([BDPW09])

## Find all hydrogen bonds¶

First, find the hydrogen bonds.

:

hbonds = HydrogenBondAnalysis(universe=u)

:

hbonds.run(verbose=True)

:

<MDAnalysis.analysis.hydrogenbonds.hbond_analysis.HydrogenBondAnalysis at 0x7fb6c56cd610>


The hydrogen bond lifetime is calculated via the time autocorrelation function of the presence of a hydrogen bond:

$C(\tau) = \bigg\langle \frac{h_{ij}(t_0) h_{ij}(t_0 + \tau)}{h_{ij}(t_0)^2} \bigg\rangle$

where $$h_{ij}$$ indicates the presence of a hydrogen bond between atoms $$i$$ and $$j$$:

• $$h_{ij}=1$$ if there is a hydrogen bond

• $$h_{ij}=0$$ if there is no hydrogen bond

$$h_{ij}(t_0)=1$$ indicates there is a hydrogen bond between atoms $$i$$ and $$j$$ at a time origin $$t_0$$, and $$h_{ij}(t_0+\tau)=1$$ indicates these atoms remain hydrogen bonded throughout the period $$t_0$$ to $$t_0+\tau$$. To improve statistics, multiple time origins, $$t_0$$, are used in the calculation and the average is taken over all time origins.

See Gowers and Carbonne (2015) for further discussion on hydrogen bonds lifetimes.

Note

The period between time origins, $$t_0$$, should be chosen such that consecutive $$t_0$$ are uncorrelated.

The hbonds.lifetime method calculates the above time autocorrelation function. The period between time origins is set using window_step, and the maximum value of $$\tau$$ (in frames) is set using tau_max.

:

tau_max = 25
window_step = 1

:

tau_frames, hbond_lifetime = hbonds.lifetime(
tau_max=tau_max,
window_step=window_step
)

:

tau_times = tau_frames * u.trajectory.dt

plt.xlabel(r"$\tau\ \rm (ps)$")
plt.ylabel(r"$C(\tau)$")

plt.show() ## Calculating the time constant¶

To obtain the hydrogen bond lifetime, you can fit a biexponential to the time autocorrelation curve. We will fit the following biexponential:

$A\exp(-t / \tau_1) + B\exp(-t / \tau_2)$

where $$\tau_1$$ and $$\tau_2$$ represent two time constants - one corresponding to a short-timescale process and the other to a longer timescale process. $$A$$ and $$B$$ will sum to $$1$$, and they represent the relative importance of the short- and longer-timescale processes in the overall autocorrelation curve.

:

def fit_biexponential(tau_timeseries, ac_timeseries):
"""Fit a biexponential function to a hydrogen bond time autocorrelation function

Return the two time constants
"""
from scipy.optimize import curve_fit

def model(t, A, tau1, B, tau2):
"""Fit data to a biexponential function.
"""
return A * np.exp(-t / tau1) + B * np.exp(-t / tau2)

params, params_covariance = curve_fit(model, tau_timeseries, ac_timeseries, [1, 0.5, 1, 2])

fit_t = np.linspace(tau_timeseries, tau_timeseries[-1], 1000)
fit_ac = model(fit_t, *params)

return params, fit_t, fit_ac

:

params, fit_t, fit_ac = fit_biexponential(tau_frames, hbond_lifetime)

:

# Plot the fit
plt.plot(fit_t, fit_ac, label="fit")

plt.xlabel(r"$\tau\ \rm (ps)$")
plt.ylabel(r"$C(\tau)$")

plt.legend()
plt.show() :

# Check the decay time constant
A, tau1, B, tau2 = params
time_constant = A * tau1 + B * tau2
print(f"time_constant = {time_constant:.2f} ps")

time_constant = 6.14 ps


The above example shows you how to calculate the continuous hydrogen bond lifetime. This means that the hydrogen bond must be present at every frame from $$t_0$$ to $$t_0 + \tau$$. To allow for small fluctuations in the DA distance or DHA angle, the intermittent hydrogen bond lifetime may be calculated. This allows a hydrogen bond to break for up to a specified number of frames and still be considered present.

In the lifetime method, the intermittency argument is used to set the maxium number of frames for which a hydrogen bond is allowed to break. The default is intermittency=0, which means that if a hydrogen bond is missing at any frame between $$t_0$$ and $$t_0 + \tau$$, it will not be considered present at $$t_0+\tau$$. This is equivalent to the continuous lifetime. However, with a value of intermittency=2, all hydrogen bonds are allowed to break for up to a maximum of consecutive two frames.

Below we see how changing the intermittency affects the hydrogen bond lifetime.

:

tau_max = 25
window_step = 1
intermittencies = [0, 1, 10, 100]

:

for intermittency in intermittencies:

tau_max=tau_max,
window_step=window_step,
intermittency=intermittency
)

times_times = tau_frames * u.trajectory.dt

plt.xlabel(r"$\tau\ \rm (ps)$")
plt.ylabel(r"$C(\tau)$")

plt.legend(title="Intermittency=", loc=(1.02, 0.0))
plt.show() ## Hydrogen bond lifetime of individual hydrogen bonds¶

Let’s first find the 10 most prevalent hydrogen bonds.

:

hbonds = HydrogenBondAnalysis(universe=u)

:

hbonds.run(verbose=True)

:

<MDAnalysis.analysis.hydrogenbonds.hbond_analysis.HydrogenBondAnalysis at 0x7fb6c4b02730>

:

# Print donor, hydrogen, acceptor and count info for these hbonds
counts = hbonds.count_by_ids()
for donor, hydrogen, acceptor, count in counts[:10]:
d, h, a = u.atoms[donor], u.atoms[hydrogen], u.atoms[acceptor]
print(f"{d.resname}-{d.resid}-{d.name}\t{h.name}\t{a.resname}-{a.resid}-{a.name}\tcount={count}")

ARG-71-NH2      HH22    ASP-76-OD1      count=98
TYR-193-OH      HH      GLU-108-OE1     count=97
ARG-2-NH1       HH11    ASP-104-OD1     count=96
THR-199-OG1     HG1     ASP-197-OD1     count=95
TYR-24-OH       HH      GLY-214-OT2     count=95
TYR-133-OH      HH      ASP-146-OD1     count=95
ARG-206-NE      HE      GLU-210-OE1     count=93
ARG-124-NH2     HH22    GLU-143-OE1     count=93
LYS-200-NZ      HZ2     ASP-208-OD2     count=93
LYS-211-NZ      HZ3     GLU-204-OE1     count=92


Now we’ll calculate the lifetime of these hydrogen bonds. To do this, the simplest way is to run HydrogenBondAnalysis for each hydrogen bond then use the lifetime method. It is very efficient to find hydrogen bonds between two specific atoms, especially with update_selections=False.

:

tau_max = 25
window_step = 1
intermittency = 0

:

hbond_lifetimes = []
labels = [] # for plotting

for hbond in counts[:10]:

# find hbonds between specific atoms
d_ix, h_ix, a_ix = hbond[:3]
tmp_hbonds = HydrogenBondAnalysis(
universe=u,
hydrogens_sel=f"index {h_ix}",
acceptors_sel=f"index {a_ix}",
update_selections=False
)
tmp_hbonds.run()

tau_max=tau_max,
intermittency=intermittency
)

# label for plotting
donor, acceptor = u.atoms[d_ix], u.atoms[a_ix]
label = f"{donor.resname}:{donor.resid} to {acceptor.resname}:{acceptor.resid}"
labels.append(label)

labels = np.array(labels)

:

# Plot the lifetimes
times = taus * u.trajectory.dt
for hbl, label in zip(hbond_lifetimes, labels):
plt.plot(times, hbl, label=label, lw=2)

plt.title(r"Hydrogen bond lifetime of specific hbonds", weight="bold")
plt.xlabel(r"$\tau\ \rm (ps)$")
plt.ylabel(r"$C(\tau)$")

plt.legend(ncol=1, loc=(1.02, 0))
plt.show() Note

The shape of these curves indicates we have poor statistics in our lifetime calculations - we used only 100 frames and consider a single hydrogen bond.

The curve should decay smoothly toward 0, as seen in the first hydrogen bond lifetime plot we produced in this notebook. If the curve does not decay smoothly, more statistics are required either by increasing the value of tau_max, using a greater number of time origins, or increasing the length of your simulation.

See Gowers and Carbonne (2015) for further discussion on hydrogen bonds lifetimes.