Calculating the solvent density around a protein

Here we use density.DensityAnalysis to analyse the solvent density around an enzyme.

Last updated: December 2022 with MDAnalysis 2.4.0-dev0

Minimum version of MDAnalysis: 1.0.0

Packages required:

Optional packages for visualisation:

  • nglview

  • matplotlib

  • scikit-image

  • pyvista

  • ipygany

[1]:
import MDAnalysis as mda
from MDAnalysis.tests.datafiles import TPR, XTC
from MDAnalysis.analysis import density

import numpy as np
import matplotlib.pyplot as plt
import nglview as nv
%matplotlib inline

Loading files

The test files we will be working with here feature adenylate kinase (AdK), a phosophotransferase enzyme. ([BDPW09]). It is solvated in TIP4P water and broken across periodic boundaries.

[2]:
u = mda.Universe(TPR, XTC)
view1 = nv.show_mdanalysis(u)
view1.add_representation('point',
                         selection='resname SOL')
view1
[3]:
view1.render_image()

Centering, aligning, and making molecules whole with on-the-fly transformations

DensityAnalysis uses a fixed grid to analyse the density of molecules. As it is likely that this grid may cross the unit cell wall, this means that molecules that have broken across the periodic boundary must be made whole. Because we want to analyse the density of water around a protein, this means:

  • that the solvent must be mapped so they are closest to the protein, and

  • we need to align the trajectory on the protein for a fixed frame of reference

In practice, the transformations that we need (in order) are shown in the table below. GROMACS’s trjconv is often used to perform these transformations; the equivalent command is also given. MDAnalysis offers on-the-fly transformations to accomplish much the same task; however, where trjconv saves the transformed trajectory into a file, MDAnalysis does not alter the initial trajectory. Instead, it transforms each frame “on-the-fly” as it is loaded into MDAnalysis.

Transformation

On-the-fly transformation

GROMACS trjconv argument

Making molecules whole

wrap.unwrap()

-pbc whole

Moving the protein to the center of the box for more symmetric density

translate.center_in_box()

-center

Wrapping water back into the box

wrap.wrap()

Aligning the trajectory onto the protein

fit.fit_rot_trans()

-fit rot+trans

We want wrap water back into the box before we fit the trajectory, in order to avoid odd placements from the rotation in the alignment.

You can do this yourself with external tools such as gmx trjconv, using the arguments above. Here, we use on-the-fly transformations so we can avoid writing out new trajectories.

[4]:
from MDAnalysis import transformations as trans

protein = u.select_atoms('protein')
water = u.select_atoms('resname SOL')

workflow = [trans.unwrap(u.atoms),  # unwrap all fragments
            trans.center_in_box(protein, # move atoms so protein
                                center='geometry'), # is centered
            trans.wrap(water, # wrap water back into box
                       compound='residues'), # keep each water whole
            trans.fit_rot_trans(protein, # align protein to first frame
                                protein,
                                weights='mass'),
           ]

u.trajectory.add_transformations(*workflow)

When we visualise the transformed trajectory, we can see that it is now centered in the box and whole.

[5]:
view2 = nv.show_mdanalysis(u)
view2.add_representation('point',
                         selection='resname SOL')
view2
[6]:
view2.render_image()

Analysing the density of water around the protein

Now that the input trajectory has been pre-processed, we can carry out our analysis. We only want to look at the density of numbers of water molecules, so we choose the oxygen atoms only (see LinearDensity for mass and charge density analysis).

[7]:
ow = u.select_atoms('name OW')
dens = density.DensityAnalysis(ow,
                               delta=4.0,
                               padding=2)
dens.run()
[7]:
<MDAnalysis.analysis.density.DensityAnalysis at 0x7fe803a3eca0>

The results are stored in dens.density, a Density object. dens.density.grid is a numpy array with the average density of the water oxygen atoms, histogrammed onto a grid with 1 \(Å\) spacing on each axis.

[8]:
grid = dens.results.density.grid
grid.shape
[8]:
(31, 42, 20)

When first calculated, these are in the default units of \(Å^{-3}\).

[9]:
dens.results.density.units
[9]:
{'length': 'Angstrom', 'density': 'Angstrom^{-3}'}

You can convert the units both for the length (convert_length) and for the density (convert_density). MDAnalysis stores a number of precomputed ways to convert units. Densities can be converted to \(nm^{-3}\), or converted to a density relative to the bulk density. After executing the code below, the array at dens.density.grid now contains the density of water relative to bulk TIP4P water at ambient positions.

[10]:
dens.results.density.convert_density('TIP4P')
dens.results.density.units
[10]:
{'length': 'Angstrom', 'density': 'TIP4P'}

Visualisation

A number of 3D and 2D visualization methods are illustrated below.

matplotlib (3D static plot)

You may want to visualise your densities as part of your analysis. One trivial way is to plot the density of water around the protein as a 3D scatter plot.

First we need to obtain the x, y, and z axes for the plot by taking the midpoints of the histogram bins. These are stored as dens.density.midpoints.

[11]:
mx, my, mz = dens.results.density.midpoints

In the plot below we represent the density of water in a particular bin with the opacity of the scatter point. To do that, we need to first normalise the density values. In the flat vector below before, the highest opacity (i.e. the point with the highest density of water oxygen atoms) is 0.1. The array is also flattened so we can treat it as a list of values.

[12]:
grid = dens.results.density.grid
flat = grid.ravel() / (grid.max()*10)

We set the colour to an RGBA array representing the colour blue. The last number in an RGBA array represents the alpha channel, which controls the opacity of the point.

[13]:
blue = [44, 130, 201, 1]
colors = [blue] * len(mx) * len(my) * len(mz)
colors = np.array(colors, dtype=float)
colors[:, -1] *= flat
colors[:, :3] /= 255

Finally we can plot the points on a 3D plot. Axes3D must be imported for a 3D plot, even though we do not directly use it. In this case, the plot is not very interesting; it just looks like a box of water.

[14]:
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

x, y, z = np.meshgrid(mx, my, mz)

ax.scatter(x, y, z, c=colors);
../../../_images/examples_analysis_volumetric_density_analysis_30_0.png

nglview (interactive)

You could also view the density in NGLView by exporting it to a DX format:

[15]:
dens.results.density.export("water.dx")

Use the surface representation in NGLViewer to visualize the loaded density at different isolevels: - contour = False shows a continuous surface (the default); True shows a wirefram - isolevel = float sets the contour level and with isolevel_type="value" is in the units of the density - isolevel_type="value" for densities (the default is “sigma”) and then isolevel has a different meanin - One can use multiple surfaces at different isolevels (although the current example trajectory has too few frames to generate a well resolved density - smooth = float controls the surface smoothing of the representation

[16]:
view3 = nv.show_mdanalysis(u)
d = view3.add_component("water.dx")
d.clear_representations()
d.add_surface(isolevel=0.5, isolevel_type="value", opacity=0.1, contour=False, smooth=1, color="blue")
d.add_surface(isolevel=1.2, isolevel_type="value", opacity=1, contour=True, smooth=1, color="cyan")

view3

scikit-image (triangulated surface)

You could use the Marching Cube (Lewiner) algorithm to triangulate the surface (following this tutorial and StackOverflow post.

This will require the skimage library.

[17]:
from skimage import measure
from mpl_toolkits.mplot3d import Axes3D

iso_val = 0.5
verts, faces, _, _ = measure.marching_cubes(dens.results.density.grid, iso_val,
                                                    spacing=dens.results.density.delta)

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_trisurf(verts[:, 0], verts[:,1], faces, verts[:, 2],
                lw=1, alpha=0.1);
../../../_images/examples_analysis_volumetric_density_analysis_37_0.png

pyvista (3D surface)

Similarly, using PyVista, you can plot both static and interactive visualizations of the surface at different iso levels (following this StackOverflow post). Uncomment the last lines to show the plot in your local machine.

[18]:
import pyvista as pv

pv.set_plot_theme("document")

x, y, z = np.meshgrid(mx, my, mz, indexing="ij")
mesh = pv.StructuredGrid(x, y, z)
mesh["density"] = dens.results.density.grid.T.flatten() # note transpose
contours = mesh.contour([0.5, 1.2])
p = pv.Plotter(notebook=True)
p.background_color = 'white'
p.add_mesh(mesh.outline(), color="k");  # box lines
p.add_mesh(contours, opacity=0.2);  # surfaces
# p.show()
# p.screenshot("./density_analysis_images/surface.png");
surface

Unfortunately plotting interactively appears to render everything with opaque surfaces. Note that this code snippet requires ipygany to be installed.

[19]:
p = pv.Plotter(notebook=True)
p.background_color = 'white'
p.add_mesh(mesh.outline(), color="k")  # box lines
p.add_mesh(contours, opacity=0.2)  # surfaces
p.show(jupyter_backend="ipygany")
/home/pbarletta/mambaforge/envs/mda-user-guide/lib/python3.9/site-packages/traittypes/traittypes.py:97: UserWarning: Given trait value dtype "float32" does not match required type "float64". A coerced copy has been created.
  warnings.warn(

2D averaging

Alternatively, you could plot the average density of water on the xy-plane. We get the average x-y positions by averaging over the z-axis.

[20]:
avg = grid.mean(axis=-1)
avg.shape
[20]:
(31, 42)

Below, it is plotted as a heat map.

[21]:
fig, ax = plt.subplots()

im = ax.imshow(avg)
cbar = plt.colorbar(im)
cbar.set_label('Mean density of water over TIP4P literature value')
plt.xlabel('X-axis ($\AA$)')
plt.ylabel('Y-axis ($\AA$)');
../../../_images/examples_analysis_volumetric_density_analysis_47_0.png

You can interpolate values for a smoother average:

[22]:
fig, ax = plt.subplots()

im = ax.imshow(avg, interpolation="bicubic")
cbar = plt.colorbar(im)
cbar.set_label('Mean density of water over TIP4P literature value')
plt.xlabel('X-axis ($\AA$)')
plt.ylabel('Y-axis ($\AA$)');
../../../_images/examples_analysis_volumetric_density_analysis_49_0.png

References

[1] Oliver Beckstein, Elizabeth J. Denning, Juan R. Perilla, and Thomas B. Woolf. Zipping and Unzipping of Adenylate Kinase: Atomistic Insights into the Ensemble of Open↔Closed Transitions. Journal of Molecular Biology, 394(1):160–176, November 2009. 00107. URL: https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164, doi:10.1016/j.jmb.2009.09.009.

[2] Richard J. Gowers, Max Linke, Jonathan Barnoud, Tyler J. E. Reddy, Manuel N. Melo, Sean L. Seyler, Jan Domański, David L. Dotson, Sébastien Buchoux, Ian M. Kenney, and Oliver Beckstein. MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations. Proceedings of the 15th Python in Science Conference, pages 98–105, 2016. 00152. URL: https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html, doi:10.25080/Majora-629e541a-00e.

[3] Oliver Beckstein, Elizabeth J. Denning, Juan R. Perilla, and Thomas B. Woolf. Zipping and Unzipping of Adenylate Kinase: Atomistic Insights into the Ensemble of Open↔Closed Transitions. Journal of Molecular Biology, 394(1):160–176, November 2009. 00107. URL: https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164, doi:10.1016/j.jmb.2009.09.009.