{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Aligning a trajectory to a reference\n", "\n", "We use `align.AlignTraj` to align a trajectory to a frame in a reference trajectory and write it to a file.\n", "\n", "**Last updated:** December 2022\n", "\n", "**Minimum version of MDAnalysis:** 2.0.0\n", "\n", "**Packages required:**\n", " \n", "* MDAnalysis (Michaud-Agrawal *et al.*, 2011, Gowers *et al.*, 2016)\n", "* MDAnalysisTests\n", " \n", "**Optional packages for molecular visualisation:**\n", " \n", "* [nglview](http://nglviewer.org/nglview/latest) (Nguyen *et al.*, 2018)\n", "\n", "Throughout this tutorial we will include cells for visualising Universes with the [NGLView](http://nglviewer.org/nglview/latest/api.html) library. However, these will be commented out, and we will show the expected images generated instead of the interactive widgets.\n", "\n", "\n", "**See also**\n", "\n", "* [Aligning a trajectory to a frame from itself](aligning_trajectory_to_frame.ipynb)\n", "* [Aligning a structure to another](aligning_structure_to_another.ipynb)\n", "* [RMSD](rmsd.ipynb)\n", "\n", "
\n", " \n", "**Note**\n", "\n", "MDAnalysis implements RMSD calculation using the fast QCP algorithm (Theobald, 2005) and a rotation matrix *R* that minimises the RMSD (Liu *et al.*, 2009). Please cite (Theobald, 2005) and (Liu *et al.*, 2009) when using the ``MDAnalysis.analysis.align`` module in published work.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:48.967650Z", "start_time": "2020-12-30T11:02:47.891539Z" } }, "outputs": [], "source": [ "import MDAnalysis as mda\n", "from MDAnalysis.analysis import align\n", "from MDAnalysis.tests.datafiles import CRD, PSF, DCD, DCD2\n", "# import nglview as nv\n", "\n", "import warnings\n", "# suppress some MDAnalysis warnings when writing PDB files\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading files\n", "\n", "The test files we will be working with here are trajectories of a adenylate kinase (AdK), a phosophotransferase enzyme. (Beckstein *et al.*, 2009) The trajectories sample a transition from a closed to an open conformation. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:49.347627Z", "start_time": "2020-12-30T11:02:48.969475Z" } }, "outputs": [], "source": [ "adk_open = mda.Universe(CRD, DCD2)\n", "adk_closed = mda.Universe(PSF, DCD)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Currently, the proteins are not aligned to each other. The difference becomes obvious when the closed conformation is compared to the open. Below, we set `adk_open` to the last frame and see the relative positions of each protein in a merged Universe." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:49.554328Z", "start_time": "2020-12-30T11:02:49.352046Z" } }, "outputs": [], "source": [ "adk_open.trajectory[-1] # last frame\n", "merged = mda.Merge(adk_open.atoms, adk_closed.atoms)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# merged_view = nv.show_mdanalysis(merged)\n", "# merged_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![merged_view](images/merged_view.png)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Aligning a trajectory with AlignTraj\n", "\n", "While [align.alignto](https://docs.mdanalysis.org/stable/documentation_pages/analysis/align.html#MDAnalysis.analysis.align.alignto) aligns structures, or a frame of a trajectory, `align.AlignTraj` ([API docs](https://docs.mdanalysis.org/stable/documentation_pages/analysis/align.html#MDAnalysis.analysis.align.AlignTraj)) efficiently aligns an entire trajectory to a reference. Unlike most other analysis modules, `AlignTraj` allows you to write the output of the analysis to a file. This is because when `Universe`s are created by loading from a file, changes to frame-by-frame (dynamic) information [do not persist](https://userguide.mdanalysis.org/stable/trajectories/trajectories.html) when the frame is changed. If the trajectory is not written to a file, or pulled into memory (below), `AlignTraj` will have no effect." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:50.469639Z", "start_time": "2020-12-30T11:02:50.179185Z" } }, "outputs": [], "source": [ "align.AlignTraj(adk_closed, # trajectory to align\n", " adk_open, # reference\n", " select='name CA', # selection of atoms to align\n", " filename='aligned.dcd', # file to write the trajectory to\n", " match_atoms=True, # whether to match atoms based on mass\n", " ).run()\n", "# merge adk_closed and adk_open into the same universe\n", "merged1 = mda.Merge(adk_closed.atoms, adk_open.atoms)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# merged1_view = nv.show_mdanalysis(merged1)\n", "# merged1_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![merged1_view](images/aligning_trajectory-merged1_view.png)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, the `adk_closed` and `adk_open` trajectories still look the same. However, when we load our aligned trajectory from `aligned.dcd`, we can see them superposed:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:51.406526Z", "start_time": "2020-12-30T11:02:51.135804Z" } }, "outputs": [], "source": [ "aligned = mda.Universe(PSF, 'aligned.dcd')\n", "aligned.segments.segids = ['Aligned'] # rename our segments\n", "adk_open.segments.segids = ['Open'] # so they're coloured differently\n", "merged2 = mda.Merge(aligned.atoms, adk_open.atoms)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# merged2_view = nv.show_mdanalysis(merged2)\n", "# merged2_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![merged2_view](images/aligning_trajectory-merged2_view.png)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you don't want to write a file, you can also choose to load the entire trajectory into memory. (This is not always feasible depending on how large your trajectory is, and how much memory your device has, in which case it is much more efficient to write an aligned trajectory to a file as above). You can accomplish this in one of two ways:\n", "\n", "1. Load the trajectory into memory in the first place\n", "```python\n", "adk_closed = mda.Universe(PSF, DCD, in_memory=True)\n", "```\n", "\n", "2. Select `in_memory=True` during `AlignTraj` (below)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:52.851594Z", "start_time": "2020-12-30T11:02:52.595887Z" } }, "outputs": [], "source": [ "align.AlignTraj(adk_closed, # trajectory to align\n", " adk_open, # reference\n", " select='name CA', # selection of atoms to align\n", " filename='aligned.dcd', # file to write the trajectory to\n", " match_atoms=True, # whether to match atoms based on mass\n", " in_memory=True\n", " ).run()\n", "# merge adk_closed and adk_open into the same universe\n", "merged3 = mda.Merge(adk_closed.atoms, adk_open.atoms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Copying coordinates into a new Universe" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`MDAnalysis.Merge` does not automatically load coordinates for a trajectory. We can do this ourselves. Below, we copy the coordinates of the 98 frames in the `aligned` universe." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:54.272804Z", "start_time": "2020-12-30T11:02:54.238434Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(98, 3341, 3)\n" ] } ], "source": [ "from MDAnalysis.analysis.base import AnalysisFromFunction\n", "import numpy as np\n", "from MDAnalysis.coordinates.memory import MemoryReader\n", "\n", "def copy_coords(ag):\n", " return ag.positions.copy()\n", "\n", "aligned_coords = AnalysisFromFunction(copy_coords, \n", " aligned.atoms).run().results\n", "\n", "print(aligned_coords['timeseries'].shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To contrast, we will keep the open conformation static." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:55.773852Z", "start_time": "2020-12-30T11:02:55.762008Z" } }, "outputs": [ { "data": { "text/plain": [ "(3341, 3)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adk_coords = adk_open.atoms.positions.copy()\n", "adk_coords.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because there are 98 frames of the `aligned` Universe, we copy the coordinates of the `adk_open` positions and stack them." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:56.532217Z", "start_time": "2020-12-30T11:02:56.523271Z" } }, "outputs": [ { "data": { "text/plain": [ "(98, 3341, 3)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adk_traj_coords = np.stack([adk_coords] * 98)\n", "adk_traj_coords.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We join `aligned_coords` and `adk_traj_coords` on the second axis with `np.hstack` and load the coordinates into memory into the `merged2` Universe." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:58.252058Z", "start_time": "2020-12-30T11:02:58.065857Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "merged_coords = np.hstack([aligned_coords['timeseries'], \n", " adk_traj_coords])\n", "merged2.load_new(merged_coords, format=MemoryReader)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# m2_view = nv.show_mdanalysis(merged2)\n", "# m2_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Online notebooks do not show the molecule trajectory, but here you can use `nglview.contrib.movie.MovieMaker` to make a gif of the trajectory. This requires you to install `moviepy`." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2020-12-30T11:02:23.357079Z", "start_time": "2020-12-30T11:02:23.331440Z" }, "scrolled": true, "tags": [ "a" ] }, "outputs": [], "source": [ "# from nglview.contrib.movie import MovieMaker\n", "# movie = MovieMaker(\n", "# m2_view,\n", "# step=4, # only render every 4th frame\n", "# output='merged.gif',\n", "# render_params={\"factor\": 3}, # set to 4 for higher quality\n", "# )\n", "# movie.make()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![merged](merged.gif)\n", " \n", "
\n", "
" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Writing trajectories to a file\n", "\n", "Finally, we can also save this new trajectory to a file. " ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2021-05-19T05:56:30.239319Z", "iopub.status.busy": "2021-05-19T05:56:30.237909Z", "iopub.status.idle": "2021-05-19T05:56:32.988341Z", "shell.execute_reply": "2021-05-19T05:56:32.988743Z" } }, "outputs": [], "source": [ "with mda.Writer('aligned.xyz', merged2.atoms.n_atoms) as w:\n", " for ts in merged2.trajectory:\n", " w.write(merged2.atoms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[1] Oliver Beckstein, Elizabeth J. Denning, Juan R. Perilla, and Thomas B. Woolf.\n", "Zipping and Unzipping of Adenylate Kinase: Atomistic Insights into the Ensemble of OpenClosed Transitions.\n", "Journal of Molecular Biology, 394(1):160–176, November 2009.\n", "00107.\n", "URL: https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164, doi:10.1016/j.jmb.2009.09.009.\n", "\n", "[2] Richard J. Gowers, Max Linke, Jonathan Barnoud, Tyler J. E. Reddy, Manuel N. Melo, Sean L. Seyler, Jan Domański, David L. Dotson, Sébastien Buchoux, Ian M. Kenney, and Oliver Beckstein.\n", "MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations.\n", "Proceedings of the 15th Python in Science Conference, pages 98–105, 2016.\n", "00152.\n", "URL: https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html, doi:10.25080/Majora-629e541a-00e.\n", "\n", "[3] Pu Liu, Dimitris K. Agrafiotis, and Douglas L. Theobald.\n", "Fast determination of the optimal rotational matrix for macromolecular superpositions.\n", "Journal of Computational Chemistry, pages n/a–n/a, 2009.\n", "URL: http://doi.wiley.com/10.1002/jcc.21439, doi:10.1002/jcc.21439.\n", "\n", "[4] Naveen Michaud-Agrawal, Elizabeth J. Denning, Thomas B. Woolf, and Oliver Beckstein.\n", "MDAnalysis: A toolkit for the analysis of molecular dynamics simulations.\n", "Journal of Computational Chemistry, 32(10):2319–2327, July 2011.\n", "00778.\n", "URL: http://doi.wiley.com/10.1002/jcc.21787, doi:10.1002/jcc.21787.\n", "\n", "[5] Hai Nguyen, David A Case, and Alexander S Rose.\n", "NGLview–interactive molecular graphics for Jupyter notebooks.\n", "Bioinformatics, 34(7):1241–1242, April 2018.\n", "00024.\n", "URL: https://academic.oup.com/bioinformatics/article/34/7/1241/4721781, doi:10.1093/bioinformatics/btx789.\n", "\n", "[6] Douglas L. Theobald.\n", "Rapid calculation of RMSDs using a quaternion-based characteristic polynomial.\n", "Acta Crystallographica Section A Foundations of Crystallography, 61(4):478–480, July 2005.\n", "00127.\n", "URL: http://scripts.iucr.org/cgi-bin/paper?S0108767305015266, doi:10.1107/S0108767305015266." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.15 ('mda-user-guide')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.15" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false }, "vscode": { "interpreter": { "hash": "7b52aa17ef4e9358c0e91ff3f0bf50d10667bb8b55636d4b9fb967a2ff94bd8c" } } }, "nbformat": 4, "nbformat_minor": 4 }