{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Aligning a structure to another\n", "\n", "We use `align.alignto` to align a structure to another.\n", "\n", "**Last updated:** December 2022\n", "\n", "**Minimum version of MDAnalysis:** 1.0.0\n", "\n", "**Packages required:**\n", " \n", "* MDAnalysis (Michaud-Agrawal *et al.*, 2011, Gowers *et al.*, 2016)\n", "* MDAnalysisTests\n", " \n", "**Optional packages for molecular visualisation:**\n", " \n", "* [nglview](http://nglviewer.org/nglview/latest) (Nguyen *et al.*, 2018)\n", "\n", "Throughout this tutorial we will include cells for visualising Universes with the [NGLView](http://nglviewer.org/nglview/latest/api.html) library. However, these will be commented out, and we will show the expected images generated instead of the interactive widgets.\n", "\n", "**See also**\n", "\n", "* [Aligning a trajectory to a frame from another](aligning_trajectory.ipynb)\n", "* [Aligning a trajectory to a frame from itself](aligning_trajectory_to_frame.ipynb)\n", "* [RMSD](rmsd.ipynb)\n", "\n", "\n", "
\n", " \n", "**Note**\n", "\n", "MDAnalysis implements RMSD calculation using the fast QCP algorithm (Theobald, 2005) and a rotation matrix *R* that minimises the RMSD (Liu *et al.*, 2009). Please cite (Theobald, 2005) and (Liu *et al.*, 2009) when using the ``MDAnalysis.analysis.align`` module in published work.\n", "\n", "
\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:04.491889Z", "start_time": "2021-05-19T06:10:03.591889Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:20.048242Z", "iopub.status.busy": "2021-05-19T05:56:20.047164Z", "iopub.status.idle": "2021-05-19T05:56:20.785802Z", "shell.execute_reply": "2021-05-19T05:56:20.786179Z" }, "scrolled": false }, "outputs": [], "source": [ "import MDAnalysis as mda\n", "from MDAnalysis.analysis import align\n", "from MDAnalysis.tests.datafiles import CRD, PSF, DCD, DCD2\n", "# import nglview as nv\n", "\n", "import warnings\n", "# suppress some MDAnalysis warnings about writing PDB files\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading files\n", "\n", "The test files we will be working with here are trajectories of a adenylate kinase (AdK), a phosophotransferase enzyme. (Beckstein *et al.*, 2009) The trajectories sample a transition from a closed to an open conformation." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:05.643507Z", "start_time": "2021-05-19T06:10:05.326336Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:20.789837Z", "iopub.status.busy": "2021-05-19T05:56:20.789276Z", "iopub.status.idle": "2021-05-19T05:56:21.192752Z", "shell.execute_reply": "2021-05-19T05:56:21.187197Z" } }, "outputs": [], "source": [ "adk_open = mda.Universe(CRD, DCD2)\n", "adk_closed = mda.Universe(PSF, DCD)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# adk_open_view = nv.show_mdanalysis(adk_open)\n", "# adk_open_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![adk_open_view](images/aligning_structure_to_another-adk_open_view.gif)\n", " \n", "
\n", "
" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:07.078934Z", "start_time": "2021-05-19T06:10:06.312021Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:21.197706Z", "iopub.status.busy": "2021-05-19T05:56:21.196586Z", "iopub.status.idle": "2021-05-19T05:56:22.140567Z", "shell.execute_reply": "2021-05-19T05:56:22.136583Z" }, "scrolled": true }, "outputs": [], "source": [ "# adk_closed_view = nv.show_mdanalysis(adk_closed)\n", "# adk_closed_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![adk_closed_view](images/aligning_structure_to_another-adk_closed_view.gif)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Currently, the proteins are not aligned to each other. The difference becomes even more obvious when the closed conformation is compared to the open. Below, we set `adk_open` to the last frame and see the relative positions of each protein in a merged Universe." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "adk_open.trajectory[-1] # last frame\n", "merged = mda.Merge(adk_open.atoms, adk_closed.atoms)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:08.199887Z", "start_time": "2021-05-19T06:10:07.996998Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:22.146963Z", "iopub.status.busy": "2021-05-19T05:56:22.146260Z", "iopub.status.idle": "2021-05-19T05:56:22.384558Z", "shell.execute_reply": "2021-05-19T05:56:22.382441Z" } }, "outputs": [], "source": [ "# merged_view = nv.show_mdanalysis(merged)\n", "# merged_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![merged_view](images/merged_view.png)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Aligning a structure with align.alignto" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`alignto` ([API docs](https://docs.mdanalysis.org/stable/documentation_pages/analysis/align.html#MDAnalysis.analysis.align.alignto)) aligns the mobile AtomGroup to the target AtomGroup by minimising the [root mean square deviation (RMSD) between particle positions](rmsd.ipynb) (please see the linked notebook for an explanation of RMSD). It returns *(old_rmsd, new_rmsd)*. By default (`match_atoms=True`), it will attempt to match the atoms between the mobile and reference structures by mass." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:09.818954Z", "start_time": "2021-05-19T06:10:09.618166Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:22.389484Z", "iopub.status.busy": "2021-05-19T05:56:22.388810Z", "iopub.status.idle": "2021-05-19T05:56:22.656138Z", "shell.execute_reply": "2021-05-19T05:56:22.649598Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(21.712154435976014, 6.817293751703919)\n" ] } ], "source": [ "rmsds = align.alignto(adk_open, # mobile\n", " adk_closed, # reference\n", " select='name CA', # selection to operate on\n", " match_atoms=True) # whether to match atoms\n", "print(rmsds)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# aligned_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n", "# aligned_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![aligned_view](images/aligning_structure_to_another-aligned_view.png)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, you may want to align to a structure that where there is not a clear match in particle mass. For example, you could be aligning the alpha-carbons of an atomistic protein to the backbone beads of a coarse-grained structure. Below, we use the somewhat contrived example of aligning 214 alpha-carbons to the first 214 atoms of the reference structure. In this case, we need to switch `match_atoms=False` or the alignment will error." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(18.991465038265208, 16.603704620787127)\n" ] } ], "source": [ "rmsds = align.alignto(adk_open.select_atoms('name CA'), # mobile\n", " adk_closed.atoms[:214], # reference\n", " select='all', # selection to operate on\n", " match_atoms=False) # whether to match atoms\n", "print(rmsds)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# shifted_aligned_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n", "# shifted_aligned_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![shifted_aligned_view](images/aligning_structure_to_another-shifted_aligned_view.png)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we align structures, positions are set temporarily. If we flip to the first frame of `adk_open` and back to the last frame, we can see that it has returned to its original location." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:12.047089Z", "start_time": "2021-05-19T06:10:11.811488Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:22.917222Z", "iopub.status.busy": "2021-05-19T05:56:22.916597Z", "iopub.status.idle": "2021-05-19T05:56:23.175025Z", "shell.execute_reply": "2021-05-19T05:56:23.170802Z" } }, "outputs": [ { "data": { "text/plain": [ "< Timestep 101 >" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adk_open.trajectory[0] # set to first frame\n", "adk_open.trajectory[-1] # set to last frame" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# reset_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n", "# reset_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![reset_view](images/aligning_structure_to_another-reset_view.png)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can save the aligned positions by writing them out to a PDB file and creating a new Universe." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:13.318130Z", "start_time": "2021-05-19T06:10:12.938865Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:23.179349Z", "iopub.status.busy": "2021-05-19T05:56:23.178353Z", "iopub.status.idle": "2021-05-19T05:56:23.725915Z", "shell.execute_reply": "2021-05-19T05:56:23.719656Z" } }, "outputs": [], "source": [ "align.alignto(adk_open, adk_closed, select='name CA')\n", "adk_open.atoms.write('aligned.pdb')" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# from_file_view = nv.show_mdanalysis(mda.Universe('aligned.pdb'))\n", "# from_file_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "![from_file_view](images/aligning_structure_to_another-from_file_view.png)\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[1] Oliver Beckstein, Elizabeth J. Denning, Juan R. Perilla, and Thomas B. Woolf.\n", "Zipping and Unzipping of Adenylate Kinase: Atomistic Insights into the Ensemble of OpenClosed Transitions.\n", "Journal of Molecular Biology, 394(1):160–176, November 2009.\n", "00107.\n", "URL: https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164, doi:10.1016/j.jmb.2009.09.009.\n", "\n", "[2] Richard J. Gowers, Max Linke, Jonathan Barnoud, Tyler J. E. Reddy, Manuel N. Melo, Sean L. Seyler, Jan Domański, David L. Dotson, Sébastien Buchoux, Ian M. Kenney, and Oliver Beckstein.\n", "MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations.\n", "Proceedings of the 15th Python in Science Conference, pages 98–105, 2016.\n", "00152.\n", "URL: https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html, doi:10.25080/Majora-629e541a-00e.\n", "\n", "[3] Pu Liu, Dimitris K. Agrafiotis, and Douglas L. Theobald.\n", "Fast determination of the optimal rotational matrix for macromolecular superpositions.\n", "Journal of Computational Chemistry, pages n/a–n/a, 2009.\n", "URL: http://doi.wiley.com/10.1002/jcc.21439, doi:10.1002/jcc.21439.\n", "\n", "[4] Naveen Michaud-Agrawal, Elizabeth J. Denning, Thomas B. Woolf, and Oliver Beckstein.\n", "MDAnalysis: A toolkit for the analysis of molecular dynamics simulations.\n", "Journal of Computational Chemistry, 32(10):2319–2327, July 2011.\n", "00778.\n", "URL: http://doi.wiley.com/10.1002/jcc.21787, doi:10.1002/jcc.21787.\n", "\n", "[5] Hai Nguyen, David A Case, and Alexander S Rose.\n", "NGLview–interactive molecular graphics for Jupyter notebooks.\n", "Bioinformatics, 34(7):1241–1242, April 2018.\n", "00024.\n", "URL: https://academic.oup.com/bioinformatics/article/34/7/1241/4721781, doi:10.1093/bioinformatics/btx789.\n", "\n", "[6] Douglas L. Theobald.\n", "Rapid calculation of RMSDs using a quaternion-based characteristic polynomial.\n", "Acta Crystallographica Section A Foundations of Crystallography, 61(4):478–480, July 2005.\n", "00127.\n", "URL: http://scripts.iucr.org/cgi-bin/paper?S0108767305015266, doi:10.1107/S0108767305015266." ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3.9.15 ('mda-user-guide')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.15" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true }, "vscode": { "interpreter": { "hash": "7b52aa17ef4e9358c0e91ff3f0bf50d10667bb8b55636d4b9fb967a2ff94bd8c" } } }, "nbformat": 4, "nbformat_minor": 2 }