{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Aligning a structure to another\n", "\n", "We use `align.alignto` to align a structure to another.\n", "\n", "**Last updated:** December 2022\n", "\n", "**Minimum version of MDAnalysis:** 1.0.0\n", "\n", "**Packages required:**\n", " \n", "* MDAnalysis (<a data-cite=\"michaud-agrawal_mdanalysis_2011\" href=\"https://doi.org/10.1002/jcc.21787\">Michaud-Agrawal *et al.*, 2011</a>, <a data-cite=\"gowers_mdanalysis_2016\" href=\"https://doi.org/10.25080/Majora-629e541a-00e\">Gowers *et al.*, 2016</a>)\n", "* MDAnalysisTests\n", " \n", "**Optional packages for molecular visualisation:**\n", " \n", "* [nglview](http://nglviewer.org/nglview/latest) (<a data-cite=\"nguyen_nglviewinteractive_2018\" href=\"https://doi.org/10.1093/bioinformatics/btx789\">Nguyen *et al.*, 2018</a>)\n", "\n", "Throughout this tutorial we will include cells for visualising Universes with the [NGLView](http://nglviewer.org/nglview/latest/api.html) library. However, these will be commented out, and we will show the expected images generated instead of the interactive widgets.\n", "\n", "**See also**\n", "\n", "* [Aligning a trajectory to a frame from another](aligning_trajectory.ipynb)\n", "* [Aligning a trajectory to a frame from itself](aligning_trajectory_to_frame.ipynb)\n", "* [RMSD](rmsd.ipynb)\n", "\n", "\n", "<div class=\"alert alert-info\">\n", " \n", "**Note**\n", "\n", "MDAnalysis implements RMSD calculation using the fast QCP algorithm (<a data-cite=\"theobald_rapid_2005\" href=\"https://doi.org/10.1107/S0108767305015266\">Theobald, 2005</a>) and a rotation matrix *R* that minimises the RMSD (<a data-cite=\"liu_fast_2009\" href=\"https://doi.org/10.1002/jcc.21439\">Liu *et al.*, 2009</a>). Please cite (<a data-cite=\"theobald_rapid_2005\" href=\"https://doi.org/10.1107/S0108767305015266\">Theobald, 2005</a>) and (<a data-cite=\"liu_fast_2009\" href=\"https://doi.org/10.1002/jcc.21439\">Liu *et al.*, 2009</a>) when using the ``MDAnalysis.analysis.align`` module in published work.\n", "\n", "</div>\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:04.491889Z", "start_time": "2021-05-19T06:10:03.591889Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:20.048242Z", "iopub.status.busy": "2021-05-19T05:56:20.047164Z", "iopub.status.idle": "2021-05-19T05:56:20.785802Z", "shell.execute_reply": "2021-05-19T05:56:20.786179Z" }, "scrolled": false }, "outputs": [], "source": [ "import MDAnalysis as mda\n", "from MDAnalysis.analysis import align\n", "from MDAnalysis.tests.datafiles import CRD, PSF, DCD, DCD2\n", "# import nglview as nv\n", "\n", "import warnings\n", "# suppress some MDAnalysis warnings about writing PDB files\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading files\n", "\n", "The test files we will be working with here are trajectories of a adenylate kinase (AdK), a phosophotransferase enzyme. (<a data-cite=\"beckstein_zipping_2009\" href=\"https://doi.org/10.1016/j.jmb.2009.09.009\">Beckstein *et al.*, 2009</a>) The trajectories sample a transition from a closed to an open conformation." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:05.643507Z", "start_time": "2021-05-19T06:10:05.326336Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:20.789837Z", "iopub.status.busy": "2021-05-19T05:56:20.789276Z", "iopub.status.idle": "2021-05-19T05:56:21.192752Z", "shell.execute_reply": "2021-05-19T05:56:21.187197Z" } }, "outputs": [], "source": [ "adk_open = mda.Universe(CRD, DCD2)\n", "adk_closed = mda.Universe(PSF, DCD)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# adk_open_view = nv.show_mdanalysis(adk_open)\n", "# adk_open_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", "<div style='width: 800px'>\n", " \n", "\n", " \n", "</div>\n", "</center>" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:07.078934Z", "start_time": "2021-05-19T06:10:06.312021Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:21.197706Z", "iopub.status.busy": "2021-05-19T05:56:21.196586Z", "iopub.status.idle": "2021-05-19T05:56:22.140567Z", "shell.execute_reply": "2021-05-19T05:56:22.136583Z" }, "scrolled": true }, "outputs": [], "source": [ "# adk_closed_view = nv.show_mdanalysis(adk_closed)\n", "# adk_closed_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", "<div style='width: 800px'>\n", " \n", "\n", " \n", "</div>\n", "</center>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Currently, the proteins are not aligned to each other. The difference becomes even more obvious when the closed conformation is compared to the open. Below, we set `adk_open` to the last frame and see the relative positions of each protein in a merged Universe." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "adk_open.trajectory[-1] # last frame\n", "merged = mda.Merge(adk_open.atoms, adk_closed.atoms)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:08.199887Z", "start_time": "2021-05-19T06:10:07.996998Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:22.146963Z", "iopub.status.busy": "2021-05-19T05:56:22.146260Z", "iopub.status.idle": "2021-05-19T05:56:22.384558Z", "shell.execute_reply": "2021-05-19T05:56:22.382441Z" } }, "outputs": [], "source": [ "# merged_view = nv.show_mdanalysis(merged)\n", "# merged_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", "<div style='width: 800px'>\n", " \n", "\n", " \n", "</div>\n", "</center>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Aligning a structure with align.alignto" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`alignto` ([API docs](https://docs.mdanalysis.org/stable/documentation_pages/analysis/align.html#MDAnalysis.analysis.align.alignto)) aligns the mobile AtomGroup to the target AtomGroup by minimising the [root mean square deviation (RMSD) between particle positions](rmsd.ipynb) (please see the linked notebook for an explanation of RMSD). It returns *(old_rmsd, new_rmsd)*. By default (`match_atoms=True`), it will attempt to match the atoms between the mobile and reference structures by mass." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:09.818954Z", "start_time": "2021-05-19T06:10:09.618166Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:22.389484Z", "iopub.status.busy": "2021-05-19T05:56:22.388810Z", "iopub.status.idle": "2021-05-19T05:56:22.656138Z", "shell.execute_reply": "2021-05-19T05:56:22.649598Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(np.float64(21.712154435976014), 6.817293751703919)\n" ] } ], "source": [ "rmsds = align.alignto(adk_open, # mobile\n", " adk_closed, # reference\n", " select='name CA', # selection to operate on\n", " match_atoms=True) # whether to match atoms\n", "print(rmsds)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# aligned_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n", "# aligned_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", "<div style='width: 800px'>\n", " \n", "\n", " \n", "</div>\n", "</center>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, you may want to align to a structure that where there is not a clear match in particle mass. For example, you could be aligning the alpha-carbons of an atomistic protein to the backbone beads of a coarse-grained structure. Below, we use the somewhat contrived example of aligning 214 alpha-carbons to the first 214 atoms of the reference structure. In this case, we need to switch `match_atoms=False` or the alignment will error." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(np.float64(18.991465038265208), 16.603704620787127)\n" ] } ], "source": [ "rmsds = align.alignto(adk_open.select_atoms('name CA'), # mobile\n", " adk_closed.atoms[:214], # reference\n", " select='all', # selection to operate on\n", " match_atoms=False) # whether to match atoms\n", "print(rmsds)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# shifted_aligned_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n", "# shifted_aligned_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", "<div style='width: 800px'>\n", " \n", "\n", " \n", "</div>\n", "</center>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we align structures, positions are set temporarily. If we flip to the first frame of `adk_open` and back to the last frame, we can see that it has returned to its original location." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:12.047089Z", "start_time": "2021-05-19T06:10:11.811488Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:22.917222Z", "iopub.status.busy": "2021-05-19T05:56:22.916597Z", "iopub.status.idle": "2021-05-19T05:56:23.175025Z", "shell.execute_reply": "2021-05-19T05:56:23.170802Z" } }, "outputs": [ { "data": { "text/plain": [ "< Timestep 101 >" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adk_open.trajectory[0] # set to first frame\n", "adk_open.trajectory[-1] # set to last frame" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# reset_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n", "# reset_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", "<div style='width: 800px'>\n", " \n", "\n", " \n", "</div>\n", "</center>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can save the aligned positions by writing them out to a PDB file and creating a new Universe." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2021-05-19T06:10:13.318130Z", "start_time": "2021-05-19T06:10:12.938865Z" }, "execution": { "iopub.execute_input": "2021-05-19T05:56:23.179349Z", "iopub.status.busy": "2021-05-19T05:56:23.178353Z", "iopub.status.idle": "2021-05-19T05:56:23.725915Z", "shell.execute_reply": "2021-05-19T05:56:23.719656Z" } }, "outputs": [], "source": [ "align.alignto(adk_open, adk_closed, select='name CA')\n", "adk_open.atoms.write('aligned.pdb')" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# from_file_view = nv.show_mdanalysis(mda.Universe('aligned.pdb'))\n", "# from_file_view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", "<div style='width: 800px'>\n", " \n", "\n", " \n", "</div>\n", "</center>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[1] Oliver Beckstein, Elizabeth J. Denning, Juan R. Perilla, and Thomas B. Woolf.\n", "Zipping and <span class=\"bibtex-protected\">Unzipping</span> of <span class=\"bibtex-protected\">Adenylate</span> <span class=\"bibtex-protected\">Kinase</span>: <span class=\"bibtex-protected\">Atomistic</span> <span class=\"bibtex-protected\">Insights</span> into the <span class=\"bibtex-protected\">Ensemble</span> of <span class=\"bibtex-protected\">Open</span>↔<span class=\"bibtex-protected\">Closed</span> <span class=\"bibtex-protected\">Transitions</span>.\n", "<em>Journal of Molecular Biology</em>, 394(1):160–176, November 2009.\n", "00107.\n", "URL: <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164\">https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164</a>, <a href=\"https://doi.org/10.1016/j.jmb.2009.09.009\">doi:10.1016/j.jmb.2009.09.009</a>.\n", "\n", "[2] Richard J. Gowers, Max Linke, Jonathan Barnoud, Tyler J. E. Reddy, Manuel N. Melo, Sean L. Seyler, Jan Domański, David L. Dotson, Sébastien Buchoux, Ian M. Kenney, and Oliver Beckstein.\n", "<span class=\"bibtex-protected\">MDAnalysis</span>: <span class=\"bibtex-protected\">A</span> <span class=\"bibtex-protected\">Python</span> <span class=\"bibtex-protected\">Package</span> for the <span class=\"bibtex-protected\">Rapid</span> <span class=\"bibtex-protected\">Analysis</span> of <span class=\"bibtex-protected\">Molecular</span> <span class=\"bibtex-protected\">Dynamics</span> <span class=\"bibtex-protected\">Simulations</span>.\n", "<em>Proceedings of the 15th Python in Science Conference</em>, pages 98–105, 2016.\n", "00152.\n", "URL: <a href=\"https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html\">https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html</a>, <a href=\"https://doi.org/10.25080/Majora-629e541a-00e\">doi:10.25080/Majora-629e541a-00e</a>.\n", "\n", "[3] Pu Liu, Dimitris K. Agrafiotis, and Douglas L. Theobald.\n", "Fast determination of the optimal rotational matrix for macromolecular superpositions.\n", "<em>Journal of Computational Chemistry</em>, pages n/a–n/a, 2009.\n", "URL: <a href=\"http://doi.wiley.com/10.1002/jcc.21439\">http://doi.wiley.com/10.1002/jcc.21439</a>, <a href=\"https://doi.org/10.1002/jcc.21439\">doi:10.1002/jcc.21439</a>.\n", "\n", "[4] Naveen Michaud-Agrawal, Elizabeth J. Denning, Thomas B. Woolf, and Oliver Beckstein.\n", "<span class=\"bibtex-protected\">MDAnalysis</span>: <span class=\"bibtex-protected\">A</span> toolkit for the analysis of molecular dynamics simulations.\n", "<em>Journal of Computational Chemistry</em>, 32(10):2319–2327, July 2011.\n", "00778.\n", "URL: <a href=\"http://doi.wiley.com/10.1002/jcc.21787\">http://doi.wiley.com/10.1002/jcc.21787</a>, <a href=\"https://doi.org/10.1002/jcc.21787\">doi:10.1002/jcc.21787</a>.\n", "\n", "[5] Hai Nguyen, David A Case, and Alexander S Rose.\n", "<span class=\"bibtex-protected\">NGLview</span>–interactive molecular graphics for <span class=\"bibtex-protected\">Jupyter</span> notebooks.\n", "<em>Bioinformatics</em>, 34(7):1241–1242, April 2018.\n", "00024.\n", "URL: <a href=\"https://academic.oup.com/bioinformatics/article/34/7/1241/4721781\">https://academic.oup.com/bioinformatics/article/34/7/1241/4721781</a>, <a href=\"https://doi.org/10.1093/bioinformatics/btx789\">doi:10.1093/bioinformatics/btx789</a>.\n", "\n", "[6] Douglas L. Theobald.\n", "Rapid calculation of <span class=\"bibtex-protected\">RMSDs</span> using a quaternion-based characteristic polynomial.\n", "<em>Acta Crystallographica Section A Foundations of Crystallography</em>, 61(4):478–480, July 2005.\n", "00127.\n", "URL: <a href=\"http://scripts.iucr.org/cgi-bin/paper?S0108767305015266\">http://scripts.iucr.org/cgi-bin/paper?S0108767305015266</a>, <a href=\"https://doi.org/10.1107/S0108767305015266\">doi:10.1107/S0108767305015266</a>." ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.16" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true }, "vscode": { "interpreter": { "hash": "7b52aa17ef4e9358c0e91ff3f0bf50d10667bb8b55636d4b9fb967a2ff94bd8c" } } }, "nbformat": 4, "nbformat_minor": 2 }