{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Aligning a structure to another\n",
    "\n",
    "We use `align.alignto` to align a structure to another.\n",
    "\n",
    "**Last updated:** December 2022\n",
    "\n",
    "**Minimum version of MDAnalysis:** 1.0.0\n",
    "\n",
    "**Packages required:**\n",
    "    \n",
    "* MDAnalysis (<a data-cite=\"michaud-agrawal_mdanalysis_2011\" href=\"https://doi.org/10.1002/jcc.21787\">Michaud-Agrawal *et al.*, 2011</a>, <a data-cite=\"gowers_mdanalysis_2016\" href=\"https://doi.org/10.25080/Majora-629e541a-00e\">Gowers *et al.*, 2016</a>)\n",
    "* MDAnalysisTests\n",
    "   \n",
    "**Optional packages for molecular visualisation:**\n",
    "    \n",
    "* [nglview](http://nglviewer.org/nglview/latest) (<a data-cite=\"nguyen_nglviewinteractive_2018\" href=\"https://doi.org/10.1093/bioinformatics/btx789\">Nguyen *et al.*, 2018</a>)\n",
    "\n",
    "Throughout this tutorial we will include cells for visualising Universes with the [NGLView](http://nglviewer.org/nglview/latest/api.html) library. However, these will be commented out, and we will show the expected images generated instead of the interactive widgets.\n",
    "\n",
    "**See also**\n",
    "\n",
    "* [Aligning a trajectory to a frame from another](aligning_trajectory.ipynb)\n",
    "* [Aligning a trajectory to a frame from itself](aligning_trajectory_to_frame.ipynb)\n",
    "* [RMSD](rmsd.ipynb)\n",
    "\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "**Note**\n",
    "\n",
    "MDAnalysis implements RMSD calculation using the fast QCP algorithm (<a data-cite=\"theobald_rapid_2005\" href=\"https://doi.org/10.1107/S0108767305015266\">Theobald, 2005</a>) and a rotation matrix *R* that minimises the RMSD (<a data-cite=\"liu_fast_2009\" href=\"https://doi.org/10.1002/jcc.21439\">Liu *et al.*, 2009</a>). Please cite (<a data-cite=\"theobald_rapid_2005\" href=\"https://doi.org/10.1107/S0108767305015266\">Theobald, 2005</a>) and (<a data-cite=\"liu_fast_2009\" href=\"https://doi.org/10.1002/jcc.21439\">Liu *et al.*, 2009</a>) when using the ``MDAnalysis.analysis.align`` module in published work.\n",
    "\n",
    "</div>\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-05-19T06:10:04.491889Z",
     "start_time": "2021-05-19T06:10:03.591889Z"
    },
    "execution": {
     "iopub.execute_input": "2021-05-19T05:56:20.048242Z",
     "iopub.status.busy": "2021-05-19T05:56:20.047164Z",
     "iopub.status.idle": "2021-05-19T05:56:20.785802Z",
     "shell.execute_reply": "2021-05-19T05:56:20.786179Z"
    },
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "import MDAnalysis as mda\n",
    "from MDAnalysis.analysis import align\n",
    "from MDAnalysis.tests.datafiles import CRD, PSF, DCD, DCD2\n",
    "# import nglview as nv\n",
    "\n",
    "import warnings\n",
    "# suppress some MDAnalysis warnings about writing PDB files\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading files\n",
    "\n",
    "The test files we will be working with here are trajectories of a adenylate kinase (AdK), a phosophotransferase enzyme. (<a data-cite=\"beckstein_zipping_2009\" href=\"https://doi.org/10.1016/j.jmb.2009.09.009\">Beckstein *et al.*, 2009</a>) The trajectories sample a transition from a closed to an open conformation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-05-19T06:10:05.643507Z",
     "start_time": "2021-05-19T06:10:05.326336Z"
    },
    "execution": {
     "iopub.execute_input": "2021-05-19T05:56:20.789837Z",
     "iopub.status.busy": "2021-05-19T05:56:20.789276Z",
     "iopub.status.idle": "2021-05-19T05:56:21.192752Z",
     "shell.execute_reply": "2021-05-19T05:56:21.187197Z"
    }
   },
   "outputs": [],
   "source": [
    "adk_open = mda.Universe(CRD, DCD2)\n",
    "adk_closed = mda.Universe(PSF, DCD)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# adk_open_view = nv.show_mdanalysis(adk_open)\n",
    "# adk_open_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "<div style='width: 800px'>\n",
    "    \n",
    "![adk_open_view](images/aligning_structure_to_another-adk_open_view.gif)\n",
    "    \n",
    "</div>\n",
    "</center>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-05-19T06:10:07.078934Z",
     "start_time": "2021-05-19T06:10:06.312021Z"
    },
    "execution": {
     "iopub.execute_input": "2021-05-19T05:56:21.197706Z",
     "iopub.status.busy": "2021-05-19T05:56:21.196586Z",
     "iopub.status.idle": "2021-05-19T05:56:22.140567Z",
     "shell.execute_reply": "2021-05-19T05:56:22.136583Z"
    },
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# adk_closed_view = nv.show_mdanalysis(adk_closed)\n",
    "# adk_closed_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "<div style='width: 800px'>\n",
    "    \n",
    "![adk_closed_view](images/aligning_structure_to_another-adk_closed_view.gif)\n",
    "    \n",
    "</div>\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Currently, the proteins are not aligned to each other. The difference becomes even more obvious when the closed conformation is compared to the open. Below, we set `adk_open` to the last frame and see the relative positions of each protein in a merged Universe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "adk_open.trajectory[-1] # last frame\n",
    "merged = mda.Merge(adk_open.atoms, adk_closed.atoms)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-05-19T06:10:08.199887Z",
     "start_time": "2021-05-19T06:10:07.996998Z"
    },
    "execution": {
     "iopub.execute_input": "2021-05-19T05:56:22.146963Z",
     "iopub.status.busy": "2021-05-19T05:56:22.146260Z",
     "iopub.status.idle": "2021-05-19T05:56:22.384558Z",
     "shell.execute_reply": "2021-05-19T05:56:22.382441Z"
    }
   },
   "outputs": [],
   "source": [
    "# merged_view = nv.show_mdanalysis(merged)\n",
    "# merged_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "<div style='width: 800px'>\n",
    "    \n",
    "![merged_view](images/merged_view.png)\n",
    "    \n",
    "</div>\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Aligning a structure with align.alignto"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`alignto` ([API docs](https://docs.mdanalysis.org/stable/documentation_pages/analysis/align.html#MDAnalysis.analysis.align.alignto)) aligns the mobile AtomGroup to the target AtomGroup by minimising the [root mean square deviation (RMSD) between particle positions](rmsd.ipynb) (please see the linked notebook for an explanation of RMSD). It returns *(old_rmsd, new_rmsd)*. By default (`match_atoms=True`), it will attempt to match the atoms between the mobile and reference structures by mass."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-05-19T06:10:09.818954Z",
     "start_time": "2021-05-19T06:10:09.618166Z"
    },
    "execution": {
     "iopub.execute_input": "2021-05-19T05:56:22.389484Z",
     "iopub.status.busy": "2021-05-19T05:56:22.388810Z",
     "iopub.status.idle": "2021-05-19T05:56:22.656138Z",
     "shell.execute_reply": "2021-05-19T05:56:22.649598Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(np.float64(21.712154435976014), 6.817293751703919)\n"
     ]
    }
   ],
   "source": [
    "rmsds = align.alignto(adk_open,  # mobile\n",
    "                      adk_closed,  # reference\n",
    "                      select='name CA', # selection to operate on\n",
    "                      match_atoms=True) # whether to match atoms\n",
    "print(rmsds)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# aligned_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n",
    "# aligned_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "<div style='width: 800px'>\n",
    "    \n",
    "![aligned_view](images/aligning_structure_to_another-aligned_view.png)\n",
    "    \n",
    "</div>\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, you may want to align to a structure that where there is not a clear match in particle mass. For example, you could be aligning the alpha-carbons of an atomistic protein to the backbone beads of a coarse-grained structure. Below, we use the somewhat contrived example of aligning 214 alpha-carbons to the first 214 atoms of the reference structure. In this case, we need to switch `match_atoms=False` or the alignment will error."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(np.float64(18.991465038265208), 16.603704620787127)\n"
     ]
    }
   ],
   "source": [
    "rmsds = align.alignto(adk_open.select_atoms('name CA'),  # mobile\n",
    "                      adk_closed.atoms[:214],  # reference\n",
    "                      select='all', # selection to operate on\n",
    "                      match_atoms=False) # whether to match atoms\n",
    "print(rmsds)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# shifted_aligned_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n",
    "# shifted_aligned_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "<div style='width: 800px'>\n",
    "    \n",
    "![shifted_aligned_view](images/aligning_structure_to_another-shifted_aligned_view.png)\n",
    "    \n",
    "</div>\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When we align structures, positions are set temporarily. If we flip to the first frame of `adk_open` and back to the last frame, we can see that it has returned to its original location."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-05-19T06:10:12.047089Z",
     "start_time": "2021-05-19T06:10:11.811488Z"
    },
    "execution": {
     "iopub.execute_input": "2021-05-19T05:56:22.917222Z",
     "iopub.status.busy": "2021-05-19T05:56:22.916597Z",
     "iopub.status.idle": "2021-05-19T05:56:23.175025Z",
     "shell.execute_reply": "2021-05-19T05:56:23.170802Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "< Timestep 101 >"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "adk_open.trajectory[0] # set to first frame\n",
    "adk_open.trajectory[-1] # set to last frame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# reset_view = nv.show_mdanalysis(mda.Merge(adk_open.atoms, adk_closed.atoms))\n",
    "# reset_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "<div style='width: 800px'>\n",
    "    \n",
    "![reset_view](images/aligning_structure_to_another-reset_view.png)\n",
    "    \n",
    "</div>\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can save the aligned positions by writing them out to a PDB file and creating a new Universe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-05-19T06:10:13.318130Z",
     "start_time": "2021-05-19T06:10:12.938865Z"
    },
    "execution": {
     "iopub.execute_input": "2021-05-19T05:56:23.179349Z",
     "iopub.status.busy": "2021-05-19T05:56:23.178353Z",
     "iopub.status.idle": "2021-05-19T05:56:23.725915Z",
     "shell.execute_reply": "2021-05-19T05:56:23.719656Z"
    }
   },
   "outputs": [],
   "source": [
    "align.alignto(adk_open, adk_closed, select='name CA')\n",
    "adk_open.atoms.write('aligned.pdb')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# from_file_view = nv.show_mdanalysis(mda.Universe('aligned.pdb'))\n",
    "# from_file_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "<div style='width: 800px'>\n",
    "    \n",
    "![from_file_view](images/aligning_structure_to_another-from_file_view.png)\n",
    "    \n",
    "</div>\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "\n",
    "[1] Oliver Beckstein, Elizabeth&nbsp;J. Denning, Juan&nbsp;R. Perilla, and Thomas&nbsp;B. Woolf.\n",
    "Zipping and <span class=\"bibtex-protected\">Unzipping</span> of <span class=\"bibtex-protected\">Adenylate</span> <span class=\"bibtex-protected\">Kinase</span>: <span class=\"bibtex-protected\">Atomistic</span> <span class=\"bibtex-protected\">Insights</span> into the <span class=\"bibtex-protected\">Ensemble</span> of <span class=\"bibtex-protected\">Open</span>↔<span class=\"bibtex-protected\">Closed</span> <span class=\"bibtex-protected\">Transitions</span>.\n",
    "<em>Journal of Molecular Biology</em>, 394(1):160–176, November 2009.\n",
    "00107.\n",
    "URL: <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164\">https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164</a>, <a href=\"https://doi.org/10.1016/j.jmb.2009.09.009\">doi:10.1016/j.jmb.2009.09.009</a>.\n",
    "\n",
    "[2] Richard&nbsp;J. Gowers, Max Linke, Jonathan Barnoud, Tyler J.&nbsp;E. Reddy, Manuel&nbsp;N. Melo, Sean&nbsp;L. Seyler, Jan Domański, David&nbsp;L. Dotson, Sébastien Buchoux, Ian&nbsp;M. Kenney, and Oliver Beckstein.\n",
    "<span class=\"bibtex-protected\">MDAnalysis</span>: <span class=\"bibtex-protected\">A</span> <span class=\"bibtex-protected\">Python</span> <span class=\"bibtex-protected\">Package</span> for the <span class=\"bibtex-protected\">Rapid</span> <span class=\"bibtex-protected\">Analysis</span> of <span class=\"bibtex-protected\">Molecular</span> <span class=\"bibtex-protected\">Dynamics</span> <span class=\"bibtex-protected\">Simulations</span>.\n",
    "<em>Proceedings of the 15th Python in Science Conference</em>, pages 98–105, 2016.\n",
    "00152.\n",
    "URL: <a href=\"https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html\">https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html</a>, <a href=\"https://doi.org/10.25080/Majora-629e541a-00e\">doi:10.25080/Majora-629e541a-00e</a>.\n",
    "\n",
    "[3] Pu&nbsp;Liu, Dimitris&nbsp;K. Agrafiotis, and Douglas&nbsp;L. Theobald.\n",
    "Fast determination of the optimal rotational matrix for macromolecular superpositions.\n",
    "<em>Journal of Computational Chemistry</em>, pages n/a–n/a, 2009.\n",
    "URL: <a href=\"http://doi.wiley.com/10.1002/jcc.21439\">http://doi.wiley.com/10.1002/jcc.21439</a>, <a href=\"https://doi.org/10.1002/jcc.21439\">doi:10.1002/jcc.21439</a>.\n",
    "\n",
    "[4] Naveen Michaud-Agrawal, Elizabeth&nbsp;J. Denning, Thomas&nbsp;B. Woolf, and Oliver Beckstein.\n",
    "<span class=\"bibtex-protected\">MDAnalysis</span>: <span class=\"bibtex-protected\">A</span> toolkit for the analysis of molecular dynamics simulations.\n",
    "<em>Journal of Computational Chemistry</em>, 32(10):2319–2327, July 2011.\n",
    "00778.\n",
    "URL: <a href=\"http://doi.wiley.com/10.1002/jcc.21787\">http://doi.wiley.com/10.1002/jcc.21787</a>, <a href=\"https://doi.org/10.1002/jcc.21787\">doi:10.1002/jcc.21787</a>.\n",
    "\n",
    "[5] Hai Nguyen, David&nbsp;A Case, and Alexander&nbsp;S Rose.\n",
    "<span class=\"bibtex-protected\">NGLview</span>–interactive molecular graphics for <span class=\"bibtex-protected\">Jupyter</span> notebooks.\n",
    "<em>Bioinformatics</em>, 34(7):1241–1242, April 2018.\n",
    "00024.\n",
    "URL: <a href=\"https://academic.oup.com/bioinformatics/article/34/7/1241/4721781\">https://academic.oup.com/bioinformatics/article/34/7/1241/4721781</a>, <a href=\"https://doi.org/10.1093/bioinformatics/btx789\">doi:10.1093/bioinformatics/btx789</a>.\n",
    "\n",
    "[6] Douglas&nbsp;L. Theobald.\n",
    "Rapid calculation of <span class=\"bibtex-protected\">RMSDs</span> using a quaternion-based characteristic polynomial.\n",
    "<em>Acta Crystallographica Section A Foundations of Crystallography</em>, 61(4):478–480, July 2005.\n",
    "00127.\n",
    "URL: <a href=\"http://scripts.iucr.org/cgi-bin/paper?S0108767305015266\">http://scripts.iucr.org/cgi-bin/paper?S0108767305015266</a>, <a href=\"https://doi.org/10.1107/S0108767305015266\">doi:10.1107/S0108767305015266</a>."
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.16"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": true
  },
  "vscode": {
   "interpreter": {
    "hash": "7b52aa17ef4e9358c0e91ff3f0bf50d10667bb8b55636d4b9fb967a2ff94bd8c"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}