{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Using ParmEd with MDAnalysis and OpenMM to simulate a selection of atoms\n", "\n", "Here we use MDAnalysis to convert a ParmEd structure to an MDAnalysis Universe, select a subset of atoms, and convert it back to ParmEd to simulate with OpenMM.\n", "\n", "**Last updated:** December 2022 with MDAnalysis 2.4.0-dev0\n", "\n", "**Last updated:** December 2022\n", "\n", "**Minimum version of MDAnalysis:** 1.0.0\n", "\n", "**Packages required:**\n", " \n", "* MDAnalysis [[1, 2]](#References)\n", "* MDAnalysisTests\n", "* [ParmEd](http://parmed.github.io/ParmEd/html/index.html)\n", "* [OpenMM](http://openmm.org) [[3]](#References)\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:14:15.046111Z", "start_time": "2020-12-27T11:14:14.063236Z" } }, "outputs": [], "source": [ "import parmed as pmd\n", "import MDAnalysis as mda\n", "from MDAnalysis.tests.datafiles import PRM7_ala2, RST7_ala2\n", "\n", "import warnings\n", "# suppress some MDAnalysis warnings when writing PDB files\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading files: the difference between ParmEd and MDAnalysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Both ParmEd and MDAnalysis read a number of file formats. However, while MDAnalysis is typically used to analyse simulations, ParmEd is often used to set them up. This requires ParmEd to read topology parameter information that MDAnalysis typically ignores, such as the equilibrium length and force constants of bonds in the system. For example, the ParmEd structure below." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:14:15.724236Z", "start_time": "2020-12-27T11:14:15.621676Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pprm = pmd.load_file(PRM7_ala2, RST7_ala2)\n", "pprm" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:14:16.212194Z", "start_time": "2020-12-27T11:14:16.208658Z" } }, "outputs": [ { "data": { "text/plain": [ "--; type=>" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pprm.bonds[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When MDAnalysis reads these files in, it does not include that information." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:14:17.242414Z", "start_time": "2020-12-27T11:14:17.202841Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mprm = mda.Universe(PRM7_ala2, RST7_ala2, format='RESTRT')\n", "mprm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The bond type simply shows the atom types involved in the connection." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:14:18.176132Z", "start_time": "2020-12-27T11:14:18.164195Z" } }, "outputs": [ { "data": { "text/plain": [ "('N3', 'H')" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mprm.atoms.bonds[0].type" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you then convert this Universe to ParmEd, you can see that the resulting Structure is not `parametrized`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:14:22.378889Z", "start_time": "2020-12-27T11:14:19.116079Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mprm_converted = mprm.atoms.convert_to('PARMED')\n", "mprm_converted" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While the bonds are present, there is no `type` information associated." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:14:22.384699Z", "start_time": "2020-12-27T11:14:22.380604Z" }, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "--; type=None>" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mprm_converted.bonds[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Therefore, if you wish to use ParmEd functionality that requires parametrization on a MDAnalysis Universe, you need to create that Universe *from* a ParmEd structure in order to convert it *back to* something useable in ParmEd." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:13:42.628191Z", "start_time": "2020-12-27T11:13:42.542442Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mprm_from_parmed = mda.Universe(pprm)\n", "mprm_from_parmed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the bond type is actually a ParmEd Bond object." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:13:43.364679Z", "start_time": "2020-12-27T11:13:43.341629Z" }, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "--; type=>" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mprm_from_parmed.bonds[0].type" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using MDAnalysis to select atoms" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One reason we might want to convert a ParmEd structure into MDAnalysis is to use its sophisticated [atom selection syntax](https://www.mdanalysis.org/UserGuide/selections.html). While ParmEd has its [own ways to select atoms](https://parmed.github.io/ParmEd/html/structure.html#structure-manipulation-slicing-combining-replicating-and-splitting), MDAnalysis allows you to select atoms based on geometric distance." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2020-12-27T11:13:44.643467Z", "start_time": "2020-12-27T11:13:44.623799Z" } }, "outputs": [], "source": [ "water = mprm_from_parmed.select_atoms('around 5 protein').residues.atoms\n", "protein_shell = mprm_from_parmed.select_atoms('protein') + water\n", "prm_protein_shell = protein_shell.convert_to('PARMED')" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "prm_protein_shell" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using ParmEd and OpenMM to create a simulation system" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "import sys\n", "import openmm as mm\n", "import openmm.app as app\n", "from parmed import unit as u\n", "from parmed.openmm import StateDataReporter, MdcrdReporter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can create an OpenMM simulation system directly from a ParmEd structure, providing that it is parametrized." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "system = prm_protein_shell.createSystem(nonbondedMethod=app.NoCutoff,\n", " constraints=app.HBonds, \n", " implicitSolvent=app.GBn2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we set the integrator to do Langevin dynamics." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "integrator = mm.LangevinIntegrator(\n", " 300*u.kelvin, # Temperature of heat bath\n", " 1.0/u.picoseconds, # Friction coefficient\n", " 2.0*u.femtoseconds, # Time step\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We create the Simulation object and set particle positions." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "sim = app.Simulation(prm_protein_shell.topology, system, integrator)\n", "sim.context.setPositions(prm_protein_shell.positions)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now minimise the energy." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "sim.minimizeEnergy(maxIterations=500)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The reporter below reports energies and coordinates every 100 steps to stdout, but every 10 steps to the `ala2_shell.nc` file." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "sim.reporters.append(\n", " StateDataReporter(sys.stdout, 100, step=True, potentialEnergy=True,\n", " kineticEnergy=True, temperature=True, volume=True,\n", " density=True)\n", ")\n", "sim.reporters.append(MdcrdReporter('ala2_shell.trj', 10, crds=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can run dynamics for 500 steps (1 picosecond)." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "tags": [ "nbval-ignore-output" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#\"Step\",\"Time (ps)\",\"Potential Energy (kilocalorie/mole)\",\"Kinetic Energy (kilocalorie/mole)\",\"Total Energy (kilocalorie/mole)\",\"Temperature (K)\",\"Box Volume (angstrom**3)\",\"Density (gram/(item*milliliter))\"\n", "100,0.20000000000000015,-623.6779995219885,20.140631869613383,-603.5373676523751,63.74314071570579,45325.8064191062,0.034909350700361955\n", "200,0.4000000000000003,-614.1849904397706,38.40737137186695,-575.7776190679035,121.55559436896063,45325.8064191062,0.034909350700361955\n", "300,0.6000000000000004,-606.5526783580306,48.61919832973248,-557.933480028298,153.87503334950912,45325.8064191062,0.034909350700361955\n", "400,0.8000000000000006,-600.0374380078872,57.988937528818184,-542.0485004790689,183.52934648642113,45325.8064191062,0.034909350700361955\n", "500,1.0000000000000007,-603.2854886173518,79.46589388029852,-523.8195947370533,251.50182419815255,45325.8064191062,0.034909350700361955\n" ] } ], "source": [ "sim.step(500)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we write a topology file out from our former `protein_shell` atomgroup, we can load the trajectory in for further analysis." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "scrolled": true }, "outputs": [], "source": [ "protein_shell.write('ala2_shell.pdb')" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "u = mda.Universe('ala2_shell.pdb', 'ala2_shell.trj')" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "u.trajectory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[1] R. J. Gowers, M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, D. L. Dotson, J. Domanski, S. Buchoux, I. M. Kenney, and O. Beckstein. [MDAnalysis: A Python package for the rapid analysis of molecular dynamics simulations](http://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html). In S. Benthall and S. Rostrup, editors, *Proceedings of the 15th Python in Science Conference*, pages 98-105, Austin, TX, 2016. SciPy, doi: [10.25080/majora-629e541a-00e](https://doi.org/10.25080/majora-629e541a-00e).\n", "\n", "[2] N. Michaud-Agrawal, E. J. Denning, T. B. Woolf, and O. Beckstein. MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations. *J. Comput. Chem*. 32 (2011), 2319-2327, [doi:10.1002/jcc.21787](https://dx.doi.org/10.1002/jcc.21787). PMCID:[PMC3144279](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144279/)\n", "\n", "[3] Peter Eastman, Jason Swails, John D. Chodera, Robert T. McGibbon, Yutong Zhao, Kyle A. Beauchamp, Lee-Ping Wang, Andrew C. Simmonett, Matthew P. Harrigan, Chaya D. Stern, Rafal P. Wiewiora, Bernard R. Brooks, Vijay S. Pande. OpenMM 7: Rapid Development of High Performance Algorithms for Molecular Dynamics. *PLoS Comput. Biol.* 13:e1005659, 2017.\n", "\n", "[4] Hai Nguyen, David A Case, Alexander S Rose. NGLview - Interactive molecular graphics for Jupyter notebooks. *Bioinformatics*. 34 (2018), 1241–1242, [doi:10.1093/bioinformatics/btx789](https://doi.org/10.1093/bioinformatics/btx789)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8.13 ('mda-user-guide')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false }, "vscode": { "interpreter": { "hash": "4674db5753660d730fe8f86bd51b34e795360b5e8cfad0948ce63ba2d64a0720" } } }, "nbformat": 4, "nbformat_minor": 2 }