The topology system

MDAnalysis groups static data about a Universe into its topology. This is typically loaded from a topology file. Topology information falls into 3 categories:

Users will almost never interact directly with a Topology. Modifying atom containers or topology attributes is typically done through Universe. Methods for viewing containers or topology attributes, or for calculating topology object values, are accessed through AtomGroup.

Topology attributes

MDAnalysis supports a range of topology attributes for each Atom and AtomGroup. If an attribute is defined for an Atom, it will be for an AtomGroup, and vice versa – however, they are accessed with singular and plural versions of the attribute specifically.

Canonical attributes

These attributes are derived for every Universe, including Universes created with empty(). They encode the MDAnalysis order of each object.

Atom

AtomGroup

Description

index

indices

MDAnalysis canonical atom index (from 0)

resindex

resindices

MDAnalysis canonical residue index (from 0)

segindex

segindices

MDAnalysis segment index (from 0)

The following attributes are read or guessed from every format supported by MDAnalysis.

Atom

AtomGroup

Description

id

ids

atom serial (from 1, except PSF/DMS/TPR formats)

mass

masses

atom mass (guessed, default: 0.0)

resid

resids

residue number (from 1, except for TPR)

resnum

resnums

alias of resid

segid

segids

names of segments (default: ‘SYSTEM’)

type

types

atom name, atom element, or force field atom type

Format-specific attributes

The table below lists attributes that are read from supported formats. These can also be added to a Universe created from a file that does not support them.

Atom

AtomGroup

Description

Supported formats

altLoc

altLocs

Alternate location

MMTF, PARMED, PDB, ENT, PDBQT, XPDB

angles

angles

DATA, GSD, PARMED, PSF, TOP, PRMTOP, PARM7, TPR, XML

atomiccharge

atomiccharges

Atomic number

GMS

atomnum

atomnums

?

DMS

bfactor

bfactors

alias of tempfactor

MMTF

bonds

bonds

DATA, DMS, GSD, MMTF, MOL2, PARMED, PDB, ENT, PSF, TOP, PRMTOP, PARM7, TPR, TXYZ, ARC, XML, XPDB

chainID

chainIDs

chain ID

DMS, PARMED, PDB, ENT, XPDB

charge

charges

partial atomic charge

DATA, DMS, GSD, MMTF, MOL2, PARMED, PDBQT, PQR, PSF, TOP, PRMTOP, PARM7, TPR, XML

chargegroup

chargegroups

cmaps

cmaps

PARMED

dihedrals

dihedrals

DATA, GSD, PARMED, PSF, TOP, PRMTOP, PARM7, TPR, XML

element

elements

atom element

IN, FHIAIMS, PARMED, PDB, ENT, XYZ

epsilon

epsilons

PARMED

epsilon14

epsilon14s

PARMED

gbscreen

gbscreens

PARMED

icode

icodes

atom insertion code

MMTF, PDB, ENT, PDBQT, PQR, XPDB

impropers

impropers

DATA, GSD, PARMED, PSF, TOP, PRMTOP, PARM7, TPR, XML

model

models

model number (from 0)

MMTF

molnum

molnums

[molecules] number (from 0)

TPR

moltype

moltypes

[moleculetype] name

TPR

name

names

atom names

CONFIG, CRD, DMS, GMS, GRO, GSD, HISTORY, IN, FHIAIMS, MMTF, MOL2, PARMED, PDB, ENT, PDBQT, PQR, PSF, TOP, PRMTOP, PARM7, TPR, TXYZ, ARC, XPDB, XYZ

nbindex

nbindices

PARMED

occupancy

occupancies

atom occupancy

MMTF, PARMED, PDB, ENT, PDBQT, XPDB

radius

radii

atomic radius

GSD, PQR, XML

record_type

record_types

ATOM / HETATM

PDB, ENT, PDBQT, PQR, XPDB

resname

resnames

residue name (except GSD has ints)

CRD, DMS, GRO, GSD, MMTF, MOL2, PARMED, PDB, ENT, PDBQT, PQR, PSF, TOP, PRMTOP, PARM7, TPR, XPDB

rmin

rmins

PARMED

rmin14

rmin14s

PARMED

solventradius

solventradii

PARMED

tempfactor

tempfactors

B-factor

CRD, PARMED, PDB, ENT, PDBQT, XPDB

type_index

type_indices

amber atom type number

TOP, PRMTOP, PARM7

ureybradleys

ureybradleys

PARMED

Connectivity information

MDAnalysis can also read connectivity information, if the file provides it. These become available as Topology objects, which have additional functionality.

Atom

AtomGroup

Supported formats

angles

angles

DATA, GSD, PARMED, PSF, TOP, PRMTOP, PARM7, TPR, XML

bonds

bonds

DATA, DMS, GSD, MMTF, MOL2, PARMED, PDB, ENT, PSF, TOP, PRMTOP, PARM7, TPR, TXYZ, ARC, XML, XPDB

dihedrals

dihedrals

DATA, GSD, PARMED, PSF, TOP, PRMTOP, PARM7, TPR, XML

impropers

impropers

DATA, GSD, PARMED, PSF, TOP, PRMTOP, PARM7, TPR, XML

Adding TopologyAttrs

Each of the attributes above can be added to a Universe if it was not available in the file with add_TopologyAttr().

add_TopologyAttr() takes two arguments:

  • topologyattr : the singular or plural name of a TopologyAttr, or a MDAnalysis TopologyAttr object. This must already have been defined as a TopologyAttr (see Adding custom TopologyAttrs for an example of adding a custom topology attribute).

  • values (optional) : if topologyattr is a string, the values for that attribute. This can be None if the attribute has default values defined, e.g. bfactors.

In [1]: import MDAnalysis as mda

In [2]: from MDAnalysis.tests.datafiles import PSF

In [3]: psf = mda.Universe(PSF)

In [4]: hasattr(psf.atoms, 'bfactors')
Out[4]: False

In [5]: psf.add_TopologyAttr('bfactors')

In [6]: psf.atoms.bfactors
Out[6]: array([0., 0., 0., ..., 0., 0., 0.])

One way to modify topology attributes is to simply replace them with add_TopologyAttr():

In [7]: psf.add_TopologyAttr('bfactors', range(len(psf.atoms)))

In [8]: psf.atoms.bfactors
Out[8]: 
array([0.000e+00, 1.000e+00, 2.000e+00, ..., 3.338e+03, 3.339e+03,
       3.340e+03])

The number of values provided should correspond with the “level” of the attribute. For example, B-factors are atomic-level information. However, residue names and residue ids apply to residues. See a table of attribute levels and default values for more information.

Modifying TopologyAttrs

Existing topology attributes can be directly modified by assigning new values.

In [9]: import MDAnalysis as mda

In [10]: from MDAnalysis.tests.datafiles import PDB

In [11]: pdb = mda.Universe(PDB)

In [12]: pdb.residues[:3].resnames
Out[12]: array(['MET', 'ARG', 'ILE'], dtype=object)

In [13]: pdb.residues[:3].resnames = ['RES1', 'RES2', 'RES3']

In [14]: pdb.residues[:3].atoms.resnames
Out[14]: 
array(['RES1', 'RES1', 'RES1', 'RES1', 'RES1', 'RES1', 'RES1', 'RES1',
       'RES1', 'RES1', 'RES1', 'RES1', 'RES1', 'RES1', 'RES1', 'RES1',
       'RES1', 'RES1', 'RES1', 'RES2', 'RES2', 'RES2', 'RES2', 'RES2',
       'RES2', 'RES2', 'RES2', 'RES2', 'RES2', 'RES2', 'RES2', 'RES2',
       'RES2', 'RES2', 'RES2', 'RES2', 'RES2', 'RES2', 'RES2', 'RES2',
       'RES2', 'RES2', 'RES2', 'RES3', 'RES3', 'RES3', 'RES3', 'RES3',
       'RES3', 'RES3', 'RES3', 'RES3', 'RES3', 'RES3', 'RES3', 'RES3',
       'RES3', 'RES3', 'RES3', 'RES3', 'RES3', 'RES3'], dtype=object)

Note

This method cannot be used with connectivity attributes, i.e. bonds, angles, dihedrals, and impropers.

Similarly to adding topology attributes with add_TopologyAttr(), the “level” of the attribute matters. Residue attributes can only be assigned to attributes at the Residue or ResidueGroup level. The same applies to attributes for Atoms and Segments. For example, we would get a NotImplementedError if we tried to assign resnames to an AtomGroup.

In [15]: pdb.residues[0].atoms.resnames = ['new_name']

NotImplementedErrorTraceback (most recent call last)
<ipython-input-15-0f99b0dc5f49> in <module>
----> 1 pdb.residues[0].atoms.resnames = ['new_name']
...
NotImplementedError: Cannot set resnames from AtomGroup. Use 'AtomGroup.residues.resnames = '

Default values and attribute levels

Topology information in MDAnalysis is always associated with a level: one of atom, residue, or segment. For example, indices is Atom information, resindices is Residue information, and segindices is Segment information. Many topology attributes also have default values, so that they can be added to a Universe without providing explicit values, and expected types. The table below lists which attributes have default values, what they are, and the information level.

Atom

AtomGroup

default

level

type

altLocs

altLoc

‘’

atom

<class ‘object’>

angles

angles

No default values

atom

aromaticities

aromaticity

False

atom

<class ‘bool’>

atomiccharges

atomiccharge

No default values

atom

atomnums

atomnum

No default values

atom

bonds

bonds

No default values

atom

chainIDs

chainID

‘’

atom

<class ‘object’>

chargegroups

chargegroup

No default values

atom

charges

charge

0.0

atom

<class ‘float’>

chiralities

chirality

No default values

atom

U1

cmaps

cmaps

No default values

atom

dihedrals

dihedrals

No default values

atom

elements

element

‘’

atom

<class ‘object’>

epsilon14s

epsilon14

0.0

atom

<class ‘float’>

epsilons

epsilon

0.0

atom

<class ‘float’>

gbscreens

gbscreen

0.0

atom

<class ‘float’>

icodes

icode

‘’

residue

<class ‘object’>

ids

id

continuous sequence from 1 to n_atoms

atom

<class ‘int’>

impropers

impropers

No default values

atom

masses

mass

0.0

atom

<class ‘numpy.float64’>

models

model

No default values

segment

molnums

molnum

No default values

residue

<class ‘numpy.int64’>

moltypes

moltype

‘’

residue

<class ‘object’>

names

name

‘’

atom

<class ‘object’>

nbindices

nbindex

0

atom

<class ‘int’>

occupancies

occupancy

0.0

atom

<class ‘float’>

radii

radius

0.0

atom

<class ‘float’>

record_types

record_type

‘ATOM’

atom

<class ‘object’>

resids

resid

continuous sequence from 1 to n_residues

residue

<class ‘int’>

resnames

resname

‘’

residue

<class ‘object’>

resnums

resnum

continuous sequence from 1 to n_residues

residue

<class ‘int’>

rmin14s

rmin14

0.0

atom

<class ‘float’>

rmins

rmin

0.0

atom

<class ‘float’>

segids

segid

‘’

segment

<class ‘object’>

solventradii

solventradius

0.0

atom

<class ‘float’>

tempfactors

tempfactor

0.0

atom

<class ‘float’>

type_indices

type_index

No default values

atom

types

type

‘’

atom

<class ‘object’>

ureybradleys

ureybradleys

No default values

atom

Topology objects

MDAnalysis defines four types of TopologyObject by connectivity:

The value of each topology object can be calculated with value().

Each TopologyObject also contains the following attributes:

  • atoms : the ordered atoms in the object

  • indices : the ordered atom indices in the object

  • type : this is either the ‘type’ of the bond/angle/dihedral/improper, or a tuple of the atom types.

  • is_guessed : MDAnalysis can guess bonds. This property records if the object was read from a file or guessed.

Groups of these are held in TopologyGroups. The master groups of TopologyObjects are accessible as properties of a Universe. TopologyObjects are typically read from a file with connectivity information (see the supported formats here). However, they can be created in two ways: by adding them to a Universe, or by creating them from an AtomGroup. Bonds can be guessed based on distance and Van der Waals’ radii with AtomGroup.guess_bonds.

Adding to a Universe

As of version 0.21.0, there are specific methods for adding TopologyObjects to a Universe:

  • add_Bonds()

  • add_Angles()

  • add_Dihedrals()

  • add_Impropers()

These accept the following values:

Prior to version 0.21.0, objects could be added to a Universe with add_TopologyAttr().

In [15]: hasattr(pdb, 'angles')
Out[15]: False

In [16]: pdb.add_TopologyAttr('angles', [(0, 1, 2), (2, 3, 4)])

In [17]: pdb.angles
Out[17]: <TopologyGroup containing 2 angles>

Both of these methods add the new objects to the associated master TopologyGroup in the Universe.

Creating with an AtomGroup

An AtomGroup can be represented as a bond, angle, dihedral angle, or improper angle TopologyObject through the respective properties:

  • bond

  • angle

  • dihedral

  • improper

The AtomGroup must contain the corresponding number of atoms, in the desired order. For example, a bond cannot be created from three atoms.

In [18]: pdb.atoms[[3, 4, 2]].bond

ValueErrorTraceback (most recent call last)
<ipython-input-21-e59c36ab66f4> in <module>
----> 1 pdb.atoms[[3, 4, 2]].bond
...
ValueError: bond only makes sense for a group with exactly 2 atoms

However, the angle Atom 2 —– Atom 4 —— Atom 3 can be calculated, even if the atoms are not connected with bonds.

In [18]: a = pdb.atoms[[3, 4, 2]].angle

In [19]: print(a.value())
47.770653826924175

These AtomGroup TopologyObjects are not added to the associated master TopologyGroup in the Universe.

Deleting from a Universe

As of version 0.21.0, there are specific methods for deleting TopologyObjects from a Universe:

  • delete_Bonds()

  • delete_Angles()

  • delete_Dihedrals()

  • delete_Impropers()

Topology-specific methods

A number of analysis and transformation methods are defined for AtomGroup, ResidueGroup, and SegmentGroup that require specific properties to be available. The primary requirement is the positions attribute. With positions, you can easily compute a center of geometry:

>>> u.atoms.center_of_geometry()
array([-0.04223882,  0.01418196, -0.03504874])

The following methods all require coordinates.

  • bbox()

  • bsphere()

  • center()

  • center_of_geometry()

  • centroid()

  • pack_into_box()

  • rotate()

  • rotate_by()

  • transform()

  • translate()

  • unwrap()

  • wrap()

Other methods are made available when certain topology attributes are defined in the Universe. These are listed below.

Method

Description

Requires

center_of_charge()

Center of (absolute) charge of (compounds of) the group .. math:: boldsymbol R = frac{sum_i vert q_i vert boldsymbol r_i} {sum_i vert q_i vert} where \(q_i\) is the charge and \(\boldsymbol r_i\) the position of atom \(i\) in the given MDAnalysis.core.groups.AtomGroup

charges

total_charge()

Total charge of (compounds of) the group

charges

align_principal_axis()

Align principal axis with index axis with vector

masses

asphericity()

Asphericity

masses

center_of_mass()

Center of mass of (compounds of) the group .. math:: boldsymbol R = frac{sum_i m_i boldsymbol r_i}{sum m_i} where \(m_i\) is the mass and \(\boldsymbol r_i\) the position of atom \(i\) in the given MDAnalysis.core.groups.AtomGroup

masses

moment_of_inertia()

Moment of inertia tensor relative to center of mass

masses

principal_axes()

Calculate the principal axes from the moment of inertia

masses

radius_of_gyration()

Radius of gyration

masses

shape_parameter()

Shape parameter

masses

total_mass()

Total mass of (compounds of) the group

masses