Trajectories

In MDAnalysis, static data is contained in your universe Topology, while dynamic data is drawn from its trajectory at Universe.trajectory. This is typically loaded from a trajectory file and includes information such as:

  • atom coordinates (Universe.atoms.positions)

  • box size (Universe.dimensions)

  • velocities and forces (if your file format contains the data) (Universe.atoms.velocities)

Although these properties look static, they are actually dynamic, and the data contained within can change. In order to remain memory-efficient, MDAnalysis does not load every frame of your trajectory into memory at once. Instead, a Universe has a state: the particular timestep that it is currently associated with in the trajectory. When the timestep changes, the data in the properties above shifts accordingly.

The typical way to change a timestep is to index it. Universe.trajectory can be thought of as a list of Timesteps, a data structure that holds information for the current time frame. For example, you can query its length.

In [1]: import MDAnalysis as mda

In [2]: from MDAnalysis.tests.datafiles import PSF, DCD

In [3]: u = mda.Universe(PSF, DCD)

In [4]: len(u.trajectory)
Out[4]: 98

When a trajectory is first loaded from a file, it is set to the first frame (with index 0), by default.

In [5]: print(u.trajectory.ts, u.trajectory.time)
< Timestep 0 with unit cell dimensions [ 0.  0.  0. 90. 90. 90.] > 0.9999999119200186

Indexing the trajectory returns the timestep for that frame, and sets the Universe to point to that frame until the timestep next changes.

In [6]: u.trajectory[3]
Out[6]: < Timestep 3 with unit cell dimensions [ 0.  0.  0. 90. 90. 90.] >
In [7]: print('Time of fourth frame', u.trajectory.time)
Time of fourth frame 3.9999996476800743

Many tasks involve applying a function to each frame of a trajectory. For these, you need to iterate through the frames, even if you don’t directly use the timestep. This is because the act of iterating moves the Universe onto the next frame, changing the dynamic atom coordinates.

Trajectories can also be sliced if you only want to work on a subset of frames.

In [8]: protein = u.select_atoms('protein')

In [9]: for ts in u.trajectory[:20:4]:
   ...:     rad = protein.radius_of_gyration()
   ...:     print('frame={}: radgyr={}'.format(ts.frame, rad))
   ...: 
frame=0: radgyr=16.66901836864977
frame=4: radgyr=16.74396089321755
frame=8: radgyr=16.789386458745813
frame=12: radgyr=16.872313632082165
frame=16: radgyr=17.003316543310994

Note that after iterating over the trajectory, the frame is always set back to the first frame, even if your loop stopped before the trajectory end.

In [10]: u.trajectory.frame
Out[10]: 0

Because MDAnalysis will pull trajectory data directly from the file it is reading from, changes to atom coordinates and box dimensions will not persist once the frame is changed. The only way to make these changes permanent is to load the trajectory into memory, or to write a new trajectory to file for every frame. For example, to set a cubic box size for every frame and write it out to a file:

with mda.Writer('with_box.trr', 'w', n_atoms=u.atoms.n_atoms) as w:
    for ts in u.trajectory:
        ts.dimensions = [10, 10, 10, 90, 90, 90]
        w.write(u.atoms)

u_with_box = mda.Universe(PSF, 'with_box.trr')

Sometimes you may wish to only transform part of the trajectory, or to not write a file out. In these cases, MDAnalysis supports “on-the-fly” transformations that are performed on a frame when it is read.