Auxiliary files¶
Auxiliary readers allow you to read in timeseries data accompanying a trajectory, that is not stored in the regular trajectory file.
Supported formats¶
Reader
Format
Extension (if file)
Remarks
XVGReader
XVG
xvg
(default)
Produced by Gromacs during simulation or analysis. Reads full file on initialisation.
XVGFileReader
XVG-F
xvg
Alternate xvg file reader, reading each step from the file in turn for a lower memory footprint.
XVGReader
is the default reader for .xvg files.
Reading data directly¶
In [1]: import MDAnalysis as mda
In [2]: from MDAnalysis.tests.datafiles import XVG_BZ2 # cobrotoxin protein forces
In [3]: aux = mda.auxiliary.core.auxreader(XVG_BZ2)
In [4]: aux
Out[4]: <MDAnalysis.auxiliary.XVG.XVGReader at 0x7fe68cf50710>
In stand-alone use, an auxiliary reader allows you to iterate over each step in a set of auxiliary data.
In [5]: for step in aux:
...: print(step.data)
...:
[ 0. 200.71288 -1552.2849 ... 128.4072 1386.0378
-2699.3118 ]
[ 50. -1082.6454 -658.32166 ... -493.02954 589.8844
-739.2124 ]
[ 100. -246.27269 146.52911 ... 484.32501 2332.3767
-1801.6234 ]
Use slicing to skip steps.
In [6]: for step in aux[1:2]:
...: print(step.time)
...:
50.0
The auxreader()
function uses the get_auxreader_for()
to return an appropriate class. This can guess the format either from a filename, ‘
In [7]: mda.auxiliary.core.get_auxreader_for(XVG_BZ2)
Out[7]: MDAnalysis.auxiliary.XVG.XVGReader
or return the reader for a specified format.
In [8]: mda.auxiliary.core.get_auxreader_for(format='XVG-F')
Out[8]: MDAnalysis.auxiliary.XVG.XVGFileReader
Loading data into a Universe¶
Auxiliary data may be added to a trajectory Reader through the
add_auxiliary()
method. Auxiliary data
may be passed in as a AuxReader instance, or directly as e.g. a filename, in
which case get_auxreader_for()
is used to
guess an appropriate reader.
In [9]: from MDAnalysis.tests.datafiles import PDB_xvf, TRR_xvf
In [10]: u = mda.Universe(PDB_xvf, TRR_xvf)
In [11]: u.trajectory.add_auxiliary('protein_force', XVG_BZ2)
In [12]: for ts in u.trajectory:
....: print(ts.aux.protein_force)
....:
[ 0. 200.71288 -1552.2849 ... 128.4072 1386.0378
-2699.3118 ]
[ 50. -1082.6454 -658.32166 ... -493.02954 589.8844
-739.2124 ]
[ 100. -246.27269 146.52911 ... 484.32501 2332.3767
-1801.6234 ]
Passing arguments to auxiliary data¶
For alignment with trajectory data, auxiliary readers provide methods to assign each auxiliary step to the nearest trajectory timestep, read all steps assigned to a trajectory timestep and calculate ‘representative’ value(s) of the auxiliary data for that timestep.
To set a timestep or ??
‘Assignment’ of auxiliary steps to trajectory timesteps is determined from the time
of the auxiliary step, dt
of the trajectory and time at the first frame of the
trajectory. If there are no auxiliary steps assigned to a given timestep (or none within
cutoff
, if set), the representative value(s) are set to np.nan
.
Iterating over auxiliary data¶
Auxiliary data may not perfectly line up with the trajectory, or have missing data.
In [13]: from MDAnalysis.tests.datafiles import PDB, TRR
In [14]: u_long = mda.Universe(PDB, TRR)
In [15]: u_long.trajectory.add_auxiliary('protein_force', XVG_BZ2, dt=200)
In [16]: for ts in u_long.trajectory:
....: print(ts.time, ts.aux.protein_force[:4])
....:
0.0 [ 0. 200.71288 -1552.2849 -967.21124]
100.00000762939453 [ 100. -246.27269 146.52911 -1084.2484 ]
200.00001525878906 [nan nan nan nan]
300.0 [nan nan nan nan]
400.0000305175781 [nan nan nan nan]
500.0000305175781 [nan nan nan nan]
600.0 [nan nan nan nan]
700.0000610351562 [nan nan nan nan]
800.0000610351562 [nan nan nan nan]
900.0000610351562 [nan nan nan nan]
The trajectory ProtoReader
methods
next_as_aux()
and
iter_as_aux()
allow for movement
through only trajectory timesteps for which auxiliary data is available.
In [17]: for ts in u_long.trajectory.iter_as_aux('protein_force'):
....: print(ts.time, ts.aux.protein_force[:4])
....:
0.0 [ 0. 200.71288 -1552.2849 -967.21124]
100.00000762939453 [ 100. -246.27269 146.52911 -1084.2484 ]
This may be used to
avoid representative values set to np.nan
, particularly when auxiliary data
is less frequent.
Sometimes the auxiliary data is longer than the trajectory.
In [18]: u_short = mda.Universe(PDB)
In [19]: u_short.trajectory.add_auxiliary('protein_force', XVG_BZ2)
In [20]: for ts in u_short.trajectory:
....: print(ts.time, ts.aux.protein_force)
....:
0.0 [ 0. 200.71288 -1552.2849 ... 128.4072 1386.0378
-2699.3118 ]
In order to acess auxiliary values at every individual step, including those
outside the time range of the trajectory,
iter_auxiliary()
allows iteration
over the auxiliary independent of the trajectory.
In [21]: for step in u_short.trajectory.iter_auxiliary('protein_force'):
....: print(step.data)
....:
[ 0. 200.71288 -1552.2849 ... 128.4072 1386.0378
-2699.3118 ]
[ 50. -1082.6454 -658.32166 ... -493.02954 589.8844
-739.2124 ]
[ 100. -246.27269 146.52911 ... 484.32501 2332.3767
-1801.6234 ]
To iterate over only a certain section of the auxiliary:
In [22]: for step in u_short.trajectory.iter_auxiliary('protein_force', start=1, step=2):
....: print(step.time)
....:
50.0
The trajectory remains unchanged, and the auxiliary will be returned to the current timestep after iteration is complete.
Accessing auxiliary attributes¶
To check the values of attributes of an added auxiliary, use
get_aux_attribute()
.
In [23]: u.trajectory.get_aux_attribute('protein_force', 'dt')
Out[23]: 50.0
If attributes are settable, they can be changed using
set_aux_attribute()
.
In [24]: u.trajectory.set_aux_attribute('protein_force', 'data_selector', [1])
The auxiliary may be renamed using set_aux_attribute
with ‘auxname’, or more
directly by using rename_aux()
.
In [25]: u.trajectory.ts.aux.protein_force
Out[25]:
array([ 0. , 200.71288, -1552.2849 , ..., 128.4072 ,
1386.0378 , -2699.3118 ])
In [26]: u.trajectory.rename_aux('protein_force', 'f')
In [27]: u.trajectory.ts.aux.f
Out[27]:
array([ 0. , 200.71288, -1552.2849 , ..., 128.4072 ,
1386.0378 , -2699.3118 ])
Recreating auxiliaries¶
To recreate an auxiliary, the set of attributes necessary to replicate it can
first be obtained with get_description()
.
The returned dictionary can then be passed to
auxreader()
to load a new copy of the
original auxiliary reader.
In [28]: description = aux.get_description()
In [29]: list(description.keys())
Out[29]:
['represent_ts_as',
'cutoff',
'dt',
'initial_time',
'time_selector',
'data_selector',
'constant_dt',
'auxname',
'format',
'auxdata']
In [30]: del aux
In [31]: mda.auxiliary.core.auxreader(**description)
Out[31]: <MDAnalysis.auxiliary.XVG.XVGReader at 0x7fe68d9618d0>
The ‘description’ of any or all the auxiliaries added to a trajectory can be
obtained using get_aux_descriptions()
.
In [32]: descriptions = u.trajectory.get_aux_descriptions(['f'])
To reload, pass the dictionary into add_auxiliary()
.
In [1]: u2 = mda.Universe(PDB, TRR)
In [2]: for desc in descriptions:
...: u2.trajectory.add_auxiliary(**desc)
...: