PDB, ENT (Standard PDB file)

Coordinate reader

MDAnalysis.coordinates.PDB.PDBReader

Coordinate writer

MDAnalysis.coordinates.PDB.PDBWriter

Topology parser

MDAnalysis.topology.PDBParser.PDBParser

Reading in

MDAnalysis parses the following PDB records (see PDB coordinate section for details):

  • CRYST1 for unit cell dimensions A,B,C, alpha,beta,gamma

  • ATOM or HETATM for serial, name, resName, chainID, resSeq, x, y, z, occupancy, tempFactor, segID

  • CONECT records for bonds

  • HEADER (Universe.trajectory.header)

  • TITLE (Universe.trajectory.title)

  • COMPND (Universe.trajectory.compound)

  • REMARK (Universe.trajectory.remarks)

All other lines are ignored. Multi-MODEL PDB files are read as trajectories with a default timestep of 1 ps (pass in the dt argument to change this). Currently, MDAnalysis cannot read multi-model PDB files written by VMD, as VMD uses the keyword “END” to separate models instead of “MODEL”/”ENDMDL” keywords.

Important

Previously, MDAnalysis did not read elements from a file. Now, if valid elements are provided, MDAnalysis will read them in and will not guess them from atom names.

MDAnalysis attempts to read segid attributes from the segID column. If this column does not contain information, segments are instead created from chainIDs. If chainIDs are also not present, then segids are set to the default 'SYSTEM' value.

Writing out

MDAnalysis can write both single-frame PDBs and convert trajectories to multi-model PDBs. If the Universe is missing fields that are required in a PDB file, MDAnalysis provides default values and raises a warning. There are 2 exceptions to this:

  • chainIDs: if a Universe does not have chainIDs, MDAnalysis uses the first character of the segment segid instead.

  • elements: Elements are always guessed from the atom name.

These are the default values:

  • names: ‘X’

  • altLocs: ‘’

  • resnames: ‘UNK’

  • icodes: ‘’

  • segids: ‘’

  • resids: 1

  • occupancies: 1.0

  • tempfactors: 0.0

PDB specification

CRYST1 fields

COLUMNS

DATA TYPE

FIELD

DEFINITION

1 - 6

Record name

“CRYST1”

7 - 15

Real(9.3)

a

a (Angstroms).

16 - 24

Real(9.3)

b

b (Angstroms).

25 - 33

Real(9.3)

c

c (Angstroms).

34 - 40

Real(7.2)

alpha

alpha (degrees).

41 - 47

Real(7.2)

beta

beta (degrees).

48 - 54

Real(7.2)

gamma

gamma (degrees).

ATOM/HETATM fields

COLUMNS

DATA TYPE

FIELD

DEFINITION

1 - 6

Record name

“ATOM “

7 - 11

Integer

serial

Atom serial number.

13 - 16

Atom

name

Atom name.

17

Character

altLoc

Alternate location indicator.

18 - 21

Residue name

resName

Residue name.

22

Character

chainID

Chain identifier.

23 - 26

Integer

resSeq

Residue sequence number.

27

AChar

iCode

Code for insertion of residues.

31 - 38

Real(8.3)

x

Orthogonal coordinates for X in Angstroms.

39 - 46

Real(8.3)

y

Orthogonal coordinates for Y in Angstroms.

47 - 54

Real(8.3)

z

Orthogonal coordinates for Z in Angstroms.

55 - 60

Real(6.2)

occupancy

Occupancy.

61 - 66

Real(6.2)

tempFactor

Temperature factor.

67 - 76

String

segID

(unofficial CHARMM extension ?)

77 - 78

LString(2)

element

Element symbol, right-justified.

79 - 80

LString(2)

charge

Charge on the atom.