This document is for OpenStructure version 1.11, the latest version is 2.8 !

IO Profiles for entity importer

As of version 1.1, OpenStructure introduces IO profiles to fine-tune the behaviour of the molecule importers. A profile aggregates flags and methods that affect the import of molecular structures and influence both the behaviour of conop and io.

Basic usage of IO profiles

You are most certainly reading this document because you were having trouble loading PDB files. In that case, as a first step you will want to set the profile parameter of LoadPDB(). The profile parameter can either be the name of a profile or an instance of IOProfile. Both of the following two examples are equivalent:

ent = io.LoadPDB('weird.pdb', profile=io.profiles['SLOPPY'])
ent = io.LoadPDB('weird.pdb', profile='SLOPPY')

Profiles is a dictionary-like object containing all the profiles known to OpenStructure. You can add new ones by inserting them into the dictionary. If you are loading a lot of structures, you may want to set the default profile to avoid having to pass the profile every time you load a structure. This is done by assigning a different profile to DEFAULT:

io.profiles['DEFAULT']='SLOPPY'
ent = io.LoadPDB('weird.pdb')

Again, you can either assign the name of the profile, or the profile itself. If none of the profiles available by default suits your needs, feel free to create one to your liking.

Available default profiles

The following profiles are available by default. For a detailed description of what the different parameters mean, consult the documentation of the IOProfile class.

STRICT

This profile is the default and is known to work very well with PDB files coming from the official PDB website. It is equivalent to the following profile:

IOProfile(dialect='PDB', strict_hydrogens=False, quack_mode=False,
          fault_tolerant=False, join_spread_atom_records=False,
          no_hetatms=False, bond_feasibility_check=False)

SLOPPY:

This profile loads essentially everything

IOProfile(dialect='PDB', strict_hydrogens=False, quack_mode=True,
          fault_tolerant=True, join_spread_atom_records=False,
          no_hetatms=False, bond_feasibility_check=True)

CHARMM:

This format is the default when importing CHARMM trajectories and turns on the CHARMM specific compound dictionary.

IOProfile(dialect='CHARMM', strict_hydrogens=False, quack_mode=True,
          fault_tolerant=True, join_spread_atom_records=True,
          no_hetatms=False, bond_feasibility_check=True)

The IOProfile Class

class IOProfile(dialect='PDB', strict_hydrogens=False, quack_mode=False, join_spread_atom_records=False, no_hetatms=False, calpha_only=False, fault_tolerant=False, bond_feasibility_check=True)

Aggregates flags that control the import of molecular structures.

quack_mode
Type:bool

Read/write property. When quack_mode is enabled, the chemical class for unknown residues is guessed based on its atoms and connectivity. Turn this on if you are working with non-standard conforming PDB files and are experiencing problems with the rendering of the backbone trace and/or see peptidic residues with unknown chemical classes.

dialect
Type:str

The dialect to be used for PDB files. At the moment, this is either CHARMM or PDB. More will most likely come in the future. By setting the dialect to CHARMM, the loading is optimized for CHARMM PDB files. This turns on support for chain names with length up to 4 characters (column 72-76) and increase the size of the residue name to 4 residues.

strict_hydrogens
Type:bool

Whether hydrogen names should be strictly checked. It is very common for PDB files to not follow the correct naming conventions for hydrogen atoms. That’s why by default the names of the hydrogens are not required to be correct. Rather, the connectivity is inferred with distance-based checks. By turning this flag on, the names of the hydrogen atoms are checked against the names in the database like all other atom types.

no_hetatms

If set to true, HETATM records are ignored during import.

fault_tolerant
Type:bool

If true, the import will succeed, even if the PDB contains faulty records. The faulty records will be ignored and import continues as if the records are not present.

join_spread_atom_records

If set to true, atom records belonging to the same residue are joined, even if they do not appear sequentially in the PDB file.

calpha_only

When set to true, forces the importer to only load atoms named CA. This is most useful in combination with protein-only PDB files to speed up subsequent processing and importing.

bond_feasibility_check

When set to true, adds an additional distance feasibility to figure out if two atoms should be connected. Atoms are only connected if they are within a certain distance range. Set this to false to completely disable distance checks for intra-residual bonds. Peptide bonds as well as bonds between nucleotides involving more than one residue still make use of the distance check to figure out of if the two residues should be connected.

Search

Enter search terms or a module, class or function name.

Contents

Documentation is available for the following OpenStructure versions:

dev / 2.8 / 2.7 / 2.6 / 2.5 / 2.4 / 2.3.1 / 2.3 / 2.2 / 2.1 / 2.0 / 1.9 / 1.8 / 1.7.1 / 1.7 / 1.6 / 1.5 / 1.4 / 1.3 / 1.2 / (Currently viewing 1.11) / 1.10 / 1.1

This documentation is still under heavy development!
If something is missing or if you need the C++ API description in doxygen style, check our old documentation for further information.