The compound library
Compound libraries contain information on chemical compounds, such as their
connectivity, chemical class and one-letter-code. The compound library has
several uses, but the most important one is to provide the connectivy
information for the rule-based processor .
The compound definitions for standard PDB files are taken from the
components.cif dictionary provided by the PDB. The dictionary is updated with
every PDB release and augmented with the compound definitions of newly
crystallized compounds.
If you downloaded the bundle, a recent version of the compound library is
already included. If you are compiling from source or want to incorporate the
latest compound definitions, follow these instructions to
build the compound library manually.
-
GetDefaultLib ()
Get the default compound library. This is set by SetDefaultLib() .
If you obtained OpenStructure as a container or you
compiled it with a specified COMPOUND_LIB flag,
this function will return a compound library.
You can override the default compound library by pointing the
OST_COMPOUNDS_CHEMLIB environment variable to a valid compound library
file.
Returns: | Default compound library. |
Return type: | CompoundLib or None if no library set |
-
SetDefaultLib (lib)
Parameters: | lib (CompoundLib ) – Library to be set as default compound library. |
-
class
CompoundLib
-
static
Load (database, readonly=True)
Load the compound lib from database with the given name.
Parameters: | readonly (bool ) – Whether the library should be opened in read-only mode. It
is important to note that only one program at the time has write access to
compound library. If multiple programs try to open the compound library in
write mode, the programs can deadlock. |
Returns: | The loaded compound lib or None if it failed. |
-
static
Create (database)
Create a new compound library
-
FindCompound (tlc, dialect='PDB')
Lookup compound by its three-letter-code, e.g ALA. If no compound with that
name exists, the function returns None. Compounds are cached after they have
been loaded with FindCompound. To delete the compound cache, use
ClearCache() .
Returns: | The found compound |
Return type: | Compound |
-
Copy (dst_filename)
Copy database to dst_filename. The new library will be an exact copy of the
database. The special name :memory: will create an in-memory version of
the database. At the expense of memory, database lookups will become much
faster.
Returns: | The copied compound library |
Return type: | CompoundLib |
-
ClearCache ()
Clear the compound cache.
-
SetChemLibInfo ()
When creating the new library the current date and the Version of OST used
are stored into the table chemlib_info.
-
GetOSTVersionUsed ()
Returns: | OST version (ost_version_used from the table chemlib_info) |
Return type: | str |
-
GetCreationDate ()
Returns: | creation date (creation_date from the table chemlib_info) |
Return type: | str |
-
class
Compound
Holds the description of a chemical compound, such as three-letter-code, and
chemical class.
-
id
Alias for three_letter_code
-
three_letter_code
Three-letter code of the residue, e.g. ALA for alanine. The three-letter
code is unique for each compound, always in uppercase letters and is between
1 and 3 characters long.
code is always uppercase.
-
one_letter_code
The one letter code of the residue, e.g. ‘G’ for glycine. If undefined, the
one letter code of the residue is set to ‘?’
-
formula
The chemical composition, e.g. ‘H2 O’ for water. The elements are listed in
alphabetical order.
-
dialect
The dialect of the compound.
-
atom_specs
The atom definitions of this compound. Read-only.
-
bond_specs
The bond definitions of this compound. Read-only.
-
chem_class
The ChemClass of this compound. Read-only.
-
chem_type
The ChemType of this compound. Read-only.
-
inchi
The InChI code of this compound, without the ‘InChI=’ part, e.g
‘1S/H2O/h1H2’ for water. Read-only.
-
inchi_key
The InChIKey of this compound without the ‘InChIKey=’ part, e.g.
‘XLYOFNOQVPJJNP-UHFFFAOYSA-N’ for water. Read-only.
-
class
AtomSpec
Definition of an atom
-
element
The element of the atom
-
name
The primary name of the atom
-
alt_name
Alternative atom name. If the atom has only one name, this is identical to
name
-
is_leaving
Whether this atom is required for a residue to be complete. The best example
of a leaving atom is the OXT atom of amino acids that gets lost when a
peptide bond is formed.
-
class
BondSpec
Definition of a bond
-
atom_one
The first atom of the bond, encoded as index into the
Compound.atom_specs array.
-
atom_two
The second atom of the bond, encoded as index into the
Compound.atom_specs array.
-
order
The bond order, 1 for single bonds, 2 for double-bonds and 3 for
triple-bonds
Example: Translating SEQRES entries
In this example we will translate the three-letter-codes given in the SEQRES record to one-letter-codes. Note that this automatically takes care of modified amino acids such as selenium-methionine.
compound_lib=conop.CompoundLib.Load('compounds.chemlib')
seqres='ALA GLY MSE VAL PHE'
sequence=''
for tlc in seqres.split():
compound=compound_lib.FindCompound(tlc)
if compound:
sequence+=compound.one_letter_code
print(sequence) # prints 'AGMVF'
Creating a compound library
The simplest way to create compound library is to use the chemdict_tool. The programs allows you to import the chemical
description of the compounds from a mmCIF dictionary, e.g. the components.cif dictionary provided by the PDB. The latest dictionary for can be downloaded from the wwPDB site. The files are rather large, it is therefore recommended to download the gzipped version.
After downloading the file use chemdict_tool to convert the MMCIF dictionary into our internal format.
chemdict_tool create <components.cif> <compounds.chemlib>
Notes:
- The chemdict_tool only understands .cif and .cif.gz files. If you have would like to use other sources for the compound definitions, consider writing a script by using the compound library API.
- This also loads compounds which are obsoleted by the PDB to maximize compatibility with older PDB files.
If you are working with CHARMM trajectory files, you will also have to add the
definitions for CHARMM. Assuming your are in the top-level source directory of
OpenStructure, this can be achieved by:
chemdict_tool update modules/conop/data/charmm.cif <compounds.chemlib> charmm
Once your library has been created, you need to tell cmake where to find it and
make sure it gets staged.
cmake -DCOMPOUND_LIB=compounds.chemlib
make
|
Contents
Search
Enter search terms or a module, class or function name.
Previous topic
Connectivity
Next topic
conop.cleanup – Sanitize structures
You are here
|