Molecular Checker (Molck)¶
The Molecular Checker (Molck) is a tool for cleaning up molecular structures and making them conform to the compound library.
Molck removes any residues and atoms that are not defined in the compound library. This means that if the structure contains residues or atoms that are not part of the compound library, they will be removed during the cleaning process.
Caution
Do not use Molck if you need to preserve residues or atoms that are not defined in the compound library. For example, if your structure contains ligands or other custom molecules that are not in the compound library, using Molck would not preserve these components.
Programmatic usage¶
Molecular Checker (Molck) could be called directly from the code using Molck function:
#! /bin/env python
"""Run Molck with Python API.
This is an exemplary procedure on how to run Molck using Python API which is
equivalent to the command line:
molck <PDB PATH> --rm=hyd,oxt,nonstd,unk \
--fix-ele --out=<OUTPUT PATH> \
--complib=<PATH TO compounds.chemlib>
"""
from ost.io import LoadPDB, SavePDB
from ost.mol.alg import MolckSettings, Molck
from ost.conop import CompoundLib
pdbid = "<PDB PATH>"
lib = CompoundLib.Load("<PATH TO compounds.chemlib>")
# Using Molck function
ent = LoadPDB(pdbid)
ms = MolckSettings(rm_unk_atoms=True,
rm_non_std=True,
rm_hyd_atoms=True,
rm_oxt_atoms=True,
rm_zero_occ_atoms=False,
colored=False,
map_nonstd_res=False,
assign_elem=True)
Molck(ent, lib, ms)
SavePDB(ent, "<OUTPUT PATH>")
It can also be split into subsequent commands for greater controll:
#! /bin/env python
"""Run Molck with Python API.
This is an exemplary procedure on how to run Molck using Python API which is
equivalent to the command line:
molck <PDB PATH> --rm=hyd,oxt,nonstd,unk \
--fix-ele --out=<OUTPUT PATH> \
--complib=<PATH TO compounds.chemlib>
"""
from ost.io import LoadPDB, SavePDB
from ost.mol.alg import (RemoveAtoms, MapNonStandardResidues,
CleanUpElementColumn)
from ost.conop import CompoundLib
pdbid = "<PDB PATH>"
lib = CompoundLib.Load("<PATH TO compounds.chemlib>")
map_nonstd = False
# Using function chain
ent = LoadPDB(pdbid)
if map_nonstd:
MapNonStandardResidues(lib=lib, ent=ent)
RemoveAtoms(lib=lib,
ent=ent,
rm_unk_atoms=True,
rm_non_std=True,
rm_hyd_atoms=True,
rm_oxt_atoms=True,
rm_zero_occ_atoms=False,
colored=False)
CleanUpElementColumn(lib=lib, ent=ent)
SavePDB(ent, "<OUTPUT PATH>")
API¶
- class MolckSettings(rm_unk_atoms=True, rm_non_std=False, rm_hyd_atoms=True, rm_oxt_atoms=False, rm_zero_occ_atoms=False, colored=False, map_nonstd_res=True, assign_elem=True)¶
Stores settings used for Molecular Checker.
- Parameters:
rm_unk_atoms – Sets
rm_unk_atoms
.rm_non_std – Sets
rm_non_std
.rm_hyd_atoms – Sets
rm_hyd_atoms
.rm_oxt_atoms – Sets
rm_oxt_atoms
.rm_zero_occ_atoms – Sets
rm_zero_occ_atoms
.colored – Sets
colored
.map_nonstd_res – Sets
map_nonstd_res
.assign_elem – Sets
assign_elem
.
- rm_unk_atoms¶
Tip
This flag should always be set to True. Other flags will behave unexpectedly otherwise.
Remove unknown atoms. That is 1) any atom from residues that are not present in the compound library (provided at Molck call) and 2) any atom with a name that is not present in the respective entries of the compound library.
- Type:
bool
- rm_non_std¶
Remove all residues not one of the 20 standard amino acids. This removes all other residues including unknown residues, ligands, saccharides and nucleotides (including the 4 standard nucleotides).
- Type:
bool
- rm_hyd_atoms¶
Remove hydrogen atoms. That’s all atoms with element specified as H or D in the respective entries of the compound library (provided at Molck call). Unknown atoms (see
rm_unk_atoms
) are not removed by this flag. If you really want to get rid of every hydrogen, you need to combine it withrm_unk_atoms
.- Type:
bool
- rm_oxt_atoms¶
Remove all atoms with name “OXT”. That’s typically terminal oxygens in protein chains, but this might remove arbitrary atoms in other molecules. You should only use this flag in combination with
rm_non_std
.- Type:
bool
- rm_zero_occ_atoms¶
Remove atoms with zero occupancy.
- Type:
bool
- colored¶
Whether output should be colored.
- Type:
bool
- map_nonstd_res¶
Maps modified residues back to the parent amino acid, for example MSE -> MET, SEP -> SER.
- Type:
bool
- assign_elem¶
Assigns elements as defined in the respective entries of the compound library (provided at Molck call). For unknown atoms (see definition in
rm_unk_atoms
), the element is set to an empty string. To avoid empty strings as elements, this property should only be applied in combination withrm_unk_atoms
.- Type:
bool
- ToString()¶
- Returns:
String representation of the MolckSettings.
- Return type:
str
Warning
The API here is set such that the functions modify the passed structure ent in-place. If this is not ok, please work on a copy of the structure.
- Molck(ent, lib, settings[, prune=True])¶
Runs Molck on provided entity. Reprocesses ent with
ost.conop.HeuristicProcessor
and given lib once done.- Parameters:
ent (
EntityHandle
) – Structure to checklib (
CompoundLib
) – Compound librarysettings (
MolckSettings
) – Molck settingsprune (
bool
) – Whether to remove residues/chains that don’t contain atoms anymore after Molck cleanup
- MapNonStandardResidues(ent, lib, reprocess=True)¶
Maps modified residues back to the parent amino acid, for example MSE -> MET.
- Parameters:
ent (
EntityHandle
) – Structure to checklib (
CompoundLib
) – Compound libraryreprocess – The function generates a deep copy of ent. Highly recommended to enable reprocess that runs
ost.conop.HeuristicProcessor
with given lib. If set to False, you’ll have no connectivity etc. after calling this function.
- RemoveAtoms(ent, lib, rm_unk_atoms=True, rm_non_std=False, rm_hyd_atoms=True, rm_oxt_atoms=False, rm_zero_occ_atoms=False, colored=False,
- reprocess=True)
Removes atoms and residues according to some criteria.
- Parameters:
ent (
EntityHandle
) – Structure to checklib (
CompoundLib
) – Compound libraryrm_unk_atoms – See
MolckSettings.rm_unk_atoms
rm_non_std – See
MolckSettings.rm_non_std
rm_hyd_atoms – See
MolckSettings.rm_hyd_atoms
rm_oxt_atoms – See
MolckSettings.rm_oxt_atoms
rm_zero_occ_atoms – See
MolckSettings.rm_zero_occ_atoms
colored – See
MolckSettings.colored
reprocess – Removing atoms may impact certain annotations on the structure (chem class etc.) which are set by
ost.conop.Processor
. If set to True, aost.conop.HeuristicProcessor
with given lib reprocesses ent.
- CleanUpElementColumn(ent, lib)¶
Assigns elements as defined in the respective entries of the compound library as described in
MolckSettings.assign_elem
. This should only be called afterRemoveAtoms()
withrm_unk_atoms
set to True.- Parameters:
ent (
EntityHandle
) – Structure to checklib (
CompoundLib
) – Compound library