`ligand_scoring` – Ligand scoring functions¶

Note

Extra requirements:

Python modules numpy and networkx must be available (e.g. use pip install numpy networkx)

class LigandScorer(model, target, model_ligands, target_ligands, resnum_alignments=False, substructure_match=False, coverage_delta=0.2, max_symmetries=100000.0, rename_ligand_chain=False, min_pep_length=6, min_nuc_length=4, pep_seqid_thr=95.0, nuc_seqid_thr=95.0, mdl_map_pep_seqid_thr=0.0, mdl_map_nuc_seqid_thr=0.0, seqres=None, trg_seqres_mapping=None)¶

Scorer to compute various small molecule ligand (non polymer) scores.

LigandScorer is an abstract base class dealing with all the setup, data storage, enumerating ligand symmetries and target/model ligand matching/assignment. But actual score computation is delegated to child classes.

At the moment, two such classes are available:

ost.mol.alg.ligand_scoring_lddtpli.LDDTPLIScorer that assesses the conservation of protein-ligand contacts (LDDT-PLI);
ost.mol.alg.ligand_scoring_scrmsd.SCRMSDScorer that computes a binding-site superposed, symmetry-corrected RMSD (BiSyRMSD) and ligand pocket LDDT (LDDT-LP).

All versus all scores are available through the lazily computed score_matrix. However, many things can go wrong… be it even something as simple as two ligands not matching. Error states therefore encode scoring issues. An Issue for a particular ligand is indicated by a non-zero state in model_ligand_states/target_ligand_states. This invalidates pairwise scores of such a ligand with all other ligands. This and other issues in pairwise score computation are reported in state_matrix which has the same size as score_matrix. Only if the respective location is 0, a valid pairwise score can be expected. The states and their meaning can be explored with code:

for state_code, (short_desc, desc) in scorer_obj.state_decoding.items():
    print(state_code)
    print(short_desc)
    print(desc)

A common use case is to derive a one-to-one mapping between ligands in the model and the target for which LigandScorer provides an automated assignment procedure. By default, only exact matches between target and model ligands are considered. This is a problem when the target only contains a subset of the expected atoms (for instance if atoms are missing in an experimental structure, which often happens in the PDB). With substructure_match=True, complete model ligands can be scored against partial target ligands. One problem with this approach is that it is very easy to find good matches to small, irrelevant ligands like EDO, CO2 or GOL. The assignment algorithm therefore considers the coverage, expressed as the fraction of atoms of the model ligand atoms covered in the target. Higher coverage matches are prioritized, but a match with a better score will be preferred if it falls within a window of coverage_delta (by default 0.2) of a worse-scoring match. As a result, for instance, with a delta of 0.2, a low-score match with coverage 0.96 would be preferred over a high-score match with coverage 0.70.

Assumptions:

Unlike most of OpenStructure, this class does not assume that the ligands (either for the model or the target) are part of the PDB component dictionary. They may have arbitrary residue names. Residue names do not have to match between the model and the target. Matching is based on the calculation of isomorphisms which depend on the atom element name and atom connectivity (bond order is ignored). It is up to the caller to ensure that the connectivity of atoms is properly set before passing any ligands to this class. Ligands with improper connectivity will lead to bogus results.

This only applies to the ligand. The rest of the model and target structures (protein, nucleic acids) must still follow the usual rules and contain only residues from the compound library. Structures are cleaned up according to constructor documentation. We advise to use the ost.mol.alg.scoring_base.MMCIFPrep() and ost.mol.alg.scoring_base.PDBPrep() for loading which already clean hydrogens and, in the case of MMCIF, optionally extract ligands ready to be used by the LigandScorer based on “non-polymer” entity types. In case of PDB file format, ligands must be loaded separately as SDF files.

Only polymers (protein and nucleic acids) of model and target are considered for ligand binding sites. The ost.mol.alg.chain_mapping.ChainMapper is used to enumerate possible mappings of these chains. In short: identical chains in the target are grouped based on pairwise sequence identity (see pep_seqid_thr/nuc_seqid_thr param). Each model chain is assigned to one of these groups (see mdl_map_pep_seqid_thr/mdl_map_nuc_seqid_thr param). To avoid spurious matches, only polymers of a certain length are considered in this matching procedure (see min_pep_length/min_nuc_length param). Shorter polymers are never mapped and do not contribute to scoring.

Here is an example of how to setup a scorer:

from ost.mol.alg.ligand_scoring_scrmsd import SCRMSDScorer
from ost.mol.alg.scoring_base import MMCIFPrep
from ost.mol.alg.scoring_base import PDBPrep

# Load data
# Structure model in PDB format, containing the receptor only
model = PDBPrep("path_to_model.pdb")
# Ligand model as SDF file
model_ligand = io.LoadEntity("path_to_ligand.sdf", format="sdf")
# Target loaded from mmCIF, containing the ligand
target, target_ligands = MMCIFPrep("path_to_target.cif",
                                   extract_nonpoly=True)

# Setup scorer object and compute SCRMSD
model_ligands = [model_ligand.Select("ele != H")]
sc = SCRMSDScorer(model, target, model_ligands, target_ligands)

# Perform assignment and read respective scores
for lig_pair in sc.assignment:
    trg_lig = sc.target_ligands[lig_pair[0]]
    mdl_lig = sc.model_ligands[lig_pair[1]]
    score = sc.score_matrix[lig_pair[0], lig_pair[1]]
    print(f"Score for {trg_lig} and {mdl_lig}: {score}")

# check cleanup in model and target structure:
print("model cleanup:", sc.model_cleanup_log)
print("target cleanup:", sc.target_cleanup_log)

Parameters:

model (ost.mol.EntityHandle/ost.mol.EntityView) – Model structure - a deep copy is available as model. The model undergoes the following cleanup steps which are dependent on ost.conop.CompoundLib returned by ost.conop.GetDefaultLib(): 1) removal of hydrogens, 2) removal of residues for which there is no entry in ost.conop.CompoundLib, 3) removal of residues that are not peptide linking or nucleotide linking according to ost.conop.CompoundLib 4) removal of atoms that are not defined for respective residues in ost.conop.CompoundLib. Except step 1), every cleanup is logged with ost.LogLevel Warning and a report is available as model_cleanup_log.
target (ost.mol.EntityHandle/ost.mol.EntityView) – Target structure - same processing as model.
model_ligands (list) – Model ligands, as a list of ost.mol.ResidueHandle/ ost.mol.ResidueView/ ost.mol.EntityHandle/ ost.mol.EntityView. For ost.mol.EntityHandle/ ost.mol.EntityView, each residue is considered to be an individual ligand. All ligands are copied into a separate ost.mol.EntityHandle available as model_ligand_ent and the respective list of ligands is available as model_ligands.
target_ligands (list) – Target ligands, same processing as model ligands.
resnum_alignments (bool) – Whether alignments between chemically equivalent chains in model and target can be computed based on residue numbers. This can be assumed in benchmarking setups such as CAMEO/CASP.
substructure_match (bool) – Set this to True to allow incomplete (i.e. partially resolved) target ligands.
coverage_delta (float) – the coverage delta for partial ligand assignment.
max_symmetries (int) – If more than that many isomorphisms exist for a target-ligand pair, it will be ignored and reported as unassigned.
min_pep_length (int) – Relevant parameter if short peptides are involved in the polymer binding site. Minimum peptide length for a chain to be considered in chain mapping. The chain mapping algorithm first performs an all vs. all pairwise sequence alignment to identify “equal” chains within the target structure. We go for simple sequence identity there. Short sequences can be problematic as they may produce high sequence identity alignments by pure chance.
min_nuc_length (int) – Same for nucleotides
pep_seqid_thr (float) – Parameter that affects identification of identical chains in target - see ost.mol.alg.chain_mapping.ChainMapper
nuc_seqid_thr (float) – Parameter that affects identification of identical chains in target - see ost.mol.alg.chain_mapping.ChainMapper
mdl_map_pep_seqid_thr (float) – Parameter that affects mapping of model chains to target chains - see ost.mol.alg.chain_mapping.ChainMapper
mdl_map_nuc_seqid_thr (float) – Parameter that affects mapping of model chains to target chains - see ost.mol.alg.chain_mapping.ChainMapper
seqres (ost.seq.SequenceList) – Parameter that affects identification of identical chains in target - see ost.mol.alg.chain_mapping.ChainMapper
trg_seqres_mapping (dict) – Parameter that affects identification of identical chains in target - see ost.mol.alg.chain_mapping.ChainMapper

property model¶

Model receptor structure

Processed according to docs in LigandScorer constructor

property target¶

Target receptor structure

Processed according to docs in LigandScorer constructor

property model_cleanup_log¶

Reports residues/atoms that were removed in model during cleanup

Residues and atoms are described as str in format <chain_name>.<resnum>.<ins_code> (residue) and <chain_name>.<resnum>.<ins_code>.<aname> (atom).

dict with keys:

‘cleaned_residues’: another dict with keys:
- ‘no_clib’: residues that have been removed because no entry could be found in ost.conop.CompoundLib
- ‘not_linking’: residues that have been removed because they’re not peptide or nucleotide linking according to ost.conop.CompoundLib
‘cleaned_atoms’: another dict with keys:
- ‘unknown_atoms’: atoms that have been removed as they’re not part of their respective residue according to ost.conop.CompoundLib

property target_cleanup_log¶: Same for target

property model_ligands¶

Residues representing model ligands

list of ost.mol.ResidueHandle

property target_ligands¶

Residues representing target ligands

list of ost.mol.ResidueHandle

property resnum_alignments¶: Given at LigandScorer construction

property min_pep_length¶: Given at LigandScorer construction

property min_nuc_length¶: Given at LigandScorer construction

property pep_seqid_thr¶: Given at LigandScorer construction

property nuc_seqid_thr¶: Given at LigandScorer construction

property mdl_map_pep_seqid_thr¶: Given at LigandScorer construction

property mdl_map_nuc_seqid_thr¶: Given at LigandScorer construction

property seqres¶: Given at LigandScorer construction

property trg_seqres_mapping¶: Given at LigandScorer construction

property substructure_match¶: Given at LigandScorer construction

property coverage_delta¶: Given at LigandScorer construction

property max_symmetries¶: Given at LigandScorer construction

property state_matrix¶

Encodes states of ligand pairs

Ligand pairs can be matched and a valid score can be expected if respective location in this matrix is 0. Target ligands are in rows, model ligands in columns. States are encoded as integers <= 9. Larger numbers encode errors for child classes. Use something like self.state_decoding[3] to get a decscription.

Return type:: ndarray

property model_ligand_states¶

Encodes states of model ligands

Non-zero state in any of the model ligands invalidates the full respective column in state_matrix.

Return type:: ndarray

property target_ligand_states¶

Encodes states of target ligands

Non-zero state in any of the target ligands invalidates the full respective row in state_matrix.

Return type:: ndarray

property score_matrix¶

Get the matrix of scores.

Target ligands are in rows, model ligands in columns.

NaN values indicate that no value could be computed (i.e. different ligands). In other words: values are only valid if the respective location in state_matrix is 0.

Return type:: ndarray

property coverage_matrix¶

Get the matrix of model ligand atom coverage in the target.

Target ligands are in rows, model ligands in columns.

NaN values indicate that no value could be computed (i.e. different ligands). In other words: values are only valid if the respective location in state_matrix is 0. If substructure_match=False, only full match isomorphisms are considered, and therefore only values of 1.0 can be observed.

Return type:: ndarray

property aux_matrix¶

Get the matrix of scorer specific auxiliary data.

Target ligands are in rows, model ligands in columns.

Auxiliary data consists of arbitrary data dicts which allow a child class to provide additional information for a scored ligand pair. empty dictionaries indicate that the child class simply didn’t return anything or that no value could be computed (e.g. different ligands). In other words: values are only valid if respective location in the state_matrix is 0.

Return type:: ndarray

property assignment¶

Ligand assignment based on computed scores

Implements a greedy algorithm to assign target and model ligands with each other. Starts from each valid ligand pair as indicated by a state of 0 in state_matrix. Each iteration first selects high coverage pairs. Given max_coverage defined as the highest coverage observed in the available pairs, all pairs with coverage in [max_coverage-coverage_delta, max_coverage] are selected. The best scoring pair among those is added to the assignment and the whole process is repeated until there are no ligands to assign anymore.

Return type:: list of tuple (trg_lig_idx, mdl_lig_idx)

property score¶

Get a dictionary of score values, keyed by model ligand

Extract score with something like: scorer.score[lig.GetChain().GetName()][lig.GetNumber()]. The returned scores are based on assignment.

Return type:: dict

property aux¶

Get a dictionary of score details, keyed by model ligand

Extract dict with something like: scorer.score[lig.GetChain().GetName()][lig.GetNumber()]. The returned info dicts are based on assignment. The content is documented in the respective child class.

Return type:: dict

property unassigned_target_ligands¶

Get indices of target ligands which are not assigned

Return type:: list of int

property unassigned_model_ligands¶

Get indices of model ligands which are not assigned

Return type:: list of int

get_target_ligand_state_report(trg_lig_idx)¶

Get summary of states observed with respect to all model ligands

Mainly for debug purposes

Parameters:: trg_lig_idx (int) – Index of target ligand for which report should be generated

get_model_ligand_state_report(mdl_lig_idx)¶

Get summary of states observed with respect to all target ligands

Mainly for debug purposes

Parameters:: mdl_lig_idx (int) – Index of model ligand for which report should be generated

guess_target_ligand_unassigned_reason(trg_lig_idx)¶

Makes an educated guess why target ligand is not assigned

This either returns actual error states or custom states that are derived from them. Currently, the following reasons are reported:

no_ligand: there was no ligand in the model.
disconnected: the ligand graph was disconnected.
identity: the ligand was not found in the model (by graph isomorphism). Check your ligand connectivity.
no_iso: no full isomorphic match could be found. Try enabling substructure_match=True if the target ligand is incomplete.
symmetries: too many symmetries were found (by graph isomorphisms). Try to increase max_symmetries.
stoichiometry: there was a possible assignment in the model, but the model ligand was already assigned to a different target ligand. This indicates different stoichiometries.
no_contact (LDDT-PLI only): There were no LDDT contacts between the binding site and the ligand, and LDDT-PLI is undefined.
target_binding_site (SCRMSD only): no polymer residues were in proximity of the target ligand.
model_binding_site (SCRMSD only): the binding site was not found in the model. Either the binding site was not modeled or the model ligand was positioned too far in combination with full_bs_search=False.

Parameters:: trg_lig_idx (int) – Index of target ligand
Returns:: tuple with two elements: 1) keyword 2) human readable sentence describing the issue, (“unknown”,”unknown”) if nothing obvious can be found.
Raises:: RuntimeError if specified target ligand is assigned

guess_model_ligand_unassigned_reason(mdl_lig_idx)¶

Makes an educated guess why model ligand is not assigned

This either returns actual error states or custom states that are derived from them. Currently, the following reasons are reported:

no_ligand: there was no ligand in the target.
disconnected: the ligand graph is disconnected.
identity: the ligand was not found in the target (by graph or subgraph isomorphism). Check your ligand connectivity.
no_iso: no full isomorphic match could be found. Try enabling substructure_match=True if the target ligand is incomplete.
symmetries: too many symmetries were found (by graph isomorphisms). Try to increase max_symmetries.
stoichiometry: there was a possible assignment in the target, but the model target was already assigned to a different model ligand. This indicates different stoichiometries.
no_contact (LDDT-PLI only): There were no LDDT contacts between the binding site and the ligand, and LDDT-PLI is undefined.
target_binding_site (SCRMSD only): a potential assignment was found in the target, but there were no polymer residues in proximity of the ligand in the target.
model_binding_site (SCRMSD only): a potential assignment was found in the target, but no binding site was found in the model. Either the binding site was not modeled or the model ligand was positioned too far in combination with full_bs_search=False.

Parameters:: mdl_lig_idx (int) – Index of model ligand
Returns:: tuple with two elements: 1) keyword 2) human readable sentence describing the issue, (“unknown”,”unknown”) if nothing obvious can be found.
Raises:: RuntimeError if specified model ligand is assigned

ComputeSymmetries(model_ligand, target_ligand, substructure_match=False, by_atom_index=False, return_symmetries=True, max_symmetries=1000000.0, model_graph=None, target_graph=None)¶

Return a list of symmetries (isomorphisms) of the model onto the target residues.

Parameters:

model_ligand (ost.mol.ResidueHandle or ost.mol.ResidueView) – The model ligand
target_ligand (ost.mol.ResidueHandle or ost.mol.ResidueView) – The target ligand
substructure_match (bool) – Set this to True to allow partial ligands in the reference.
by_atom_index (bool) – Set this parameter to True if you need the symmetries to refer to atom index (within the residue). Otherwise, if False, the symmetries refer to atom names.
max_symmetries (int) – If more than that many isomorphisms exist, raise a TooManySymmetriesError. This can only be assessed by generating at least that many isomorphisms and can take some time.

Raises:

NoSymmetryError when no symmetry can be found; NoIsomorphicSymmetryError in case of isomorphic subgraph but substructure_match is False; TooManySymmetriesError when more than max_symmetries isomorphisms are found; DisconnectedGraphError if graph for model_ligand/target_ligand is disconnected.

exception NoSymmetryError¶: Exception raised when no symmetry can be found.

exception NoIsomorphicSymmetryError¶

Exception raised when no isomorphic symmetry can be found

There would be isomorphic subgraphs for which symmetries can be found, but substructure_match is disabled

exception TooManySymmetriesError¶: Exception raised when too many symmetries are found.

exception DisconnectedGraphError¶: Exception raised when the ligand graph is disconnected.

class LDDTPLIScorer(model, target, model_ligands, target_ligands, resnum_alignments=False, rename_ligand_chain=False, substructure_match=False, coverage_delta=0.2, max_symmetries=10000.0, lddt_pli_radius=6.0, add_mdl_contacts=True, lddt_pli_thresholds=[0.5, 1.0, 2.0, 4.0], lddt_pli_binding_site_radius=None, min_pep_length=6, min_nuc_length=4, pep_seqid_thr=95.0, nuc_seqid_thr=95.0, mdl_map_pep_seqid_thr=0.0, mdl_map_nuc_seqid_thr=0.0, seqres=None, trg_seqres_mapping=None)¶

LigandScorer implementing LDDT-PLI.

LDDT-PLI is an LDDT score considering contacts between ligand and receptor. Where receptor consists of protein and nucleic acid chains that pass the criteria for chain mapping. This means ignoring other ligands, waters, short polymers as well as any incorrectly connected chains that may be in proximity.

LDDTPLIScorer computes a score for a specific pair of target/model ligands. Given a target/model ligand pair, all possible mappings of model chains onto their chemically equivalent target chains are enumerated. For each of these enumerations, all possible symmetries, i.e. atom-atom assignments of the ligand as given by LigandScorer, are evaluated and an LDDT-PLI score is computed. The best possible LDDT-PLI score is returned.

The LDDT-PLI score is a variant of LDDT with a custom inclusion radius (lddt_pli_radius), no stereochemistry checks, and which penalizes contacts added in the model within lddt_pli_radius by default (can be changed with the add_mdl_contacts flag) but only if the involved atoms can be mapped to the target. This is a requirement to 1) extract the respective reference distance from the target 2) avoid usage of contacts for which we have no experimental evidence. One special case are contacts from chains that are not mapped to the target binding site. It is very well possible that we have experimental evidence for this chain though its just too far away from the target binding site. We therefore try to map these contacts to the chain in the target with equivalent sequence that is closest to the target binding site. If the respective atoms can be mapped there, the contact is considered not fulfilled and added as penalty.

Populates LigandScorer.aux_data with following dict keys:

lddt_pli: The LDDT-PLI score
lddt_pli_n_contacts: Number of contacts considered in LDDT computation
target_ligand: The actual target ligand for which the score was computed
model_ligand: The actual model ligand for which the score was computed
chain_mapping: dict with a chain mapping of chains involved in binding site - key: trg chain name, value: mdl chain name

Parameters:

model (ost.mol.EntityHandle/ost.mol.EntityView) – Passed to parent constructor - see LigandScorer.
target (ost.mol.EntityHandle/ost.mol.EntityView) – Passed to parent constructor - see LigandScorer.
model_ligands (list) – Passed to parent constructor - see LigandScorer.
target_ligands (list) – Passed to parent constructor - see LigandScorer.
resnum_alignments (bool) – Passed to parent constructor - see LigandScorer.
rename_ligand_chain (bool) – Passed to parent constructor - see LigandScorer.
substructure_match (bool) – Passed to parent constructor - see LigandScorer.
coverage_delta (float) – Passed to parent constructor - see LigandScorer.
max_symmetries (int) – Passed to parent constructor - see LigandScorer.
lddt_pli_radius (float) – LDDT inclusion radius for LDDT-PLI.
add_mdl_contacts (bool) – Whether to penalize added model contacts.
lddt_pli_thresholds (list of float) – Distance difference thresholds for LDDT.
lddt_pli_binding_site_radius (float) – Pro param - dont use. Providing a value Restores behaviour from previous implementation that first extracted a binding site with strict distance threshold and computed LDDT-PLI only on those target residues whereas the current implementation includes every atom within lddt_pli_radius.
min_pep_length (int) – See ost.mol.alg.ligand_scoring_base.LigandScorer.
min_nuc_length (int) – See ost.mol.alg.ligand_scoring_base.LigandScorer
pep_seqid_thr (float) – See ost.mol.alg.ligand_scoring_base.LigandScorer
nuc_seqid_thr (float) – See ost.mol.alg.ligand_scoring_base.LigandScorer
mdl_map_pep_seqid_thr (float) – See ost.mol.alg.ligand_scoring_base.LigandScorer
mdl_map_nuc_seqid_thr (float) – See ost.mol.alg.ligand_scoring_base.LigandScorer

class SCRMSDScorer(model, target, model_ligands, target_ligands, resnum_alignments=False, rename_ligand_chain=False, substructure_match=False, coverage_delta=0.2, max_symmetries=100000.0, bs_radius=4.0, lddt_lp_radius=15.0, model_bs_radius=25, binding_sites_topn=100000, full_bs_search=False, min_pep_length=6, min_nuc_length=4, pep_seqid_thr=95.0, nuc_seqid_thr=95.0, mdl_map_pep_seqid_thr=0.0, mdl_map_nuc_seqid_thr=0.0, seqres=None, trg_seqres_mapping=None)¶

LigandScorer implementing symmetry corrected RMSD (BiSyRMSD).

SCRMSDScorer computes a score for a specific pair of target/model ligands.

The returned RMSD is based on a binding site superposition. The binding site of the target structure is defined as all residues with at least one atom within bs_radius around the target ligand. It only contains protein and nucleic acid residues from chains that pass the criteria for the chain mapping. This means ignoring other ligands, waters, short polymers as well as any incorrectly connected chains that may be in proximity. The respective model binding site for superposition is identified by naively enumerating all possible mappings of model chains onto their chemically equivalent target counterparts from the target binding site. The binding_sites_topn with respect to lDDT score are evaluated and an RMSD is computed. You can either try to map ALL model chains onto the target binding site by enabling full_bs_search or restrict the model chains for a specific target/model ligand pair to the chains with at least one atom within model_bs_radius around the model ligand. The latter can be significantly faster in case of large complexes. Symmetry correction is achieved by simply computing an RMSD value for each symmetry, i.e. atom-atom assignments of the ligand as given by LigandScorer. The lowest RMSD value is returned.

Populates LigandScorer.aux_data with following dict keys:

rmsd: The BiSyRMSD score
lddt_lp: lDDT of the binding pocket used for superposition (lDDT-LP)
bs_ref_res: list of binding site residues in target
bs_ref_res_mapped: list of target binding site residues that are mapped to model
bs_mdl_res_mapped: list of same length with respective model residues
bb_rmsd: Backbone RMSD (CA, C3’ for nucleotides; full backbone for binding sites with fewer than 3 residues) for mapped binding site residues after superposition
target_ligand: The actual target ligand for which the score was computed
model_ligand: The actual model ligand for which the score was computed
chain_mapping: dict with a chain mapping of chains involved in binding site - key: trg chain name, value: mdl chain name
transform: geom.Mat4 to transform model binding site onto target binding site
inconsistent_residues: list of tuple representing residues with inconsistent residue names upon mapping (which is given by bs_ref_res_mapped and bs_mdl_res_mapped). Tuples have two elements: 1) trg residue 2) mdl residue

Parameters:

model (ost.mol.EntityHandle/ost.mol.EntityView) – Passed to parent constructor - see LigandScorer.
target (ost.mol.EntityHandle/ost.mol.EntityView) – Passed to parent constructor - see LigandScorer.
model_ligands (list) – Passed to parent constructor - see LigandScorer.
target_ligands (list) – Passed to parent constructor - see LigandScorer.
resnum_alignments (bool) – Passed to parent constructor - see LigandScorer.
rename_ligand_chain (bool) – Passed to parent constructor - see LigandScorer.
substructure_match (bool) – Passed to parent constructor - see LigandScorer.
coverage_delta (float) – Passed to parent constructor - see LigandScorer.
max_symmetries (int) – Passed to parent constructor - see LigandScorer.
bs_radius (float) – Inclusion radius for the binding site. Residues with atoms within this distance of the ligand will be considered for inclusion in the binding site.
lddt_lp_radius (float) – lDDT inclusion radius for lDDT-LP.
model_bs_radius (float) – inclusion radius for model binding sites. Only used when full_bs_search=False, otherwise the radius is effectively infinite. Only chains with atoms within this distance of a model ligand will be considered in the chain mapping.
binding_sites_topn (int) – maximum number of model binding site representations to assess per target binding site.
full_bs_search (bool) – If True, all potential binding sites in the model are searched for each target binding site. If False, the search space in the model is reduced to chains around (model_bs_radius Å) model ligands. This speeds up computations, but may result in ligands not being scored if the predicted ligand pose is too far from the actual binding site.
min_pep_length (int) – See ost.mol.alg.ligand_scoring_base.LigandScorer.
min_nuc_length (int) – See ost.mol.alg.ligand_scoring_base.LigandScorer
pep_seqid_thr (float) – See ost.mol.alg.ligand_scoring_base.LigandScorer
nuc_seqid_thr (float) – See ost.mol.alg.ligand_scoring_base.LigandScorer
mdl_map_pep_seqid_thr (float) – See ost.mol.alg.ligand_scoring_base.LigandScorer
mdl_map_nuc_seqid_thr (float) – See ost.mol.alg.ligand_scoring_base.LigandScorer
seqres (ost.seq.SequenceList) – See ost.mol.alg.ligand_scoring_base.LigandScorer
trg_seqres_mapping (dict) – See ost.mol.alg.ligand_scoring_base.LigandScorer

SCRMSD(model_ligand, target_ligand, transformation=geom.Mat4(1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1), substructure_match=False, max_symmetries=1000000.0)¶

Calculate symmetry-corrected RMSD.

Binding site superposition must be computed separately and passed as transformation.

Parameters:

model_ligand (ost.mol.ResidueHandle or ost.mol.ResidueView) – The model ligand
target_ligand (ost.mol.ResidueHandle or ost.mol.ResidueView) – The target ligand
transformation (ost.geom.Mat4) – Optional transformation to apply on each atom position of model_ligand.
substructure_match (bool) – Set this to True to allow partial target ligand.
max_symmetries (int) – If more than that many isomorphisms exist, raise a TooManySymmetriesError. This can only be assessed by generating at least that many isomorphisms and can take some time.

Return type:

float

Raises:

ost.mol.alg.ligand_scoring_base.NoSymmetryError when no symmetry can be found, ost.mol.alg.ligand_scoring_base.DisconnectedGraphError when ligand graph is disconnected, ost.mol.alg.ligand_scoring_base.TooManySymmetriesError when more than max_symmetries isomorphisms are found.

`ligand_scoring` – Ligand scoring functions¶

Search

Contents

ligand_scoring – Ligand scoring functions¶

Search

Contents

`ligand_scoring` – Ligand scoring functions¶