scoring – Specialized Scoring Functions

class lDDTBSScorer(reference, model, residue_number_alignment=False)

Scorer specific for a reference/model pair

Finds best possible binding site representation of reference in model given lDDT score. Uses ost.mol.alg.chain_mapping.ChainMapper to deal with chain mapping.

Parameters:
ScoreBS(ligand, radius=4.0, lddt_radius=10.0)

Computes binding site lDDT score given ligand. Best possible binding site representation is selected by lDDT but other scores such as CA based RMSD and GDT are computed too and returned.

Parameters:
  • ligand (r'((Residue)|(Chain)|(Entity))((View)|(Handle))') – Defines the scored binding site, i.e. provides positions to perform proximity search

  • radius (float) – Reference residues with any atom position within radius of ligand consitute the scored binding site

  • lddt_radius (float) – Passed as inclusion_radius to ost.mol.alg.lddt.lDDTScorer

Returns:

Object of type ost.mol.alg.chain_mapping.ReprResult containing all atom lDDT score and mapping information. None if no representation could be found.

class Scorer(model, target, resnum_alignments=False, molck_settings=None, cad_score_exec=None, custom_mapping=None, custom_rigid_mapping=None, usalign_exec=None, lddt_no_stereochecks=False, n_max_naive=40320, oum=False, min_pep_length=6, min_nuc_length=4, lddt_add_mdl_contacts=False, dockq_capri_peptide=False)

Helper class to access the various scores available from ost.mol.alg

Deals with structure cleanup, chain mapping, interface identification etc. Intermediate results are available as attributes.

Parameters:
  • model (ost.mol.EntityHandle/ost.mol.EntityView) – Model structure - a deep copy is available as model. Additionally, ost.mol.alg.Molck() using molck_settings is applied.

  • target (ost.mol.EntityHandle/ost.mol.EntityView) – Target structure - a deep copy is available as target. Additionally, ost.mol.alg.Molck() using molck_settings is applied.

  • resnum_alignments (bool) – Whether alignments between chemically equivalent chains in model and target can be computed based on residue numbers. This can be assumed in benchmarking setups such as CAMEO/CASP.

  • molck_settings (ost.mol.alg.MolckSettings) – Settings used for Molck on model and target, if set to None, a default object is constructed by setting everything except rm_zero_occ_atoms and colored to True in ost.mol.alg.MolckSettings constructor.

  • cad_score_exec (str) – Explicit path to voronota-cadscore executable from voronota installation from https://github.com/kliment-olechnovic/voronota. If not given, voronota-cadscore must be in PATH if any of the CAD score related attributes is requested.

  • custom_mapping (dict) – Provide custom chain mapping between model and target. Dictionary with target chain names as key and model chain names as value. mapping is constructed from this.

  • custom_rigid_mapping (dict) – Provide custom chain mapping between model and target. Dictionary with target chain names as key and model chain names as value. rigid_mapping is constructed from this.

  • usalign_exec (str) – Explicit path to USalign executable used to compute TM-score. If not given, TM-score will be computed with OpenStructure internal copy of USalign code.

  • lddt_no_stereochecks (bool) – Whether to compute lDDT without stereochemistry checks

  • n_max_naive (int) – Parameter for chain mapping. If the number of possible mappings is <= n_max_naive, the full mapping solution space is enumerated to find the the optimum. A heuristic is used otherwise. The default of 40320 corresponds to an octamer (8! = 40320). A structure with stoichiometry A6B2 would be 6!*2! = 1440 etc.

  • oum (bool) – Override USalign Mapping. Inject rigid_mapping of Scorer object into USalign to compute TM-score. Experimental feature with limitations.

  • min_pep_length (int) – Relevant parameter if short peptides are involved in scoring. Minimum peptide length for a chain in the target structure to be considered in chain mapping. The chain mapping algorithm first performs an all vs. all pairwise sequence alignment to identify “equal” chains within the target structure. We go for simple sequence identity there. Short sequences can be problematic as they may produce high sequence identity alignments by pure chance.

  • min_nuc_length (int) – Relevant parameter if short nucleotides are involved in scoring. Minimum nucleotide length for a chain in the target structure to be considered in chain mapping. The chain mapping algorithm first performs an all vs. all pairwise sequence alignment to identify “equal” chains within the target structure. We go for simple sequence identity there. Short sequences can be problematic as they may produce high sequence identity alignments by pure chance.

  • lddt_add_mdl_contacts (bool) – lDDT specific flag. Only using contacts in lDDT that are within a certain distance threshold in the target does not penalize for added model contacts. If set to True, this flag will also consider target contacts that are within the specified distance threshold in the model but not necessarily in the target. No contact will be added if the respective atom pair is not resolved in the target.

  • dockq_capri_peptide (bool) – Flag that changes two things in the way DockQ and its underlying scores are computed which is proposed by the CAPRI community when scoring peptides (PMID: 31886916). ONE: Two residues are considered in contact if any of their atoms is within 5A. This is relevant for fnat and fnonat scores. CAPRI suggests to lower this threshold to 4A for protein-peptide interactions. TWO: irmsd is computed on interface residues. A residue is defined as interface residue if any of its atoms is within 10A of another chain. CAPRI suggests to lower the default of 10A to 8A in combination with only considering CB atoms for protein-peptide interactions. This flag has no influence on patch_dockq scores.

property model

Model with Molck cleanup

Type:

ost.mol.EntityHandle

property model_orig

The original model passed at object construction

Type:

ost.mol.EntityHandle/ost.mol.EntityView

property target

Target with Molck cleanup

Type:

ost.mol.EntityHandle

property target_orig

The original target passed at object construction

Type:

ost.mol.EntityHandle/ost.mol.EntityView

property aln

Alignments of model/target chains

Alignments for each pair of chains mapped in mapping. First sequence is target sequence, second sequence the model sequence.

Type:

list of ost.seq.AlignmentHandle

property stereochecked_aln

Stereochecked equivalent of aln

The alignments may differ, as stereochecks potentially remove residues

Type:

list of ost.seq.AlignmentHandle

property pepnuc_aln

Alignments of model_orig/target_orig chains

Selects for peptide and nucleotide residues before sequence extraction. Includes residues that would be removed by molck in structure preprocessing.

Type:

list of ost.seq.AlignmentHandle

property stereochecked_model

View of model that has stereochemistry checks applied

First, a selection for peptide/nucleotide residues is performed, secondly peptide sidechains with stereochemical irregularities are removed (full residue if backbone atoms are involved). Irregularities are clashes or bond lengths/angles more than 12 standard deviations from expected values.

Type:

ost.mol.EntityView

property model_clashes

Clashing model atoms

Type:

list of ost.mol.alg.stereochemistry.ClashInfo

property model_bad_bonds

Model bonds with unexpected stereochemistry

Type:

list of ost.mol.alg.stereochemistry.BondViolationInfo

property model_bad_angles

Model angles with unexpected stereochemistry

Type:

list of ost.mol.alg.stereochemistry.AngleViolationInfo

property stereochecked_target

Same as stereochecked_model for target

Type:

ost.mol.EntityView

property target_clashes

Clashing target atoms

Type:

list of ost.mol.alg.stereochemistry.ClashInfo

property target_bad_bonds

Target bonds with unexpected stereochemistry

Type:

list of ost.mol.alg.stereochemistry.BondViolationInfo

property target_bad_angles

Target angles with unexpected stereochemistry

Type:

list of ost.mol.alg.stereochemistry.AngleViolationInfo

property chain_mapper

Chain mapper object for given target

Type:

ost.mol.alg.chain_mapping.ChainMapper

property mapping

Full chain mapping result for target/model

Computed with ost.mol.alg.ChainMapper.GetMapping()

Type:

ost.mol.alg.chain_mapping.MappingResult

property rigid_mapping

Full chain mapping result for target/model

Computed with ost.mol.alg.ChainMapper.GetRMSDMapping()

Type:

ost.mol.alg.chain_mapping.MappingResult

property model_interface_residues

Interface residues in model

Thats all residues having a contact with at least one residue from another chain (CB-CB distance <= 8A, CA in case of Glycine)

Type:

dict with chain names as key and and list with residue numbers of the respective interface residues.

property target_interface_residues

Same as model_interface_residues for target

Type:

dict with chain names as key and and list with residue numbers of the respective interface residues.

property lddt_scorer

lDDT scorer for stereochecked_target (default parameters)

Type:

ost.mol.alg.lddt.lDDTScorer

property bb_lddt_scorer

Backbone only lDDT scorer for target

No stereochecks applied for bb only lDDT which considers CA atoms for peptides and C3’ atoms for nucleotides.

Type:

ost.mol.alg.lddt.lDDTScorer

property qs_scorer

QS scorer constructed from mapping

The scorer object is constructed with default parameters and relates to model and target (no stereochecks).

Type:

ost.mol.alg.qsscore.QSScorer

property lddt

Global lDDT score in range [0.0, 1.0]

Computed based on stereochecked_model. In case of oligomers, mapping is used.

Type:

float

property local_lddt

Per residue lDDT scores in range [0.0, 1.0]

Computed based on stereochecked_model but scores for all residues in model are reported. If a residue has been removed by stereochemistry checks, the respective score is set to 0.0. If a residue is not covered by the target or is in a chain skipped by the chain mapping procedure (happens for super short chains), the respective score is set to None. In case of oligomers, mapping is used.

Type:

dict

property bb_lddt

Backbone only global lDDT score in range [0.0, 1.0]

Computed based on model on backbone atoms only. This is CA for peptides and C3’ for nucleotides. No stereochecks are performed. In case of oligomers, mapping is used.

Type:

float

property bb_local_lddt

Backbone only per residue lDDT scores in range [0.0, 1.0]

Computed based on model on backbone atoms only. This is CA for peptides and C3’ for nucleotides. No stereochecks are performed. If a residue is not covered by the target or is in a chain skipped by the chain mapping procedure (happens for super short chains), the respective score is set to None. In case of oligomers, mapping is used.

Type:

dict

property ilddt

Global interface lDDT score in range [0.0, 1.0]

This is lDDT only based on inter-chain contacts. Value is None if no such contacts are present. For example if we’re dealing with a monomer. Computed based on stereochecked_model and mapping for chain mapping.

Type:

float

property qs_global

Global QS-score

Computed based on model using mapping

Type:

float

property qs_best

Global QS-score - only computed on aligned residues

Computed based on model using mapping. The QS-score computation only considers contacts between residues with a mapping between target and model. As a result, the score won’t be lowered in case of additional chains/residues in any of the structures.

Type:

float

property qs_target_interfaces

Interfaces in target with non-zero contribution to qs_global/qs_best

Chain names are lexicographically sorted.

Type:

list of tuple with 2 elements each: (trg_ch1, trg_ch2)

property qs_model_interfaces

Interfaces in model with non-zero contribution to qs_global/qs_best

Chain names are lexicographically sorted.

Type:

list of tuple with 2 elements each: (mdl_ch1, mdl_ch2)

property qs_interfaces

Interfaces in qs_target_interfaces that can be mapped to model.

Target chain names are lexicographically sorted.

Type:

list of tuple with 4 elements each: (trg_ch1, trg_ch2, mdl_ch1, mdl_ch2)

property per_interface_qs_global

QS-score for each interface in qs_interfaces

Type:

list of float

property per_interface_qs_best

QS-score for each interface in qs_interfaces

Only computed on aligned residues

Type:

list of float

property native_contacts

Native contacts

A contact is a pair or residues from distinct chains that have a minimal heavy atom distance < 5A. Contacts are specified as tuple with two strings in format: <cname>.<rnum>.<ins_code>

Type:

list of tuple

property model_contacts

Same for model

property contact_target_interfaces

Interfaces in target which have at least one contact

Contact as defined in native_contacts, chain names are lexicographically sorted.

Type:

list of tuple with 2 elements each (trg_ch1, trg_ch2)

property contact_model_interfaces

Interfaces in model which have at least one contact

Contact as defined in native_contacts, chain names are lexicographically sorted.

Type:

list of tuple with 2 elements each (mdl_ch1, mdl_ch2)

property ics_precision

Fraction of model contacts that are also present in target

Type:

float

property ics_recall

Fraction of target contacts that are correctly reproduced in model

Type:

float

property ics

ICS (Interface Contact Similarity) score

Combination of ics_precision and ics_recall using the F1-measure

Type:

float

property per_interface_ics_precision

Per-interface ICS precision

ics_precision for each interface in contact_target_interfaces

Type:

list of float

property per_interface_ics_recall

Per-interface ICS recall

ics_recall for each interface in contact_target_interfaces

Type:

list of float

property per_interface_ics

Per-interface ICS (Interface Contact Similarity) score

ics for each interface in contact_target_interfaces

Type:

float

property ips_precision

Fraction of model interface residues that are also interface residues in target

Type:

float

property ips_recall

Fraction of target interface residues that are also interface residues in model

Type:

float

property ips

IPS (Interface Patch Similarity) score

Jaccard coefficient of interface residues in target and their mapped counterparts in model

Type:

float

property per_interface_ips_precision

Per-interface IPS precision

ips_precision for each interface in contact_target_interfaces

Type:

list of float

property per_interface_ips_recall

Per-interface IPS recall

ips_recall for each interface in contact_target_interfaces

Type:

list of float

property per_interface_ips

Per-interface IPS (Interface Patch Similarity) score

ips for each interface in contact_target_interfaces

Type:

list of float

property dockq_target_interfaces

Interfaces in target that are relevant for DockQ

All interfaces in target with non-zero contacts that are relevant for DockQ. Chain names are lexicographically sorted.

Type:

list of tuple with 2 elements each: (trg_ch1, trg_ch2)

property dockq_interfaces

Interfaces in dockq_target_interfaces that can be mapped to model

Target chain names are lexicographically sorted

Type:

list of tuple with 4 elements each: (trg_ch1, trg_ch2, mdl_ch1, mdl_ch2)

property dockq_scores

DockQ scores for interfaces in dockq_interfaces

list of float

property fnat

fnat scores for interfaces in dockq_interfaces

fnat: Fraction of native contacts that are also present in model

list of float

property nnat

N native contacts for interfaces in dockq_interfaces

list of int

property nmdl

N model contacts for interfaces in dockq_interfaces

list of int

property fnonnat

fnonnat scores for interfaces in dockq_interfaces

fnat: Fraction of model contacts that are not present in target

list of float

property irmsd

irmsd scores for interfaces in dockq_interfaces

irmsd: RMSD of interface (RMSD computed on N, CA, C, O atoms) which consists of each residue that has at least one heavy atom within 10A of other chain.

list of float

property lrmsd

lrmsd scores for interfaces in dockq_interfaces

lrmsd: The interfaces are superposed based on the receptor (rigid min RMSD superposition) and RMSD for the ligand is reported. Superposition and RMSD are based on N, CA, C and O positions, receptor is the chain contributing to the interface with more residues in total.

list of float

property dockq_ave

Average of DockQ scores in dockq_scores

In its original implementation, DockQ only operates on single interfaces. Thus the requirement to combine scores for higher order oligomers.

Type:

float

property dockq_wave

Same as dockq_ave, weighted by native contacts

Type:

float

property dockq_ave_full

Same as dockq_ave but penalizing for missing interfaces

Interfaces that are not covered in model are added as 0.0 in average computation.

Type:

float

property dockq_wave_full

Same as dockq_ave_full, but weighted

Interfaces that are not covered in model are added as 0.0 in average computations and the respective weights are derived from number of contacts in respective target interface.

property mapped_target_pos

Mapped representative positions in target

Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as mapped_model_pos and mapping is based on mapping.

Type:

ost.geom.Vec3List

property mapped_model_pos

Mapped representative positions in model

Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as mapped_target_pos and mapping is based on mapping.

Type:

ost.geom.Vec3List

property transformed_mapped_model_pos

mapped_model_pos with transform applied

Type:

ost.geom.Vec3List

property n_target_not_mapped

Number of target residues which have no mapping to model

Type:

int

property transform

Transform: mapped_model_pos onto mapped_target_pos

Computed using Kabsch minimal rmsd algorithm

Type:

ost.geom.Mat4

property rigid_mapped_target_pos

Mapped representative positions in target

Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as rigid_mapped_model_pos and mapping is based on rigid_mapping.

Type:

ost.geom.Vec3List

property rigid_mapped_model_pos

Mapped representative positions in model

Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as mapped_target_pos and mapping is based on rigid_mapping.

Type:

ost.geom.Vec3List

property rigid_transformed_mapped_model_pos

rigid_mapped_model_pos with rigid_transform applied

Type:

ost.geom.Vec3List

property rigid_n_target_not_mapped

Number of target residues which have no rigid mapping to model

Type:

int

property rigid_transform

Transform: rigid_mapped_model_pos onto rigid_mapped_target_pos

Computed using Kabsch minimal rmsd algorithm

Type:

ost.geom.Mat4

property gdt_05

Fraction CA (C3’ for nucleotides) that can be superposed within 0.5A

Uses rigid_mapped_model_pos and rigid_mapped_target_pos. Similar iterative algorithm as LGA tool

Type:

float

property gdt_1

Fraction CA (C3’ for nucleotides) that can be superposed within 1.0A

Uses rigid_mapped_model_pos and rigid_mapped_target_pos. Similar iterative algorithm as LGA tool

Type:

float

property gdt_2

Fraction CA (C3’ for nucleotides) that can be superposed within 2.0A

Uses rigid_mapped_model_pos and rigid_mapped_target_pos. Similar iterative algorithm as LGA tool

Type:

float

property gdt_4

Fraction CA (C3’ for nucleotides) that can be superposed within 4.0A

Uses rigid_mapped_model_pos and rigid_mapped_target_pos. Similar iterative algorithm as LGA tool

Type:

float

property gdt_8

Fraction CA (C3’ for nucleotides) that can be superposed within 8.0A

Similar iterative algorithm as LGA tool

Type:

float

property gdtts

avg GDT with thresholds: 8.0A, 4.0A, 2.0A and 1.0A

Type:

float

property gdtha

avg GDT with thresholds: 4.0A, 2.0A, 1.0A and 0.5A

Type:

float

property rmsd

RMSD

Computed on rigid_transformed_mapped_model_pos and rigid_mapped_target_pos

Type:

float

property cad_score

The global CAD atom-atom (AA) score

Computed based on model. In case of oligomers, mapping is used.

Type:

float

property local_cad_score

The per-residue CAD atom-atom (AA) scores

Computed based on model. In case of oligomers, mapping is used.

Type:

dict

property patch_qs

Patch QS-scores for each residue in model_interface_residues

Representative patches for each residue r in chain c are computed as follows:

  • mdl_patch_one: All residues in c with CB (CA for GLY) positions within 8A of r and within 12A of residues from any other chain.

  • mdl_patch_two: Closest residue x to r in any other chain gets identified. Patch is then constructed by selecting all residues from any other chain within 8A of x and within 12A from any residue in c.

  • trg_patch_one: Chain name and residue number based mapping from mdl_patch_one

  • trg_patch_two: Chain name and residue number based mapping from mdl_patch_two

Results are stored in the same manner as model_interface_residues, with corresponding scores instead of residue numbers. Scores for residues which are not mol.ChemType.AMINOACIDS are set to None. Additionally, interface patches are derived from model. If they contain residues which are not covered by target, the score is set to None too.

Type:

dict with chain names as key and and list with scores of the respective interface residues.

property patch_dockq

Same as patch_qs but for DockQ scores

property tm_score

TM-score computed with USalign

USalign executable can be specified with usalign_exec kwarg at Scorer construction, an OpenStructure internal copy of the USalign code is used otherwise.

Type:

float

property usalign_mapping

Mapping computed with USalign

Dictionary with target chain names as key and model chain names as values. No guarantee that all chains are mapped. USalign executable can be specified with usalign_exec kwarg at Scorer construction, an OpenStructure internal copy of the USalign code is used otherwise.

Type:

dict

Search

Enter search terms or a module, class or function name.

Contents

Documentation is available for the following OpenStructure versions:

(Currently viewing dev) / 2.8 / 2.7 / 2.6 / 2.5 / 2.4 / 2.3.1 / 2.3 / 2.2 / 2.1 / 2.0 / 1.9 / 1.8 / 1.7.1 / 1.7 / 1.6 / 1.5 / 1.4 / 1.3 / 1.2 / 1.11 / 1.10 / 1.1

This documentation is still under heavy development!
If something is missing or if you need the C++ API description in doxygen style, check our old documentation for further information.