`scoring` – Specialized Scoring Functions¶

class lDDTBSScorer(reference, model, residue_number_alignment=False)¶

Scorer specific for a reference/model pair

Finds best possible binding site representation of reference in model given lDDT score. Uses ost.mol.alg.chain_mapping.ChainMapper to deal with chain mapping.

Parameters:

reference (ost.mol.EntityView/ost.mol.EntityHandle) – Reference structure
model (ost.mol.EntityView/ost.mol.EntityHandle) – Model structure
residue_number_alignment (bool) – Passed to ChainMapper constructor

ScoreBS(ligand, radius=4.0, lddt_radius=10.0)¶

Computes binding site lDDT score given ligand. Best possible binding site representation is selected by lDDT but other scores such as CA based RMSD and GDT are computed too and returned.

Parameters:

ligand (r'((Residue)|(Chain)|(Entity))((View)|(Handle))') – Defines the scored binding site, i.e. provides positions to perform proximity search
radius (float) – Reference residues with any atom position within radius of ligand consitute the scored binding site
lddt_radius (float) – Passed as inclusion_radius to ost.mol.alg.lddt.lDDTScorer

Returns:

Object of type ost.mol.alg.chain_mapping.ReprResult containing all atom lDDT score and mapping information. None if no representation could be found.

class Scorer(model, target, resnum_alignments=False, molck_settings=None, cad_score_exec=None, custom_mapping=None, custom_rigid_mapping=None, usalign_exec=None, lddt_no_stereochecks=False, n_max_naive=40320, oum=False, min_pep_length=6, min_nuc_length=4, lddt_add_mdl_contacts=False, dockq_capri_peptide=False)¶

Helper class to access the various scores available from ost.mol.alg

Deals with structure cleanup, chain mapping, interface identification etc. Intermediate results are available as attributes.

Parameters:

model (ost.mol.EntityHandle/ost.mol.EntityView) – Model structure - a deep copy is available as model. Additionally, ost.mol.alg.Molck() using molck_settings is applied.
target (ost.mol.EntityHandle/ost.mol.EntityView) – Target structure - a deep copy is available as target. Additionally, ost.mol.alg.Molck() using molck_settings is applied.
resnum_alignments (bool) – Whether alignments between chemically equivalent chains in model and target can be computed based on residue numbers. This can be assumed in benchmarking setups such as CAMEO/CASP.
molck_settings (ost.mol.alg.MolckSettings) – Settings used for Molck on model and target, if set to None, a default object is constructed by setting everything except rm_zero_occ_atoms and colored to True in ost.mol.alg.MolckSettings constructor.
cad_score_exec (str) – Explicit path to voronota-cadscore executable from voronota installation from https://github.com/kliment-olechnovic/voronota. If not given, voronota-cadscore must be in PATH if any of the CAD score related attributes is requested.
custom_mapping (dict) – Provide custom chain mapping between model and target. Dictionary with target chain names as key and model chain names as value. mapping is constructed from this.
custom_rigid_mapping (dict) – Provide custom chain mapping between model and target. Dictionary with target chain names as key and model chain names as value. rigid_mapping is constructed from this.
usalign_exec (str) – Explicit path to USalign executable used to compute TM-score. If not given, TM-score will be computed with OpenStructure internal copy of USalign code.
lddt_no_stereochecks (bool) – Whether to compute lDDT without stereochemistry checks
n_max_naive (int) – Parameter for chain mapping. If the number of possible mappings is <= n_max_naive, the full mapping solution space is enumerated to find the the optimum. A heuristic is used otherwise. The default of 40320 corresponds to an octamer (8! = 40320). A structure with stoichiometry A6B2 would be 6!*2! = 1440 etc.
oum (bool) – Override USalign Mapping. Inject rigid_mapping of Scorer object into USalign to compute TM-score. Experimental feature with limitations.
min_pep_length (int) – Relevant parameter if short peptides are involved in scoring. Minimum peptide length for a chain in the target structure to be considered in chain mapping. The chain mapping algorithm first performs an all vs. all pairwise sequence alignment to identify “equal” chains within the target structure. We go for simple sequence identity there. Short sequences can be problematic as they may produce high sequence identity alignments by pure chance.
min_nuc_length (int) – Relevant parameter if short nucleotides are involved in scoring. Minimum nucleotide length for a chain in the target structure to be considered in chain mapping. The chain mapping algorithm first performs an all vs. all pairwise sequence alignment to identify “equal” chains within the target structure. We go for simple sequence identity there. Short sequences can be problematic as they may produce high sequence identity alignments by pure chance.
lddt_add_mdl_contacts (bool) – lDDT specific flag. Only using contacts in lDDT that are within a certain distance threshold in the target does not penalize for added model contacts. If set to True, this flag will also consider target contacts that are within the specified distance threshold in the model but not necessarily in the target. No contact will be added if the respective atom pair is not resolved in the target.
dockq_capri_peptide (bool) – Flag that changes two things in the way DockQ and its underlying scores are computed which is proposed by the CAPRI community when scoring peptides (PMID: 31886916). ONE: Two residues are considered in contact if any of their atoms is within 5A. This is relevant for fnat and fnonat scores. CAPRI suggests to lower this threshold to 4A for protein-peptide interactions. TWO: irmsd is computed on interface residues. A residue is defined as interface residue if any of its atoms is within 10A of another chain. CAPRI suggests to lower the default of 10A to 8A in combination with only considering CB atoms for protein-peptide interactions. This flag has no influence on patch_dockq scores.

property model¶

Model with Molck cleanup

Type:: ost.mol.EntityHandle

property model_orig¶

The original model passed at object construction

Type:: ost.mol.EntityHandle/ost.mol.EntityView

property target¶

Target with Molck cleanup

Type:: ost.mol.EntityHandle

property target_orig¶

The original target passed at object construction

Type:: ost.mol.EntityHandle/ost.mol.EntityView

property aln¶

Alignments of model/target chains

Alignments for each pair of chains mapped in mapping. First sequence is target sequence, second sequence the model sequence.

Type:: list of ost.seq.AlignmentHandle

property stereochecked_aln¶

Stereochecked equivalent of aln

The alignments may differ, as stereochecks potentially remove residues

Type:: list of ost.seq.AlignmentHandle

property pepnuc_aln¶

Alignments of model_orig/target_orig chains

Selects for peptide and nucleotide residues before sequence extraction. Includes residues that would be removed by molck in structure preprocessing.

Type:: list of ost.seq.AlignmentHandle

property stereochecked_model¶

View of model that has stereochemistry checks applied

First, a selection for peptide/nucleotide residues is performed, secondly peptide sidechains with stereochemical irregularities are removed (full residue if backbone atoms are involved). Irregularities are clashes or bond lengths/angles more than 12 standard deviations from expected values.

Type:: ost.mol.EntityView

property model_clashes¶

Clashing model atoms

Type:: list of ost.mol.alg.stereochemistry.ClashInfo

property model_bad_bonds¶

Model bonds with unexpected stereochemistry

Type:: list of ost.mol.alg.stereochemistry.BondViolationInfo

property model_bad_angles¶

Model angles with unexpected stereochemistry

Type:: list of ost.mol.alg.stereochemistry.AngleViolationInfo

property stereochecked_target¶

Same as stereochecked_model for target

Type:: ost.mol.EntityView

property target_clashes¶

Clashing target atoms

Type:: list of ost.mol.alg.stereochemistry.ClashInfo

property target_bad_bonds¶

Target bonds with unexpected stereochemistry

Type:: list of ost.mol.alg.stereochemistry.BondViolationInfo

property target_bad_angles¶

Target angles with unexpected stereochemistry

Type:: list of ost.mol.alg.stereochemistry.AngleViolationInfo

property chain_mapper¶

Chain mapper object for given target

Type:: ost.mol.alg.chain_mapping.ChainMapper

property mapping¶

Full chain mapping result for target/model

Computed with ost.mol.alg.ChainMapper.GetMapping()

Type:: ost.mol.alg.chain_mapping.MappingResult

property rigid_mapping¶

Full chain mapping result for target/model

Computed with ost.mol.alg.ChainMapper.GetRMSDMapping()

Type:: ost.mol.alg.chain_mapping.MappingResult

property model_interface_residues¶

Interface residues in model

Thats all residues having a contact with at least one residue from another chain (CB-CB distance <= 8A, CA in case of Glycine)

Type:: dict with chain names as key and and list with residue numbers of the respective interface residues.

property target_interface_residues¶

Same as model_interface_residues for target

Type:: dict with chain names as key and and list with residue numbers of the respective interface residues.

property lddt_scorer¶

lDDT scorer for stereochecked_target (default parameters)

Type:: ost.mol.alg.lddt.lDDTScorer

property bb_lddt_scorer¶

Backbone only lDDT scorer for target

No stereochecks applied for bb only lDDT which considers CA atoms for peptides and C3’ atoms for nucleotides.

Type:: ost.mol.alg.lddt.lDDTScorer

property qs_scorer¶

QS scorer constructed from mapping

The scorer object is constructed with default parameters and relates to model and target (no stereochecks).

Type:: ost.mol.alg.qsscore.QSScorer

property lddt¶

Global lDDT score in range [0.0, 1.0]

Computed based on stereochecked_model. In case of oligomers, mapping is used.

Type:: float

property local_lddt¶

Per residue lDDT scores in range [0.0, 1.0]

Computed based on stereochecked_model but scores for all residues in model are reported. If a residue has been removed by stereochemistry checks, the respective score is set to 0.0. If a residue is not covered by the target or is in a chain skipped by the chain mapping procedure (happens for super short chains), the respective score is set to None. In case of oligomers, mapping is used.

Type:: dict

property bb_lddt¶

Backbone only global lDDT score in range [0.0, 1.0]

Computed based on model on backbone atoms only. This is CA for peptides and C3’ for nucleotides. No stereochecks are performed. In case of oligomers, mapping is used.

Type:: float

property bb_local_lddt¶

Backbone only per residue lDDT scores in range [0.0, 1.0]

Computed based on model on backbone atoms only. This is CA for peptides and C3’ for nucleotides. No stereochecks are performed. If a residue is not covered by the target or is in a chain skipped by the chain mapping procedure (happens for super short chains), the respective score is set to None. In case of oligomers, mapping is used.

Type:: dict

property ilddt¶

Global interface lDDT score in range [0.0, 1.0]

This is lDDT only based on inter-chain contacts. Value is None if no such contacts are present. For example if we’re dealing with a monomer. Computed based on stereochecked_model and mapping for chain mapping.

Type:: float

property qs_global¶

Global QS-score

Computed based on model using mapping

Type:: float

property qs_best¶

Global QS-score - only computed on aligned residues

Computed based on model using mapping. The QS-score computation only considers contacts between residues with a mapping between target and model. As a result, the score won’t be lowered in case of additional chains/residues in any of the structures.

Type:: float

property qs_target_interfaces¶

Interfaces in target with non-zero contribution to qs_global/qs_best

Chain names are lexicographically sorted.

Type:: list of tuple with 2 elements each: (trg_ch1, trg_ch2)

property qs_model_interfaces¶

Interfaces in model with non-zero contribution to qs_global/qs_best

Chain names are lexicographically sorted.

Type:: list of tuple with 2 elements each: (mdl_ch1, mdl_ch2)

property qs_interfaces¶

Interfaces in qs_target_interfaces that can be mapped to model.

Target chain names are lexicographically sorted.

Type:: list of tuple with 4 elements each: (trg_ch1, trg_ch2, mdl_ch1, mdl_ch2)

property per_interface_qs_global¶

QS-score for each interface in qs_interfaces

Type:: list of float

property per_interface_qs_best¶

QS-score for each interface in qs_interfaces

Only computed on aligned residues

Type:: list of float

property native_contacts¶

Native contacts

A contact is a pair or residues from distinct chains that have a minimal heavy atom distance < 5A. Contacts are specified as tuple with two strings in format: <cname>.<rnum>.<ins_code>

Type:: list of tuple

property model_contacts¶: Same for model

property contact_target_interfaces¶

Interfaces in target which have at least one contact

Contact as defined in native_contacts, chain names are lexicographically sorted.

Type:: list of tuple with 2 elements each (trg_ch1, trg_ch2)

property contact_model_interfaces¶

Interfaces in model which have at least one contact

Contact as defined in native_contacts, chain names are lexicographically sorted.

Type:: list of tuple with 2 elements each (mdl_ch1, mdl_ch2)

property ics_precision¶

Fraction of model contacts that are also present in target

Type:: float

property ics_recall¶

Fraction of target contacts that are correctly reproduced in model

Type:: float

property ics¶

ICS (Interface Contact Similarity) score

Combination of ics_precision and ics_recall using the F1-measure

Type:: float

property per_interface_ics_precision¶

Per-interface ICS precision

ics_precision for each interface in contact_target_interfaces

Type:: list of float

property per_interface_ics_recall¶

Per-interface ICS recall

ics_recall for each interface in contact_target_interfaces

Type:: list of float

property per_interface_ics¶

Per-interface ICS (Interface Contact Similarity) score

ics for each interface in contact_target_interfaces

Type:: float

property ips_precision¶

Fraction of model interface residues that are also interface residues in target

Type:: float

property ips_recall¶

Fraction of target interface residues that are also interface residues in model

Type:: float

property ips¶

IPS (Interface Patch Similarity) score

Jaccard coefficient of interface residues in target and their mapped counterparts in model

Type:: float

property per_interface_ips_precision¶

Per-interface IPS precision

ips_precision for each interface in contact_target_interfaces

Type:: list of float

property per_interface_ips_recall¶

Per-interface IPS recall

ips_recall for each interface in contact_target_interfaces

Type:: list of float

property per_interface_ips¶

Per-interface IPS (Interface Patch Similarity) score

ips for each interface in contact_target_interfaces

Type:: list of float

property dockq_target_interfaces¶

Interfaces in target that are relevant for DockQ

All interfaces in target with non-zero contacts that are relevant for DockQ. Chain names are lexicographically sorted.

Type:: list of tuple with 2 elements each: (trg_ch1, trg_ch2)

property dockq_interfaces¶

Interfaces in dockq_target_interfaces that can be mapped to model

Target chain names are lexicographically sorted

Type:: list of tuple with 4 elements each: (trg_ch1, trg_ch2, mdl_ch1, mdl_ch2)

property dockq_scores¶

DockQ scores for interfaces in dockq_interfaces

list of float

property fnat¶

fnat scores for interfaces in dockq_interfaces

fnat: Fraction of native contacts that are also present in model

list of float

property nnat¶

N native contacts for interfaces in dockq_interfaces

list of int

property nmdl¶

N model contacts for interfaces in dockq_interfaces

list of int

property fnonnat¶

fnonnat scores for interfaces in dockq_interfaces

fnat: Fraction of model contacts that are not present in target

list of float

property irmsd¶

irmsd scores for interfaces in dockq_interfaces

irmsd: RMSD of interface (RMSD computed on N, CA, C, O atoms) which consists of each residue that has at least one heavy atom within 10A of other chain.

list of float

property lrmsd¶

lrmsd scores for interfaces in dockq_interfaces

lrmsd: The interfaces are superposed based on the receptor (rigid min RMSD superposition) and RMSD for the ligand is reported. Superposition and RMSD are based on N, CA, C and O positions, receptor is the chain contributing to the interface with more residues in total.

list of float

property dockq_ave¶

Average of DockQ scores in dockq_scores

In its original implementation, DockQ only operates on single interfaces. Thus the requirement to combine scores for higher order oligomers.

Type:: float

property dockq_wave¶

Same as dockq_ave, weighted by native contacts

Type:: float

property dockq_ave_full¶

Same as dockq_ave but penalizing for missing interfaces

Interfaces that are not covered in model are added as 0.0 in average computation.

Type:: float

property dockq_wave_full¶

Same as dockq_ave_full, but weighted

Interfaces that are not covered in model are added as 0.0 in average computations and the respective weights are derived from number of contacts in respective target interface.

property mapped_target_pos¶

Mapped representative positions in target

Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as mapped_model_pos and mapping is based on mapping.

Type:: ost.geom.Vec3List

property mapped_model_pos¶

Mapped representative positions in model

Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as mapped_target_pos and mapping is based on mapping.

Type:: ost.geom.Vec3List

property transformed_mapped_model_pos¶

mapped_model_pos with transform applied

Type:: ost.geom.Vec3List

property n_target_not_mapped¶

Number of target residues which have no mapping to model

Type:: int

property transform¶

Transform: mapped_model_pos onto mapped_target_pos

Computed using Kabsch minimal rmsd algorithm

Type:: ost.geom.Mat4

property rigid_mapped_target_pos¶

Mapped representative positions in target

Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as rigid_mapped_model_pos and mapping is based on rigid_mapping.

Type:: ost.geom.Vec3List

property rigid_mapped_model_pos¶

Mapped representative positions in model

Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as mapped_target_pos and mapping is based on rigid_mapping.

Type:: ost.geom.Vec3List

property rigid_transformed_mapped_model_pos¶

rigid_mapped_model_pos with rigid_transform applied

Type:: ost.geom.Vec3List

property rigid_n_target_not_mapped¶

Number of target residues which have no rigid mapping to model

Type:: int

property rigid_transform¶

Transform: rigid_mapped_model_pos onto rigid_mapped_target_pos

Computed using Kabsch minimal rmsd algorithm

Type:: ost.geom.Mat4

property gdt_05¶

Fraction CA (C3’ for nucleotides) that can be superposed within 0.5A

Uses rigid_mapped_model_pos and rigid_mapped_target_pos. Similar iterative algorithm as LGA tool

Type:: float

property gdt_1¶

Fraction CA (C3’ for nucleotides) that can be superposed within 1.0A

Uses rigid_mapped_model_pos and rigid_mapped_target_pos. Similar iterative algorithm as LGA tool

Type:: float

property gdt_2¶

Fraction CA (C3’ for nucleotides) that can be superposed within 2.0A

Uses rigid_mapped_model_pos and rigid_mapped_target_pos. Similar iterative algorithm as LGA tool

Type:: float

property gdt_4¶

Fraction CA (C3’ for nucleotides) that can be superposed within 4.0A

Uses rigid_mapped_model_pos and rigid_mapped_target_pos. Similar iterative algorithm as LGA tool

Type:: float

property gdt_8¶

Fraction CA (C3’ for nucleotides) that can be superposed within 8.0A

Similar iterative algorithm as LGA tool

Type:: float

property gdtts¶

avg GDT with thresholds: 8.0A, 4.0A, 2.0A and 1.0A

Type:: float

property gdtha¶

avg GDT with thresholds: 4.0A, 2.0A, 1.0A and 0.5A

Type:: float

property rmsd¶

RMSD

Computed on rigid_transformed_mapped_model_pos and rigid_mapped_target_pos

Type:: float

property cad_score¶

The global CAD atom-atom (AA) score

Computed based on model. In case of oligomers, mapping is used.

Type:: float

property local_cad_score¶

The per-residue CAD atom-atom (AA) scores

Computed based on model. In case of oligomers, mapping is used.

Type:: dict

property patch_qs¶

Patch QS-scores for each residue in model_interface_residues

Representative patches for each residue r in chain c are computed as follows:

mdl_patch_one: All residues in c with CB (CA for GLY) positions within 8A of r and within 12A of residues from any other chain.
mdl_patch_two: Closest residue x to r in any other chain gets identified. Patch is then constructed by selecting all residues from any other chain within 8A of x and within 12A from any residue in c.
trg_patch_one: Chain name and residue number based mapping from mdl_patch_one
trg_patch_two: Chain name and residue number based mapping from mdl_patch_two

Results are stored in the same manner as model_interface_residues, with corresponding scores instead of residue numbers. Scores for residues which are not mol.ChemType.AMINOACIDS are set to None. Additionally, interface patches are derived from model. If they contain residues which are not covered by target, the score is set to None too.

Type:: dict with chain names as key and and list with scores of the respective interface residues.

property patch_dockq¶: Same as patch_qs but for DockQ scores

property tm_score¶

TM-score computed with USalign

USalign executable can be specified with usalign_exec kwarg at Scorer construction, an OpenStructure internal copy of the USalign code is used otherwise.

Type:: float

property usalign_mapping¶

Mapping computed with USalign

Dictionary with target chain names as key and model chain names as values. No guarantee that all chains are mapped. USalign executable can be specified with usalign_exec kwarg at Scorer construction, an OpenStructure internal copy of the USalign code is used otherwise.

Type:: dict

`scoring` – Specialized Scoring Functions¶

Search

Contents

scoring – Specialized Scoring Functions¶

Search

Contents

`scoring` – Specialized Scoring Functions¶