scoring
– Specialized Scoring Functions¶
- class lDDTBSScorer(reference, model, residue_number_alignment=False)¶
Scorer specific for a reference/model pair
Finds best possible binding site representation of reference in model given lDDT score. Uses
ost.mol.alg.chain_mapping.ChainMapper
to deal with chain mapping.- Parameters:
reference (
ost.mol.EntityView
/ost.mol.EntityHandle
) – Reference structuremodel (
ost.mol.EntityView
/ost.mol.EntityHandle
) – Model structureresidue_number_alignment (
bool
) – Passed to ChainMapper constructor
- ScoreBS(ligand, radius=4.0, lddt_radius=10.0)¶
Computes binding site lDDT score given ligand. Best possible binding site representation is selected by lDDT but other scores such as CA based RMSD and GDT are computed too and returned.
- Parameters:
ligand (r'((Residue)|(Chain)|(Entity))((View)|(Handle))') – Defines the scored binding site, i.e. provides positions to perform proximity search
radius (
float
) – Reference residues with any atom position within radius of ligand consitute the scored binding sitelddt_radius (
float
) – Passed as inclusion_radius toost.mol.alg.lddt.lDDTScorer
- Returns:
Object of type
ost.mol.alg.chain_mapping.ReprResult
containing all atom lDDT score and mapping information. None if no representation could be found.
- class Scorer(model, target, resnum_alignments=False, molck_settings=None, cad_score_exec=None, custom_mapping=None, custom_rigid_mapping=None, usalign_exec=None, lddt_no_stereochecks=False, n_max_naive=40320, oum=False, min_pep_length=6, min_nuc_length=4, lddt_add_mdl_contacts=False, dockq_capri_peptide=False)¶
Helper class to access the various scores available from ost.mol.alg
Deals with structure cleanup, chain mapping, interface identification etc. Intermediate results are available as attributes.
- Parameters:
model (
ost.mol.EntityHandle
/ost.mol.EntityView
) – Model structure - a deep copy is available asmodel
. Additionally,ost.mol.alg.Molck()
using molck_settings is applied.target (
ost.mol.EntityHandle
/ost.mol.EntityView
) – Target structure - a deep copy is available astarget
. Additionally,ost.mol.alg.Molck()
using molck_settings is applied.resnum_alignments (
bool
) – Whether alignments between chemically equivalent chains in model and target can be computed based on residue numbers. This can be assumed in benchmarking setups such as CAMEO/CASP.molck_settings (
ost.mol.alg.MolckSettings
) – Settings used for Molck on model and target, if set to None, a default object is constructed by setting everything except rm_zero_occ_atoms and colored to True inost.mol.alg.MolckSettings
constructor.cad_score_exec (
str
) – Explicit path to voronota-cadscore executable from voronota installation from https://github.com/kliment-olechnovic/voronota. If not given, voronota-cadscore must be in PATH if any of the CAD score related attributes is requested.custom_mapping (
dict
) – Provide custom chain mapping between model and target. Dictionary with target chain names as key and model chain names as value.mapping
is constructed from this.custom_rigid_mapping (
dict
) – Provide custom chain mapping between model and target. Dictionary with target chain names as key and model chain names as value.rigid_mapping
is constructed from this.usalign_exec (
str
) – Explicit path to USalign executable used to compute TM-score. If not given, TM-score will be computed with OpenStructure internal copy of USalign code.lddt_no_stereochecks (
bool
) – Whether to compute lDDT without stereochemistry checksn_max_naive (
int
) – Parameter for chain mapping. If the number of possible mappings is <= n_max_naive, the full mapping solution space is enumerated to find the the optimum. A heuristic is used otherwise. The default of 40320 corresponds to an octamer (8! = 40320). A structure with stoichiometry A6B2 would be 6!*2! = 1440 etc.oum (
bool
) – Override USalign Mapping. Inject rigid_mapping ofScorer
object into USalign to compute TM-score. Experimental feature with limitations.min_pep_length (
int
) – Relevant parameter if short peptides are involved in scoring. Minimum peptide length for a chain in the target structure to be considered in chain mapping. The chain mapping algorithm first performs an all vs. all pairwise sequence alignment to identify “equal” chains within the target structure. We go for simple sequence identity there. Short sequences can be problematic as they may produce high sequence identity alignments by pure chance.min_nuc_length (
int
) – Relevant parameter if short nucleotides are involved in scoring. Minimum nucleotide length for a chain in the target structure to be considered in chain mapping. The chain mapping algorithm first performs an all vs. all pairwise sequence alignment to identify “equal” chains within the target structure. We go for simple sequence identity there. Short sequences can be problematic as they may produce high sequence identity alignments by pure chance.lddt_add_mdl_contacts (
bool
) – lDDT specific flag. Only using contacts in lDDT that are within a certain distance threshold in the target does not penalize for added model contacts. If set to True, this flag will also consider target contacts that are within the specified distance threshold in the model but not necessarily in the target. No contact will be added if the respective atom pair is not resolved in the target.dockq_capri_peptide (
bool
) – Flag that changes two things in the way DockQ and its underlying scores are computed which is proposed by the CAPRI community when scoring peptides (PMID: 31886916). ONE: Two residues are considered in contact if any of their atoms is within 5A. This is relevant for fnat and fnonat scores. CAPRI suggests to lower this threshold to 4A for protein-peptide interactions. TWO: irmsd is computed on interface residues. A residue is defined as interface residue if any of its atoms is within 10A of another chain. CAPRI suggests to lower the default of 10A to 8A in combination with only considering CB atoms for protein-peptide interactions. This flag has no influence on patch_dockq scores.
- property model¶
Model with Molck cleanup
- Type:
- property model_orig¶
The original model passed at object construction
- property target¶
Target with Molck cleanup
- Type:
- property target_orig¶
The original target passed at object construction
- property aln¶
Alignments of
model
/target
chainsAlignments for each pair of chains mapped in
mapping
. First sequence is target sequence, second sequence the model sequence.- Type:
list
ofost.seq.AlignmentHandle
- property stereochecked_aln¶
Stereochecked equivalent of
aln
The alignments may differ, as stereochecks potentially remove residues
- Type:
list
ofost.seq.AlignmentHandle
- property pepnuc_aln¶
Alignments of
model_orig
/target_orig
chainsSelects for peptide and nucleotide residues before sequence extraction. Includes residues that would be removed by molck in structure preprocessing.
- Type:
list
ofost.seq.AlignmentHandle
- property stereochecked_model¶
View of
model
that has stereochemistry checks appliedFirst, a selection for peptide/nucleotide residues is performed, secondly peptide sidechains with stereochemical irregularities are removed (full residue if backbone atoms are involved). Irregularities are clashes or bond lengths/angles more than 12 standard deviations from expected values.
- Type:
- property model_clashes¶
Clashing model atoms
- Type:
- property model_bad_bonds¶
Model bonds with unexpected stereochemistry
- Type:
- property model_bad_angles¶
Model angles with unexpected stereochemistry
- Type:
- property stereochecked_target¶
Same as
stereochecked_model
fortarget
- Type:
- property target_clashes¶
Clashing target atoms
- Type:
- property target_bad_bonds¶
Target bonds with unexpected stereochemistry
- Type:
- property target_bad_angles¶
Target angles with unexpected stereochemistry
- Type:
- property mapping¶
Full chain mapping result for
target
/model
Computed with
ost.mol.alg.ChainMapper.GetMapping()
- property rigid_mapping¶
Full chain mapping result for
target
/model
Computed with
ost.mol.alg.ChainMapper.GetRMSDMapping()
- property model_interface_residues¶
Interface residues in
model
Thats all residues having a contact with at least one residue from another chain (CB-CB distance <= 8A, CA in case of Glycine)
- Type:
dict
with chain names as key and andlist
with residue numbers of the respective interface residues.
- property target_interface_residues¶
Same as
model_interface_residues
fortarget
- Type:
dict
with chain names as key and andlist
with residue numbers of the respective interface residues.
- property lddt_scorer¶
lDDT scorer for
stereochecked_target
(default parameters)
- property bb_lddt_scorer¶
Backbone only lDDT scorer for
target
No stereochecks applied for bb only lDDT which considers CA atoms for peptides and C3’ atoms for nucleotides.
- property qs_scorer¶
QS scorer constructed from
mapping
The scorer object is constructed with default parameters and relates to
model
andtarget
(no stereochecks).
- property lddt¶
Global lDDT score in range [0.0, 1.0]
Computed based on
stereochecked_model
. In case of oligomers,mapping
is used.- Type:
float
- property local_lddt¶
Per residue lDDT scores in range [0.0, 1.0]
Computed based on
stereochecked_model
but scores for all residues inmodel
are reported. If a residue has been removed by stereochemistry checks, the respective score is set to 0.0. If a residue is not covered by the target or is in a chain skipped by the chain mapping procedure (happens for super short chains), the respective score is set to None. In case of oligomers,mapping
is used.- Type:
dict
- property bb_lddt¶
Backbone only global lDDT score in range [0.0, 1.0]
Computed based on
model
on backbone atoms only. This is CA for peptides and C3’ for nucleotides. No stereochecks are performed. In case of oligomers,mapping
is used.- Type:
float
- property bb_local_lddt¶
Backbone only per residue lDDT scores in range [0.0, 1.0]
Computed based on
model
on backbone atoms only. This is CA for peptides and C3’ for nucleotides. No stereochecks are performed. If a residue is not covered by the target or is in a chain skipped by the chain mapping procedure (happens for super short chains), the respective score is set to None. In case of oligomers,mapping
is used.- Type:
dict
- property ilddt¶
Global interface lDDT score in range [0.0, 1.0]
This is lDDT only based on inter-chain contacts. Value is None if no such contacts are present. For example if we’re dealing with a monomer. Computed based on
stereochecked_model
andmapping
for chain mapping.- Type:
float
- property qs_best¶
Global QS-score - only computed on aligned residues
Computed based on
model
usingmapping
. The QS-score computation only considers contacts between residues with a mapping between target and model. As a result, the score won’t be lowered in case of additional chains/residues in any of the structures.- Type:
float
- property qs_target_interfaces¶
Interfaces in
target
with non-zero contribution toqs_global
/qs_best
Chain names are lexicographically sorted.
- Type:
list
oftuple
with 2 elements each: (trg_ch1, trg_ch2)
- property qs_model_interfaces¶
Interfaces in
model
with non-zero contribution toqs_global
/qs_best
Chain names are lexicographically sorted.
- Type:
list
oftuple
with 2 elements each: (mdl_ch1, mdl_ch2)
- property qs_interfaces¶
Interfaces in
qs_target_interfaces
that can be mapped tomodel
.Target chain names are lexicographically sorted.
- Type:
list
oftuple
with 4 elements each: (trg_ch1, trg_ch2, mdl_ch1, mdl_ch2)
- property per_interface_qs_global¶
QS-score for each interface in
qs_interfaces
- Type:
list
offloat
- property per_interface_qs_best¶
QS-score for each interface in
qs_interfaces
Only computed on aligned residues
- Type:
list
offloat
- property native_contacts¶
Native contacts
A contact is a pair or residues from distinct chains that have a minimal heavy atom distance < 5A. Contacts are specified as
tuple
with two strings in format: <cname>.<rnum>.<ins_code>- Type:
list
oftuple
- property model_contacts¶
Same for model
- property contact_target_interfaces¶
Interfaces in
target
which have at least one contactContact as defined in
native_contacts
, chain names are lexicographically sorted.- Type:
list
oftuple
with 2 elements each (trg_ch1, trg_ch2)
- property contact_model_interfaces¶
Interfaces in
model
which have at least one contactContact as defined in
native_contacts
, chain names are lexicographically sorted.- Type:
list
oftuple
with 2 elements each (mdl_ch1, mdl_ch2)
- property ics_precision¶
Fraction of model contacts that are also present in target
- Type:
float
- property ics_recall¶
Fraction of target contacts that are correctly reproduced in model
- Type:
float
- property ics¶
ICS (Interface Contact Similarity) score
Combination of
ics_precision
andics_recall
using the F1-measure- Type:
float
- property per_interface_ics_precision¶
Per-interface ICS precision
ics_precision
for each interface incontact_target_interfaces
- Type:
list
offloat
- property per_interface_ics_recall¶
Per-interface ICS recall
ics_recall
for each interface incontact_target_interfaces
- Type:
list
offloat
- property per_interface_ics¶
Per-interface ICS (Interface Contact Similarity) score
ics
for each interface incontact_target_interfaces
- Type:
float
- property ips_precision¶
Fraction of model interface residues that are also interface residues in target
- Type:
float
- property ips_recall¶
Fraction of target interface residues that are also interface residues in model
- Type:
float
- property ips¶
IPS (Interface Patch Similarity) score
Jaccard coefficient of interface residues in target and their mapped counterparts in model
- Type:
float
- property per_interface_ips_precision¶
Per-interface IPS precision
ips_precision
for each interface incontact_target_interfaces
- Type:
list
offloat
- property per_interface_ips_recall¶
Per-interface IPS recall
ips_recall
for each interface incontact_target_interfaces
- Type:
list
offloat
- property per_interface_ips¶
Per-interface IPS (Interface Patch Similarity) score
ips
for each interface incontact_target_interfaces
- Type:
list
offloat
- property dockq_target_interfaces¶
Interfaces in
target
that are relevant for DockQAll interfaces in
target
with non-zero contacts that are relevant for DockQ. Chain names are lexicographically sorted.- Type:
list
oftuple
with 2 elements each: (trg_ch1, trg_ch2)
- property dockq_interfaces¶
Interfaces in
dockq_target_interfaces
that can be mapped to modelTarget chain names are lexicographically sorted
- Type:
list
oftuple
with 4 elements each: (trg_ch1, trg_ch2, mdl_ch1, mdl_ch2)
- property dockq_scores¶
DockQ scores for interfaces in
dockq_interfaces
list
offloat
- property fnat¶
fnat scores for interfaces in
dockq_interfaces
fnat: Fraction of native contacts that are also present in model
list
offloat
- property nnat¶
N native contacts for interfaces in
dockq_interfaces
list
ofint
- property nmdl¶
N model contacts for interfaces in
dockq_interfaces
list
ofint
- property fnonnat¶
fnonnat scores for interfaces in
dockq_interfaces
fnat: Fraction of model contacts that are not present in target
list
offloat
- property irmsd¶
irmsd scores for interfaces in
dockq_interfaces
irmsd: RMSD of interface (RMSD computed on N, CA, C, O atoms) which consists of each residue that has at least one heavy atom within 10A of other chain.
list
offloat
- property lrmsd¶
lrmsd scores for interfaces in
dockq_interfaces
lrmsd: The interfaces are superposed based on the receptor (rigid min RMSD superposition) and RMSD for the ligand is reported. Superposition and RMSD are based on N, CA, C and O positions, receptor is the chain contributing to the interface with more residues in total.
list
offloat
- property dockq_ave¶
Average of DockQ scores in
dockq_scores
In its original implementation, DockQ only operates on single interfaces. Thus the requirement to combine scores for higher order oligomers.
- Type:
float
- property dockq_ave_full¶
Same as
dockq_ave
but penalizing for missing interfacesInterfaces that are not covered in model are added as 0.0 in average computation.
- Type:
float
- property dockq_wave_full¶
Same as
dockq_ave_full
, but weightedInterfaces that are not covered in model are added as 0.0 in average computations and the respective weights are derived from number of contacts in respective target interface.
- property mapped_target_pos¶
Mapped representative positions in target
Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as
mapped_model_pos
and mapping is based onmapping
.- Type:
ost.geom.Vec3List
- property mapped_model_pos¶
Mapped representative positions in model
Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as
mapped_target_pos
and mapping is based onmapping
.- Type:
ost.geom.Vec3List
- property transformed_mapped_model_pos¶
mapped_model_pos
withtransform
applied- Type:
ost.geom.Vec3List
- property n_target_not_mapped¶
Number of target residues which have no mapping to model
- Type:
int
- property transform¶
Transform:
mapped_model_pos
ontomapped_target_pos
Computed using Kabsch minimal rmsd algorithm
- Type:
- property rigid_mapped_target_pos¶
Mapped representative positions in target
Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as
rigid_mapped_model_pos
and mapping is based onrigid_mapping
.- Type:
ost.geom.Vec3List
- property rigid_mapped_model_pos¶
Mapped representative positions in model
Thats CA positions for peptide residues and C3’ positions for nucleotides. Has same length as
mapped_target_pos
and mapping is based onrigid_mapping
.- Type:
ost.geom.Vec3List
- property rigid_transformed_mapped_model_pos¶
rigid_mapped_model_pos
withrigid_transform
applied- Type:
ost.geom.Vec3List
- property rigid_n_target_not_mapped¶
Number of target residues which have no rigid mapping to model
- Type:
int
- property rigid_transform¶
Transform:
rigid_mapped_model_pos
ontorigid_mapped_target_pos
Computed using Kabsch minimal rmsd algorithm
- Type:
- property gdt_05¶
Fraction CA (C3’ for nucleotides) that can be superposed within 0.5A
Uses
rigid_mapped_model_pos
andrigid_mapped_target_pos
. Similar iterative algorithm as LGA tool- Type:
float
- property gdt_1¶
Fraction CA (C3’ for nucleotides) that can be superposed within 1.0A
Uses
rigid_mapped_model_pos
andrigid_mapped_target_pos
. Similar iterative algorithm as LGA tool- Type:
float
- property gdt_2¶
Fraction CA (C3’ for nucleotides) that can be superposed within 2.0A
Uses
rigid_mapped_model_pos
andrigid_mapped_target_pos
. Similar iterative algorithm as LGA tool- Type:
float
- property gdt_4¶
Fraction CA (C3’ for nucleotides) that can be superposed within 4.0A
Uses
rigid_mapped_model_pos
andrigid_mapped_target_pos
. Similar iterative algorithm as LGA tool- Type:
float
- property gdt_8¶
Fraction CA (C3’ for nucleotides) that can be superposed within 8.0A
Similar iterative algorithm as LGA tool
- Type:
float
- property gdtts¶
avg GDT with thresholds: 8.0A, 4.0A, 2.0A and 1.0A
- Type:
float
- property gdtha¶
avg GDT with thresholds: 4.0A, 2.0A, 1.0A and 0.5A
- Type:
float
- property rmsd¶
RMSD
Computed on
rigid_transformed_mapped_model_pos
andrigid_mapped_target_pos
- Type:
float
- property cad_score¶
The global CAD atom-atom (AA) score
Computed based on
model
. In case of oligomers,mapping
is used.- Type:
float
- property local_cad_score¶
The per-residue CAD atom-atom (AA) scores
Computed based on
model
. In case of oligomers,mapping
is used.- Type:
dict
- property patch_qs¶
Patch QS-scores for each residue in
model_interface_residues
Representative patches for each residue r in chain c are computed as follows:
mdl_patch_one: All residues in c with CB (CA for GLY) positions within 8A of r and within 12A of residues from any other chain.
mdl_patch_two: Closest residue x to r in any other chain gets identified. Patch is then constructed by selecting all residues from any other chain within 8A of x and within 12A from any residue in c.
trg_patch_one: Chain name and residue number based mapping from mdl_patch_one
trg_patch_two: Chain name and residue number based mapping from mdl_patch_two
Results are stored in the same manner as
model_interface_residues
, with corresponding scores instead of residue numbers. Scores for residues which are notmol.ChemType.AMINOACIDS
are set to None. Additionally, interface patches are derived frommodel
. If they contain residues which are not covered bytarget
, the score is set to None too.- Type:
dict
with chain names as key and andlist
with scores of the respective interface residues.
- property tm_score¶
TM-score computed with USalign
USalign executable can be specified with usalign_exec kwarg at Scorer construction, an OpenStructure internal copy of the USalign code is used otherwise.
- Type:
float
- property usalign_mapping¶
Mapping computed with USalign
Dictionary with target chain names as key and model chain names as values. No guarantee that all chains are mapped. USalign executable can be specified with usalign_exec kwarg at Scorer construction, an OpenStructure internal copy of the USalign code is used otherwise.
- Type:
dict