Local Distance Difference Test (LDDT)¶

Note

This is a new implementation of LDDT, introduced in OpenStructure 2.4 with focus on supporting quaternary structure and compounds beyond the 20 standard proteinogenic amino acids. The previous LDDT code that comes with Mariani et al. is considered deprecated.

Note

lddt.lDDTScorer provides the raw Python API to compute LDDT but stereochemistry checks as described in Mariani et al. must be done seperately. You may want to check out the compare-structures action (Comparing two structures) to compute LDDT with pre-processing and support for quaternary structures.

class lDDTScorer(target, compound_lib=None, custom_compounds=None, inclusion_radius=15, sequence_separation=0, symmetry_settings=None, seqres_mapping={}, bb_only=False)¶

LDDT scorer object for a specific target

Sets up everything to score models of that target. LDDT (local distance difference test) is defined as fraction of pairwise distances which exhibit a difference < threshold when considering target and model. In case of multiple thresholds, the average is returned. See

V. Mariani, M. Biasini, A. Barbato, T. Schwede, lDDT : A local superposition-free score for comparing protein structures and models using distance difference tests, Bioinformatics, 2013

Parameters:

target (ost.mol.EntityHandle/ost.mol.EntityView) – The target
compound_lib (ost.conop.CompoundLib) – Compound library from which a compound for each residue is extracted based on its name. Uses ost.conop.GetDefaultLib() if not given, raises if this returns no valid compound library. Atoms defined in the compound are searched in the residue and build the reference for scoring. If the residue has atoms with names [“A”, “B”, “C”] but the corresponding compound only has [“A”, “B”], “A” and “B” are considered for scoring. If the residue has atoms [“A”, “B”] but the compound has [“A”, “B”, “C”], “C” is considered missing and does not influence scoring, even if present in the model.
custom_compounds (dict with residue names (str) as key and CustomCompound as value.) – Custom compounds defining reference atoms. If given, custom_compounds take precedent over compound_lib.
inclusion_radius (float) – All pairwise distances < inclusion_radius are considered for scoring
sequence_separation (int) – Only pairwise distances between atoms of residues which are further apart than this threshold are considered. Residue distance is based on resnum. The default (0) considers all pairwise distances except intra-residue distances.
symmetry_settings (SymmetrySettings) – Define residues exhibiting internal symmetry, uses GetDefaultSymmetrySettings() if not given.
seqres_mapping (dict (key: str, value: ost.seq.AlignmentHandle)) – Mapping of model residues at the scoring stage happens with residue numbers defining their location in a reference sequence (SEQRES) using one based indexing. If the residue numbers in target don’t correspond to that SEQRES, you can specify the mapping manually. You can provide a dictionary to specify a reference sequence (SEQRES) for one or more chain(s). Key: chain name, value: alignment (seq1: SEQRES, seq2: sequence of residues in chain). Example: The residues in a chain with name “A” have sequence “YEAH” and residue numbers [42,43,44,45]. You can provide an alignment with seq1 “HELLYEAH” and seq2 “----YEAH”. “Y” gets assigned residue number 5, “E” gets assigned 6 and so on no matter what the original residue numbers were.
bb_only (bool) – Only consider atoms with name “CA” in case of amino acids and “C3’” for Nucleotides. this invalidates compound_lib. Raises if any residue in target is not r.chem_class.IsPeptideLinking() or r.chem_class.IsNucleotideLinking()

Raises:

RuntimeError if target contains compound which is not in compound_lib, RuntimeError if symmetry_settings specifies symmetric atoms that are not present in the according compound in compound_lib, RuntimeError if seqres_mapping is not provided and target contains residue numbers with insertion codes or the residue numbers for each chain are not monotonically increasing, RuntimeError if seqres_mapping is provided but an alignment is invalid (seq1 contains gaps, mismatch in seq1/seq2, seq2 does not match residues in corresponding chains).

DRMSD(model, dist_cap=5, chain_mapping=None, no_interchain=False, no_intrachain=False, residue_mapping=None, check_resnames=True, add_mdl_contacts=False, interaction_data=None)¶

DRMSD of model - globally and per-residue

Very similar to LDDT as we operate on distance differences for all interatomic distances within the same inclusion radius as in LDDT. DRMSD is the distance rmsd, i.e. the RMSD of distance differences. Distance differences are capped at dist_cap which is also the default value for missing distances.

Parameters:

model (ost.mol.EntityHandle/ost.mol.EntityView) – Model to be scored - models are preferably scored upon performing stereo-chemistry checks in order to punish for non-sensical irregularities. This must be done separately as a pre-processing step. Target contacts that are not covered by model are considered not conserved, thus increasing DRMSD score. This also includes missing model chains or model chains for which no mapping is provided in chain_mapping.
dist_cap (float) – Cap for distance differences.
chain_mapping (dict with str as keys/values) – Mapping of model chains (key) onto target chains (value). This is required if target or model have more than one chain.
no_interchain (bool) – Whether to exclude interchain contacts
no_intrachain (bool) – Whether to exclude intrachain contacts (i.e. only consider interface related contacts)
residue_mapping (dict with key: str, value: ost.seq.AlignmentHandle) – By default, residue mapping is based on residue numbers. That means, a model chain and the respective target chain map to the same underlying reference sequence (SEQRES). Alternatively, you can specify one or several alignment(s) between model and target chains by providing a dictionary. key: Name of chain in model (respective target chain is extracted from chain_mapping), value: Alignment with first sequence corresponding to target chain and second sequence to model chain. There is NO reference sequence involved, so the two sequences MUST exactly match the actual residues observed in the respective target/model chains (ATOMSEQ).
check_resnames (bool) – On by default. Enforces residue name matches between mapped model and target residues.
add_mdl_contacts (bool) – Adds model contacts - Only using contacts that are within a certain distance threshold in the target does not penalize for added model contacts. If set to True, this flag will also consider target contacts that are within the specified distance threshold in the model but not necessarily in the target. No contact will be added if the respective atom pair is not resolved in the target.
interaction_data (tuple) – Pro param - don’t use

Returns:

global and per-residue DRMSD scores as a tuple - first element is global DRMSD score (None if target has no contacts) and second element a list of per-residue scores with length len(model.residues). None is assigned to residues that are not covered by target. If a residue is covered but has no contacts in target, None is assigned.

GetNChainContacts(target_chain, no_interchain=False)¶

Returns number of contacts expected for a certain chain in target

Parameters:

target_chain (str) – Chain in target for which you want the number of expected contacts
no_interchain (bool) – Whether to exclude interchain contacts

Raises:

RuntimeError if specified chain doesnt exist

lDDT(model, thresholds=[0.5, 1.0, 2.0, 4.0], local_lddt_prop=None, local_contact_prop=None, chain_mapping=None, no_interchain=False, no_intrachain=False, penalize_extra_chains=False, residue_mapping=None, return_dist_test=False, check_resnames=True, add_mdl_contacts=False, interaction_data=None, set_atom_props=False)¶

Computes LDDT of model - globally and per-residue

Parameters:

model (ost.mol.EntityHandle/ost.mol.EntityView) – Model to be scored - models are preferably scored upon performing stereo-chemistry checks in order to punish for non-sensical irregularities. This must be done separately as a pre-processing step. Target contacts that are not covered by model are considered not conserved, thus decreasing LDDT score. This also includes missing model chains or model chains for which no mapping is provided in chain_mapping.
thresholds (list of floats) – Thresholds of distance differences to be considered as correct - see docs in constructor for more info. default: [0.5, 1.0, 2.0, 4.0]
local_lddt_prop (str) – If set, per-residue scores will be assigned as generic float property of that name
local_contact_prop (str) – If set, number of expected contacts as well as number of conserved contacts will be assigned as generic int property. Excected contacts will be set as <local_contact_prop>_exp, conserved contacts as <local_contact_prop>_cons. Values are summed over all thresholds.
chain_mapping (dict with str as keys/values) – Mapping of model chains (key) onto target chains (value). This is required if target or model have more than one chain.
no_interchain (bool) – Whether to exclude interchain contacts
no_intrachain (bool) – Whether to exclude intrachain contacts (i.e. only consider interface related contacts)
penalize_extra_chains (bool) – Whether to include a fixed penalty for additional chains in the model that are not mapped to the target. ONLY AFFECTS RETURNED GLOBAL SCORE. In detail: adds the number of intra-chain contacts of each extra chain to the expected contacts, thus adding a penalty.
residue_mapping (dict with key: str, value: ost.seq.AlignmentHandle) – By default, residue mapping is based on residue numbers. That means, a model chain and the respective target chain map to the same underlying reference sequence (SEQRES). Alternatively, you can specify one or several alignment(s) between model and target chains by providing a dictionary. key: Name of chain in model (respective target chain is extracted from chain_mapping), value: Alignment with first sequence corresponding to target chain and second sequence to model chain. There is NO reference sequence involved, so the two sequences MUST exactly match the actual residues observed in the respective target/model chains (ATOMSEQ).
return_dist_test – Whether to additionally return the underlying per-residue data for the distance difference test. Adds five objects to the return tuple. First: Number of total contacts summed over all thresholds Second: Number of conserved contacts summed over all thresholds Third: list with length of scored residues. Contains indices referring to model.residues. Fourth: numpy array of size len(scored_residues) containing the number of total contacts, Fifth: numpy matrix of shape (len(scored_residues), len(thresholds)) specifying how many for each threshold are conserved.
check_resnames (bool) – On by default. Enforces residue name matches between mapped model and target residues.
add_mdl_contacts (bool) – Adds model contacts - Only using contacts that are within a certain distance threshold in the target does not penalize for added model contacts. If set to True, this flag will also consider target contacts that are within the specified distance threshold in the model but not necessarily in the target. No contact will be added if the respective atom pair is not resolved in the target.
interaction_data (tuple) – Pro param - don’t use
set_atom_props (bool) – If True, sets generic properties on a per atom level if local_lddt_prop/local_contact_prop are set as well. In other words: this is the only way you can get per-atom LDDT values.

Returns:

global and per-residue LDDT scores as a tuple - first element is global LDDT score (None if target has no contacts) and second element a list of per-residue scores with length len(model.residues). None is assigned to residues that are not covered by target. If a residue is covered but has no contacts in target, 0.0 is assigned.

class SymmetrySettings¶

Container for symmetric compounds

LDDT considers symmetries and selects the one resulting in the highest possible score.

A symmetry is defined as a renaming operation on one or more atoms that leads to a chemically equivalent residue. Example would be OD1 and OD2 in ASP => renaming OD1 to OD2 and vice versa gives a chemically equivalent residue.

Use AddSymmetricCompound() to define a symmetry which can then directly be accessed through the symmetric_compounds member.

AddSymmetricCompound(name, symmetric_atoms)¶

Adds symmetry for compound with name

Parameters:

name (str) – Name of compound with symmetry
symmetric_atoms (list of tuple) – Pairs of atom names that define renaming operation, i.e. after applying all switches defined in the tuples, the resulting residue should be chemically equivalent. Atom names must refer to the PDB component dictionary.

GetDefaultSymmetrySettings()¶: Constructs and returns SymmetrySettings object for natural amino acids

class CustomCompound(atom_names)¶

Defines atoms for custom compounds

LDDT requires the reference atoms of a compound which are typically extracted from a ost.conop.CompoundLib. This lightweight container allows to handle arbitrary compounds which are not necessarily in the compound library.

Parameters:: atom_names (list of str) – Names of atoms of custom compound

static FromResidue(res)¶

Construct custom compound from residue

Parameters:: res (ost.mol.ResidueView/ost.mol.ResidueHandle) – Residue from which reference atom names are extracted, hydrogen/deuterium atoms are filtered out
Returns:: CustomCompound

class lDDTSettings(radius=15, sequence_separation=0, cutoffs=(0.5, 1.0, 2.0, 4.0), label='locallddt')¶

Object containing the settings used for LDDT calculations.

Parameters:

radius – Sets radius.
sequence_separation – Sets sequence_separation.
cutoffs – Sets cutoffs.
label – Sets label.

radius¶

Distance inclusion radius.

Type:: float

sequence_separation¶

Sequence separation.

Type:: int

cutoffs¶

List of thresholds used to determine distance conservation.

Type:: list of float

label¶

The base name for the ResidueHandle properties that store the local scores.

Type:: str

PrintParameters()¶: Print settings.

ToString()¶

Returns:: String representation of the lDDTSettings object.
Return type:: str

Local Distance Difference Test (LDDT)¶

Search

Contents