You are reading the documentation for version 1.9 of OpenStructure. You may also want to read the documentation for:
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.7.1
1.8
1.10
1.11
2.0
2.1
2.2
2.3
2.3.1
2.4
devel
|
Parameters: |
|
---|---|
Returns: | a tuple containing the counts of the conserved distances in the model and of all the checked distances |
LocalDistDiffTest
(model, reference_list, distance_list, settings)Wrapper around LocalDistDiffTest()
above.
Parameters: |
|
---|---|
Returns: | the Local Distance Difference Test score (conserved distances divided by all the checked distances) |
Return type: |
|
LocalDistDiffTest
(model, target, cutoff, max_dist, local_lddt_property_string="")Wrapper around LocalDistDiffTest()
above using:
distance_list = CreateDistanceList()
with target and max_dist as
parameters and tolerance_list = [cutoff].
Parameters: |
|
---|---|
Returns: | the Local Distance Difference Test score (conserved distances divided by all the checked distances) |
Return type: |
|
LocalDistDiffTest
(alignment, tolerance, radius, ref_index=0, mdl_index=1)Calculates the Local Distance Difference Test score (see previous function) starting from an alignment between a reference structure and a model. The AlignmentHandle parameter used to provide the alignment to the function needs to have the two structures attached to it. By default the first structure in the alignment is considered to be the reference structure, and the second structure is taken as the model. This can however be changed by passing the indexes of the two structures in the AlignmentHandle as parameters to the function.
Note
This function uses the old implementation of the Local Distance Difference Test algorithm and will give slightly different results from the new one.
Parameters: |
|
---|---|
Returns: | the Local Distance Difference Test score |
LDDTHA
(model, distance_list, sequence_separation=0)¶This function calculates the Local Distance Difference Test, using the same threshold values as the GDT-HA test (the default set of thresholds used for the lDDT score) (See previous functions). The thresholds are 0.5, 1, 2, and 4 Angstroms.
The function only compares the input distance list to the first chain of the model structure.
The local residue-based lDDT score values are stored in the ResidueHandles of the model passed to the function in a float property called “locallddt”.
A sequence separation parameter can be passed to the function. If this happens, only distances between residues whose separation is higher than the provided parameter are considered when computing the score.
Parameters: |
|
---|---|
Returns: | the Local Distance Difference Test score |
DistanceRMSDTest
(model, distance_list, cap_difference, sequence_separation=0, local_drmsd_property_string="")¶This function performs a Distance RMSD Test on a provided model, and calculates the two values that are necessary to determine the Distance RMSD Score, namely the sum of squared distance deviations and the number of distances on which the sum was computed.
The Distance RMSD Test (or DRMSD Test) computes the deviation in the length of local contacts between a model and a reference structure and expresses it in the form of a score value. The score has an an RMSD-like form, with the deviations in the RMSD formula computed as contact distance differences. The score is open-ended, with a value of zero meaning complete agreement of local contact distances, and a positive value revealing a disagreement of magnitude proportional to the score value itself. This score does not require any superposition between the model and the reference.
This function processes a list of distances provided by the user, together with their length in the reference structure. For each distance that is found in the model, its difference with the reference length is computed and used as deviation term in the RMSD-like formula.When a distance is not present in the model because one or both the atoms are missing, a default deviation value provided by the user is used.
The function only processes distances between atoms that do not belong to the same residue, and considers only standard residues in the first chain of the model. For residues with symmetric sidechains (GLU, ASP, ARG, VAL, PHE, TYR), the naming of the atoms is ambiguous. For these residues, the function computes the Distance RMSD Test score that each naming convention would generate when considering all non-ambiguous surrounding atoms. The solution that gives the lower score is then picked to compute the final Distance RMSD Score for the whole model.
A sequence separation parameter can be passed to the function. If this happens, only distances between residues whose separation is higher than the provided parameter are considered when computing the score.
If a string is passed as last parameter to the function, the function computes the Distance RMSD Score for each residue and saves it as a float property in the ResidueHandle, with the passed string as property name. Additionally, the actual sum of squared deviations and the number of distances on which it was computed are stored as properties in the ResidueHandle. The property names are respectively <passed string>_sum (a float property) and <passed string>_count (an integer property).
Parameters: |
|
---|---|
Returns: | a tuple containing the sum of squared distance deviations, and the number of distances on which it was computed. |
DRMSD
(model, distance_list, cap_difference, sequence_separation=0)¶This function calculates the Distance RMSD Test score (see
DistanceRMSDTest()
).
The function only considers distances between atoms not belonging to the same residue, and only compares the input distance list to the first chain of the model structure. It requires, in addition to the model and the list themselves, a default deviation value to be used in the DRMSD Test when a distance is not found in the model.
The local Local Distance Difference Test score values are stored in the ResidueHandles of the model passed to the function in a float property called “localdrmsd”.
A sequence separation parameter can be passed to the function. If this happens, only distances between residues whose separation is higher than the provided parameter are considered when computing the score.
Parameters: |
|
---|---|
Returns: | the Distance RMSD Test score |
CreateDistanceList
(reference, radius)¶CreateDistanceListFromMultipleReferences
(reference_list, tolerance_list, sequence_separation, radius)¶Both these functions create lists of distances to be checked during a Local Distance Difference Test (see description of the functions above).
Note
These functions process only standard residues present in the first chain of the reference structures.
The only difference between the two functions is that one takes a single reference structure and the other a list of reference structures. The structures in the list have to be properly prepared before being passed to the function. Corresponding residues in the structures must have the same residue number, the same chain name, etc. Gaps are allowed and automatically dealt with: if information about a distance is present in at least one of the structures, it will be considered.
If a distance between two atoms is shorter than the inclusion radius in all structures in which the two atoms are present, it is included in the list. However, if the distance is longer than the inclusion radius in at least one of the structures, it is not considered to be a local interaction and is excluded from the list.
The multiple-reference function takes care of residues with ambiguous symmetric sidechains. To decide which naming convention to use, the function computes a Local Distance Difference Test score foreach reference against the first reference structure in the list, using only non ambiguously-named atoms. It picks then the naming convention that gives the highest score, guaranteeing that all references are processed with the correct atom names.
The cutoff list that will later be used to compute the Local Distance Difference Test score and the sequence separation parameter must be passed to the multi-reference function. These parameters do not influence the output distance list, which always includes all distances within the provided radius (to make it consistent with the single-reference corresponding function). However, the parameters are used when dealing with the naming convention of residues with ambiguous nomenclature.
Parameters: |
|
---|---|
Returns: |
PreparelDDTGlobalRDMap
(reference_list, cutoff_list, sequence_separation, max_dist)¶A wrapper around CreateDistanceList()
and
CreateDistanceListFromMultipleReferences()
. Depending on the length of
the reference_list
it calls one or the other.
Parameters: |
|
---|---|
Returns: |
CleanlDDTReferences
(reference_list)¶Prepares references to be used in lDDT calculation. It checks if all references has the same chain name and selects this chain for for further calculations.
Warning
This function modifies the passed reference_list list.
Parameters: | reference_list (list of EntityView ) – A list of reference structures from which distances are
derived |
---|
CheckStructure
(ent, bond_table, angle_table, nonbonded_table, bond_tolerance, angle_tolerance)¶Perform structural checks and filters the structure.
Parameters: |
|
---|
GetlDDTPerResidueStats
(model, distance_list, structural_checks, label)¶Get the per-residue statistics from the lDDT calculation.
Parameters: |
|
---|---|
Returns: | Per-residue local lDDT scores |
Return type: |
|
PrintlDDTPerResidueStats
(scores, structural_checks, cutoffs_length)¶Print per-residue statistics from lDDT calculation.
Parameters: |
|
---|
lDDTLocalScore
(cname, rname, rnum, is_assessed, quality_problems, local_lddt, conserved_dist, total_dist)¶Object containing per-residue information about calculated lDDT.
Parameters: |
|
---|
cname
¶Chain name.
Type: | str |
---|
rname
¶Residue name.
Type: | str |
---|
rnum
¶Residue number.
Type: | int |
---|
is_assessed
¶Is the residue taken into account? Yes or No.
Type: | str |
---|
quality_problems
¶Does the residue have quality problems? No if there are no problems, NA if the problems were not assessed, Yes if there are sidechain problems and Yes+ if there are backbone problems.
Type: | str |
---|
local_lddt
¶Local lDDT score for residue.
Type: | float |
---|
conserved_dist
¶Number of conserved distances.
Type: | int |
---|
total_dist
¶Total number of distances.
Type: | int |
---|
ToString
(structural_checks)¶Returns: | String representation of the lDDTLocalScore object. |
---|---|
Return type: | str |
Parameters: | structural_checks (bool) – Where structural checks applied during calculations? |
GetHeader
(structural_checks, cutoffs_length)¶Get the names of the fields as printed by ToString method.
Parameters: |
|
---|
StereoChemicalProps
(bond_table, angle_table, nonbonded_table)¶Object containing the stereo-chemical properties read form stereochmical_props.txt file.
Parameters: |
|
---|
bond_table
¶Object containing bond parameters
Type: | StereoChemicalParams |
---|
angle_table
¶Object containing angle parameters
Type: | StereoChemicalParams |
---|
nonbonded_table
¶Object containing clashing distances parameters
Type: | ClashingDistances |
---|
lDDTSettings
(radius=15, sequence_separation=0, cutoffs=(0.5, 1.0, 2.0, 4.0), label="locallddt")¶Object containing the settings used for lDDT calculations.
Parameters: |
|
---|
radius
¶Distance inclusion radius.
Type: | float |
---|
sequence_separation
¶Sequence separation.
Type: | int |
---|
cutoffs
¶List of thresholds used to determine distance conservation.
Type: | list of float |
---|
label
¶The base name for the ResidueHandle properties that store the local scores.
Type: | str |
---|
PrintParameters
()¶Print settings.
ToString
()¶Returns: | String representation of the lDDTSettings object. |
---|---|
Return type: | str |
lDDTScorer
(reference, model, settings)¶Object to compute lDDT scores using LocalDistDiffTest()
as in
Mariani et al..
Example usage.
#! /bin/env python
"""Run lDDT from within script."""
from ost.io import LoadPDB
from ost.mol.alg import (CleanlDDTReferences,
lDDTSettings, lDDTScorer)
ent_full = LoadPDB('3ia3', remote=True)
model_view = ent_full.Select('cname=A')
references = [ent_full.Select('cname=C')]
#
# Initialize settings with default parameters and print them
settings = lDDTSettings()
settings.PrintParameters()
# Clean up references
CleanlDDTReferences(references)
#
# Calculate lDDT
scorer = lDDTScorer(references=references, model=model_view, settings=settings)
print "Global score:", scorer.global_score
scorer.PrintPerResidueStats()
Parameters: |
|
---|
references
¶A list of reference structures.
Type: | list(EntityView ) |
---|
model
¶A model structure.
Type: | EntityView |
---|
settings
¶Settings used to calculate lDDT.
Type: | lDDTSettings |
---|
global_dist_list
¶Global map of residue properties.
Type: | GlobalRDMap |
---|
global_score
¶Global lDDT score. It is calculated as conserved_contacts
divided
by total_contacts
.
Type: | float |
---|
conserved_contacts
¶Number of conserved distances.
Type: | int |
---|
total_contacts
¶Number of total distances.
Type: |
---|
local_scores
¶Local scores. For each of the residue lDDT is it is calculated as residue conserved contacts divided by residue total contacts.
Type: | list(lDDTLocalScore ) |
---|
is_valid
¶Is the calculated score valid?
Type: | bool |
---|
PrintPerResidueStats
()¶Print per-residue statistics.
UniqueAtomIdentifier
(chain, residue_number, residue_name, atom_name)¶Object containing enough information to uniquely identify an atom in a structure.
Parameters: |
|
---|
GetChainName
()¶Returns the name of the chain to which the atom belongs, as a String
GetResidueName
()¶Returns the name of the residue to which the atom belongs, as a String
GetAtomName
()¶Returns the name of the atom, as a String
GetQualifiedAtomName
()¶Returns the qualified name of the atom (the chain name, followed by a unique residue identifier and the atom name. For example: “A.GLY2.CA”)
ResidueRDMap
¶Dictionary-like object containing the list of interatomic distances that
originate from a single residue to be checked during a run of the Local
Distance Difference Test algorithm
(key = pair of UniqueAtomIdentifier
, value = pair of floats)
GlobalRDMap
¶Dictionary-like object containing all the ResidueRDMap
objects related to residues of a single structure
(key = ResNum
, value = ResidueRDMap
)
PrintResidueRDMap
(residue_distance_list)¶Prints to standard output all the distances contained in a
ResidueRDMap
object
PrintGlobalRDMap
(global_distance_list)¶Prints to standard output all the distances contained in each of the
ResidueRDMap
objects that make up a
GlobalRDMap
object.
qsscoring
– Quaternary Structure (QS) scores¶Scoring of quaternary structures (QS). The QS scoring is according to the paper by Bertoni et al..
Note
Requirements for use:
compound library
must be defined
and accessible via GetDefaultLib()
. This is set by default
when executing scripts with ost
. Otherwise, you must set this with
SetDefaultLib()
.pip install scipy numpy
)QSscoreError
¶Exception to be raised for “acceptable” exceptions in QS scoring.
Those are cases we might want to capture for default behavior.
QSscorer
(ent_1, ent_2, res_num_alignment=False)¶Object to compute QS scores.
Simple usage without any precomputed contacts, symmetries and mappings:
import ost
from ost.mol.alg import qsscoring
# load two biounits to compare
ent_full = ost.io.LoadPDB('3ia3', remote=True)
ent_1 = ent_full.Select('cname=A,D')
ent_2 = ent_full.Select('cname=B,C')
# get score
ost.PushVerbosityLevel(3)
try:
qs_scorer = qsscoring.QSscorer(ent_1, ent_2)
ost.LogScript('QSscore:', str(qs_scorer.global_score))
ost.LogScript('Chain mapping used:', str(qs_scorer.chain_mapping))
# commonly you want the QS global score as output
qs_score = qs_scorer.global_score
except qsscoring.QSscoreError as ex:
# default handling: report failure and set score to 0
ost.LogError('QSscore failed:', str(ex))
qs_score = 0
For maximal performance when computing QS scores of the same entity with many
others, it is advisable to construct and reuse QSscoreEntity
objects.
Any known / precomputed information can be filled into the appropriate
attribute here (no checks done!). Otherwise most quantities are computed on
first access and cached (lazy evaluation). Setters are provided to set values
with extra checks (e.g. SetSymmetries()
).
All necessary seq. alignments are done by global BLOSUM62-based alignment. A
multiple sequence alignment is performed with ClustalW unless
chain_mapping
is provided manually. You will need to have an
executable clustalw
or clustalw2
in your PATH
or you must set
clustalw_bin
accordingly. Otherwise an exception
(ost.settings.FileNotFound
) is thrown.
Formulas for QS scores:
- QS_best = weighted_scores / (weight_sum + weight_extra_mapped)
- QS_global = weighted_scores / (weight_sum + weight_extra_all)
-> weighted_scores = sum(w(min(d1,d2)) * (1 - abs(d1-d2)/12)) for shared
-> weight_sum = sum(w(min(d1,d2))) for shared
-> weight_extra_mapped = sum(w(d)) for all mapped but non-shared
-> weight_extra_all = sum(w(d)) for all non-shared
-> w(d) = 1 if d <= 5, exp(-2 * ((d-5.0)/4.28)^2) else
In the formulas above:
alignments
.calpha_only
is True).Parameters: |
|
---|---|
Raises: |
|
qs_ent_1
¶QSscoreEntity
object for ent_1 given at construction.
If entity names (original_name
) are not unique, we
set it to ‘pdb_1’ using SetName()
.
qs_ent_2
¶QSscoreEntity
object for ent_2 given at construction.
If entity names (original_name
) are not unique, we
set it to ‘pdb_2’ using SetName()
.
calpha_only
¶True if any of the two structures is CA-only (after cleanup).
Type: | bool |
---|
max_ca_per_chain_for_cm
¶Maximal number of CA atoms to use in each chain to determine chain mappings. Setting this to -1 disables the limit. Limiting it speeds up determination of symmetries and chain mappings. By default it is set to 100.
Type: | int |
---|
max_mappings_extensive
¶Maximal number of chain mappings to test for ‘extensive’
chain_mapping_scheme
. The extensive chain mapping search must in the
worst case check O(N^2) * O(N!) possible mappings for complexes with N
chains. Two octamers without symmetry would require 322560 mappings to be
checked. To limit computations, a QSscoreError
is thrown if we try
more than the maximal number of chain mappings.
The value must be set before the first use of chain_mapping
.
By default it is set to 100000.
Type: | int |
---|
res_num_alignment
¶Forces each alignment in alignments
to be based on residue numbers
instead of using a global BLOSUM62-based alignment.
Type: | bool |
---|
GetOligoLDDTScorer
(settings, penalize_extra_chains=True)¶Returns: |
|
---|---|
Parameters: |
|
SetSymmetries
(symm_1, symm_2)¶Set user-provided symmetry groups.
These groups are restricted to chain names appearing in ent_to_cm_1
and ent_to_cm_2
respectively. They are only valid if they cover all
chains and both symm_1 and symm_2 have same lengths of symmetry group
tuples. Otherwise trivial symmetry group used (see symm_1
).
Parameters: |
---|
alignments
¶List of successful sequence alignments using chain_mapping
.
There will be one alignment for each mapped chain and they are ordered by
their chain names in qs_ent_1
.
The first sequence of each alignment belongs to qs_ent_1
and the
second one to qs_ent_2
. The sequences are named according to the
mapped chain names and have views attached into QSscoreEntity.ent
of qs_ent_1
and qs_ent_2
.
If res_num_alignment
is False, each alignment is performed using a
global BLOSUM62-based alignment. Otherwise, the positions in the alignment
sequences are simply given by the residue number so that residues with
matching numbers are aligned.
Getter: | Computed on first use (cached) |
---|---|
Type: | list of AlignmentHandle |
best_score
¶QS-score without penalties.
Like global_score
, but neglecting additional residues or chains in
one of the biounits (i.e. the score is calculated considering only mapped
chains and residues).
Getter: | Computed on first use (cached) |
---|---|
Type: | float |
Raises: | QSscoreError if only one chain is mapped |
chain_mapping
¶Mapping from ent_to_cm_1
to ent_to_cm_2
.
Properties:
chem_mapping
)symm_1
, symm_2
) is taken into accountDetails on algorithms used to find mapping:
global_score
can range from 0.12 to 0.43
for mappings with very similar multi-chain-RMSD.Getter: | Computed on first use (cached) |
---|---|
Type: | dict with key / value = str (chain names, key
for ent_to_cm_1 , value for ent_to_cm_2 ) |
Raises: | QSscoreError if there are too many combinations to check
to find a chain mapping (see max_mappings_extensive ). |
chain_mapping_scheme
¶Mapping scheme used to get chain_mapping
.
Possible values:
chain_mapping
was set by user before first use of this
attribute.Getter: | Computed with chain_mapping on first use (cached) |
---|---|
Type: | str |
Raises: | QSscoreError as in chain_mapping . |
chem_mapping
¶Inter-complex mapping of chemical groups.
Each group (see QSscoreEntity.chem_groups
) is mapped according to
highest sequence identity. Alignment is between longest sequences in groups.
Limitations:
Getter: | Computed on first use (cached) |
---|---|
Type: | dict with key = tuple of chain names in
qs_ent_1 and value = tuple of chain names in
qs_ent_2 . |
Raises: | QSscoreError if we end up having no chains for either
entity in the mapping (can happen if chains do not have CA atoms). |
clustalw_bin
¶Full path to clustalw
or clustalw2
executable to use for multiple
sequence alignments (unless chain_mapping
is provided manually).
Getter: | Located in path on first use (cached) |
---|---|
Type: | str |
ent_to_cm_1
¶Subset of qs_ent_1
used to compute chain mapping and symmetries.
Properties:
chem_mapping
max_ca_per_chain_for_cm
atoms per chainent_to_cm_2
according to chem_mapping
)chem_mapping
appear in this entity
(so the two can be safely used together)This entity might be transformed (i.e. all positions rotated/translated by same transformation matrix) if this can speed up computations. So do not assume fixed global positions (but relative distances will remain fixed).
Getter: | Computed on first use (cached) |
---|---|
Type: | EntityHandle |
Raises: | QSscoreError if any chain ends up having less than 5 res. |
ent_to_cm_2
¶Subset of qs_ent_1
used to compute chain mapping and symmetries
(see ent_to_cm_1
for details).
global_score
¶QS-score with penalties.
The range of the score is between 0 (i.e. no interface residues are shared between biounits) and 1 (i.e. the interfaces are identical).
The global QS-score is computed applying penalties when interface residues
or entire chains are missing (i.e. anything that is not mapped in
mapped_residues
/ chain_mapping
) in one of the biounits.
Getter: | Computed on first use (cached) |
---|---|
Type: | float |
Raises: | QSscoreError if only one chain is mapped |
mapped_residues
¶Mapping of shared residues in alignments
.
Getter: | Computed on first use (cached) |
---|---|
Type: | dict mapped_residues[c1][r1] = r2 with:
c1 = Chain name in first entity (= first sequence in aln),
r1 = Residue number in first entity,
r2 = Residue number in second entity |
superposition
¶Superposition result based on shared CA atoms in alignments
.
The superposition can be used to map QSscoreEntity.ent
of
qs_ent_1
onto the one of qs_ent_2
. Use
ost.geom.Invert()
if you need the opposite transformation.
Getter: | Computed on first use (cached) |
---|---|
Type: | ost.mol.alg.SuperpositionResult |
symm_1
¶Symmetry groups for qs_ent_1
used to speed up chain mapping.
This is a list of chain-lists where each chain-list can be used reconstruct the others via cyclic C or dihedral D symmetry. The first chain-list is used as a representative symmetry group. For heteromers, the group-members must contain all different seqres in oligomer.
Example: symm. groups [(A,B,C), (D,E,F), (G,H,I)] means that there are symmetry transformations to get (D,E,F) and (G,H,I) from (A,B,C).
Properties:
ent_to_cm_1
appear (w/o duplicates)symm_2
or trivial symmetry groups used.
Compatibility requires same lengths of symmetry group tuples and it must
be possible to get an overlap (80% of residues covered within 6 A of a
(chem. mapped) chain) of all chains in representative symmetry groups by
superposing one pair of chains.Getter: | Computed on first use (cached) |
---|---|
Type: | list of tuple of str (chain names) |
QSscoreEntity
(ent)¶Entity with cached entries for QS scoring.
Any known / precomputed information can be filled into the appropriate
attribute here as long as they are labelled as read/write. Otherwise the
quantities are computed on first access and cached (lazy evaluation). The
heaviest load is expected when computing contacts
and
contacts_ca
.
Parameters: | ent (EntityHandle or EntityView ) – Entity to be used for QS scoring. A copy of it will be processed. |
---|
is_valid
¶True, if successfully initialized. False, if input structure is monomer or has less than 2 protein chains with >= 20 residues.
Type: | bool |
---|
original_name
¶Name set for ent when object was created.
Type: | str |
---|
ent
¶Cleaned version of ent passed at construction. Hydrogens are removed, the
entity is processed with a RuleBasedProcessor
and chains
listed in removed_chains
have been removed. The name of this entity
might change during scoring (see GetName()
). Otherwise, this will be
fixed.
Type: | EntityHandle |
---|
removed_chains
¶Chains removed from ent passed at construction. These are ligand and water chains as well as small (< 20 res.) peptides or chains with no amino acids (determined by chem. type, which is set by rule based processor).
Type: | list of str |
---|
calpha_only
¶Whether entity is CA-only (i.e. it has 0 CB atoms)
Type: | bool |
---|
GetAlignment
(c1, c2)¶Get sequence alignment of chain c1 with chain c2.
Computed on first use based on ca_chains
(cached).
Parameters: |
|
---|---|
Return type: |
|
GetAngles
(c1, c2)¶Get Euler angles from superposition of chain c1 with chain c2.
Computed on first use based on ca_chains
(cached).
Parameters: |
|
---|---|
Returns: | 3 Euler angles (may contain nan if something fails). |
Return type: |
|
GetAxis
(c1, c2)¶Get axis of symmetry from superposition of chain c1 with chain c2.
Computed on first use based on ca_chains
(cached).
Parameters: |
|
---|---|
Returns: | Rotational axis (may contain nan if something fails). |
Return type: |
|
GetName
()¶Wrapper to GetName()
of ent
.
This is used to uniquely identify the entity while scoring. The name may
therefore change while original_name
remains fixed.
SetName
(new_name)¶Wrapper to SetName()
of ent
.
Use this to change unique identifier while scoring (see GetName()
).
ca_chains
¶Map of chain names in ent
to sequences with attached view to CA-only
chains (into ca_entity
). Useful for alignments and superpositions.
Getter: | Computed on first use (cached) |
---|---|
Type: | dict (key = str ,
value = SequenceHandle ) |
ca_entity
¶Reduced representation of ent
with only CA atoms.
This guarantees that each included residue has exactly one atom.
Getter: | Computed on first use (cached) |
---|---|
Type: | EntityHandle |
chem_groups
¶Intra-complex group of chemically identical (seq. id. > 95%) polypeptide
chains as extracted from ca_chains
. First chain in group is the one
with the longest sequence.
Getter: | Computed on first use (cached) |
---|---|
Type: | list of list of str (chain names) |
contacts
¶Connectivity dictionary (read/write).
As given by GetContacts()
with calpha_only = False on ent
.
Getter: | Computed on first use (cached) |
---|---|
Setter: | Uses FilterContacts() to ensure that we only keep contacts
for chains in the cleaned entity. |
Type: | See return type of GetContacts() |
contacts_ca
¶CA-only connectivity dictionary (read/write).
Like contacts
but with calpha_only = True in GetContacts()
.
FilterContacts
(contacts, chain_names)¶Filter contacts to contain only contacts for chains in chain_names.
Parameters: |
|
---|---|
Returns: | New connectivity dictionary (format as in |
Return type: |
|
GetContacts
(entity, calpha_only, dist_thr=12.0)¶Get inter-chain contacts of a macromolecular entity.
Contacts are pairs of residues within a given distance belonging to different chains. They are stored once per pair and include the CA/CB-CA/CB distance.
Parameters: |
|
---|---|
Returns: | A connectivity dictionary. A pair of residues with chain names ch_name1 & ch_name2 (ch_name1 < ch_name2), residue numbers res_num1 & res_num2 and distance dist (<= dist_thr) are stored as result[ch_name1][ch_name2][res_num1][res_num2] = dist. |
Return type: |
|
OligoLDDTScorer
(ref, mdl, alignments, calpha_only, settings, penalize_extra_chains=False, chem_mapping=None)¶Helper class to calculate oligomeric lDDT scores.
This class can be used independently, but commonly it will be created by
calling QSscorer.GetOligoLDDTScorer()
.
Note
By construction, lDDT scores are not symmetric and hence it matters which
structure is the reference (ref
) and which one is the model
(mdl
). Extra residues in the model are generally not considered.
Extra chains in both model and reference can be considered by setting the
penalize_extra_chains
flag to True.
Parameters: |
|
---|
ref
¶mdl
¶Full reference/model entity to be scored. The entity must contain all chains
mapped in alignments
and may also contain additional ones which are
considered if penalize_extra_chains
is True.
Type: | EntityHandle |
---|
alignments
¶One alignment for each mapped chain of ref
/mdl
as defined in
QSscorer.alignments
. The first sequence of each alignment belongs to
ref
and the second one to mdl
. Sequences must have sequence
naming and attached views as defined in QSscorer.alignments
.
Type: | list of AlignmentHandle |
---|
calpha_only
¶If True, restricts lDDT score to CA only.
Type: | bool |
---|
settings
¶Settings to use for lDDT scoring.
Type: | lDDTSettings |
---|
penalize_extra_chains
¶If True, extra chains in both ref
and mdl
will penalize the
lDDT scores.
Type: | bool |
---|
chem_mapping
¶Inter-complex mapping of chemical groups as defined in
QSscorer.chem_mapping
. Used to find “chem-mapped” chains in
ref
for unmapped chains in mdl
when penalizing scores.
Each unmapped model chain can add extra reference-contacts according to the
average total contacts of each single “chem-mapped” reference chain. If
there is no “chem-mapped” reference chain, a warning is shown and the model
chain is ignored.
Only relevant if penalize_extra_chains
is True.
Type: | dict with key = tuple of chain names in
ref and value = tuple of chain names in
mdl . |
---|
lddt_mdl
¶The model entity used for oligomeric lDDT scoring
(oligo_lddt
/ oligo_lddt_scorer
).
Like lddt_ref
, this is a single chain X containing all chains of
mdl
. The residue numbers match the ones in lddt_ref
where
aligned and have unique numbers for additional residues.
Getter: | Computed on first use (cached) |
---|---|
Type: | EntityHandle |
lddt_ref
¶The reference entity used for oligomeric lDDT scoring
(oligo_lddt
/ oligo_lddt_scorer
).
Since the lDDT computation requires a single chain with mapped residue
numbering, all chains of ref
are appended into a single chain X with
unique residue numbers according to the column-index in the alignment. The
alignments are in the same order as they appear in alignments
.
Additional residues are appended at the end of the chain with unique residue
numbers. Unmapped chains are only added if penalize_extra_chains
is
True. Only CA atoms are considered if calpha_only
is True.
Getter: | Computed on first use (cached) |
---|---|
Type: | EntityHandle |
mapped_lddt_scorers
¶List of scorer objects for each chain mapped in alignments
.
Getter: | Computed on first use (cached) |
---|---|
Type: | list of MappedLDDTScorer |
oligo_lddt
¶Oligomeric lDDT score.
The score is computed as conserved contacts divided by the total contacts
in the reference using the oligo_lddt_scorer
, which uses the full
complex as reference/model structure. If penalize_extra_chains
is
True, the reference/model complexes contain all chains (otherwise only the
mapped ones) and additional contacts are added to the reference’s total
contacts for unmapped model chains according to the chem_mapping
.
The main difference with weighted_lddt
is that the lDDT scorer
“sees” the full complex here (incl. inter-chain contacts), while the
weighted single chain score looks at each chain separately.
Getter: | Computed on first use (cached) |
---|---|
Type: | float |
oligo_lddt_scorer
¶lDDT Scorer object for lddt_ref
and lddt_mdl
.
Getter: | Computed on first use (cached) |
---|---|
Type: | lDDTScorer |
sc_lddt
¶List of global scores extracted from sc_lddt_scorers
.
If scoring for a mapped chain fails, an error is displayed and a score of 0 is assigned.
Getter: | Computed on first use (cached) |
---|---|
Type: | list of float |
sc_lddt_scorers
¶List of lDDT scorer objects extracted from mapped_lddt_scorers
.
Type: | list of lDDTScorer |
---|
weighted_lddt
¶Weighted average of single chain lDDT scores.
The score is computed as a weighted average of single chain lDDT scores
(see sc_lddt_scorers
) using the total contacts of each single
reference chain as weights. If penalize_extra_chains
is True,
unmapped chains are added with a 0 score and total contacts taken from
the actual reference chains or (for unmapped model chains) using the
chem_mapping
.
See oligo_lddt
for a comparison of the two scores.
Getter: | Computed on first use (cached) |
---|---|
Type: | float |
MappedLDDTScorer
(alignment, calpha_only, settings)¶A simple class to calculate a single-chain lDDT score on a given chain to
chain mapping as extracted from OligoLDDTScorer
.
Parameters: |
|
---|
alignment
¶Alignment with two sequences named according to the mapped chains and with
views attached to both sequences (e.g. one of the items of
QSscorer.alignments
).
The first sequence is assumed to be the reference and the second one the model. Since the lDDT score is not symmetric (extra residues in model are ignored), the order is important.
Type: | AlignmentHandle |
---|
calpha_only
¶If True, restricts lDDT score to CA only.
Type: | bool |
---|
settings
¶Settings to use for lDDT scoring.
Type: | lDDTSettings |
---|
lddt_scorer
¶lDDT Scorer object for the given chains.
Type: | lDDTScorer |
---|
reference_chain_name
¶Chain name of the reference.
Type: | str |
---|
model_chain_name
¶Chain name of the model.
Type: | str |
---|
GetPerResidueScores
()¶Returns: | Scores for each residue |
---|---|
Return type: | list of dict with one item for each residue
existing in model and reference:
|
The following function detects steric clashes in atomic structures. Two atoms are clashing if their euclidian distance is smaller than a threshold value (minus a tolerance offset).
FilterClashes
(entity, clashing_distances, always_remove_bb=False)¶This function filters out residues with non-bonded clashing atoms. If the clashing atom is a backbone atom, the complete residue is removed from the structure, if the atom is part of the sidechain, only the sidechain atoms are removed. This behavior is changed by the always_remove_bb flag: when the flag is set to True the whole residue is removed even if a clash is just detected in the side-chain.
The function returns a view containing all elements (residues, atoms) that
have not been removed from the input structure, plus a
ClashingInfo
object containing information about the
detected clashes.
Two atoms are defined as clashing if their distance is shorter than the reference distance minus a tolerance threshold. The information about the clashing distances and the tolerance thresholds for all possible pairs of atoms is passed to the function as a parameter.
Hydrogen and deuterium atoms are ignored by this function.
Parameters: |
|
---|---|
Returns: | A tuple of two elements: The filtered |
CheckStereoChemistry
(entity, bond_stats, angle_stats, bond_tolerance, angle_tolerance, always_remove_bb=False)¶This function filters out residues with severe stereo-chemical violations. If the violation involves a backbone atom, the complete residue is removed from the structure, if it involves an atom that is part of the sidechain, only the sidechain is removed. This behavior is changed by the always_remove_bb flag: when the flag is set to True the whole residue is removed even if a violation is just detected in the side-chain.
The function returns a view containing all elements (residues, atoms) that
have not been removed from the input structure, plus a
StereoChemistryInfo
object containing information about
the detected stereo-chemical violations.
A violation is defined as a bond length that lies outside of the range: [mean_length-std_dev*bond_tolerance, mean_length+std_dev*bond_tolerance] or an angle width outside of the range [mean_width-std_dev*angle_tolerance, mean_width+std_dev*angle_tolerance ]. The information about the mean lengths and widths and the corresponding standard deviations is passed to the function using two parameters.
Hydrogen and deuterium atoms are ignored by this function.
Parameters: |
|
---|---|
Returns: | A tuple of two elements: The filtered |
ClashingInfo
¶This object is returned by the FilterClashes()
function, and contains
information about the clashes detected by the function.
GetClashCount
()¶Returns: | number of clashes between non-bonded atoms detected in the input structure |
---|
GetAverageOffset
()¶Returns: | a value in Angstroms representing the average offset by which clashing atoms lie closer than the minimum acceptable distance (which of course differs for each possible pair of elements) |
---|
GetClashList
()¶Returns: | list of detected inter-atomic clashes |
---|---|
Return type: | list of ClashEvent |
ClashEvent
¶This object contains all the information relative to a single clash detected
by the FilterClashes()
function
GetFirstAtom
()¶GetSecondAtom
()¶Returns: | atoms which clash |
---|---|
Return type: | UniqueAtomIdentifier |
GetModelDistance
()¶Returns: | distance (in Angstroms) between the two clashing atoms as observed in the model |
---|
GetAdjustedReferenceDistance
()¶Returns: | minimum acceptable distance (in Angstroms) between the two atoms
involved in the clash, as defined in ClashingDistances |
---|
StereoChemistryInfo
¶This object is returned by the CheckStereoChemistry()
function, and
contains information about bond lengths and planar angle widths in the
structure that diverge from the parameters tabulated by Engh and Huber in the
International Tables of Crystallography. Only elements that diverge from the
tabulated value by a minimumnumber of standard deviations (defined when the
CheckStereoChemistry function is called) are reported.
GetBadBondCount
()¶Returns: | number of bonds where a serious violation was detected |
---|
GetBondCount
()¶Returns: | total number of bonds in the structure checked by the CheckStereoChemistry function |
---|
GetAvgZscoreBonds
()¶Returns: | average z-score of all the bond lengths in the structure, computed using Engh and Huber’s mean and standard deviation values |
---|
GetBadAngleCount
()¶Returns: | number of planar angles where a serious violation was detected |
---|
GetAngleCount
()¶Returns: | total number of planar angles in the structure checked by the CheckStereoChemistry function |
---|
GetAvgZscoreAngles
()¶Returns: | average z-score of all the planar angle widths, computed using Engh and Huber’s mean and standard deviation values. |
---|
GetBondViolationList
()¶Returns: | list of bond length violations detected in the structure |
---|---|
Return type: | list of StereoChemicalBondViolation |
GetAngleViolationList
()¶Returns: | list of angle width violations detected in the structure |
---|---|
Return type: | list of StereoChemicalAngleViolation |
StereoChemicalBondViolation
¶This object contains all the information relative to a single detected violation of stereo-chemical parameters in a bond length
GetFirstAtom
()¶GetSecondAtom
()¶Returns: | first / second atom of the bond |
---|---|
Return type: | UniqueAtomIdentifier |
GetBondLength
()¶Returns: | length of the bond (in Angstroms) as observed in the model |
---|
GetAllowedRange
()¶Returns: | allowed range of bond lengths (in Angstroms), according to Engh and
Huber’s tabulated parameters and the tolerance threshold used when
the CheckStereoChemistry() function was called |
---|---|
Return type: | tuple (minimum and maximum) |
StereoChemicalAngleViolation
¶This object contains all the information relative to a single detected violation of stereo-chemical parameters in a planar angle width
GetFirstAtom
()¶GetSecondAtom
()¶GetThirdAtom
()¶Returns: | first / second (vertex) / third atom that defines the planar angle |
---|---|
Return type: | UniqueAtomIdentifier |
GetAngleWidth
()¶Returns: | width of the planar angle (in degrees) as observed in the model |
---|
GetAllowedRange
()¶Returns: | allowed range of angle widths (in degrees), according to Engh and
Huber’s tabulated parameters and the tolerance threshold used when
the CheckStereoChemistry() function was called |
---|---|
Return type: | tuple (minimum and maximum) |
ClashingDistances
¶Object containing information about clashing distances between non-bonded atoms
ClashingDistances
()¶Creates an empty distance list
SetClashingDistance
(ele1, ele2, clash_distance, tolerance)¶Adds or replaces an entry in the list
Parameters: |
|
---|
GetClashingDistance
(ele1, ele2)¶Returns: | reference distance and a tolerance threshold (both in Angstroms) for two elements |
---|---|
Return type: |
|
Parameters: |
|
GetAdjustedClashingDistance
(ele1, ele2)¶Returns: | reference distance (in Angstroms) for two elements, already adjusted by the tolerance threshold |
---|---|
Parameters: |
|
GetMaxAdjustedDistance
()¶Returns: | longest clashing distance (in Angstroms) in the list, after adjustment with tolerance threshold |
---|
IsEmpty
()¶Returns: | True if the list is empty (i.e. in an invalid, useless state) |
---|
PrintAllDistances
()¶Prints all distances in the list to standard output
StereoChemicalParams
¶Object containing stereo-chemical information about bonds and angles. For each item (bond or angle in a specific residue), stores the mean and standard deviation
StereoChemicalParams
()¶Creates an empty parameter list
SetParam
(item, residue, mean, standard_dev)¶Adds or replaces an entry in the list
Parameters: |
|
---|
IsEmpty
()¶Returns: | True if the list is empty (i.e. in an invalid, useless state) |
---|
PrintAllParameters
()¶Prints all entries in the list to standard output
FillClashingDistances
(file_content)¶FillBondStereoChemicalParams
(file_content)¶FillAngleStereoChemicalParams
(file_content)¶These three functions fill a list of reference clashing distances, a list of stereo-chemical parameters for bonds and a list of stereo-chemical parameters for angles, respectively, starting from the content of a parameter file.
Parameters: | file_content (list of str ) – list of lines from the parameter file |
---|---|
Return type: | ClashingDistances or
StereoChemicalParams |
FillClashingDistancesFromFile
(filename)¶FillBondStereoChemicalParamsFromFile
(filename)¶FillAngleStereoChemicalParamsFromFile
(filename)¶These three functions fill a list of reference clashing distances, a list of stereo-chemical parameters for bonds and a list of stereo-chemical parameters for angles, respectively, starting from a file path.
Parameters: | filename (str ) – path to parameter file |
---|---|
Return type: | ClashingDistances or
StereoChemicalParams |
DefaultClashingDistances
()¶DefaultBondStereoChemicalParams
()¶DefaultAngleStereoChemicalParams
()¶These three functions fill a list of reference clashing distances, a list of stereo-chemical parameters for bonds and a list of stereo-chemical parameters for angles, respectively, using the default parameter files distributed with OpenStructure.
Return type: | ClashingDistances or
StereoChemicalParams |
---|
ResidueNamesMatch
(probe, reference)¶The function requires a reference structure and a probe structure. The function checks that all the residues in the reference structure that appear in the probe structure (i.e., that have the same ResNum) are of the same residue type. Chains are comapred by order, not by chain name (i.e.: the first chain of the reference will be compared with the first chain of the probe structure, etc.)
Parameters: |
|
---|---|
Returns: | True if the residue names are the same, False otherwise |
Superpose
(ent_a, ent_b, match='number', atoms='all', iterative=False, max_iterations=5, distance_threshold=3.0)¶Superposes the model entity onto the reference. To do so, two views are
created, returned with the result. atoms describes what goes into these
views and match the selection method. For superposition,
SuperposeSVD()
or IterativeSuperposeSVD()
are called (depending on
iterative). For matching, the following methods are recognised:
number
- select residues by residue number, includes atoms, calls
MatchResidueByNum()
index
- select residues by index in chain, includes atoms, calls
MatchResidueByIdx()
local-aln
- select residues from a Smith/Waterman alignment, includes
atoms, calls MatchResidueByLocalAln()
global-aln
- select residues from a Needleman/Wunsch alignment, includes
atoms, calls MatchResidueByGlobalAln()
Parameters: |
|
---|---|
Returns: | An instance of |
ParseAtomNames
(atoms)¶Parses different representations of a list of atom names and returns a
set
, understandable by MatchResidueByNum()
. In
essence, this function translates
None
None
set(['N', 'CA', 'C', 'O'])
set(['aname1', 'aname2'])
['aname1', 'aname2']
to set(['aname1', 'aname2'])
Parameters: | atoms (str , list , set ) – Identifier or list of atoms |
---|---|
Returns: | A set of atoms. |
MatchResidueByNum
(ent_a, ent_b, atoms='all')¶Returns a tuple of views containing exactly the same number of atoms. Residues are matched by residue number. A subset of atoms to be included in the views can be specified in the atoms argument. Regardless of what the list of atoms says, only those present in two matched residues will be included in the views. Chains are processed in the order they occur in the entities. If ent_a and ent_b contain a different number of chains, processing stops with the lower count.
Parameters: |
|
---|---|
Returns: | Two |
MatchResidueByIdx
(ent_a, ent_b, atoms='all')¶Returns a tuple of views containing exactly the same number of atoms. Residues are matched by position in the chains of an entity. A subset of atoms to be included in the views can be specified in the atoms argument. Regardless of what the list of atoms says, only those present in two matched residues will be included in the views. Chains are processed in order of appearance. If ent_a and ent_b contain a different number of chains, processing stops with the lower count. The number of residues per chain is supposed to be the same.
Parameters: |
|
---|---|
Returns: | Two |
MatchResidueByLocalAln
(ent_a, ent_b, atoms='all')¶Match residues by local alignment. Takes ent_a and ent_b, extracts the
sequences chain-wise and aligns them in Smith/Waterman manner using the
BLOSUM62 matrix for scoring. Only residues which are marked as peptide
linking
are considered for alignment.
The residues of the entities are then matched based on this alignment. Only
atoms present in both residues are included in the views. Chains are processed
in order of appearance. If ent_a and ent_b contain a different number
of chains, processing stops with the lower count.
Parameters: |
|
---|---|
Returns: | Two |
MatchResidueByGlobalAln
(ent_a, ent_b, atoms='all')¶Match residues by global alignment.
Same as MatchResidueByLocalAln()
but performs a global Needleman/Wunsch
alignment of the sequences using the BLOSUM62 matrix for scoring.
Parameters: |
|
---|---|
Returns: | Two |
SuperpositionResult
¶rmsd
¶RMSD of the superposed entities.
view1
¶view2
¶Two EntityView
used in superposition (not set if methods
with Vec3List
used).
fraction_superposed
¶rmsd_superposed_atoms
¶ncycles
¶For iterative superposition (IterativeSuperposeSVD()
): fraction and
RMSD of atoms that were superposed with a distance below the given
threshold and the number of iteration cycles performed.
SuperposeSVD
(view1, view2, apply_transform=True)¶SuperposeSVD
(list1, list2)Superposition of two sets of atoms minimizing RMSD using a classic SVD based algorithm.
Note that the atom positions in the view are taken blindly in the order in which the atoms appear.
Parameters: |
|
---|---|
Returns: | An instance of |
IterativeSuperposeSVD
(view1, view2, max_iterations=5, distance_threshold=3.0, apply_transform=True)¶IterativeSuperposeSVD
(list1, list2, max_iterations=5, distance_threshold=3.0)Iterative superposition of two sets of atoms. In each iteration cycle, we keep a fraction of atoms with distances below distance_threshold and get the superposition considering only those atoms.
Note that the atom positions in the view are taken blindly in the order in which the atoms appear.
Parameters: |
|
---|---|
Returns: | An instance of |
Raises: | Exception if atom counts do not match or if less than 3 atoms. |
CalculateRMSD
(view1, view2, transformation=geom.Mat4())¶Returns: | RMSD of atom positions (taken blindly in the order in which the atoms appear) in the two given views. |
---|---|
Return type: |
|
Parameters: |
|
Accessibility
(ent, probe_radius=1.4, include_hydrogens=False, include_hetatm=False, include_water=False, oligo_mode=False, selection="", asa_abs="asaAbs", asa_rel="asaRel", asa_atom="asaAtom", algorithm = NACCESS)¶Calculates the accesssible surface area for ever atom in ent. The algorithm mimics the behaviour of the bindings available for the NACCESS and DSSP tools and has been tested to reproduce the numbers accordingly.
Parameters: |
|
---|---|
Returns: | The summed solvent accessibilty of each atom in ent. |
AccessibilityAlgorithm
¶The accessibility algorithm enum specifies the algorithm used by the respective tools. Available are:
NACCESS, DSSP
AssignSecStruct
(ent)¶Assigns secondary structures to all residues based on hydrogen bond patterns as described by DSSP.
Parameters: | ent (EntityView /
EntityHandle ) – Entity on which to assign secondary structures |
---|
FindMemParam
¶Result object for the membrane detection algorithm described below
axis
¶initial search axis from which optimal membrane slab could be found
tilt_axis
¶Axis around which we tilt the membrane starting from the initial axis
tilt
¶Angle to tilt around tilt axis
angle
¶After the tilt operation we perform a rotation around the initial axis with this angle to get the final membrane axis
membrane_axis
¶The result of applying the tilt and rotation procedure described above. The membrane_axis is orthogonal to the membrane plane and has unit length.
pos
¶Real number that describes the membrane center point. To get the actual position you can do: pos * membrane_axis
width
¶Total width of the membrane in A
energy
¶Pseudo energy of the implicit solvation model
membrane_representation
¶Dummy atoms that represent the membrane. This entity is only valid if the according flag has been set to True when calling FindMembrane.
FindMembrane
(ent, assign_membrane_representation=True, fast=False)¶Estimates the optimal membrane position of a protein by using an implicit solvation model. The original algorithm and the used energy function are described in: Lomize AL, Pogozheva ID, Lomize MA, Mosberg HI (2006) Positioning of proteins in membranes: A computational approach.
There are some modifications in this implementation and the procedure is as follows:
FindMemParam
). The top 20 parametrizations
(only top parametrization if fast is True) are stored for further
processing.Parameters: |
|
---|---|
Returns: | The results object |
Return type: |
This is a set of functions used for basic trajectory analysis such as extracting
positions, distances, angles and RMSDs. The organization is such that most
functions have their counterpart at the individual frame level
so that they can also be called on one frame instead of
the whole trajectory.
All these functions have a “stride” argument that defaults to stride=1, which is used to skip frames in the analysis.
SuperposeFrames
(frames, sel, from=0, to=-1, ref=-1)¶This function superposes the frames of the given coord group and returns them as a new coord group.
Parameters: |
|
---|---|
Returns: | A newly created coord group containing the superposed frames. |
SuperposeFrames
(frames, sel, ref_view, from=0, to=-1)Same as SuperposeFrames above, but the superposition is done on a reference view and not on another frame of the trajectory.
Parameters: |
|
---|---|
Returns: | A newly created coord group containing the superposed frames. |
AnalyzeAtomPos
(traj, atom1, stride=1)¶This function extracts the position of an atom from a trajectory. It returns a vector containing the position of the atom for each analyzed frame.
Parameters: |
|
---|
AnalyzeCenterOfMassPos
(traj, sele, stride=1)¶This function extracts the position of the center-of-mass of a selection
(EntityView
) from a trajectory and returns it as a vector.
Parameters: |
|
---|
AnalyzeDistanceBetwAtoms
(traj, atom1, atom2, stride=1)¶This function extracts the distance between two atoms from a trajectory and returns it as a vector.
Parameters: |
|
---|
AnalyzeAngle
(traj, atom1, atom2, atom3, stride=1)¶This function extracts the angle between three atoms from a trajectory and returns it as a vector. The second atom is taken as being the central atom, so that the angle is between the vectors (atom1.pos-atom2.pos) and (atom3.pos-atom2.pos).
Parameters: |
|
---|
AnalyzeDihedralAngle
(traj, atom1, atom2, atom3, atom4, stride=1)¶This function extracts the dihedral angle between four atoms from a trajectory and returns it as a vector. The angle is between the planes containing the first three and the last three atoms.
Parameters: |
|
---|
AnalyzeDistanceBetwCenterOfMass
(traj, sele1, sele2, stride=1)¶This function extracts the distance between the center-of-mass of two
selections (EntityView
) from a trajectory and returns it as
a vector.
Parameters: |
|
---|
AnalyzeRMSD
(traj, reference_view, sele_view, stride=1)¶This function extracts the rmsd between two EntityView
and
returns it as a vector. The views don’t have to be from the same entity. The
reference positions are taken directly from the reference_view, evaluated only
once. The positions from the sele_view are evaluated for each frame.
If you want to compare to frame i of the trajectory t, first use
t.CopyFrame(i) for example:
eh = io.LoadPDB(...)
t = io.LoadCHARMMTraj(eh, ...)
sele = eh.Select(...)
t.CopyFrame(0)
mol.alg.AnalyzeRMSD(t, sele, sele)
Parameters: |
|
---|
AnalyzeMinDistance
(traj, view1, view2, stride=1)¶This function extracts the minimal distance between two sets of atoms (view1 and view2) for each frame in a trajectory and returns it as a vector.
Parameters: |
|
---|
AnalyzeMinDistanceBetwCenterOfMassAndView
(traj, view_cm, view_atoms, stride=1)¶This function extracts the minimal distance between a set of atoms (view_atoms) and the center of mass of a second set of atoms (view_cm) for each frame in a trajectory and returns it as a vector.
Parameters: |
|
---|
AnalyzeAromaticRingInteraction
(traj, view_ring1, view_ring2, stride=1)¶This function is a crude analysis of aromatic ring interactions. For each frame in a trajectory, it calculates the minimal distance between the atoms in one view and the center of mass of the other and vice versa, and returns the minimum between these two minimal distances. Concretely, if the two views are the heavy atoms of two rings, then it returns the minimal center of mass - heavy atom distance betweent he two rings
Parameters: |
|
---|
helix_kinks
– Algorithms to calculate Helix Kinks¶Functions to calculate helix kinks: bend, face shift and wobble angles
Author: Niklaus Johner
AnalyzeHelixKink
(t, sele, proline=False)¶This function calculates the bend, wobble and face-shift angles in an alpha- helix over a trajectory. The determination is more stable if there are at least 4 residues on each side (8 is even better) of the proline around which the helix is kinked. The selection should contain all residues in the correct order and with no gaps and no missing C-alphas.
Parameters: |
|
---|---|
Returns: | A tuple (bend_angle, face_shift, wobble_angle). |
Return type: | (FloatList, FLoatList, FloatList) |
CalculateHelixKink
(sele, proline=False)¶This function calculates the bend, wobble and face-shift angles in an alpha- helix of an EntityView. The determination is more stable if there are at least 4 residues on each side (8 is even better) of the proline around which the helix is kinked. The selection should contain all residues in the correct order and with no gaps and no missing C-alphas.
Parameters: |
|
---|---|
Returns: | A tuple (bend_angle, face_shift, wobble_angle). |
Return type: | (float, float, float) |
trajectory_analysis
– DRMSD, pairwise distances and more¶This Module requires numpy
This module contains functions to analyze trajectories, mainly similiraty measures baed on RMSDS and pairwise distances.
Author: Niklaus Johner (niklaus.johner@unibas.ch)
AverageDistanceMatrixFromTraj
(t, sele, first=0, last=-1)¶This function calcultes the distance between each pair of atoms in sele, averaged over the trajectory t.
Parameters: |
|
---|---|
Returns: | a numpy NpairsxNpairs matrix, where Npairs is the number of atom pairs in sele. |
DistRMSDFromTraj
(t, sele, ref_sele, radius=7.0, average=False, seq_sep=4, first=0, last=-1)¶This function calculates the distance RMSD from a trajectory. The distances selected for the calculation are all the distances between pair of atoms from residues that are at least seq_sep apart in the sequence and that are smaller than radius in ref_sel. The number and order of atoms in ref_sele and sele should be the same.
Parameters: |
|
---|---|
Returns: | a numpy vecor dist_rmsd(Nframes). |
DistanceMatrixFromPairwiseDistances
(distances, p=2)¶This function calculates an distance matrix M(NframesxNframes) from the pairwise distances matrix D(NpairsxNframes), where Nframes is the number of frames in the trajectory and Npairs the number of atom pairs. M[i,j] is the distance between frame i and frame j calculated as a p-norm of the differences in distances from the two frames (distance-RMSD for p=2).
Parameters: |
|
---|---|
Returns: | a numpy NframesxNframes matrix, where Nframes is the number of frames. |
PairwiseDistancesFromTraj
(t, sele, first=0, last=-1, seq_sep=1)¶This function calculates the distances between any pair of atoms in sele with sequence separation larger than seq_sep from a trajectory t. It return a matrix containing one line for each atom pair and Nframes columns, where Nframes is the number of frames in the trajectory.
Parameters: |
|
---|---|
Returns: | a numpy NpairsxNframes matrix. |
RMSD_Matrix_From_Traj
(t, sele, first=0, last=-1, align=True, align_sele=None)¶This function calculates a matrix M such that M[i,j] is the RMSD (calculated on sele) between frames i and j of the trajectory t aligned on sele.
Parameters: |
|
---|---|
Returns: | Returns a numpy NframesxNframes matrix, where Nframes is the number of frames. |
structure_analysis
– Functions to analyze structures¶Some functions for analyzing structures
Author: Niklaus Johner (Niklaus.Johner@unibas.ch)
CalculateBestFitLine
(sele1)¶This function calculates the best fit line to the atoms in sele1.
Parameters: | sele1 (EntityView ) – |
---|---|
Returns: | Line3 |
CalculateBestFitPlane
(sele1)¶This function calculates the best fit plane to the atoms in sele1.
Parameters: | sele1 (EntityView ) – |
---|---|
Returns: | Plane |
CalculateDistanceDifferenceMatrix
(sele1, sele2)¶This function calculates the pairwise distance differences between two selections (EntityView
).
The two selections should have the same number of atoms
It returns an NxN DistanceDifferenceMatrix M (where N is the number of atoms in sele1)
where M[i,j]=||(sele2.atoms[i].pos-sele2.atoms[j].pos)||-||(sele1.atoms[i].pos-sele1.atoms[j].pos)||
Parameters: |
|
---|---|
Returns: | NxN numpy matrix |
CalculateHelixAxis
(sele1)¶This function calculates the best fit cylinder to the CA atoms in sele1, and returns its axis. Residues should be ordered correctly in sele1.
Parameters: | sele1 (EntityView ) – |
---|---|
Returns: | Line3 |
GetAlphaHelixContent
(sele1)¶This function calculates the content of alpha helix in a view. All residues in the view have to ordered and adjacent (no gaps allowed)
Parameters: | sele1 (EntityView ) – |
---|---|
Returns: | float |
GetDistanceBetwCenterOfMass
(sele1, sele2)¶This function calculates the distance between the centers of mass of sele1 and sele2, two selections from the same Entity.
Parameters: |
|
---|---|
Returns: |
|
GetFrameFromEntity
(eh)¶This function returns a CoordFrame from an EntityHandle
Parameters: | eh (EntityHandle ) – |
---|---|
Returns: | ost.mol.CoordFrame |
GetMinDistBetwCenterOfMassAndView
(sele1, sele2)¶This function calculates the minimal distance between sele2 and the center of mass of sele1, two selections from the same Entity.
Parameters: |
|
---|---|
Returns: | distance ( |
GetMinDistanceBetweenViews
(sele1, sele2)¶This function calculates the minimal distance between sele1 and sele2, two selections from the same Entity.
Parameters: |
|
---|---|
Returns: |
|
The following functions help to convert one residue into another by reusing as much as possible from the present atoms. They are mainly meant to map from standard amino acid to other standard amino acids or from modified amino acids to standard amino acids.
CopyResidue
(src_res, dst_res, editor)¶Copies the atoms of src_res
to dst_res
using the residue names
as guide to decide which of the atoms should be copied. If src_res
and
dst_res
have the same name, or src_res
is a modified version of
dst_res
(i.e. have the same single letter code), CopyConserved will be
called, otherwise CopyNonConserved will be called.
Parameters: |
|
---|---|
Returns: | True if the residue could be copied, False if not. |
CopyConserved
(src_res, dst_res, editor)¶Copies the atoms of src_res
to dst_res
assuming that the parent
amino acid of src_res
(or src_res
itself) are identical to dst_res
.
If src_res
and dst_res
are identical, all heavy atoms are copied
to dst_res
. If src_res
is a modified version of dst_res
and the
modification is a pure addition (e.g. the phosphate group of phosphoserine),
the modification is stripped off and all other heavy atoms are copied to
dst_res
. If the modification is not a pure addition, only the backbone
heavy atoms are copied to dst_res
.
Additionally, the selenium atom of MSE
is converted to sulphur.
Parameters: |
|
---|---|
Returns: | A tuple of bools stating whether the residue could be copied and
whether the Cbeta atom was inserted into the |
CopyNonConserved
(src_res, dst_res, editor)¶Copies the heavy backbone atoms and Cbeta (except for GLY
) of src_res
to dst_res
.
Parameters: |
|
---|---|
Returns: | A tuple of bools stating whether the residue could be copied and
whether the Cbeta atom was inserted into the |
Molecular Checker (Molck) could be called directly from the code using Molck function:
#! /bin/env python
"""Run Molck with Python API.
This is an exemplary procedure on how to run Molck using Python API which is
equivalent to the command line:
molck <PDB PATH> --rm=hyd,oxt,nonstd,unk \
--fix-ele --out=<OUTPUT PATH> \
--complib=<PATH TO compounds.chemlib>
"""
from ost.io import LoadPDB, SavePDB
from ost.mol.alg import MolckSettings, Molck
from ost.conop import CompoundLib
pdbid = "<PDB PATH>"
lib = CompoundLib.Load("<PATH TO compounds.chemlib>")
# Using Molck function
ent = LoadPDB(pdbid)
ms = MolckSettings(rm_unk_atoms=True,
rm_non_std=True,
rm_hyd_atoms=True,
rm_oxt_atoms=True,
rm_zero_occ_atoms=False,
colored=False,
map_nonstd_res=False,
assign_elem=True)
Molck(ent, lib, ms)
SavePDB(ent, "<OUTPUT PATH>")
It can also be split into subsequent commands for greater controll:
#! /bin/env python
"""Run Molck with Python API.
This is an exemplary procedure on how to run Molck using Python API which is
equivalent to the command line:
molck <PDB PATH> --rm=hyd,oxt,nonstd,unk \
--fix-ele --out=<OUTPUT PATH> \
--complib=<PATH TO compounds.chemlib>
"""
from ost.io import LoadPDB, SavePDB
from ost.mol.alg import (RemoveAtoms, MapNonStandardResidues,
CleanUpElementColumn)
from ost.conop import CompoundLib
pdbid = "<PDB PATH>"
lib = CompoundLib.Load("<PATH TO compounds.chemlib>")
map_nonstd = False
# Using function chain
ent = LoadPDB(pdbid)
if map_nonstd:
MapNonStandardResidues(lib=lib, ent=ent)
RemoveAtoms(lib=lib,
ent=ent,
rm_unk_atoms=True,
rm_non_std=True,
rm_hyd_atoms=True,
rm_oxt_atoms=True,
rm_zero_occ_atoms=False,
colored=False)
CleanUpElementColumn(lib=lib, ent=ent)
SavePDB(ent, "<OUTPUT PATH>")
MolckSettings
(rm_unk_atoms=False, rm_non_std=False, rm_hyd_atoms=True, rm_oxt_atoms=False, rm_zero_occ_atoms=False, colored=False, map_nonstd_res=True, assign_elem=True)¶Stores settings used for Molecular Checker.
Parameters: |
|
---|
rm_unk_atoms
¶Remove unknown and atoms not following the nomenclature.
Type: | bool |
---|
rm_non_std
¶Remove all residues not one of the 20 standard amino acids
Type: | bool |
---|
rm_hyd_atoms
¶Remove hydrogen atoms
Type: | bool |
---|
rm_oxt_atoms
¶Remove terminal oxygens
Type: | bool |
---|
rm_zero_occ_atoms
¶Remove atoms with zero occupancy
Type: | bool |
---|
colored
¶Whether output should be colored
Type: | bool |
---|
map_nonstd_res
¶Maps modified residues back to the parent amino acid, for example MSE -> MET, SEP -> SER
Type: | bool |
---|
assign_elem
¶Clean up element column
Type: | bool |
---|
ToString
()¶Returns: | String representation of the MolckSettings. |
---|---|
Return type: | str |
Warning
The API here is set such that the functions modify the passed structure ent in-place. If this is not ok, please work on a copy of the structure.
Molck
(ent, lib, settings)¶Runs Molck on provided entity.
Parameters: |
|
---|
MapNonStandardResidues
(ent, lib)¶Maps modified residues back to the parent amino acid, for example MSE -> MET.
Parameters: |
|
---|
RemoveAtoms
(ent, lib, rm_unk_atoms=False, rm_non_std=False, rm_hyd_atoms=True, rm_oxt_atoms=False, rm_zero_occ_atoms=False, colored=False)¶Removes atoms and residues according to some criteria.
Parameters: |
|
---|
CleanUpElementColumn
(ent, lib)¶Clean up element column.
Parameters: |
|
---|
mol.alg
– Algorithms for Structuresqsscoring
– Quaternary Structure (QS) scoreshelix_kinks
– Algorithms to calculate Helix Kinkstrajectory_analysis
– DRMSD, pairwise distances and morestructure_analysis
– Functions to analyze structuresEnter search terms or a module, class or function name.
mol.alg
– Algorithms for Structures