`tmtools` - Structural superposition¶

The tmtools module provides access to the structural superposition programs TMscore, Tmalign and MMalign developed by Y. Zhang and J. Skolnick. These programs superpose a model onto a reference structure, using the positions of the Calpha atoms only. While at their core, these programs essentially use the same algorithm, they differ on how the Calphas are paired. TMscore pairs the Calpha atom based on the residue number, TMalign calculates an optimal pairing of Calpha atom based on heuristics.

Citation:

Yang Zhang and Jeffrey Skolnick, Proteins 2004 57: 702-710 Y. Zhang and J. Skolnick, Nucl. Acids Res. 2005 33, 2302-9

Besides using the standalone TM-align program, ost also provides a wrapper around TM-align as published in:

Sha Gong, Chengxin Zhang, Yang Zhang, Bioinformatics 2019

The advantage is that no intermediate files must be generated, a wrapper on the c++ layer is used instead. However, only the basic TM-align superposition functionality is available.

Distance measures used by TMscore¶

There are many different ways to describe the structural similarity of two protein structures at the Calpha level. TMscore calculate several of these measures. The most common is to describe the difference in terms of the root mean square deviation of the Calpha positions, the RMSD. Despite its common use, RMSD has several drawbacks when working with incomplete models. Since the RMSD highly depends on the set of included atoms, it is relatively easy to obtain a smaller RMSD by omitting flexible parts of a protein structure. This has lead to the introduction of the global distance test (GDT). A model is compared to a reference by calculating the fraction of Calpha atoms that can be superposed below a certain cutoff, e.g. 1Å. The fractions of several such cutoffs are combined into the GDT_TS (1, 2, 4 and 8Å) and GDT_HA (0.5, 1, 2, 4Å) and divided by four to obtain the final measure. In contrast to RSMD, GDT is an agreement measure. The higher the value, the more similar the two structures are. TM-score (not to be confused by TMscore, the program), additionally adds a size dependences to the GDT measure by taking the protein length into account. As with GDT, the bigger the value, the more similar the two structures are.

Common Usage¶

The following example shows how to use TMscore to superpose two protein structures and print the RMSD as well as the GDT_TS and GDT_HA similarity measures.

from ost.bindings import tmtools

pdb1=io.LoadPDB('1ake.pdb', restrict_chains='A')
pdb2=io.LoadPDB('4ake.pdb', restrict_chains='A')
result=tmtools.TMScore(pdb1, pdb2)
print(result.rmsd_below_five) # 1.9
print(result.gdt_ha) # 0.41
print(result.gdt_ts) # 0.56

Usage of TMalign¶

TMAlign(model1, model2, tmalign=None)¶

Performs a sequence independent superposition of model1 onto model2, the reference.

Parameters:

model1 (EntityView or EntityHandle) – The model structure. If the superposition is successful, will be superposed onto the reference structure
model2 (EntityView or EntityHandle) – The reference structure
tmalign – If not None, the path to the tmalign executable.

Returns:

The result of the tmscore superposition

Return type:

ost.bindings.TMAlignResult

Raises:

FileNotFound if tmalign could not be located.

Raises:

RuntimeError if the superposition failed

Usage of TMscore¶

TMScore(model1, model2, tmscore=None)¶

Performs a sequence dependent superposition of model1 onto model2, the reference.

Parameters:

model1 (EntityView or EntityHandle) – The model structure. If the superposition is successful, will be superposed onto the reference structure
model2 (EntityView or EntityHandle) – The reference structure
tmscore – If not None, the path to the tmscore executable.

Returns:

The result of the tmscore superposition

Return type:

TMScoreResult

Raises:

FileNotFound if tmalign could not be located.

Raises:

RuntimeError if the superposition failed

class TMScoreResult(rmsd_common, tm_score, max_sub, gdt_ts, gdt_ha, rmsd_below_five, transform)¶

Holds the result of running TMscore

rmsd_common¶: The RMSD of the common Calpha atoms of both structures

rmsd_below_five¶: The RMSD of all Calpha atoms that can be superposed below five Angstroem

tm_score¶: The TM-score of the structural superposition

transform¶

The transform that superposes the model onto the reference structure.

Type:: Mat4

gdt_ha¶: The GDT_HA of the model to the reference structure.

gdt_ts¶: The GDT_TS of the model to the reference structure.

TMalign C++ wrapper¶

Instead of calling the TMalign executable, ost also provides a wrapper around its C++ implementation. The advantage is that no intermediate files need to be generated in order to call the executable.

from ost import bindings

pdb1=io.LoadPDB('1ake.pdb').Select("peptide=true")
pdb2=io.LoadPDB('4ake.pdb').Select("peptide=true")
result = bindings.WrappedTMAlign(pdb1.chains[0], pdb2.chains[0],
                                 fast=True)
print(result.tm_score)
print(result.alignment.ToString(80))

class TMAlignResult(rmsd, tm_score, aligned_length, transform, alignment)¶

All parameters of the constructor are available as attributes of the class

Parameters:

rmsd (float) – RMSD of the superposed residues
tm_score (float) – TMScore of the superposed residues
aligned_length (int) – Number of superposed residues
transform (geom.Mat4) – Transformation matrix to superpose first chain onto reference
alignment (ost.seq.AlignmentHandle) – The sequence alignment given the structural superposition

WrappedTMAlign(chain1, chain2[, fast=False])¶

Takes two chain views and runs TMalign with chain2 as reference. The positions and sequences are directly extracted from the chain residues for every residue that fulfills:

peptide linking

valid one letter code(no ‘?’)

valid CA atom

Parameters:

chain1 (ost.mol.ChainView) – Chain from which position and sequence are extracted to run TMalign.
chain2 (ost.mol.ChainView) – Chain from which position and sequence are extracted to run TMalign, this is the reference.
fast (bool) – Whether to apply the fast flag to TMAlign

Return type:

ost.bindings.TMAlignResult

WrappedTMAlign(pos1, pos2, seq1, seq2 [fast=False])¶

Similar as described above, but directly feeding in raw data.

Parameters:

pos1 (ost.geom.Vec3List) – CA positions of the first chain
pos2 (ost.geom.Vec3List) – CA positions of the second chain, this is the reference.
seq1 (ost.seq.SequenceHandle) – Sequence of first chain
seq2 (ost.seq.SequenceHandle) – Sequence of second chain
fast (bool) – Whether to apply the fast flag to TMAlign

Return type:

ost.bindings.TMAlignResult

Raises:

ost.Error if pos1 and seq1, pos2 and seq2 respectively are not consistent in size.

`tmtools` - Structural superposition¶

Distance measures used by TMscore¶

Common Usage¶

Usage of TMalign¶

Usage of TMscore¶

TMalign C++ wrapper¶

Table of Contents

Previous topic

Next topic

This Page

tmtools - Structural superposition¶

Distance measures used by TMscore¶

Common Usage¶

Usage of TMalign¶

Usage of TMscore¶

TMalign C++ wrapper¶

`tmtools` - Structural superposition¶