Handling All Atom Positions¶
To represent and handle all atom loops, we provide a container
(AllAtomPositions
) to represent arbitrary amino acid sequences with the
positions of all their heavy atoms and an environment (AllAtomEnv
) to
handle changes during loop modelling.
The example below showcases some operations on the two classes:
from ost import io, geom
from promod3 import loop
# load example (has res. numbering starting at 1)
ent = io.LoadPDB("data/1CRN.pdb")
res_list = ent.residues
seqres_str = ''.join([r.one_letter_code for r in res_list])
# initialize an environment
env = loop.AllAtomEnv(seqres_str)
env.SetInitialEnvironment(ent)
# get all atom position for part of it
all_atoms = loop.AllAtomPositions(res_list[35:44])
# change pos. at res. index 1 (= res. number 37)
idx_ca_res_37 = all_atoms.GetIndex(1, "CA")
new_pos = all_atoms.GetPos(idx_ca_res_37) + geom.Vec3(0.2, 0.2, 0.2)
all_atoms.SetPos(idx_ca_res_37, new_pos)
# insert into ent and store updated entity
all_atoms.InsertInto(1, ent.chains[0], 37)
io.SavePDB(ent, "all_atom_pos.pdb")
# update environment with new positions
env.SetEnvironment(all_atoms, 36)
# store all positions of environment
io.SavePDB(env.GetAllAtomPositions().ToEntity(), "all_atom_env.pdb")
The AllAtomEnv class¶
- class promod3.loop.AllAtomEnv(seqres)¶
The all atom environment contains and handles positions of all atoms during loop modelling. It is linked to a (list of) seqres (one per chain) at construction. The idea is to initialize it at the beginning of the modelling process with all known positions and then update the environment whenever a new loop is being added.
- Parameters:
seqres (
str
/ost.seq.SequenceHandle
/list
ofstr
/ost.seq.SequenceList
) – Internal SEQRES to be set (single chain or list with one per chain). Whenever setting structural data, consistency with this SEQRES is enforced.
Indexing to access parts of the environment:
chain_idx: Index of chain as it occurs in seqres (0 for single sequence)
start_resnum: Residue number defining the position in the SEQRES of chain with index chain_idx. The numbering starts for every chain with the value 1.
internal residue indexing: all residues of all chains are simply concatenated one after each other (indexing starts at 0)
- SetInitialEnvironment(env_structure)¶
Sets full environment. Existing data is cleared first.
- Parameters:
env_structure (
ost.mol.EntityHandle
) – Structral data to be set as environment. The chains in env_structure are expected to be in the same order as the SEQRES items provided in constructor.- Raises:
RuntimeError
if env is inconsistent with SEQRES set in constructor. This can be because of corrupt residue numbers or sequence mismatches.
- SetEnvironment(new_env_pos)¶
- SetEnvironment(new_pos, start_resnum, chain_idx=0)
- SetEnvironment(bb_list, start_resnum, chain_idx=0)
Add/update atom positions in environment. In the end, all residues covered in new_env_pos / new_pos / bb_list will be set as defined there. This means, that positions in the env. may be reset, newly set or cleared.
- Parameters:
new_env_pos (
AllAtomEnvPositions
) – Structural data to be set as environment.new_pos (
AllAtomPositions
) – Structural data to be set as environment.bb_list (
BackboneList
) – Backbone data to be set as environment.start_resnum (
int
/ost.mol.ResNum
) – Res. number defining the start position in the SEQRES.chain_idx (
int
) – Index of chain the structural data belongs to.
- Raises:
RuntimeError
if new_pos / new_env_pos / bb_list is inconsistent with SEQRES set in constructor or when either start_resnum or chain_idx point to invalid positions in the SEQRES.
- ClearEnvironment(start_resnum, num_residues, chain_idx=0)¶
Clears a stretch of length num_residues in the environment in chain with idx chain_idx starting from residue number start_resnum
- GetEnvironment(start_resnum, num_residues, chain_idx=0)¶
- Returns:
Copy of stretch of structural data in environment. Useful to store a loop to reset later with
SetEnvironment()
.- Return type:
- Parameters:
- Raises:
RuntimeError
when either start_resnum or chain_idx point to invalid positions in the SEQRES.
- GetSeqres()¶
- Returns:
SEQRES that was set in constructor (one sequence per chain).
- Return type:
- GetAllAtomPositions()¶
- Returns:
Reference (use with caution!) to the internal storage of all atom positions for the environment. All residues of all chains are stored continuously in there. To get a copy of some positions use
GetEnvironment()
.- Return type:
The AllAtomPositions class¶
- class promod3.loop.AllAtomPositions¶
Container for the positions of all heavy atoms of an arbitrary amino acid sequence. This is tailored for fast operations within C++ codes. The Python export described here, is mainly meant for debugging or to initialize the object and feed it to other classes using it.
Indexing of positions and residues:
residue indexing is in the range [0,
GetNumResidues()
-1] and is in the order of the sequence used to initialize the objectindexing of single atoms is in the range [0,
GetNumAtoms()
-1]. For each residue you can find the bounds withGetFirstIndex()
andGetLastIndex()
or find a single atom withGetIndex()
Each atom position is initially unset (unless a list of residues is passed when constructing it) and can only be set with calls to
SetPos()
orSetResidue()
.- AllAtomPositions(sequence)¶
Creates a container for the given sequence with all positions unset.
- Parameters:
sequence (
str
) – Sequence of amino acid one letter codes.- Raises:
RuntimeError
if sequence contains a one letter code which is not one of the 20 default amino acids.
- AllAtomPositions(residues)¶
Creates a container for the given residues. Both sequence and positions are extracted from the given residues.
- Parameters:
residues (
ost.mol.ResidueHandleList
) – List of residues- Raises:
RuntimeError
if any residue has a one letter code which is not one of the 20 default amino acids.
- AllAtomPositions(sequence, residues)¶
Creates a container for the given sequence and extracts positions from residues. The residues may be different amino acids than the given sequence (see
SetResidue()
).- Parameters:
sequence (
str
) – Sequence of amino acid one letter codes.residues (
ost.mol.ResidueHandleList
) – List of residues from which positions are extracted.
- Raises:
RuntimeError
if sequence contains a one letter code which is not one of the 20 default amino acids or if sequence and residues are inconsistent in size.
- AllAtomPositions(bb_list)¶
Creates a container for the given backbone. Both sequence and backbone positions are extracted from the given residues.
- Parameters:
bb_list (
BackboneList
) – Backbone list of residues- Raises:
RuntimeError
if any residue has a one letter code which is not one of the 20 default amino acids.
- SetResidue(res_index, res)¶
Set positions for residue at index res_index given the atoms in res. For each expected heavy atom, we search for an atom of that name in res and if found set the corresponding position, otherwise we unset it.
- Parameters:
res_index (
int
) – Residue indexres (
ost.mol.ResidueHandle
) – Residue providing atoms
- Raises:
RuntimeError
if res_index out of bounds.
- SetResidue(res_index, other, other_res_index)¶
Set positions for residue at index res_index given the positions of residue at index other_res_index in other position container. Each position is set or cleared according to the data in other.
- Parameters:
res_index (
int
) – Residue indexother (
AllAtomPositions
) – Position container from which we take dataother_res_index (
int
) – Residue index in other
- Raises:
RuntimeError
if res_index or other_res_index out of bounds or if residues in the two containers are inconsistent (different amino acids).
- SetPos(index, pos)¶
- Parameters:
index (
int
) – Set position at that index.pos (
ost.geom.Vec3
) – Set position to pos.
- Raises:
RuntimeError
if index out of bounds.
- ClearPos(index)¶
- Parameters:
index (
int
) – Unset position at that index.- Raises:
RuntimeError
if index out of bounds.
- ClearResidue(res_index)¶
- Parameters:
res_index (
int
) – Unset all positions for residue at that index.- Raises:
RuntimeError
if res_index out of bounds.
- GetPos(index)¶
- Returns:
Position at given index.
- Return type:
- Parameters:
index (
int
) – Atom position index.- Raises:
RuntimeError
if index out of bounds.
- IsSet(index)¶
- GetIndex(res_index, atom_name)¶
- Returns:
Atom position index for atom named atom_name (standard PDB naming) for residue at index res_index.
- Return type:
- Parameters:
- Raises:
RuntimeError
if res_index out of bounds or if atom_name is not one of that residue’s heavy atoms.
- GetFirstIndex(res_index)¶
- GetLastIndex(res_index)¶
- GetAA(res_index)¶
- Returns:
Amino acid type of residue at index res_index.
- Return type:
- Parameters:
res_index (
int
) – Residue index- Raises:
RuntimeError
if res_index out of bounds.
- IsAnySet(res_index)¶
- IsAllSet(res_index)¶
- GetPhiTorsion(res_index, def_angle=-1.0472)¶
- GetPsiTorsion(res_index, def_angle=-0.7854)¶
- GetOmegaTorsion(res_index, def_angle=3.14159)¶
- Returns:
Omega torsion angle of residue at index res_index or def_angle if required atom positions (CA-C-N-CA) are not set. Here, we use CA-C of residue res_index - 1 and N-CA of residue res_index (consistent with OST’s
GetOmegaTorsion()
).- Return type:
- Parameters:
- Raises:
RuntimeError
if res_index - 1 or res_index out of bounds.
- Copy()¶
- Returns:
Full copy of this object.
- Return type:
- Extract(from, to)¶
- Returns:
Container with residues with indices in range [from, to-1].
- Return type:
- Parameters:
- Raises:
RuntimeError
if from >= to or if any residue index is out of bounds.
- Extract(res_indices)¶
- Returns:
Container with residues with indices in res_indices.
- Return type:
- Parameters:
- Raises:
RuntimeError
if any residue index is out of bounds.
- ExtractBackbone(from, to)¶
- Returns:
Backbone list of residues with indices in range [from, to-1]. CB atoms are reconstructed if unset.
- Return type:
- Parameters:
- Raises:
RuntimeError
if from >= to, if any residue index is out of bounds or if any residue has any unset backbone atom (N, CA, C, O).
- ToEntity()¶
- Returns:
All residues packed in a single chain as an OST entity. Connectivity resolved with
ost.conop.HeuristicProcessor
.- Return type:
- InsertInto(res_index, chain, res_num)¶
Insert a single residue (taken from given index) into the chain (with given res. number). Existing data is replaced and atoms are (re)connected according to the default connectivity of that amino acid. Peptide links to neighboring residues are set according to residue numbering. To make this function efficient, we require the backbone atoms (N, C, CA) to be set.
- Parameters:
res_index (
int
) – Residue indexchain (
ost.mol.ChainHandle
) – Chain into which we insertstart_resnum (
int
/ost.mol.ResNum
) – Residue number for the inserted residue
- Raises:
RuntimeError
if res_index out of bounds, if chain is invalid or if not all backbone atoms (N, C, CA) set.
The AllAtomEnvPositions class¶
- class promod3.loop.AllAtomEnvPositions¶
To link the arbitrary amino acid sequence defined in
AllAtomPositions
and the SEQRES ofAllAtomEnv
, we provide a helper class containing structural data as well as a mapping to the internal residue indices ofAllAtomEnv
.- all_pos¶
Container for the positions of all heavy atoms of some residues.
- Type:
- res_indices¶
Residue indices to be used by
AllAtomEnv
for each residue defined in all_pos.
Distinguishing amino acid atoms¶
- class promod3.loop.AminoAcidAtom¶
Enumerates all heavy atoms of all amino acids. The naming scheme is TLC_AN, where TLC is the standard three letter code of the amino acid and AN is the atom name (standard PDB naming) of the heavy atom. Examples: ALA_CB, ARG_CA, ASN_C, ASP_O, CYS_SG, GLU_OE1, GLN_NE2, GLY_N.
We include all heavy atoms that amino acids have when they are peptide bound to other residues (i.e. no OXT).
Additionally, there is the value XXX_NUM_ATOMS, which corresponds to the number of atoms in the enumerator. Each heavy atom hence corresponds to an integer in the range [0, XXX_NUM_ATOMS-1].
- class promod3.loop.AminoAcidHydrogen¶
Enumerates all hydrogens of all amino acids. The naming scheme is TLC_AN, where TLC is the standard three letter code of the amino acid and AN is the atom name (standard PDB naming) of the hydrogen. Examples: ALA_H, ARG_HD3, ASN_HB2, ASP_HA, CYS_HB3, LEU_H.
We include all hydrogens that amino acids can have including H1 (def. PDB name = “H”), H2 and (if not PRO) H3 for charged N-terminal residues. Note that the H atom attached to N when peptide bound (H) is distinct from the N terminal hydrogens (e.g. ALA_H != ALA_H1). For HIS we consider the fully protonated state, while ASP and GLU are included in their charged state.
Additionally, there is the value XXX_NUM_HYDROGENS, which corresponds to the number of hydrogens in the enumerator. Each hydrogen hence corresponds to an integer in the range [0, XXX_NUM_HYDROGENS-1].
- class promod3.loop.AminoAcidLookup¶
Collection of static methods to lookup properties of amino acid types (
ost.conop.AminoAcid
), heavy atom types (AminoAcidAtom
) and hydrogen types (AminoAcidHydrogen
).- static GetOLC(aa)¶
- static GetAAA(aa, atom_idx)¶
- static GetAAA(aa, atom_name)
- Returns:
Heavy atom type for the given amino acid and atom.
- Return type:
- Parameters:
- Raises:
RuntimeError
if atom_idx out of bounds or if atom_name is not one of the heavy atoms of aa.
- static GetAAH(aa, atom_idx)¶
- static GetAAH(aa, atom_name)
- Returns:
Hydrogen type for the given amino acid and atom.
- Return type:
- Parameters:
- Raises:
RuntimeError
if atom_idx out of bounds or if atom_name is not one of the hydrogens of aa.
- static GetIndex(aa, atom_name)¶
- static GetHydrogenIndex(aa, atom_name)¶
- static GetNumAtoms(aa)¶
- static GetNumHydrogens(aa)¶
- static GetAA(aaa)¶
- static GetAA(aah)
- Returns:
Amino acid type of the given heavy atom type
- Return type:
- Parameters:
aaa (
AminoAcidAtom
) – Heavy atom typeaah (
AminoAcidHydrogen
) – Hydrogen type
- static GetAtomName(aaa)¶
- static GetAtomName(aah)
- static GetAtomNameCharmm(aaa)¶
- static GetAtomNameCharmm(aah)
- static GetAtomNameAmber(aaa)¶
- static GetAtomNameAmber(aah)
- Returns:
Atom name of the given heavy atom type according to PDB (default), CHARMM or AMBER naming.
- Return type:
- Parameters:
aaa (
AminoAcidAtom
) – Heavy atom typeaah (
AminoAcidHydrogen
) – Hydrogen type
- static GetElement(aaa)¶
- Returns:
Chemical element of the given heavy atom type
- Return type:
- Parameters:
aaa (
AminoAcidAtom
) – Heavy atom type
- static GetAnchorAtomIndex(aah)¶
- Returns:
Atom index (in [0, GetNumAtoms(GetAA(aah))-1]) of the anchor to which the given hydrogen is attached.
- Return type:
- Parameters:
aah (
AminoAcidHydrogen
) – Hydrogen type
- static GetHNIndex(aa)¶