Sidechain Reconstruction

Two methods are provided to fully reconstruct sidechains of residues:

Example usage:

from ost import io, mol
from promod3 import modelling

# load a protein 
prot = io.LoadPDB('data/1CRN.pdb')
# get only amino acids
prot = mol.CreateEntityFromView(prot.Select("peptide=true"), True)
io.SavePDB(prot, 'sidechain_test_orig.pdb')
# reconstruct sidechains
modelling.ReconstructSidechains(prot, keep_sidechains=False)
io.SavePDB(prot, 'sidechain_test_rec.pdb')
from ost import io
from promod3 import loop, modelling

# load example (has res. numbering starting at 1)
prot = io.LoadPDB('data/1CRN.pdb')
res_list = prot.residues
seqres_str = ''.join([r.one_letter_code for r in res_list])

# initialize AllAtom environment and sidechain reconstructor
env = loop.AllAtomEnv(seqres_str)
env.SetInitialEnvironment(prot)
sc_rec = modelling.SidechainReconstructor(keep_sidechains=False)
sc_rec.AttachEnvironment(env)

# reconstruct subset (res. num. 6..10)
res = sc_rec.Reconstruct(6, 5)
# reconstruct two loops (6..10 and 20..25)
res = sc_rec.Reconstruct(start_resnum_list=[6, 20],
                         num_residues_list=[5, 6],
                         chain_idx_list=[0, 0])
# update environment with solution
env.SetEnvironment(res.env_pos)
# store all positions of environment
io.SavePDB(env.GetAllAtomPositions().ToEntity(), 'sc_rec_test.pdb')

Reconstruct Function

promod3.modelling.ReconstructSidechains(ent, keep_sidechains=False, build_disulfids=True, rotamer_model='frm', consider_ligands=True, rotamer_library=None, optimize_subrotamers=True, graph_max_complexity=100000000, graph_initial_epsilon=0.02, energy_function='SCWRL4')

Reconstruct sidechains for the given structure.

Parameters:
  • ent (ost.mol.EntityHandle) – Structure for sidechain reconstruction. Note, that the sidechain reconstruction gets directly applied on the structure itself.

  • keep_sidechains (bool) – Flag, whether complete sidechains in ent (i.e. containing all required atoms) should be kept rigid and directly be added to the frame.

  • build_disulfids (bool) – Flag, whether possible disulfid bonds should be searched. If a disulfid bond is found, the two participating cysteins are fixed and added to the frame.

  • rotamer_model (str) – Rotamer model to be used, can either be “frm” or “rrm”

  • consider_ligands (bool) – Flag, whether to add ligands (anything in chain ‘_’) as static objects.

  • rotamer_library (BBDepRotamerLib / RotamerLib) – A rotamer library to extract the rotamers from. The default is to call <LoadBBDepLib>().

  • optimize_subrotamers (bool) – Only considered when rotamer_model is “frm”. If set to True, the FRM solution undergoes some postprocessing by calling SubrotamerOptimizer() with default parametrization.

  • graph_max_complexity (int) – Max. complexity for RotamerGraph.TreeSolve().

  • graph_intial_epsilon (float) – Initial epsilon for RotamerGraph.TreeSolve().

  • energy_function (str) – What energy function to use can be any in [“SCWRL4”, “SCWRL3”, “VINA”]

SidechainReconstructor Class

class promod3.modelling.SidechainReconstructor(keep_sidechains=True, build_disulfids=True, optimize_subrotamers=False, remodel_cutoff=20, rigid_frame_cutoff=0, graph_max_complexity=100000000, graph_intial_epsilon=0.02, disulfid_score_thresh=45)

Reconstruct sidechains for single loops or residues. Must be linked to an all atom env. (AttachEnvironment()) containing the structural data. Residues are identified as N- or C-terminal according to the seqres in the environment. This means that residues preceeded / followed by gaps are not treated as terminal! In the reconstruction procedure you can specify residues that should be remodelled. Everything within remodel_cutoff will also be considered and potentially remodelled. To enforce the visibility of the rigid frame to all of those close residues you can specify the rigid_frame_cutoff. In the example of remodel_cutoff=20 and rigid_frame_cutoff=10, all residues within 20A of any of the input residues will be considered for remodelling. Everything further away than 20A but within 20A + 10A = 30A will also be considered as rigid frame (all backbone atoms and the sidechain if present). The distance criteria is the CB atom distance between residues (CA in case of glycine).

Parameters:
  • keep_sidechains (bool) – Flag, whether complete sidechains in env. (i.e. containing all required atoms) should be kept rigid and directly be added to the result.

  • build_disulfids (bool) – Flag, whether possible disulfid bonds should be searched. If a disulfid bond is found, the two participating cysteins are fixed and added to the result.

  • optimize_subrotamers (bool) – Flag, whether the SubrotamerOptimizer() with default parametrization should be called if we’re dealing with FRM rotamers.

  • remodel_cutoff (float) – Cutoff to identify all residues that need to be remodelled.

  • rigid_frame_cutoff (float) – Cutoff to control the visibility of the rigid frame to the reconstruction procedure. Everything within [remodel_cutoff, remodel_cutoff + rigid_frame_cutoff] will be considered as ridig frame. Small sidenote: if the keep_sidechains flag is true and all residues within remodel_cutoff already have a sidechain, the rigid_frame_cutoff won’t have any effect.

  • graph_max_complexity (int) – Max. complexity for promod3.sidechain.RotamerGraph.TreeSolve().

  • graph_intial_epsilon (float) – Initial epsilon for promod3.sidechain.RotamerGraph.TreeSolve().

  • disulfid_score_thresh (float) – If DisulfidScore() between two CYS is below this threshold, we consider them to be disulfid-bonded.

Reconstruct(start_resnum, num_residues, chain_idx=0)
Reconstruct(start_resnum_list, num_residues_list, chain_idx_list)

Reconstruct sidechains for one or several loops extracted from environment. Overlapping loops are merged and 0-length loops are removed. All residues in the loop(s) are expected to contain valid CB positions (or CA for GLY), which are used to look for other potentially relevant residues in the surrounding. The resulting structural data will contain all residues in the loop(s) and in the surrounding with all backbone and sidechain heavy atom positions set.

Note that the structural data of the loop(s) is expected to be in the linked environment before calling this!

Parameters:
  • start_resnum (int / ost.mol.ResNum) – Start of loop.

  • num_residues (int) – Length of loop.

  • chain_idx (int) – Chain the loop belongs to.

  • start_resnum_list (list of int) – Starts of loops.

  • num_residues_list (list of int) – Lengths of loops.

  • chain_idx_list (list of int) – Chains the loops belong to.

Returns:

A helper object with all the reconstruction results.

Return type:

SidechainReconstructionData

Raises:

RuntimeError if reconstructor was never attached to an environment or if parameters lead to invalid / unset positions in environment.

AttachEnvironment(env, use_frm=True, use_bbdep_lib=True)
AttachEnvironment(env, use_frm, rotamer_library)

Link reconstructor to given env. A helper class is used in the background to provide sidechain-objects for the environment. As this class is reused by every reconstructor linked to env, the used parameters must be consistent if multiple reconstructors are used (or you must use a distinct env).

Parameters:
  • env (AllAtomEnv) – Link to this environment.

  • use_frm (bool) – If True, use flexible rotamer model, else rigid.

  • use_bbdep_lib (bool) – If True, use default backbone dependent rot. library (LoadBBDepLib()), else use backbone independent one (LoadLib()).

  • rotamer_library (BBDepRotamerLib / RotamerLib) – Custom rotamer library to be used.

Raises:

RuntimeError if env was already linked to another reconstructor with inconsistent parameters. Acceptable changes:

  • keep_sidechains = True, if previously False

  • build_disulfids = False, if previously True

The SidechainReconstructionData class

class promod3.modelling.SidechainReconstructionData

Contains the results of a sidechain reconstruction (SidechainReconstructor.Reconstruct()). All attributes are read only!

env_pos

Container for structural data and mapping to the internal residue indices of the used AllAtomEnv. Useful for scoring and env. updates.

Type:

AllAtomEnvPositions

loop_start_indices
loop_lengths

The first sum(loop_lengths) residues in res_indices of env_pos are guaranteed to belong to the actual input, all the rest comes from the close environment.

Each input loop (apart from overlapping and 0-length loops) is defined by an entry in loop_start_indices and loop_lengths. For loop i_loop, res_indices[loop_start_indices[i_loop]] is the N-stem and res_indices[loop_start_indices[i_loop] + loop_lengths[i_loop] - 1] is the C-stem of the loop. The loop indices are contiguous in res_indices between the stems.

Type:

list of int

rotamer_res_indices

Indices of residues within env_pos for which we generated a new sidechain (in [0, len(env_pos.res_indices)-1]).

Type:

list of int

disulfid_bridges

Pairs of residue indices within env_pos for which we generated a disulfid bridge (indices in [0, len(env_pos.res_indices)-1]).

Type:

list of tuple with two int

is_n_ter

True/False depending on whether a given residue in env_pos is N-terminal in the environment (same length as env_pos.res_indices)

Type:

list of bool

is_c_ter

True/False depending on whether a given residue in env_pos is C-terminal in the environment (same length as env_pos.res_indices)

Type:

list of bool

Search

Enter search terms or a module, class or function name.

Contents