Sidechain Reconstruction¶
Two methods are provided to fully reconstruct sidechains of residues:
the
ReconstructSidechains()
function handles a full OSTEntityHandle
the
SidechainReconstructor
is linked to an all atom environment and used to reconstruct sidechains of single loops
Example usage:
from ost import io, mol
from promod3 import modelling
# load a protein
prot = io.LoadPDB('data/1CRN.pdb')
# get only amino acids
prot = mol.CreateEntityFromView(prot.Select("peptide=true"), True)
io.SavePDB(prot, 'sidechain_test_orig.pdb')
# reconstruct sidechains
modelling.ReconstructSidechains(prot, keep_sidechains=False)
io.SavePDB(prot, 'sidechain_test_rec.pdb')
from ost import io
from promod3 import loop, modelling
# load example (has res. numbering starting at 1)
prot = io.LoadPDB('data/1CRN.pdb')
res_list = prot.residues
seqres_str = ''.join([r.one_letter_code for r in res_list])
# initialize AllAtom environment and sidechain reconstructor
env = loop.AllAtomEnv(seqres_str)
env.SetInitialEnvironment(prot)
sc_rec = modelling.SidechainReconstructor(keep_sidechains=False)
sc_rec.AttachEnvironment(env)
# reconstruct subset (res. num. 6..10)
res = sc_rec.Reconstruct(6, 5)
# reconstruct two loops (6..10 and 20..25)
res = sc_rec.Reconstruct(start_resnum_list=[6, 20],
num_residues_list=[5, 6],
chain_idx_list=[0, 0])
# update environment with solution
env.SetEnvironment(res.env_pos)
# store all positions of environment
io.SavePDB(env.GetAllAtomPositions().ToEntity(), 'sc_rec_test.pdb')
Reconstruct Function¶
- promod3.modelling.ReconstructSidechains(ent, keep_sidechains=False, build_disulfids=True, rotamer_model='frm', consider_ligands=True, rotamer_library=None, optimize_subrotamers=True, graph_max_complexity=100000000, graph_initial_epsilon=0.02, energy_function='SCWRL4')¶
Reconstruct sidechains for the given structure.
- Parameters:
ent (
ost.mol.EntityHandle
) – Structure for sidechain reconstruction. Note, that the sidechain reconstruction gets directly applied on the structure itself.keep_sidechains (
bool
) – Flag, whether complete sidechains in ent (i.e. containing all required atoms) should be kept rigid and directly be added to the frame.build_disulfids (
bool
) – Flag, whether possible disulfid bonds should be searched. If a disulfid bond is found, the two participating cysteins are fixed and added to the frame.rotamer_model (
str
) – Rotamer model to be used, can either be “frm” or “rrm”consider_ligands (
bool
) – Flag, whether to add ligands (anything in chain ‘_’) as static objects.rotamer_library (
BBDepRotamerLib
/RotamerLib
) – A rotamer library to extract the rotamers from. The default is to call<LoadBBDepLib>()
.optimize_subrotamers (
bool
) – Only considered when rotamer_model is “frm”. If set to True, the FRM solution undergoes some postprocessing by callingSubrotamerOptimizer()
with default parametrization.graph_max_complexity (
int
) – Max. complexity forRotamerGraph.TreeSolve()
.graph_intial_epsilon (
float
) – Initial epsilon forRotamerGraph.TreeSolve()
.energy_function (
str
) – What energy function to use can be any in [“SCWRL4”, “SCWRL3”, “VINA”]
SidechainReconstructor Class¶
- class promod3.modelling.SidechainReconstructor(keep_sidechains=True, build_disulfids=True, optimize_subrotamers=False, remodel_cutoff=20, rigid_frame_cutoff=0, graph_max_complexity=100000000, graph_intial_epsilon=0.02, disulfid_score_thresh=45)¶
Reconstruct sidechains for single loops or residues. Must be linked to an all atom env. (
AttachEnvironment()
) containing the structural data. Residues are identified as N- or C-terminal according to the seqres in the environment. This means that residues preceeded / followed by gaps are not treated as terminal! In the reconstruction procedure you can specify residues that should be remodelled. Everything within remodel_cutoff will also be considered and potentially remodelled. To enforce the visibility of the rigid frame to all of those close residues you can specify the rigid_frame_cutoff. In the example of remodel_cutoff=20 and rigid_frame_cutoff=10, all residues within 20A of any of the input residues will be considered for remodelling. Everything further away than 20A but within 20A + 10A = 30A will also be considered as rigid frame (all backbone atoms and the sidechain if present). The distance criteria is the CB atom distance between residues (CA in case of glycine).- Parameters:
keep_sidechains (
bool
) – Flag, whether complete sidechains in env. (i.e. containing all required atoms) should be kept rigid and directly be added to the result.build_disulfids (
bool
) – Flag, whether possible disulfid bonds should be searched. If a disulfid bond is found, the two participating cysteins are fixed and added to the result.optimize_subrotamers (
bool
) – Flag, whether theSubrotamerOptimizer()
with default parametrization should be called if we’re dealing with FRM rotamers.remodel_cutoff (
float
) – Cutoff to identify all residues that need to be remodelled.rigid_frame_cutoff (
float
) – Cutoff to control the visibility of the rigid frame to the reconstruction procedure. Everything within [remodel_cutoff, remodel_cutoff + rigid_frame_cutoff] will be considered as ridig frame. Small sidenote: if the keep_sidechains flag is true and all residues within remodel_cutoff already have a sidechain, the rigid_frame_cutoff won’t have any effect.graph_max_complexity (
int
) – Max. complexity forpromod3.sidechain.RotamerGraph.TreeSolve()
.graph_intial_epsilon (
float
) – Initial epsilon forpromod3.sidechain.RotamerGraph.TreeSolve()
.disulfid_score_thresh (
float
) – IfDisulfidScore()
between two CYS is below this threshold, we consider them to be disulfid-bonded.
- Reconstruct(start_resnum, num_residues, chain_idx=0)¶
- Reconstruct(start_resnum_list, num_residues_list, chain_idx_list)
Reconstruct sidechains for one or several loops extracted from environment. Overlapping loops are merged and 0-length loops are removed. All residues in the loop(s) are expected to contain valid CB positions (or CA for GLY), which are used to look for other potentially relevant residues in the surrounding. The resulting structural data will contain all residues in the loop(s) and in the surrounding with all backbone and sidechain heavy atom positions set.
Note that the structural data of the loop(s) is expected to be in the linked environment before calling this!
- Parameters:
- Returns:
A helper object with all the reconstruction results.
- Return type:
- Raises:
RuntimeError
if reconstructor was never attached to an environment or if parameters lead to invalid / unset positions in environment.
- AttachEnvironment(env, use_frm=True, use_bbdep_lib=True)¶
- AttachEnvironment(env, use_frm, rotamer_library)
Link reconstructor to given env. A helper class is used in the background to provide sidechain-objects for the environment. As this class is reused by every reconstructor linked to env, the used parameters must be consistent if multiple reconstructors are used (or you must use a distinct env).
- Parameters:
env (
AllAtomEnv
) – Link to this environment.use_frm (
bool
) – If True, use flexible rotamer model, else rigid.use_bbdep_lib (
bool
) – If True, use default backbone dependent rot. library (LoadBBDepLib()
), else use backbone independent one (LoadLib()
).rotamer_library (
BBDepRotamerLib
/RotamerLib
) – Custom rotamer library to be used.
- Raises:
RuntimeError
if env was already linked to another reconstructor with inconsistent parameters. Acceptable changes:keep_sidechains = True, if previously False
build_disulfids = False, if previously True
The SidechainReconstructionData class¶
- class promod3.modelling.SidechainReconstructionData¶
Contains the results of a sidechain reconstruction (
SidechainReconstructor.Reconstruct()
). All attributes are read only!- env_pos¶
Container for structural data and mapping to the internal residue indices of the used
AllAtomEnv
. Useful for scoring and env. updates.- Type:
- loop_start_indices¶
- loop_lengths¶
The first sum(loop_lengths) residues in
res_indices
of env_pos are guaranteed to belong to the actual input, all the rest comes from the close environment.Each input loop (apart from overlapping and 0-length loops) is defined by an entry in loop_start_indices and loop_lengths. For loop i_loop, res_indices[loop_start_indices[i_loop]] is the N-stem and res_indices[loop_start_indices[i_loop] + loop_lengths[i_loop] - 1] is the C-stem of the loop. The loop indices are contiguous in res_indices between the stems.
- rotamer_res_indices¶
Indices of residues within env_pos for which we generated a new sidechain (in [0, len(env_pos.res_indices)-1]).
- disulfid_bridges¶
Pairs of residue indices within env_pos for which we generated a disulfid bridge (indices in [0, len(env_pos.res_indices)-1]).
- is_n_ter¶
True/False depending on whether a given residue in env_pos is N-terminal in the environment (same length as env_pos.res_indices)