Modelling Pipeline¶
A protein homology modelling pipeline has the following main steps:
Build a raw model from the template (see
BuildRawModel()
function)Perform loop modelling to close (or remove) all gaps (see functions
CloseSmallDeletions()
,RemoveTerminalGaps()
,MergeGapsByDistance()
,FillLoopsByDatabase()
,FillLoopsByMonteCarlo()
,CloseLargeDeletions()
orCloseGaps()
that calls all these functions using predefined heuristics)Build sidechains (see
BuildSidechains()
function)Minimize energy of final model using molecular mechanics (see
MinimizeModelEnergy()
function)
The last steps to go from a raw model to a final model can easily be executed
with the BuildFromRawModel()
function. If you want to run and tweak the
internal steps, you can start with the following code and adapt it to your
purposes:
from ost import io
from promod3 import modelling, loop
# setup
merge_distance = 4
fragment_db = loop.LoadFragDB()
structure_db = loop.LoadStructureDB()
torsion_sampler = loop.LoadTorsionSamplerCoil()
# get raw model
tpl = io.LoadPDB('data/1crn_cut.pdb')
aln = io.LoadAlignment('data/1crn.fasta')
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)
# we're not modelling termini
modelling.RemoveTerminalGaps(mhandle)
# perform loop modelling to close all gaps
modelling.CloseGaps(mhandle, merge_distance, fragment_db,
structure_db, torsion_sampler)
# build sidechains
modelling.BuildSidechains(mhandle, merge_distance, fragment_db,
structure_db, torsion_sampler)
# minimize energy of final model using molecular mechanics
modelling.MinimizeModelEnergy(mhandle)
# check final model and report issues
modelling.CheckFinalModel(mhandle)
# extract final model
final_model = mhandle.model
io.SavePDB(final_model, 'model.pdb')
Build Raw Modelling Handle¶
- class promod3.modelling.ModellingHandle¶
Handles the result for structure model building and provides high-level methods to turn an initial raw model (see
BuildRawModel()
) into a complete protein model by removing any existing gaps.- model¶
The resulting model. This includes one chain per target chain (in the same order as the sequences in
seqres
) and (if they were included) a chain named ‘_’ for ligands. You can therefore access model.chains items andseqres
items with the same indexing and the optional ligand chain follows afterwards.- Type:
- gaps¶
List of gaps in the model that could not be copied from the template. These gaps may be the result of insertions/deletions in the alignment or due to missing or incomplete backbone coordinates in the template structure. Gaps of different chains are appended one after another.
- Type:
- seqres¶
List of sequences with one
SequenceHandle
for each chain of the target protein.- Type:
- profiles¶
List of profiles with one
ost.seq.ProfileHandle
for each chain of the target protein (same order as inseqres
). Please note, that this attribute won’t be set by simply callingBuildFromRawModel()
. You have to fill it manually or even better by the convenient functionSetSequenceProfiles()
, to ensure consistency with the seqres.- Type:
- psipred_predictions¶
List of predictions with one
promod3.loop.PsipredPrediction
for each chain of the target protein (same order as inseqres
). Please note, that this attribute won’t be set by simply callingBuildFromRawModel()
. You have to fill it manually or even better by the convenient functionSetPsipredPredictions()
, to ensure consistency with the seqres.- Type:
- backbone_scorer_env¶
Backbone score environment attached to this handle. A default environment is set with
SetupDefaultBackboneScoring()
when needed. Additional information can be added to the environment before running the pipeline steps.- Type:
- backbone_scorer¶
Backbone scorer container attached to this handle. A default set of scorers is initialized with
SetupDefaultBackboneScoring()
when needed.- Type:
- all_atom_scorer_env¶
All atom environment attached to this handle for scoring. A default environment is set with
SetupDefaultAllAtomScoring()
when needed. This environment is for temporary work only and is only updated to score loops. It is not to be updated when loops are chosen and added to the final model.- Type:
- all_atom_scorer¶
All atom scorer container attached to this handle. A default set of scorers is initialized with
SetupDefaultAllAtomScoring()
when needed.- Type:
- all_atom_sidechain_env¶
All atom environment attached to this handle for sidechain reconstruction. A default environment is set with
SetupDefaultAllAtomScoring()
when needed.- Type:
- sidechain_reconstructor¶
A sidechain reconstructor to add sidechains to loops prior to all atom scoring. A default one is set with
SetupDefaultAllAtomScoring()
when needed.- Type:
- fragger_handles¶
Optional attribute which is set in
SetFraggerHandles()
. Usehasattr()
to check if it’s available. If it’s set, it is used inBuildFromRawModel()
.- Type:
list
ofFraggerHandle
- modelling_issues¶
Optional attribute which is set in
AddModellingIssue()
. Usehasattr()
to check if it’s available. If it’s set, it can be used to check issues which occurred inBuildFromRawModel()
(seeMinimizeModelEnergy()
andCheckFinalModel()
for details).- Type:
list
ofModellingIssue
- Copy()¶
Generates a deep copy. Everything will be copied over to the returned
ModellingHandle
, except the potentially set scoring membersbackbone_scorer
,backbone_scorer_env
,all_atom_scorer_env
,all_atom_scorer
,all_atom_sidechain_env
andsidechain_reconstructor
.- Returns:
A deep copy of the current handle
- Return type:
The Default Pipeline¶
Modelling Steps¶
- promod3.modelling.SetupDefaultBackboneScoring(mhandle)¶
Setup scorers and environment for meddling with backbones. This one is already tailored towards a certain modelling job. The scorers added (with their respective keys) are:
“cb_packing”:
CBPackingScorer
“cbeta”:
CBetaScorer
“reduced”:
ReducedScorer
“clash”:
ClashScorer
“hbond”:
HBondScorer
“torsion”:
TorsionScorer
“pairwise”:
PairwiseScorer
- Parameters:
mhandle (
ModellingHandle
) – The modelling handle. This will set the propertiesbackbone_scorer
andbackbone_scorer_env
of mhandle.
- promod3.modelling.IsBackboneScoringSetUp(mhandle)¶
- Returns:
True, if
backbone_scorer
andbackbone_scorer_env
of mhandle are set.- Return type:
- Parameters:
mhandle (
ModellingHandle
) – Modelling handle to check.
- promod3.modelling.SetupDefaultAllAtomScoring(mhandle)¶
Setup scorers and environment to perform all atom scoring. This one is already tailored towards a certain modelling job, where we reconstruct sidechains for loop candidates and score them. The scorers added (with their respective keys) are:
“aa_interaction”:
AllAtomInteractionScorer
“aa_packing”:
AllAtomPackingScorer
“aa_clash”:
AllAtomClashScorer
- Parameters:
mhandle (
ModellingHandle
) – The modelling handle. This will set the propertiesall_atom_scorer_env
,all_atom_scorer
,all_atom_sidechain_env
andsidechain_reconstructor
.
- promod3.modelling.IsAllAtomScoringSetUp(mhandle)¶
- Returns:
True, if
all_atom_scorer_env
,all_atom_scorer
,all_atom_sidechain_env
andsidechain_reconstructor
of mhandle are set.- Return type:
- Parameters:
mhandle (
ModellingHandle
) – Modelling handle to check.
- promod3.modelling.InsertLoop(mhandle, bb_list, start_resnum, chain_idx)¶
Insert loop into model and ensure consistent updating of scoring environments. Note that we do not update
all_atom_scorer_env
as that one is meant to be updated only while scoring. To clear a gap while inserting a loop, use the simplerInsertLoopClearGaps()
.- Parameters:
mhandle (
ModellingHandle
) – Modelling handle on which to apply change.bb_list (
BackboneList
) – Loop to insert (backbone only).start_resnum (
int
) – Res. number defining the start position in the SEQRES.chain_idx (
int
) – Index of chain the loop belongs to.
- promod3.modelling.RemoveTerminalGaps(mhandle)¶
Removes terminal gaps without modelling them (just removes them from the list of gaps). This is useful for pipelines which lack the possibility to properly model loops at the termini.
- Parameters:
mhandle (
ModellingHandle
) – Modelling handle on which to apply change.- Returns:
Number of gaps which were removed.
- Return type:
- promod3.modelling.ReorderGaps(mhandle)¶
Reorders all gaps to ensure sequential order by performing lexicographical comparison on the sequence formed by chain index of the gap and start residue number.
- promod3.modelling.MergeMHandle(source_mhandle, target_mhandle, source_chain_idx, target_chain_idx, start_resnum, end_resnum, transform)¶
Merges the specified stretch of source_mhandle into target_mhandle by replacing all structural information and gaps in the stretch start_resnum and end_resnum (inclusive). The residues specified by start_resnum and end_resnum must be valid in the source_mhandle, i.e. not be enclosed by a gap. If a gap encloses start_resnum or end_resnum in the target_mhandle, the gap gets replaced by a shortened version not including the part overlapping with the defined stretch. If there is any scoring set up (backbone or all atom), the according environments get updated in target_mhandle.
- Parameters:
source_mhandle (
ModellingHandle
) – Source of structural information and gapstarget_mhandle (
ModellingHandle
) – Structural information and gaps will be copied in heresource_chain_idx (
int
) – This is the chain where the info comes fromtarget_chain_idx (
int
) – This is the chain where the info goes tostart_resnum (
int
) – First residue of the copied stretchend_resnum (
int
) – Last residue of the copied stretchtransform (
ost.geom.Mat4
) – Transformation to be applied to all atom positions when they’re copied over
- Raises:
A
RuntimeError
when:the chain indices are invalid
the SEQRES of the specified chains do not match
the start and end residue numbers are invalid or when the residues at the specified positions in the source_mhandle do not exist
a gap in the source_mhandle encloses the residues specified by start_resnum and end_resnum
- promod3.modelling.SetSequenceProfiles(mhandle, profiles)¶
Sets the
sequence profiles
of mhandle while ensuring consistency with theseqres
.- Parameters:
mhandle (
ModellingHandle
) – Will have the profiles attached afterwardsprofiles (
list
ofost.seq.ProfileHandle
) – The sequence profiles to attach
- Raises:
ValueError
when the given profiles are not consistent with seqres in mhandle
- promod3.modelling.SetPsipredPredictions(mhandle, predictions)¶
Sets the
predictions
of mhandle while ensuring consistency with theseqres
.- Parameters:
mhandle (
ModellingHandle
) – Will have the predictions attached afterwardspredictions (
list
ofPsipredPrediction
) – The predictions to attach
- Raises:
ValueError
when the given predictions are not consistent with seqres in mhandle