You are reading the documentation for version 3.3 of ProMod3.
You may also want to read the documentation for:
1.3
2.0
2.1
3.0
3.1
3.2
Modelling Pipeline
A protein homology modelling pipeline has the following main steps:
The last steps to go from a raw model to a final model can easily be executed
with the BuildFromRawModel() function. If you want to run and tweak the
internal steps, you can start with the following code and adapt it to your
purposes:
from ost import io
from promod3 import modelling, loop
# setup
merge_distance = 4
fragment_db = loop.LoadFragDB()
structure_db = loop.LoadStructureDB()
torsion_sampler = loop.LoadTorsionSamplerCoil()
# get raw model
tpl = io.LoadPDB('data/1crn_cut.pdb')
aln = io.LoadAlignment('data/1crn.fasta')
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)
# we're not modelling termini
modelling.RemoveTerminalGaps(mhandle)
# perform loop modelling to close all gaps
modelling.CloseGaps(mhandle, merge_distance, fragment_db,
structure_db, torsion_sampler)
# build sidechains
modelling.BuildSidechains(mhandle, merge_distance, fragment_db,
structure_db, torsion_sampler)
# minimize energy of final model using molecular mechanics
modelling.MinimizeModelEnergy(mhandle)
# check final model and report issues
modelling.CheckFinalModel(mhandle)
# extract final model
final_model = mhandle.model
io.SavePDB(final_model, 'model.pdb')
Build Raw Modelling Handle
-
class
promod3.modelling. ModellingHandle
Handles the result for structure model building and provides high-level methods
to turn an initial raw model (see BuildRawModel() )
into a complete protein model by removing any existing gaps.
-
model
The resulting model. This includes one chain per target chain (in the same
order as the sequences in seqres ) and (if they were included) a
chain named ‘_’ for ligands. You can therefore access model.chains items
and seqres items with the same indexing and the optional ligand
chain follows afterwards.
-
gaps
List of gaps in the model that could not be copied from the template. These
gaps may be the result of insertions/deletions in the alignment or due to
missing or incomplete backbone coordinates in the template structure.
Gaps of different chains are appended one after another.
-
seqres
List of sequences with one SequenceHandle for each chain
of the target protein.
-
profiles
List of profiles with one ost.seq.ProfileHandle for each chain of
the target protein (same order as in seqres ). Please note, that this
attribute won’t be set by simply calling BuildFromRawModel() . You have
to fill it manually or even better by the convenient function
SetSequenceProfiles() , to ensure consistency with the seqres.
-
psipred_predictions
List of predictions with one promod3.loop.PsipredPrediction for
each chain of the target protein (same order as in seqres ). Please
note, that this attribute won’t be set by simply calling
BuildFromRawModel() . You have to fill it manually or even better by
the convenient function SetPsipredPredictions() , to ensure
consistency with the seqres.
-
backbone_scorer_env
Backbone score environment attached to this handle. A default environment is
set with SetupDefaultBackboneScoring() when needed. Additional
information can be added to the environment before running the pipeline
steps.
-
backbone_scorer
Backbone scorer container attached to this handle. A default set of scorers
is initialized with SetupDefaultBackboneScoring() when needed.
-
all_atom_scorer_env
All atom environment attached to this handle for scoring. A default
environment is set with SetupDefaultAllAtomScoring() when needed. This
environment is for temporary work only and is only updated to score loops.
It is not to be updated when loops are chosen and added to the final model.
-
all_atom_scorer
All atom scorer container attached to this handle. A default set of scorers
is initialized with SetupDefaultAllAtomScoring() when needed.
-
all_atom_sidechain_env
All atom environment attached to this handle for sidechain reconstruction. A
default environment is set with SetupDefaultAllAtomScoring() when
needed.
-
sidechain_reconstructor
A sidechain reconstructor to add sidechains to loops prior to all atom
scoring. A default one is set with SetupDefaultAllAtomScoring() when
needed.
-
fragger_handles
Optional attribute which is set in SetFraggerHandles() . Use
hasattr() to check if it’s available. If it’s set, it is used in
BuildFromRawModel() .
-
modelling_issues
Optional attribute which is set in AddModellingIssue() . Use
hasattr() to check if it’s available. If it’s set, it can be used to
check issues which occurred in BuildFromRawModel() (see
MinimizeModelEnergy() and CheckFinalModel() for details).
-
Copy ()
Generates a deep copy. Everything will be copied over to the returned
ModellingHandle , except the potentially set scoring members
backbone_scorer ,
backbone_scorer_env ,
all_atom_scorer_env ,
all_atom_scorer ,
all_atom_sidechain_env and
sidechain_reconstructor .
-
promod3.modelling. BuildRawModel (aln, chain_names=None, include_ligands=False, spdbv_style=False, aln_preprocessing='default')
Builds a raw (pseudo) model from the alignment. Can either take a single
alignment handle or an alignment handle list. Every list item is treated as a
single chain in the final raw model.
Each alignment handle must contain exactly two sequences and the second
sequence is considered the template sequence, which must have a
EntityView attached.
Before extracting the coordinates, the alignments are pre-processed
according to aln_preprocessing.
This is a basic protein core modelling algorithm that copies backbone
coordinates based on the sequence alignment. For matching residues, the
side chain coordinates are also copied. Gaps are ignored. Hydrogen an
deuterium atoms are not copied into the model.
The function tries to reuse as much as possible from the template. Modified
residues are treated as follows:
- Selenium methionine residues are converted to methionine
- Side chains which contain all atoms of the parent amino acid, e.g.
phosphoserine are copied as a whole with the modifications stripped off.
Residues with missing backbone atoms and D-peptides are generally skipped and
treated as gaps. Missing Cbeta atoms in backbone are ok and reconstructed.
If all residues are skipped (e.g. Calpha traces), we report an error and
return an empty model.
Residue numbers are set such that missing residue in gaps are honoured and
subsequent loop modelling can insert new residues without having to renumber.
The numbering of residues starts for every chain with the value 1.
The returned ModellingHandle stores the obtained raw model as well
as information about insertions and deletions in the gaps list.
Parameters: |
- aln (
AlignmentHandle / AlignmentList ) – Single alignment handle for raw model with single chain or
list of alignment handles for raw model with multiple chains.
- include_ligands (
bool ) – True, if we wish to include ligands in the model. This
searches for ligands in all OST handles of the views
attached to the alignments. Ligands are identified
with the ligand property in the handle (set by OST
based on HET records) or by the chain name ‘_’ (as set
in SMTL). All ligands are added to a new chain named
‘_’.
- chain_names (
str / list ) – If set, this overrides the default chain naming
(chains are consecutively named according to
characters in
‘ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz’).
If aln is of type ost.seq.AlignmentHandle ,
chain_names is expected to be a str .
If aln is of type ost.seq.AlignmentList ,
chain_names is expected to be a list of
str of same size as aln or a str .
For the latter case, chains will consecutively named
according to characters in chain_names.
- spdbv_style (
bool ) – True, if we need a model in the old SPDBV style.
- aln_preprocessing – Calls
promod3.modelling.PullTerminalDeletions()
if set to ‘default’. Can be disabled when set
to False.
|
Returns: | Raw (pseudo) model from the alignment.
|
Return type: | ModellingHandle
|
Raises: | A RuntimeError when:
- the alignments do not have two sequences
- the second sequence does not have an attached structure
- the residues of the template structure do not match with the
alignment sequence (note that you can set an “offset” (see
SetSequenceOffset() ) for the
template sequence (but not for the target))
- the target sequence has a non-zero offset (cannot be honored as
the resulting model will always start its residue numbering at 1)
|
The Default Pipeline
-
promod3.modelling. BuildFromRawModel (mhandle, use_amber_ff=False, extra_force_fields=[], model_termini=False)
Build a model starting with a raw model (see BuildRawModel() ).
This function implements a recommended pipeline to generate complete models
from a raw model. The steps are shown in detail in the code example
above. If you wish to use your own
pipeline, you can use that code as a starting point for your own custom
modelling pipeline. For reproducibility, we recommend that you keep copies
of custom pipelines.
To adapt the scoring used during loop closing, you can call
SetupDefaultBackboneScoring() and SetupDefaultAllAtomScoring()
and adapt the default scoring members. Alternatively, you can setup the
scoring manually, but you must ensure consistency yourself!
By default, a simple backbone dihedral sampling is performed when entering
Monte Carlo. If mhandle has a list of FraggerHandle objects
attached as “fragger_handles” attribute, the sampling will be performed with
structural fragments. To ensure consistency, the fragger handles should be
attached using SetFraggerHandles() .
But be aware of increased runtime due to the fragment search step.
If the function fails to close all gaps, it will produce a warning and
return an incomplete model.
Parameters: |
- mhandle (
ModellingHandle ) – The prepared template coordinates loaded with the input
alignment.
- use_amber_ff (
bool ) – if True, use the AMBER force field instead of the def.
CHARMM one (see ost.mol.mm.LoadAMBERForcefield()
and ost.mol.mm.LoadCHARMMForcefield() ).
Both do a similarly good job without ligands (CHARMM
slightly better), but you will want to be consistent
with the optional force fields in extra_force_fields.
- extra_force_fields (
list of ost.mol.mm.Forcefield ) – Additional list of force fields to use if a
(ligand) residue cannot be parametrized with the
default force field. The force fields are tried
in the order as given and ligands without an
existing parametrization are skipped.
- model_termini (
bool ) – The default modelling pipeline in ProMod3 is optimized
to generate a gap-free model of the region in the
target sequence(s) that is covered with template
information. Terminal extensions without template
coverage are negelected.
You can activate this flag to enforce a model of the
full target sequence(s). The terminal parts will be
modelled with a crude Monte Carlo approach. Be aware
that the accuracy of those termini is likely to be
limited.
|
Returns: | Delivers the model as an OST entity.
|
Return type: | Entity
|
Modelling Steps
-
promod3.modelling. SetupDefaultBackboneScoring (mhandle)
Setup scorers and environment for meddling with backbones.
This one is already tailored towards a certain modelling job.
The scorers added (with their respective keys) are:
-
promod3.modelling. IsBackboneScoringSetUp (mhandle)
-
-
promod3.modelling. SetupDefaultAllAtomScoring (mhandle)
Setup scorers and environment to perform all atom scoring.
This one is already tailored towards a certain modelling job, where we
reconstruct sidechains for loop candidates and score them.
The scorers added (with their respective keys) are:
-
promod3.modelling. IsAllAtomScoringSetUp (mhandle)
-
-
promod3.modelling. InsertLoop (mhandle, bb_list, start_resnum, chain_idx)
Insert loop into model and ensure consistent updating of scoring environments.
Note that we do not update all_atom_scorer_env as
that one is meant to be updated only while scoring. To clear a gap while
inserting a loop, use the simpler InsertLoopClearGaps() .
Parameters: |
- mhandle (
ModellingHandle ) – Modelling handle on which to apply change.
- bb_list (
BackboneList ) – Loop to insert (backbone only).
- start_resnum (
int ) – Res. number defining the start position in the SEQRES.
- chain_idx (
int ) – Index of chain the loop belongs to.
|
-
promod3.modelling. RemoveTerminalGaps (mhandle)
Removes terminal gaps without modelling them (just removes them from the list
of gaps). This is useful for pipelines which lack the possibility to properly
model loops at the termini.
Parameters: | mhandle (ModellingHandle ) – Modelling handle on which to apply change. |
Returns: | Number of gaps which were removed. |
Return type: | int |
-
promod3.modelling. ReorderGaps (mhandle)
Reorders all gaps to ensure sequential order by performing lexicographical
comparison on the sequence formed by chain index of the gap and
start residue number.
-
promod3.modelling. MergeMHandle (source_mhandle, target_mhandle, source_chain_idx, target_chain_idx, start_resnum, end_resnum, transform)
Merges the specified stretch of source_mhandle into target_mhandle by
replacing all structural information and gaps in the stretch
start_resnum and end_resnum (inclusive). The residues specified by
start_resnum and end_resnum must be valid in the source_mhandle,
i.e. not be enclosed by a gap. If a gap encloses start_resnum or
end_resnum in the target_mhandle, the gap gets replaced by a
shortened version not including the part overlapping with the defined stretch.
If there is any scoring set up (backbone or all atom), the according
environments get updated in target_mhandle.
Parameters: |
- source_mhandle (
ModellingHandle ) – Source of structural information and gaps
- target_mhandle (
ModellingHandle ) – Structural information and gaps will be copied in here
- source_chain_idx (
int ) – This is the chain where the info comes from
- target_chain_idx (
int ) – This is the chain where the info goes to
- start_resnum (
int ) – First residue of the copied stretch
- end_resnum (
int ) – Last residue of the copied stretch
- transform (
ost.geom.Mat4 ) – Transformation to be applied to all atom positions when
they’re copied over
|
Raises: | A RuntimeError when:
- the chain indices are invalid
- the SEQRES of the specified chains do not match
- the start and end residue numbers are invalid or when the residues
at the specified positions in the source_mhandle do not exist
- a gap in the source_mhandle encloses the residues specified by
start_resnum and end_resnum
|
-
promod3.modelling. SetSequenceProfiles (mhandle, profiles)
Sets the sequence profiles of mhandle
while ensuring consistency with the seqres .
Parameters: |
|
Raises: | ValueError when the given profiles are not consistent
with seqres in mhandle
|
-
promod3.modelling. SetPsipredPredictions (mhandle, predictions)
Sets the predictions of
mhandle while ensuring consistency with the
seqres .
Parameters: |
|
Raises: | ValueError when the given predictions are not consistent
with seqres in mhandle
|
-
promod3.modelling. SetFraggerHandles (mhandle, fragger_handles)
Sets fragger_handles in mhandle while
ensuring consistency with the ModellingHandle.seqres .
Parameters: |
|
Raises: | ValueError when the given fragger_handles are not
consistent with seqres in mhandle
|
-
promod3.modelling. CloseGaps (mhandle, merge_distance=4, fragment_db=None, structure_db=None, torsion_sampler=None, fragger_handles=None, chain_idx=None, resnum_range=None, length_dep_weights=False)
Tries to close all gaps in a model, except termini. It will go through
following steps:
Parameters: |
- mhandle (
ModellingHandle ) – Modelling handle on which to apply change.
- merge_distance (
int ) – Max. merge distance when performing the database
approach
- fragment_db (
FragDB ) – Database for searching fragments in database
approach, must be consistent with provided
structure_db. A default is loaded if None.
- structure_db (
StructureDB ) – Structure db from which the fragment_db gets
it’s structural information. A default is loaded
if None.
- torsion_sampler (
promod3.loop.TorsionSampler ) – Used as parameter for FillLoopsByDatabase()
and FillLoopsByMonteCarlo() A default one is
loaded if None.
- fragger_handles (
list ) – A list of promod3.modelling.FraggerHandle
objects for each chain in mhandle.
If provided, fragments will be used for
sampling when the FillLoopsByMonteCarlo()
gets executed.
- chain_idx (
int ) – If not None, only gaps from chain with given index get
processed
- resnum_range (
tuple containing two int ) – If not None, only gaps within this resnum range get
processed.
- length_dep_weights (
bool ) – ScoringWeights provides different sets
of weights that have been trained on different
loop subsets. If this flag is true, the length
dependent weights are used to close loops with
database / Monte Carlo.
|
-
promod3.modelling. CloseSmallDeletions (mhandle, max_extension=9, clash_thresh=1.0, e_thresh=200, use_scoring_extender=True, use_full_extender=True, chain_idx=None, resnum_range=None, ff_lookup=None)
Close small deletions by relaxing neighbouring residues.
Small deletions in the template from the target-template alignment have a
good chance to be bridged just by relaxing neighbours around a tiny gap.
Before diving into the more demanding tasks in modeling, those may be closed
already in the raw-model. After closure some checks are done to see if the
solution is stereochemically sensible.
Closed gaps are removed from mhandle.gaps .
from ost import io, seq
from promod3 import modelling
# setup
tpl = io.LoadPDB('data/gly.pdb')
aln = seq.CreateAlignment(seq.CreateSequence('trg', 'GGGG-GGGG'),
seq.CreateSequence('tpl', 'GGGGAGGGG'))
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)
# close small deletion
print('Number of gaps before: %d' % len(mhandle.gaps))
modelling.CloseSmallDeletions(mhandle)
print('Number of gaps after: %d' % len(mhandle.gaps))
-
promod3.modelling. MergeGapsByDistance (mhandle, distance, chain_idx=None, resnum_range=None)
Merge 2 neighbouring gaps by deleting residues in-between.
Check if two neighbouring gaps are at max. distance residues apart from
each other. Then delete the residues and store a new gap spanning the whole
stretch of original gaps and the deleted region. Original gaps will be
removed. Stem residues count to the gap, so A-A-A has a distance of 0.
IMPORTANT: we assume here that mhandle stores gaps sequentially.
Non-sequential gaps are ignored!
from ost import io, seq
from promod3 import modelling
# setup
tpl = io.LoadPDB('data/1crn_cut.pdb')
seq_trg = 'TTCCPSIVARSNFNVCRLPGTPEAICATGYTCIIIPGATCPGDYAN'
seq_tpl = 'TTCCPSIVARSNFNVCRLPGTPEA----G--CIIIPGATCPGDYAN'
aln = seq.CreateAlignment(seq.CreateSequence('trg', seq_trg),
seq.CreateSequence('tpl', seq_tpl))
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)
# merge gaps
print('Number of gaps before: %d' % len(mhandle.gaps))
modelling.MergeGapsByDistance(mhandle, 0)
print('Number of gaps after: %d' % len(mhandle.gaps))
Parameters: |
- mhandle (
ModellingHandle ) – Modelling handle on which to apply change.
- distance (
int ) – The max. no. of residues between two gaps up to which
merge happens.
- chain_idx (
int ) – If not None, only gaps from chain with given index get
processed
- resnum_range (
tuple containing two int ) – If not None, two gaps only get merged if they’re
both in this resnum range.
|
-
promod3.modelling. FillLoopsByDatabase (mhandle, fragment_db, structure_db, torsion_sampler=None, max_loops_to_search=40, min_loops_required=4, max_res_extension=-1, extended_search=True, use_scoring_extender=True, use_full_extender=True, score_variant=0, ring_punch_detection=1, chain_idx=None, resnum_range=None, max_num_all_atom=0, clash_thresh=-1, length_dep_weights=False)
Try to fill up loops from a structural database.
Usually this will extend the gaps a bit to match candidates from the
database. Do not expect a gap being filled in between its actual stem
residues.
This function cannot fill gaps at C- or N-terminal.
from ost import io, seq
from promod3 import modelling, loop
# setup
tpl = io.LoadPDB('data/1crn_cut.pdb')
seq_trg = 'TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN'
seq_tpl = 'TTCCPSIVARSNFNVCRLPGTPEA------GCIIIPGATCPGDYAN'
aln = seq.CreateAlignment(seq.CreateSequence('trg', seq_trg),
seq.CreateSequence('tpl', seq_tpl))
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)
# close gaps
print('Number of gaps before: %d' % len(mhandle.gaps))
modelling.FillLoopsByDatabase(mhandle, loop.LoadFragDB(),
loop.LoadStructureDB(),
loop.LoadTorsionSamplerCoil())
print('Number of gaps after: %d' % len(mhandle.gaps))
-
promod3.modelling. FillLoopsByMonteCarlo (mhandle, torsion_sampler=None, max_loops_to_search=6, max_extension=30, mc_num_loops=2, mc_steps=5000, use_scoring_extender=True, use_full_extender=True, score_variant=0, ring_punch_detection=1, fragger_handles=None, chain_idx=None, resnum_range=None, length_dep_weights=False)
Try to fill up loops with Monte Carlo sampling.
This is meant as a “last-resort” approach when it is not possible to fill
the loops from the database with FillLoopsByDatabase() .
This will extend the gaps (up to max_extension times) a bit to allow for
more loop candidates to be found.
The loops are modelled by either sampling the dihedral angles or (if
fragger_handles is given) Fragger lists. The latter
is only used if the gap length is >= the length of fragments stored.
This function cannot fill gaps at C- or N-terminal.
from ost import io, seq
from promod3 import modelling, loop
# setup
tpl = io.LoadPDB('data/1crn_cut.pdb')
seq_trg = 'TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN'
seq_tpl = 'TTCCPSIVARSNFNVCRLPGTPEA------GCIIIPGATCPGDYAN'
aln = seq.CreateAlignment(seq.CreateSequence('trg', seq_trg),
seq.CreateSequence('tpl', seq_tpl))
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)
# close gaps
print('Number of gaps before: %d' % len(mhandle.gaps))
modelling.FillLoopsByMonteCarlo(mhandle,
loop.LoadTorsionSamplerCoil())
print('Number of gaps after: %d' % len(mhandle.gaps))
-
promod3.modelling. CloseLargeDeletions (mhandle, structure_db, linker_length=8, num_fragments=500, use_scoring_extender=True, use_full_extender=True, chain_idx=None, resnum_range=None)
Try to close large deletions.
This is meant as a “last-resort” approach. In some cases you cannot
close very large deletions simply because the two parts separated
by a deletion are too far apart. The idea is to sample a linker region
and always move the whole chain towards the n-terminus.
Parameters: |
- mhandle (
ModellingHandle ) – Modelling handle on which to apply change.
- structure_db (
StructureDB ) – The database from which to extract fragments for
the linker region.
- linker_length (
int ) – Desired length (in residues w/o stems) for the
linker. This may be shorter if extender cannot
extend further.
- num_fragments (
int ) – Number of fragments to sample the linker.
- use_scoring_extender (
bool ) – True = use ScoringGapExtender instead
of GapExtender .
See CloseSmallDeletions() .
- use_full_extender (
bool ) – True = use FullGapExtender instead of
GapExtender .
See CloseSmallDeletions() .
- chain_idx (
int ) – If not None, only gaps from chain with given index get
processed
- resnum_range (
tuple containing two int ) – If not None, only gaps within this resnum range get
processed
|
-
promod3.modelling. ModelTermini (mhandle, torsion_sampler, fragger_handles=None, mc_num_loops=20, mc_steps=5000)
Try to model termini with Monte Carlo sampling.
Use with care! This is an experimental feature which will increase coverage
but we do not assume that the resulting termini are of high quality!
The termini are modelled by either sampling the dihedral angles or (if
fragger_handles is given) Fragger lists. The latter
is only used if the gap length is >= the length of fragments stored.
from ost import io, seq
from promod3 import modelling, loop
# setup
tpl = io.LoadPDB('data/gly.pdb')
seq_trg = 'AAAAGGGGGGGGGGGGGGGGGGGGAAAAAA'
seq_tpl = '----GGGGGGGGGGGGGGGGGGGG------'
aln = seq.CreateAlignment(seq.CreateSequence('trg', seq_trg),
seq.CreateSequence('tpl', seq_tpl))
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)
# close gaps
print('Number of gaps before: %d' % len(mhandle.gaps))
modelling.ModelTermini(mhandle,
loop.LoadTorsionSamplerCoil())
print('Number of gaps after: %d' % len(mhandle.gaps))
-
promod3.modelling. BuildSidechains (mhandle, merge_distance=4, fragment_db=None, structure_db=None, torsion_sampler=None, rotamer_library=None)
Build sidechains for model.
This is a wrapper for promod3.modelling.ReconstructSidechains() ,
followed by a check for ring punches. If ring punches are found it
introduces gaps for the residues with punched rings and tries to fill them
with FillLoopsByDatabase() with ring_punch_detection=2.
-
promod3.modelling. MinimizeModelEnergy (mhandle, max_iterations=12, max_iter_sd=20, max_iter_lbfgs=10, use_amber_ff=False, extra_force_fields=[], tolerance_sd=1.0, tolerance_lbfgs=1.0)
Minimize energy of final model using molecular mechanics.
Uses ost.mol.mm to perform energy minimization.
It will iteratively (at most max_iterations times):
- run up to max_iter_sd minimization iter. of a steepest descend method
- run up to max_iter_lbfgs minimization iter. of a Limited-memory
Broyden-Fletcher-Goldfarb-Shanno method
- abort if no stereochemical problems found
The idea is that we don’t want to minimize “too much”. So, we iteratively
minimize until there are no stereochemical problems and not more.
To speed things up, this can run on multiple CPU threads by setting the
env. variable PM3_OPENMM_CPU_THREADS to the number of desired threads.
If the variable is not set, 1 thread will be used by default.
If the starting model is so bad that the energy is NaN or Inf from the start
(happens if atoms are on top of each other or almost), the energy
minimization is aborted. This issue is logged and added as a major issue to
modelling_issues of mhandle.
Parameters: |
|
Returns: | The model including all oxygens as used in the minimizer.
|
Return type: | Entity
|
-
promod3.modelling. CheckFinalModel (mhandle)
Performs samity checks on final models and reports problems.
Issues are logged and tracked in modelling_issues
of mhandle. Major issues:
- Chains with less than 3 residues (usually due to bad templates).
- Incomplete models (i.e. some gaps couldn’t be closed). One issue is
created per unclosed gap and the stems of the gap are added to the issue.
- Complete models with sequence mismatches (should never happen).
- Residues with rings which have been punched by another bond.
Minor issues:
- Remaining stereo-chemical problems after energy minimization. The affected
residues will have the boolean property “stereo_chemical_problem_backbone”
set to True, if the problem affects backbone atoms.
- Residues that contain non-planar rings.
Parameters: | mhandle (ModellingHandle ) – Modelling handle for which to perform checks. |
-
class
promod3.modelling. ModellingIssue (text, severity, residue_list=[])
-
-
text
Description of issue.
-
severity
Severity of issue.
-
residue_list
List of residues affected by issue (or empty list if global issue).
-
class
Severity
Enumerates severities.
-
MAJOR = 10
Major issues like MM-failures, incomplete models and ring punches.
-
MINOR = 0
Minor issues like remaining stereo-chemistry problems after MM.
-
ModellingIssue. is_major ()
Returns: | True if this is a major issue. |
Return type: | bool |
-
promod3.modelling. AddModellingIssue (mhandle, text, severity, residue_list=[])
Adds a new ModellingIssue to
modelling_issues in mhandle.
If mhandle doesn’t contain the modelling_issues
attribute yet, it is added.
Alignment Fiddling
-
promod3.modelling. DeleteGapCols (aln)
Deletes alignment columns that only contain gaps.
Columns that only contain gaps (‘-‘) are removed. If no such column can be
identified, the input alignment gets returned. A new alignment gets
constructed otherwise. The sequences of the new alignment retain name, offset
and the potentially attached ost.mol.EntityView from the original
sequences.
-
promod3.modelling. PullTerminalDeletions (aln, min_terminal_anchor_size=4)
Fixes deletions close to termini.
Some alignment tools may produce alignments with deletions close to the
termini. For example:
SEQRES: A-----BCDE...
ATOMSEQ: ABCDEFGHIJ...
where A is the anchor residue. The default loop modelling pipeline would
keep the position of A fixed and start to omit structural information from
B, C, ... until it is able to resolve the deletion. If the anchor is very
short, a shift typically results in a better model:
SEQRES: -----ABCDE...
ATOMSEQ: ABCDEFGHIJ...
This function checks whether the gap closest to any termini is a deletion
(one or several ‘-‘ in the first sequence) and estimates the anchor size
(number of aligned residues towards the respective termini from that gap).
If the anchor size is smaller than min_terminal_anchor_size, a shift is
applied and the deletion removed.
This is done iteratively, until no deletion can be removed anymore.
SEQRES: A-B--CDEF...
ATOMSEQ: ABCDEFGHI...
becomes
SEQRES: ---ABCDEF...
ATOMSEQ: ABCDEFGHI...
given a min_terminal_anchor_size>2. If no shift can be performed, the
input alignment gets returned. A new alignment gets constructed otherwise.
The sequences of the new alignment retain name, offset and the potentially
attached ost.mol.EntityView from the original sequences.
|
Contents
Search
Enter search terms or a module, class or function name.
Previous topic
modelling - Protein Modelling
Next topic
Model Checking
You are here
|