Generate ost.mol.mm
systems¶
To simplify the creation of ost.mol.mm
/ OpenMM simulations for loops in
proteins, we define a system creator for loops (MmSystemCreator
) and a
specialized forcefield lookup for amino acids (ForcefieldLookup
).
The example below showcases the creation and use of an MM system:
from ost import io, geom
from promod3 import loop
# setup system creator
ff_lookup = loop.ForcefieldLookup.GetDefault()
mm_sys = loop.MmSystemCreator(ff_lookup)
# load example (has res. numbering starting at 1)
ent = io.LoadPDB("data/1CRN.pdb")
res_list = ent.residues
num_residues = len(res_list)
# get all atom positions for full protein
all_atoms = loop.AllAtomPositions(res_list)
# here full structure in res_indices but in practice this could
# be just a subset of residues relevant to the loop of interest
res_indices = list(range(num_residues))
# define two loops (indices into res_indices list)
loop_start_indices = [10, 20]
loop_lengths = [6, 4]
# define which of the res_indices is terminal
is_n_ter = [True] + [False] * (num_residues - 1)
is_c_ter = [False] * (num_residues - 1) + [True]
# get disulfid bridges
disulfid_bridges = mm_sys.GetDisulfidBridges(all_atoms, res_indices)
# setup MM system
mm_sys.SetupSystem(all_atoms, res_indices, loop_start_indices,
loop_lengths, is_n_ter, is_c_ter, disulfid_bridges)
# run simulation
sim = mm_sys.GetSimulation()
print("Potential energy before: %g" % sim.GetPotentialEnergy())
sim.ApplySD(0.01, 100)
print("Potential energy after: %g" % sim.GetPotentialEnergy())
# extract new loop positions and store it
mm_sys.ExtractLoopPositions(all_atoms, res_indices)
io.SavePDB(all_atoms.ToEntity(), "mm_sys_output.pdb")
Note
To simplify use when sidechains may be missing and the region of interest for
the MM system has to be determined, it might be better to use the simpler
promod3.modelling.SidechainReconstructor
and
promod3.modelling.AllAtomRelaxer
classes. Even if all the sidechains
are available, those classes will be helpful since the overhead to check
sidechains without reconstructing them is minimal.
Create MM systems for loops¶
- class promod3.loop.MmSystemCreator(ff_lookup, fix_surrounding_hydrogens=True, kill_electrostatics=False, nonbonded_cutoff=8, inaccurate_pot_energy=False)¶
Setup a system creator for a specific forcefield. The constructor only stores the settings. Most setup work is done by
SetupSystem()
.The idea is to have a set of movable loop residues and a set of fixed surrounding residues which interact with the loop.
- Parameters:
ff_lookup (
ForcefieldLookup
) – Forcefield to use with this system creator.fix_surrounding_hydrogens (
bool
) – If False, the hydrogens of the surrounding residues can move to improve H-bond building (True by default as it only has a minor impact on the result and a big one on performance).kill_electrostatics (
bool
) – If True, all charges are removed from the system. This is good for H-bond building, but may be bad if residues in the surrounding are missing (gaps).nonbonded_cutoff (
float
) – Defines cutoff to set for non bonded interactions. Recommended values: 5 if kill_electrostatics = True, 8 otherwise. Negative value means no cutoff.inaccurate_pot_energy (
bool
) – If True, we do not set correct non-bonded interactions between fixed atoms. This leads to inaccurate pot. energies but it is faster and has no effect on simulation runs (e.g. ApplySD) otherwise.
- GetDisulfidBridges(all_pos, res_indices)¶
- Returns:
Pairs of indices (i,j), where res_indices[i] and res_indices[j] are assumed to have a disulfid bridge (CYS-SG pairs within 2.5 A).
- Return type:
- Parameters:
all_pos (
AllAtomPositions
) – Provides positions for each res_indices[i].
- SetupSystem(all_pos, res_indices, loop_length, is_n_ter, is_c_ter, disulfid_bridges)¶
- SetupSystem(all_pos, res_indices, loop_start_indices, loop_lengths, is_n_ter, is_c_ter, disulfid_bridges)
Setup a new system for the loop / surrounding defined by all_pos and res_indices. Positions are taken from all_pos[res_indices[i]] and each of them must have all heavy atoms defined. Residue all_pos[i] is assumed to be peptide bound to all_pos[i-1] unless res. i is labelled as N-terminal. If all_pos[i-1] has an unset C (could happen for gaps while modelling), res. i will still be treated as if peptide-bound and the hydrogen attached to N is built assuming a helical structure as an approximation. Similarly, we do not generate OXT atoms for residues followed by a gap unless they are specifically labelled as C-terminal.
Each loop is defined by an entry in loop_start_indices and loop_lengths. For loop i_loop, res_indices[loop_start_indices[i_loop]] is the N-stem and res_indices[loop_start_indices[i_loop] + loop_lengths[i_loop] - 1] is the C-stem of the loop. The loop indices are expected to be contiguous in res_indices between the stems. The stems of the loop are kept fixed (N, CA, CB for N-stem / CA, CB, C, O for C-stem) unless they are terminal residues. Overlapping loops are merged and 0-length loops are removed.
If loop_length is given, a single loop with start index 0 and the given loop length is defined.
If a system was setup previously, it is overwritten completely.
If possible, this uses the “CPU” platform in OpenMM using the env. variable
PM3_OPENMM_CPU_THREADS
to define the number of desired threads (1 thread is used if variable is undefined).- Parameters:
all_pos (
AllAtomPositions
) – Provides positions for each res_indices[i].loop_length (
int
) – Length of single loop (incl. stems).loop_lengths (
list
ofint
) – Lengths of loops (incl. stems).is_n_ter (
list
ofbool
) – For each res_indices[i], is_n_ter[i] defines whether that residue is N-terminal.is_c_ter (
list
ofbool
) – For each res_indices[i], is_c_ter[i] defines whether that residue is C-terminal.disulfid_bridges (
list
oftuple
with twoint
) – Pairs of indices (i,j), where res_indices[i] and res_indices[j] form a disulfid bridge (seeGetDisulfidBridges()
)
- Raises:
RuntimeError
if loops out of bounds with respect to res_indices, loop_start_indices / loop_lengths or res_indices / is_n_ter / is_c_ter have inconsistent lengths, or if any all_pos[res_indices[i]] is invalid or has unset heavy atoms.
- UpdatePositions(all_pos, res_indices)¶
Updates the positions in the system. Even though
SetupSystem()
is already very fast, this can speed up resetting a simulation. The data must be consistent with the data used in the lastSetupSystem()
call.- Parameters:
all_pos (
AllAtomPositions
) – Provides positions for each res_indices[i].
- Raises:
RuntimeError
if data is inconsistent with lastSetupSystem()
call (same number of residues, same AA).
- ExtractLoopPositions(loop_pos)¶
- ExtractLoopPositions(out_pos, res_indices)
Extracts loop positions from the current simulation. If the simulation was run (via
GetSimulation()
), we internally have new positions for the residues corresponding to all_pos[res_indices[i]] passed inSetupSystem()
. This function then extracts these positions back into a compatible output storage.If loop_pos is passed, only the loops are stored. The loop residues are stored contiguously and the loops are stored in the order given by
GetLoopStartIndices()
/GetLoopLengths()
. The loop_pos storage must have at leastGetNumLoopResidues()
residues and must have matching amino acid types with respect to the loop residues passed as all_pos ofSetupSystem()
.If out_pos and res_indices are passed, residues must have matching amino acid types for out_pos[res_indices[i]] and all_pos[res_indices[i]] of
SetupSystem()
. Only loop residues with i in the ranges defined byGetLoopStartIndices()
/GetLoopLengths()
are touched.- Parameters:
loop_pos (
AllAtomPositions
) – Reduced storage only covering loop positions.out_pos (
AllAtomPositions
) – Storage for loop positions linked to res_indices.
- Raises:
RuntimeError
if data is inconsistent with lastSetupSystem()
call (big enough and matching AA).
- GetSimulation()¶
- Returns:
Simulation object setup by
SetupSystem()
. Use this to run MM simulations.- Return type:
- GetNumLoopResidues()¶
- Returns:
Number of loop residues (incl. stems) of current simulation setup.
- Return type:
- GetLoopStartIndices()¶
- Returns:
Start indices of loops (see
SetupSystem()
).- Return type:
- GetLoopLengths()¶
- Returns:
Lengths of loops (see
SetupSystem()
).- Return type:
- GetForcefieldAminoAcids()¶
- Returns:
Forcefield-specific amino acid type for each residue of last
SetupSystem()
call.- Return type:
- GetIndexing()¶
The atoms of residue i are stored from idx[i] to idx[i+1]-1, where idx is the list returned by this function. The atoms themselves are ordered according to the indexing defined by
ForcefieldLookup
.
- GetCpuPlatformSupport()¶
- Returns:
True, if we will use OpenMM’s “CPU” platform (enabled by default if platform is available). False, if we use “Reference” platform.
- Return type:
- SetCpuPlatformSupport(cpu_platform_support)¶
Override “CPU” platform support setting. Useful to force the use of the OpenMM’s “Reference” platform for testing (by default we use “CPU” if it is available).
- Parameters:
cpu_platform_support (
bool
) – True, if “CPU” platform desired (ignored if platform not available). False, if “Reference” platform desired.
Forcefield lookup for amino acids¶
The ForcefieldLookup
class and its helpers define a fast way to extract
FF specific data for amino acids in a protein. We distinguish amino acid types
(and a few variants thereof) which may all be N- and/or C-terminal.
- class promod3.loop.ForcefieldLookup¶
This class provides all functionality to generate
ost.mol.mm.Simulation
objects. Specifically, we can:get a consistent indexing of each atom of each residue in [0, N-1], where N =
GetNumAtoms()
(note that only OXT indexing depends on whether a residue is terminal)extract masses, charges and LJ-parameters for each atom (list of length N)
extract connectivities (
ForcefieldConnectivity
), which include all possible bonds, angles, dihedrals, impropers and LJ pairs
There is functionality to adapt the lookup and store it as needed or you can load a predefined one with
GetDefault()
orLoadCHARMM()
.The atom indexing and
GetAA()
are independent of the loaded file.- static GetDefault()¶
- Returns:
A default singleton instance (shared throughout the Python process) of this class with all data defined. Using this instance has the advantage that the object is only loaded once!
- Return type:
- static SetDefault(new_default)¶
Sets default singleton instance for all future
GetDefault()
calls.- Parameters:
new_default (
ForcefieldLookup
) – Lookup object to use as the new default.
- static LoadCHARMM()¶
- Returns:
Predefined lookup object extracted from a CHARMM forcefield (loaded from disk)
- Return type:
- static Load(filename)¶
- static LoadPortable(filename)¶
Loads raw binary file generated with
Save()
(optimized for fast reading) / portable file generated withSavePortable()
(slower but less machine-dependent).
- Save(filename)¶
- SavePortable(filename)¶
Saves a raw / portable binary representation. Use portable files for distribution and convert locally to raw files. See here for details.
- Parameters:
filename (
str
) – Path to the file where it will be saved.- Raises:
RuntimeError
if file cannot be opened.
- GetAA(ff_aa)¶
- Returns:
Amino acid type for given ff_aa
- Return type:
- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid type
- GetNumAtoms(ff_aa, is_nter, is_cter)¶
- Returns:
Number of atoms for given input.
- Return type:
- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeis_nter (
bool
) – True if N-terminal variant desiredis_cter (
bool
) – True if C-terminal variant desired
- GetHeavyIndex(ff_aa, atom_idx)¶
- GetHeavyIndex(ff_aa, atom_name)
- Returns:
Internal index for given heavy atom in [0,
GetNumAtoms()
]- Return type:
- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeatom_idx (
int
) – Atom index as returned byAminoAcidLookup.GetIndex()
atom_name (
str
) – Atom name
- GetHydrogenIndex(ff_aa, atom_idx)¶
- GetHydrogenIndex(ff_aa, atom_name)
- Returns:
Internal index for given hydrogen atom in [0,
GetNumAtoms()
]- Return type:
- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeatom_idx (
int
) – Atom index as returned byAminoAcidLookup.GetHydrogenIndex()
atom_name (
str
) – Atom name
- GetOXTIndex(ff_aa, is_nter)¶
- Returns:
Internal index of OXT atom for C-terminal residue
- Return type:
- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeis_nter (
bool
) – True if N-terminal variant desired
- GetFudgeLJ()¶
- Returns:
Dampening factor for LJ 1,4 interactions (see
ost.mol.mm.Topology.SetFudgeLJ()
)- Return type:
- GetFudgeQQ()¶
- Returns:
Dampening factor for electrostatic 1,4 interactions (see
ost.mol.mm.Topology.SetFudgeQQ()
)- Return type:
- GetMasses(ff_aa, is_nter, is_cter)¶
- Returns:
Mass for each atom (see
ost.mol.mm.Topology.SetMasses()
)- Return type:
list
offloat
(length =GetNumAtoms()
)- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeis_nter (
bool
) – True if N-terminal variant desiredis_cter (
bool
) – True if C-terminal variant desired
- GetCharges(ff_aa, is_nter, is_cter)¶
- Returns:
Charge for each atom (see
ost.mol.mm.Topology.SetCharges()
)- Return type:
list
offloat
(length =GetNumAtoms()
)- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeis_nter (
bool
) – True if N-terminal variant desiredis_cter (
bool
) – True if C-terminal variant desired
- GetSigmas(ff_aa, is_nter, is_cter)¶
- Returns:
Sigma in nm for each atom (see
ost.mol.mm.Topology.SetSigmas()
)- Return type:
list
offloat
(length =GetNumAtoms()
)- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeis_nter (
bool
) – True if N-terminal variant desiredis_cter (
bool
) – True if C-terminal variant desired
- GetEpsilons(ff_aa, is_nter, is_cter)¶
- Returns:
Epsilon in kJ/mol for each atom (see
ost.mol.mm.Topology.SetEpsilons()
)- Return type:
list
offloat
(length =GetNumAtoms()
)- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeis_nter (
bool
) – True if N-terminal variant desiredis_cter (
bool
) – True if C-terminal variant desired
- GetInternalConnectivity(ff_aa, is_nter, is_cter)¶
- Returns:
Internal connectivity of a residue
- Return type:
- Parameters:
ff_aa (
ForcefieldAminoAcid
) – Forcefield-specific amino acid typeis_nter (
bool
) – True if N-terminal variant desiredis_cter (
bool
) – True if C-terminal variant desired
- GetPeptideBoundConnectivity(ff_aa_one, ff_aa_two, is_nter, is_cter)¶
- Returns:
All connectivity which include peptide bond between two residues (additional to
GetInternalConnectivity()
)- Return type:
- Parameters:
ff_aa_one (
ForcefieldAminoAcid
) – Forcefield-specific amino acid type of first residueff_aa_two (
ForcefieldAminoAcid
) – Forcefield-specific amino acid type of second residueis_nter (
bool
) – True if first residue is N-terminalis_cter (
bool
) – True if second residue is C-terminal
- GetDisulfidConnectivity()¶
- Returns:
All connectivity which include disulfid bridge between two cysteins (additional to
GetInternalConnectivity()
)- Return type:
- SetFudgeLJ(fudge)¶
Set value for future
GetFudgeLJ()
calls to fudge.
- SetFudgeQQ(fudge)¶
Set value for future
GetFudgeQQ()
calls to fudge.
- SetMasses(ff_aa, is_nter, is_cter, masses)¶
Set value for future
GetMasses()
calls to masses.
- SetCharges(ff_aa, is_nter, is_cter, charges)¶
Set value for future
GetCharges()
calls to charges.
- SetSigmas(ff_aa, is_nter, is_cter, sigmas)¶
Set value for future
GetSigmas()
calls to sigmas.
- SetEpsilons(ff_aa, is_nter, is_cter, epsilons)¶
Set value for future
GetEpsilons()
calls to epsilons.
- SetInternalConnectivity(ff_aa, is_nter, is_cter, connectivity)¶
Set value for future
GetInternalConnectivity()
calls to connectivity.
- SetPeptideBoundConnectivity(ff_aa_one, ff_aa_two, is_nter, is_cter, connectivity)¶
Set value for future
GetPeptideBoundConnectivity()
calls to connectivity.
- SetDisulfidConnectivity(connectivity)¶
Set value for future
GetDisulfidConnectivity()
calls to connectivity.
- class promod3.loop.ForcefieldAminoAcid¶
Enumerates the amino acid types for forcefields. The first 20 values correspond to the 20 values of
ost.conop.AminoAcid
. Additionally, there are values for disulfid bridges (FF_CYS2), d-protonated histidine (FF_HISD, default for ost.conop.HIS is FF_HISE) and FF_XXX for unknown types. The full list of values is:FF_ALA, FF_ARG, FF_ASN, FF_ASP, FF_GLN, FF_GLU, FF_LYS, FF_SER, FF_CYS, FF_MET, FF_TRP, FF_TYR, FF_THR, FF_VAL, FF_ILE, FF_LEU, FF_GLY, FF_PRO FF_HISE, FF_PHE, FF_CYS2, FF_HISD, FF_XXX
- class promod3.loop.ForcefieldConnectivity¶
Contains lists of bonds, angles, dihedrals, impropers and LJ pairs (exclusions are the combination of all bonds and 1,3 pairs of angles and are not stored separately). Each type of connectivity has it’s own class (see below) storing indices and parameters to be used for methods of
ost.mol.mm.Topology
. The indexing of atoms for internal connectivities is in [0, N-1], where N =ForcefieldLookup.GetNumAtoms()
. For connectivities of pairs of residues, atoms of the first residue are in [0, N1-1] and atoms of the second one are in [N1, N1+N2-1], where N1 and N2 are the number of atoms of the two residues. For disulfid bridges, N1 = N2 = GetNumAtoms(FF_CYS2, False, False).- harmonic_bonds¶
List of harmonic bonds.
- Type:
- harmonic_angles¶
List of harmonic angles.
- Type:
- urey_bradley_angles¶
List of Urey-Bradley angles.
- Type:
- periodic_dihedrals¶
List of periodic dihedrals.
- Type:
- periodic_impropers¶
List of periodic impropers.
- Type:
- harmonic_impropers¶
List of harmonic impropers.
- Type:
- lj_pairs¶
List of LJ pairs.
- Type:
- class promod3.loop.ForcefieldBondInfo¶
Define harmonic bond (see
ost.mol.mm.Topology.AddHarmonicBond()
)
- class promod3.loop.ForcefieldHarmonicAngleInfo¶
Define harmonic angle (see
ost.mol.mm.Topology.AddHarmonicAngle()
)
- class promod3.loop.ForcefieldUreyBradleyAngleInfo¶
Define Urey-Bradley angle (see
ost.mol.mm.Topology.AddUreyBradleyAngle()
)
- class promod3.loop.ForcefieldPeriodicDihedralInfo¶
Define periodic dihedral or improper (see
ost.mol.mm.Topology.AddPeriodicDihedral()
andost.mol.mm.Topology.AddPeriodicImproper()
)
- class promod3.loop.ForcefieldHarmonicImproperInfo¶
Define harmonic improper (see
ost.mol.mm.Topology.AddHarmonicImproper()
)