OpenStructure
ligand_scoring_base.py
Go to the documentation of this file.
1 import numpy as np
2 import networkx
3 
4 from ost import mol
5 from ost import LogWarning, LogScript, LogVerbose, LogDebug
6 from ost.mol.alg import chain_mapping
7 
9  """ Scorer to compute various small molecule ligand (non polymer) scores.
10 
11  .. note ::
12  Extra requirements:
13 
14  - Python modules `numpy` and `networkx` must be available
15  (e.g. use ``pip install numpy networkx``)
16 
17  :class:`LigandScorer` is an abstract base class dealing with all the setup,
18  data storage, enumerating ligand symmetries and target/model ligand
19  matching/assignment. But actual score computation is delegated to child
20  classes.
21 
22  At the moment, two such classes are available:
23 
24  * :class:`ost.mol.alg.ligand_scoring_lddtpli.LDDTPLIScorer`
25  that assesses the conservation of protein-ligand
26  contacts
27  * :class:`ost.mol.alg.ligand_scoring_scrmsd.SCRMSDScorer`
28  that computes a binding-site superposed, symmetry-corrected RMSD.
29 
30  All versus all scores are available through the lazily computed
31  :attr:`score_matrix`. However, many things can go wrong... be it even
32  something as simple as two ligands not matching. Error states therefore
33  encode scoring issues. An Issue for a particular ligand is indicated by a
34  non-zero state in :attr:`model_ligand_states`/:attr:`target_ligand_states`.
35  This invalidates pairwise scores of such a ligand with all other ligands.
36  This and other issues in pairwise score computation are reported in
37  :attr:`state_matrix` which has the same size as :attr:`score_matrix`.
38  Only if the respective location is 0, a valid pairwise score can be
39  expected. The states and their meaning can be explored with code::
40 
41  for state_code, (short_desc, desc) in scorer_obj.state_decoding.items():
42  print(state_code)
43  print(short_desc)
44  print(desc)
45 
46  A common use case is to derive a one-to-one mapping between ligands in
47  the model and the target for which :class:`LigandScorer` provides an
48  automated assignment procedure.
49  By default, only exact matches between target and model ligands are
50  considered. This is a problem when the target only contains a subset
51  of the expected atoms (for instance if atoms are missing in an
52  experimental structure, which often happens in the PDB). With
53  `substructure_match=True`, complete model ligands can be scored against
54  partial target ligands. One problem with this approach is that it is
55  very easy to find good matches to small, irrelevant ligands like EDO, CO2
56  or GOL. The assignment algorithm therefore considers the coverage,
57  expressed as the fraction of atoms of the model ligand atoms covered in the
58  target. Higher coverage matches are prioritized, but a match with a better
59  score will be preferred if it falls within a window of `coverage_delta`
60  (by default 0.2) of a worse-scoring match. As a result, for instance,
61  with a delta of 0.2, a low-score match with coverage 0.96 would be
62  preferred over a high-score match with coverage 0.70.
63 
64  Assumptions:
65 
66  :class:`LigandScorer` generally assumes that the
67  :attr:`~ost.mol.ResidueHandle.is_ligand` property is properly set on all
68  the ligand atoms, and only ligand atoms. This is typically the case for
69  entities loaded from mmCIF (tested with mmCIF files from the PDB and
70  SWISS-MODEL). Legacy PDB files must contain `HET` headers (which is usually
71  the case for files downloaded from the PDB but not elsewhere).
72 
73  The class doesn't perform any cleanup of the provided structures.
74  It is up to the caller to ensure that the data is clean and suitable for
75  scoring. :ref:`Molck <molck>` should be used with extra
76  care, as many of the options (such as `rm_non_std` or `map_nonstd_res`) can
77  cause ligands to be removed from the structure. If cleanup with Molck is
78  needed, ligands should be kept aside and passed separately. Non-ligand
79  residues should be valid compounds with atom names following the naming
80  conventions of the component dictionary. Non-standard residues are
81  acceptable, and if the model contains a standard residue at that position,
82  only atoms with matching names will be considered.
83 
84  Unlike most of OpenStructure, this class does not assume that the ligands
85  (either in the model or the target) are part of the PDB component
86  dictionary. They may have arbitrary residue names. Residue names do not
87  have to match between the model and the target. Matching is based on
88  the calculation of isomorphisms which depend on the atom element name and
89  atom connectivity (bond order is ignored).
90  It is up to the caller to ensure that the connectivity of atoms is properly
91  set before passing any ligands to this class. Ligands with improper
92  connectivity will lead to bogus results.
93 
94  Note, however, that atom names should be unique within a residue (ie two
95  distinct atoms cannot have the same atom name).
96 
97  This only applies to the ligand. The rest of the model and target
98  structures (protein, nucleic acids) must still follow the usual rules and
99  contain only residues from the compound library.
100 
101  Although it isn't a requirement, hydrogen atoms should be removed from the
102  structures. Here is an example code snippet that will perform a reasonable
103  cleanup. Keep in mind that this is most likely not going to work as
104  expected with entities loaded from PDB files, as the `is_ligand` flag is
105  probably not set properly.
106 
107  Here is an example of how to use setup a scorer code::
108 
109  from ost.mol.alg.ligand_scoring_scrmsd import SCRMSDScorer
110  from ost.mol.alg import Molck, MolckSettings
111 
112  # Load data
113  # Structure model in PDB format, containing the receptor only
114  model = io.LoadPDB("path_to_model.pdb")
115  # Ligand model as SDF file
116  model_ligand = io.LoadEntity("path_to_ligand.sdf", format="sdf")
117  # Target loaded from mmCIF, containing the ligand
118  target = io.LoadMMCIF("path_to_target.cif")
119 
120  # Cleanup a copy of the structures
121  cleaned_model = model.Copy()
122  cleaned_target = target.Copy()
123  molck_settings = MolckSettings(rm_unk_atoms=True,
124  rm_non_std=False,
125  rm_hyd_atoms=True,
126  rm_oxt_atoms=False,
127  rm_zero_occ_atoms=False,
128  colored=False,
129  map_nonstd_res=False,
130  assign_elem=True)
131  Molck(cleaned_model, conop.GetDefaultLib(), molck_settings)
132  Molck(cleaned_target, conop.GetDefaultLib(), molck_settings)
133 
134  # Setup scorer object and compute lDDT-PLI
135  model_ligands = [model_ligand.Select("ele != H")]
136  sc = SCRMSDScorer(cleaned_model, cleaned_target, model_ligands)
137 
138  # Perform assignment and read respective scores
139  for lig_pair in sc.assignment:
140  trg_lig = sc.target_ligands[lig_pair[0]]
141  mdl_lig = sc.model_ligands[lig_pair[1]]
142  score = sc.score_matrix[lig_pair[0], lig_pair[1]]
143  print(f"Score for {trg_lig} and {mdl_lig}: {score}")
144 
145  :param model: Model structure - a deep copy is available as :attr:`model`.
146  No additional processing (ie. Molck), checks,
147  stereochemistry checks or sanitization is performed on the
148  input. Hydrogen atoms are kept.
149  :type model: :class:`ost.mol.EntityHandle`/:class:`ost.mol.EntityView`
150  :param target: Target structure - a deep copy is available as
151  :attr:`target`. No additional processing (ie. Molck), checks
152  or sanitization is performed on the input. Hydrogen atoms are
153  kept.
154  :type target: :class:`ost.mol.EntityHandle`/:class:`ost.mol.EntityView`
155  :param model_ligands: Model ligands, as a list of
156  :class:`~ost.mol.ResidueHandle` belonging to the model
157  entity. Can be instantiated with either a :class:list
158  of :class:`~ost.mol.ResidueHandle`/
159  :class:`ost.mol.ResidueView` or of
160  :class:`ost.mol.EntityHandle`/
161  :class:`ost.mol.EntityView`.
162  If `None`, ligands will be extracted based on the
163  :attr:`~ost.mol.ResidueHandle.is_ligand` flag (this is
164  normally set properly in entities loaded from mmCIF).
165  :type model_ligands: :class:`list`
166  :param target_ligands: Target ligands, as a list of
167  :class:`~ost.mol.ResidueHandle` belonging to the
168  target entity. Can be instantiated either a
169  :class:list of :class:`~ost.mol.ResidueHandle`/
170  :class:`ost.mol.ResidueView` or of
171  :class:`ost.mol.EntityHandle`/
172  :class:`ost.mol.EntityView` containing a single
173  residue each. If `None`, ligands will be extracted
174  based on the :attr:`~ost.mol.ResidueHandle.is_ligand`
175  flag (this is normally set properly in entities
176  loaded from mmCIF).
177  :type target_ligands: :class:`list`
178  :param resnum_alignments: Whether alignments between chemically equivalent
179  chains in *model* and *target* can be computed
180  based on residue numbers. This can be assumed in
181  benchmarking setups such as CAMEO/CASP.
182  :type resnum_alignments: :class:`bool`
183  :param rename_ligand_chain: If a residue with the same chain name and
184  residue number than an explicitly passed model
185  or target ligand exits in the structure,
186  and `rename_ligand_chain` is False, a
187  RuntimeError will be raised. If
188  `rename_ligand_chain` is True, the ligand will
189  be moved to a new chain instead, and the move
190  will be logged to the console with SCRIPT
191  level.
192  :type rename_ligand_chain: :class:`bool`
193  :param substructure_match: Set this to True to allow incomplete (i.e.
194  partially resolved) target ligands.
195  :type substructure_match: :class:`bool`
196  :param coverage_delta: the coverage delta for partial ligand assignment.
197  :type coverage_delta: :class:`float`
198  :param max_symmetries: If more than that many isomorphisms exist for
199  a target-ligand pair, it will be ignored and reported
200  as unassigned.
201  :type max_symmetries: :class:`int`
202  """
203 
204  def __init__(self, model, target, model_ligands=None, target_ligands=None,
205  resnum_alignments=False, rename_ligand_chain=False,
206  substructure_match=False, coverage_delta=0.2,
207  max_symmetries=1e5):
208 
209  if isinstance(model, mol.EntityView):
210  self.modelmodel = mol.CreateEntityFromView(model, False)
211  elif isinstance(model, mol.EntityHandle):
212  self.modelmodel = model.Copy()
213  else:
214  raise RuntimeError("model must be of type EntityView/EntityHandle")
215 
216  if isinstance(target, mol.EntityView):
217  self.targettarget = mol.CreateEntityFromView(target, False)
218  elif isinstance(target, mol.EntityHandle):
219  self.targettarget = target.Copy()
220  else:
221  raise RuntimeError("target must be of type EntityView/EntityHandle")
222 
223  # Extract ligands from target
224  if target_ligands is None:
225  self.target_ligandstarget_ligands = self._extract_ligands_extract_ligands(self.targettarget)
226  else:
227  self.target_ligandstarget_ligands = self._prepare_ligands_prepare_ligands(self.targettarget, target,
228  target_ligands,
229  rename_ligand_chain)
230  if len(self.target_ligandstarget_ligands) == 0:
231  LogWarning("No ligands in the target")
232 
233  # Extract ligands from model
234  if model_ligands is None:
235  self.model_ligandsmodel_ligands = self._extract_ligands_extract_ligands(self.modelmodel)
236  else:
237  self.model_ligandsmodel_ligands = self._prepare_ligands_prepare_ligands(self.modelmodel, model,
238  model_ligands,
239  rename_ligand_chain)
240  if len(self.model_ligandsmodel_ligands) == 0:
241  LogWarning("No ligands in the model")
242  if len(self.target_ligandstarget_ligands) == 0:
243  raise ValueError("No ligand in the model and in the target")
244 
245  self.resnum_alignmentsresnum_alignments = resnum_alignments
246  self.rename_ligand_chainrename_ligand_chain = rename_ligand_chain
247  self.substructure_matchsubstructure_match = substructure_match
248  self.coverage_deltacoverage_delta = coverage_delta
249  self.max_symmetriesmax_symmetries = max_symmetries
250 
251  # lazily computed attributes
252  self.__chain_mapper__chain_mapper = None
253 
254  # keep track of states
255  # simple integers instead of enums - encoding available in
256  # self.state_decoding
257  self._state_matrix_state_matrix = None
258  self._model_ligand_states_model_ligand_states = None
259  self._target_ligand_states_target_ligand_states = None
260 
261  # score matrices
262  self._score_matrix_score_matrix = None
263  self._coverage_matrix_coverage_matrix = None
264  self._aux_matrix_aux_matrix = None
265 
266  # assignment and derived data
267  self._assignment_assignment = None
268  self._score_dict_score_dict = None
269  self._aux_dict_aux_dict = None
270 
271  # human readable description of states - child class must extend with
272  # with child class specific states
273  # each state code comes with a tuple of two elements:
274  # 1) short description 2) human readable description
275  # The actual states are set in _compute_scores in :class:`LigandScorer`
276  # or _compute_score of the child class.
277  if self.substructure_matchsubstructure_match:
278  iso = "subgraph isomorphism"
279  else:
280  iso = "full graph isomorphism"
281 
282  self.state_decodingstate_decoding = \
283  {0: ("OK", "OK"),
284  1: ("identity", f"Ligands could not be matched (by {iso})"),
285  2: ("symmetries", "Too many symmetries between ligand atoms were "
286  "found - increasing max_symmetries might help"),
287  3: ("no_iso", "No fully isomorphic match could be found - enabling "
288  "substructure_match might allow a match"),
289  4: ("disconnected", "Ligand graph is disconnected"),
290  5: ("stoichiometry", "Ligand was already assigned to another ligand "
291  "(different stoichiometry)"),
292  6: ("single_ligand_issue", "Cannot compute valid pairwise score as "
293  "either model or target ligand have non-zero state."),
294  9: ("unknown", "An unknown error occured in LigandScorer")}
295 
296  @property
297  def state_matrix(self):
298  """ Encodes states of ligand pairs
299 
300  Ligand pairs can be matched and a valid score can be expected if
301  respective location in this matrix is 0.
302  Target ligands are in rows, model ligands in columns. States are encoded
303  as integers <= 9. Larger numbers encode errors for child classes.
304  Use something like ``self.state_decoding[3]`` to get a decscription.
305 
306  :rtype: :class:`~numpy.ndarray`
307  """
308  if self._state_matrix_state_matrix is None:
309  self._compute_scores_compute_scores()
310  return self._state_matrix_state_matrix
311 
312  @property
314  """ Encodes states of model ligands
315 
316  Non-zero state in any of the model ligands invalidates the full
317  respective column in :attr:`~state_matrix`.
318 
319  :rtype: :class:`~numpy.ndarray`
320  """
321  if self._model_ligand_states_model_ligand_states is None:
322  self._compute_scores_compute_scores()
323  return self._model_ligand_states_model_ligand_states
324 
325  @property
327  """ Encodes states of target ligands
328 
329  Non-zero state in any of the target ligands invalidates the full
330  respective row in :attr:`~state_matrix`.
331 
332  :rtype: :class:`~numpy.ndarray`
333  """
334  if self._target_ligand_states_target_ligand_states is None:
335  self._compute_scores_compute_scores()
336  return self._target_ligand_states_target_ligand_states
337 
338  @property
339  def score_matrix(self):
340  """ Get the matrix of scores.
341 
342  Target ligands are in rows, model ligands in columns.
343 
344  NaN values indicate that no value could be computed (i.e. different
345  ligands). In other words: values are only valid if the respective
346  location in :attr:`~state_matrix` is 0.
347 
348  :rtype: :class:`~numpy.ndarray`
349  """
350  if self._score_matrix_score_matrix is None:
351  self._compute_scores_compute_scores()
352  return self._score_matrix_score_matrix
353 
354  @property
355  def coverage_matrix(self):
356  """ Get the matrix of model ligand atom coverage in the target.
357 
358  Target ligands are in rows, model ligands in columns.
359 
360  NaN values indicate that no value could be computed (i.e. different
361  ligands). In other words: values are only valid if the respective
362  location in :attr:`~state_matrix` is 0. If `substructure_match=False`,
363  only full match isomorphisms are considered, and therefore only values
364  of 1.0 can be observed.
365 
366  :rtype: :class:`~numpy.ndarray`
367  """
368  if self._coverage_matrix_coverage_matrix is None:
369  self._compute_scores_compute_scores()
370  return self._coverage_matrix_coverage_matrix
371 
372  @property
373  def aux_matrix(self):
374  """ Get the matrix of scorer specific auxiliary data.
375 
376  Target ligands are in rows, model ligands in columns.
377 
378  Auxiliary data consists of arbitrary data dicts which allow a child
379  class to provide additional information for a scored ligand pair.
380  empty dictionaries indicate that the child class simply didn't return
381  anything or that no value could be computed (e.g. different ligands).
382  In other words: values are only valid if respective location in the
383  :attr:`~state_matrix` is 0.
384 
385  :rtype: :class:`~numpy.ndarray`
386  """
387  if self._aux_matrix_aux_matrix is None:
388  self._compute_scores_compute_scores()
389  return self._aux_matrix_aux_matrix
390 
391  @property
392  def assignment(self):
393  """ Ligand assignment based on computed scores
394 
395  Implements a greedy algorithm to assign target and model ligands
396  with each other. Starts from each valid ligand pair as indicated
397  by a state of 0 in :attr:`state_matrix`. Each iteration first selects
398  high coverage pairs. Given max_coverage defined as the highest
399  coverage observed in the available pairs, all pairs with coverage
400  in [max_coverage-*coverage_delta*, max_coverage] are selected.
401  The best scoring pair among those is added to the assignment
402  and the whole process is repeated until there are no ligands to
403  assign anymore.
404 
405  :rtype: :class:`list` of :class:`tuple` (trg_lig_idx, mdl_lig_idx)
406  """
407  if self._assignment_assignment is None:
408  self._assignment_assignment = list()
409  # Build working array that contains tuples for all mdl/trg ligand
410  # pairs with valid score as indicated by a state of 0:
411  # (score, coverage, trg_ligand_idx, mdl_ligand_idx)
412  tmp = list()
413  for trg_idx in range(self.score_matrixscore_matrix.shape[0]):
414  for mdl_idx in range(self.score_matrixscore_matrix.shape[1]):
415  if self.state_matrixstate_matrix[trg_idx, mdl_idx] == 0:
416  tmp.append((self.score_matrixscore_matrix[trg_idx, mdl_idx],
417  self.coverage_matrixcoverage_matrix[trg_idx, mdl_idx],
418  trg_idx, mdl_idx))
419 
420  # sort by score, such that best scoring item is in front
421  if self._score_dir_score_dir() == '+':
422  tmp.sort(reverse=True)
423  elif self._score_dir_score_dir() == '-':
424  tmp.sort()
425  else:
426  raise RuntimeError("LigandScorer._score_dir must return one in "
427  "['+', '-']")
428 
429  while len(tmp) > 0:
430  # select high coverage ligand pairs in working array
431  coverage_thresh = max([x[1] for x in tmp]) - self.coverage_deltacoverage_delta
432  top_coverage = [x for x in tmp if x[1] >= coverage_thresh]
433 
434  # working array is sorted by score => just pick first one
435  a = top_coverage[0][2] # selected trg_ligand_idx
436  b = top_coverage[0][3] # selected mdl_ligand_idx
437  self._assignment_assignment.append((a, b))
438 
439  # kick out remaining pairs involving these ligands
440  tmp = [x for x in tmp if (x[2] != a and x[3] != b)]
441 
442  return self._assignment_assignment
443 
444  @property
445  def score(self):
446  """ Get a dictionary of score values, keyed by model ligand
447 
448  Extract score with something like:
449  ``scorer.score[lig.GetChain().GetName()][lig.GetNumber()]``.
450  The returned scores are based on :attr:`~assignment`.
451 
452  :rtype: :class:`dict`
453  """
454  if self._score_dict_score_dict is None:
455  self._score_dict_score_dict = dict()
456  for (trg_lig_idx, mdl_lig_idx) in self.assignmentassignment:
457  mdl_lig = self.model_ligandsmodel_ligands[mdl_lig_idx]
458  cname = mdl_lig.GetChain().GetName()
459  rnum = mdl_lig.GetNumber()
460  if cname not in self._score_dict_score_dict:
461  self._score_dict_score_dict[cname] = dict()
462  score = self.score_matrixscore_matrix[trg_lig_idx, mdl_lig_idx]
463  self._score_dict_score_dict[cname][rnum] = score
464  return self._score_dict_score_dict
465 
466  @property
467  def aux(self):
468  """ Get a dictionary of score details, keyed by model ligand
469 
470  Extract dict with something like:
471  ``scorer.score[lig.GetChain().GetName()][lig.GetNumber()]``.
472  The returned info dicts are based on :attr:`~assignment`. The content is
473  documented in the respective child class.
474 
475  :rtype: :class:`dict`
476  """
477  if self._aux_dict_aux_dict is None:
478  self._aux_dict_aux_dict = dict()
479  for (trg_lig_idx, mdl_lig_idx) in self.assignmentassignment:
480  mdl_lig = self.model_ligandsmodel_ligands[mdl_lig_idx]
481  cname = mdl_lig.GetChain().GetName()
482  rnum = mdl_lig.GetNumber()
483  if cname not in self._aux_dict_aux_dict:
484  self._aux_dict_aux_dict[cname] = dict()
485  d = self.aux_matrixaux_matrix[trg_lig_idx, mdl_lig_idx]
486  self._aux_dict_aux_dict[cname][rnum] = d
487  return self._aux_dict_aux_dict
488 
489  @property
491  """ Get indices of target ligands which are not assigned
492 
493  :rtype: :class:`list` of :class:`int`
494  """
495  # compute on-the-fly, no need for caching
496  assigned = set([x[0] for x in self.assignmentassignment])
497  return [x for x in range(len(self.target_ligandstarget_ligands)) if x not in assigned]
498 
499  @property
501  """ Get indices of model ligands which are not assigned
502 
503  :rtype: :class:`list` of :class:`int`
504  """
505  # compute on-the-fly, no need for caching
506  assigned = set([x[1] for x in self.assignmentassignment])
507  return [x for x in range(len(self.model_ligandsmodel_ligands)) if x not in assigned]
508 
509  def get_target_ligand_state_report(self, trg_lig_idx):
510  """ Get summary of states observed with respect to all model ligands
511 
512  Mainly for debug purposes
513 
514  :param trg_lig_idx: Index of target ligand for which report should be
515  generated
516  :type trg_lig_idx: :class:`int`
517  """
518  return self._get_report_get_report(self.target_ligand_statestarget_ligand_states[trg_lig_idx],
519  self.state_matrixstate_matrix[trg_lig_idx,:])
520 
521  def get_model_ligand_state_report(self, mdl_lig_idx):
522  """ Get summary of states observed with respect to all target ligands
523 
524  Mainly for debug purposes
525 
526  :param mdl_lig_idx: Index of model ligand for which report should be
527  generated
528  :type mdl_lig_idx: :class:`int`
529  """
530  return self._get_report_get_report(self.model_ligand_statesmodel_ligand_states[mdl_lig_idx],
531  self.state_matrixstate_matrix[:, mdl_lig_idx])
532 
533  def _get_report(self, ligand_state, pair_states):
534  """ Helper
535  """
536  pair_report = list()
537  for s in np.unique(pair_states):
538  desc = self.state_decodingstate_decoding[s]
539  indices = np.flatnonzero(pair_states == s).tolist()
540  pair_report.append({"state": s,
541  "short desc": desc[0],
542  "desc": desc[1],
543  "indices": indices})
544 
545  desc = self.state_decodingstate_decoding[ligand_state]
546  ligand_report = {"state": ligand_state,
547  "short desc": desc[0],
548  "desc": desc[1]}
549 
550  return (ligand_report, pair_report)
551 
553  """ Makes an educated guess why target ligand is not assigned
554 
555  This either returns actual error states or custom states that are
556  derived from them.
557 
558  :param trg_lig_idx: Index of target ligand
559  :type trg_lig_idx: :class:`int`
560  :returns: :class:`tuple` with two elements: 1) keyword 2) human readable
561  sentence describing the issue, (\"unknown\",\"unknown\") if
562  nothing obvious can be found.
563  :raises: :class:`RuntimeError` if specified target ligand is assigned
564  """
565  if trg_lig_idx not in self.unassigned_target_ligandsunassigned_target_ligands:
566  raise RuntimeError("Specified target ligand is not unassigned")
567 
568  # hardcoded tuple if there is simply nothing we can assign it to
569  if len(self.model_ligandsmodel_ligands) == 0:
570  return ("no_ligand", "No ligand in the model")
571 
572  # if something with the ligand itself is wrong, we can be pretty sure
573  # thats why the ligand is unassigned
574  if self.target_ligand_statestarget_ligand_states[trg_lig_idx] != 0:
575  return self.state_decodingstate_decoding[self.target_ligand_statestarget_ligand_states[trg_lig_idx]]
576 
577  # The next best guess comes from looking at pair states
578  tmp = np.unique(self.state_matrixstate_matrix[trg_lig_idx,:])
579 
580  # In case of any 0, it could have been assigned so it's probably
581  # just not selected due to different stoichiometry - this is no
582  # defined state, we just return a hardcoded tuple in this case
583  if 0 in tmp:
584  return ("stoichiometry",
585  "Ligand was already assigned to an other "
586  "model ligand (different stoichiometry)")
587 
588  # maybe its a symmetry issue?
589  if 2 in tmp:
590  return self.state_decodingstate_decoding[2]
591 
592  # if the state is 6 (single_ligand_issue), there is an issue with its
593  # target counterpart.
594  if 6 in tmp:
595  mdl_idx = np.where(self.state_matrixstate_matrix[trg_lig_idx,:]==6)[0]
596  # we're reporting everything except disconnected error...
597  # don't ask...
598  for i in mdl_idx:
599  if self.model_ligand_statesmodel_ligand_states[i] == 0:
600  raise RuntimeError("This should never happen")
601  if self.model_ligand_statesmodel_ligand_states[i] != 4:
602  return self.state_decodingstate_decoding[self.model_ligand_statesmodel_ligand_states[i]]
603 
604  # get rid of remaining single ligand issues (only disconnected error)
605  if 6 in tmp and len(tmp) > 1:
606  tmp = tmp[tmp!=6]
607 
608  # prefer everything over identity state
609  if 1 in tmp and len(tmp) > 1:
610  tmp = tmp[tmp!=1]
611 
612  # just return whatever is left
613  return self.state_decodingstate_decoding[tmp[0]]
614 
615 
616  def guess_model_ligand_unassigned_reason(self, mdl_lig_idx):
617  """ Makes an educated guess why model ligand is not assigned
618 
619  This either returns actual error states or custom states that are
620  derived from them.
621 
622  :param mdl_lig_idx: Index of model ligand
623  :type mdl_lig_idx: :class:`int`
624  :returns: :class:`tuple` with two elements: 1) keyword 2) human readable
625  sentence describing the issue, (\"unknown\",\"unknown\") if
626  nothing obvious can be found.
627  :raises: :class:`RuntimeError` if specified model ligand is assigned
628  """
629  if mdl_lig_idx not in self.unassigned_model_ligandsunassigned_model_ligands:
630  raise RuntimeError("Specified model ligand is not unassigned")
631 
632  # hardcoded tuple if there is simply nothing we can assign it to
633  if len(self.target_ligandstarget_ligands) == 0:
634  return ("no_ligand", "No ligand in the target")
635 
636  # if something with the ligand itself is wrong, we can be pretty sure
637  # thats why the ligand is unassigned
638  if self.model_ligand_statesmodel_ligand_states[mdl_lig_idx] != 0:
639  return self.state_decodingstate_decoding[self.model_ligand_statesmodel_ligand_states[mdl_lig_idx]]
640 
641  # The next best guess comes from looking at pair states
642  tmp = np.unique(self.state_matrixstate_matrix[:,mdl_lig_idx])
643 
644  # In case of any 0, it could have been assigned so it's probably
645  # just not selected due to different stoichiometry - this is no
646  # defined state, we just return a hardcoded tuple in this case
647  if 0 in tmp:
648  return ("stoichiometry",
649  "Ligand was already assigned to an other "
650  "model ligand (different stoichiometry)")
651 
652  # maybe its a symmetry issue?
653  if 2 in tmp:
654  return self.state_decodingstate_decoding[2]
655 
656  # if the state is 6 (single_ligand_issue), there is an issue with its
657  # target counterpart.
658  if 6 in tmp:
659  trg_idx = np.where(self.state_matrixstate_matrix[:,mdl_lig_idx]==6)[0]
660  # we're reporting everything except disconnected error...
661  # don't ask...
662  for i in trg_idx:
663  if self.target_ligand_statestarget_ligand_states[i] == 0:
664  raise RuntimeError("This should never happen")
665  if self.target_ligand_statestarget_ligand_states[i] != 4:
666  return self.state_decodingstate_decoding[self.target_ligand_statestarget_ligand_states[i]]
667 
668  # get rid of remaining single ligand issues (only disconnected error)
669  if 6 in tmp and len(tmp) > 1:
670  tmp = tmp[tmp!=6]
671 
672  # prefer everything over identity state
673  if 1 in tmp and len(tmp) > 1:
674  tmp = tmp[tmp!=1]
675 
676  # just return whatever is left
677  return self.state_decodingstate_decoding[tmp[0]]
678 
679  @property
681  return_dict = dict()
682  for i in self.unassigned_model_ligandsunassigned_model_ligands:
683  lig = self.model_ligandsmodel_ligands[i]
684  cname = lig.GetChain().GetName()
685  rnum = lig.GetNumber()
686  if cname not in return_dict:
687  return_dict[cname] = dict()
688  return_dict[cname][rnum] = \
689  self.guess_model_ligand_unassigned_reasonguess_model_ligand_unassigned_reason(i)[0]
690  return return_dict
691 
692  @property
694  return_dict = dict()
695  for i in self.unassigned_target_ligandsunassigned_target_ligands:
696  lig = self.target_ligandstarget_ligands[i]
697  cname = lig.GetChain().GetName()
698  rnum = lig.GetNumber()
699  if cname not in return_dict:
700  return_dict[cname] = dict()
701  return_dict[cname][rnum] = \
702  self.guess_target_ligand_unassigned_reasonguess_target_ligand_unassigned_reason(i)[0]
703  return return_dict
704 
705  @property
706  def _chain_mapper(self):
707  """ Chain mapper object for the given :attr:`target`.
708 
709  Can be used by child classes if needed, constructed with
710  *resnum_alignments* flag
711 
712  :type: :class:`ost.mol.alg.chain_mapping.ChainMapper`
713  """
714  if self.__chain_mapper__chain_mapper is None:
715  self.__chain_mapper__chain_mapper = \
717  n_max_naive=1e9,
718  resnum_alignments=self.resnum_alignmentsresnum_alignments)
719  return self.__chain_mapper__chain_mapper
720 
721  @staticmethod
722  def _extract_ligands(entity):
723  """Extract ligands from entity. Return a list of residues.
724 
725  Assumes that ligands have the :attr:`~ost.mol.ResidueHandle.is_ligand`
726  flag set. This is typically the case for entities loaded from mmCIF
727  (tested with mmCIF files from the PDB and SWISS-MODEL).
728  Legacy PDB files must contain `HET` headers (which is usually the
729  case for files downloaded from the PDB but not elsewhere).
730 
731  This function performs basic checks to ensure that the residues in this
732  chain are not forming polymer bonds (ie peptide/nucleotide ligands) and
733  will raise a RuntimeError if this assumption is broken.
734 
735  :param entity: the entity to extract ligands from
736  :type entity: :class:`~ost.mol.EntityHandle`
737  :rtype: :class:`list` of :class:`~ost.mol.ResidueHandle`
738 
739  """
740  extracted_ligands = []
741  for residue in entity.residues:
742  if residue.is_ligand:
743  if mol.InSequence(residue, residue.next):
744  raise RuntimeError("Residue %s connected in polymer sequen"
745  "ce %s" % (residue.qualified_name))
746  extracted_ligands.append(residue)
747  LogVerbose("Detected residue %s as ligand" % residue)
748  return extracted_ligands
749 
750  @staticmethod
751  def _prepare_ligands(new_entity, old_entity, ligands, rename_chain):
752  """Prepare the ligands given into a list of ResidueHandles which are
753  part of the copied entity, suitable for the model_ligands and
754  target_ligands properties.
755 
756  This function takes a list of ligands as (Entity|Residue)(Handle|View).
757  Entities can contain multiple ligands, which will be considered as
758  separate ligands.
759 
760  Ligands which are part of the entity are simply fetched in the new
761  copied entity. Otherwise, they are copied over to the copied entity.
762  """
763  extracted_ligands = []
764 
765  next_chain_num = 1
766  new_editor = None
767 
768  def _copy_residue(residue, rename_chain):
769  """ Copy the residue into the new chain.
770  Return the new residue handle."""
771  nonlocal next_chain_num, new_editor
772 
773  # Instantiate the editor
774  if new_editor is None:
775  new_editor = new_entity.EditXCS()
776 
777  new_chain = new_entity.FindChain(residue.chain.name)
778  if not new_chain.IsValid():
779  new_chain = new_editor.InsertChain(residue.chain.name)
780  new_editor.SetChainType(new_chain,
781  mol.ChainType.CHAINTYPE_NON_POLY)
782  else:
783  # Does a residue with the same name already exist?
784  already_exists = new_chain.FindResidue(residue.number).IsValid()
785  if already_exists:
786  if rename_chain:
787  chain_ext = 2 # Extend the chain name by this
788  while True:
789  new_chain_name = \
790  residue.chain.name + "_" + str(chain_ext)
791  new_chain = new_entity.FindChain(new_chain_name)
792  if new_chain.IsValid():
793  chain_ext += 1
794  continue
795  else:
796  new_chain = \
797  new_editor.InsertChain(new_chain_name)
798  break
799  LogScript("Moved ligand residue %s to new chain %s" % (
800  residue.qualified_name, new_chain.name))
801  else:
802  msg = \
803  "A residue number %s already exists in chain %s" % (
804  residue.number, residue.chain.name)
805  raise RuntimeError(msg)
806 
807  # Add the residue with its original residue number
808  new_res = new_editor.AppendResidue(new_chain, residue.name,
809  residue.number)
810  # Add atoms
811  for old_atom in residue.atoms:
812  new_editor.InsertAtom(new_res, old_atom.name, old_atom.pos,
813  element=old_atom.element, occupancy=old_atom.occupancy,
814  b_factor=old_atom.b_factor, is_hetatm=old_atom.is_hetatom)
815  # Add bonds
816  for old_atom in residue.atoms:
817  for old_bond in old_atom.bonds:
818  new_first = new_res.FindAtom(old_bond.first.name)
819  new_second = new_res.FindAtom(old_bond.second.name)
820  new_editor.Connect(new_first, new_second)
821  return new_res
822 
823  def _process_ligand_residue(res, rename_chain):
824  """Copy or fetch the residue. Return the residue handle."""
825  new_res = None
826  if res.entity.handle == old_entity.handle:
827  # Residue is part of the old_entity handle.
828  # However, it may not be in the copied one, for instance it may
829  # have been a view We try to grab it first, otherwise we copy it
830  new_res = new_entity.FindResidue(res.chain.name, res.number)
831  if new_res and new_res.valid:
832  qname = res.handle.qualified_name
833  LogVerbose("Ligand residue %s already in entity" % qname)
834  else:
835  # Residue is not part of the entity, need to copy it first
836  new_res = _copy_residue(res, rename_chain)
837  qname = res.handle.qualified_name
838  LogVerbose("Copied ligand residue %s" % qname)
839  new_res.SetIsLigand(True)
840  return new_res
841 
842  for ligand in ligands:
843  is_eh = isinstance(ligand, mol.EntityHandle)
844  is_ev = isinstance(ligand, mol.EntityView)
845  is_rh = isinstance(ligand, mol.ResidueHandle)
846  is_rv = isinstance(ligand, mol.ResidueView)
847  if is_eh or is_ev:
848  for residue in ligand.residues:
849  new_residue = _process_ligand_residue(residue, rename_chain)
850  extracted_ligands.append(new_residue)
851  elif is_rh or is_rv:
852  new_residue = _process_ligand_residue(ligand, rename_chain)
853  extracted_ligands.append(new_residue)
854  else:
855  raise RuntimeError("Ligands should be given as Entity or "
856  "Residue")
857 
858  if new_editor is not None:
859  new_editor.UpdateICS()
860  return extracted_ligands
861 
862  def _compute_scores(self):
863  """
864  Compute score for every possible target-model ligand pair and store the
865  result in internal matrices.
866  """
867 
870  shape = (len(self.target_ligandstarget_ligands), len(self.model_ligandsmodel_ligands))
871  self._score_matrix_score_matrix = np.full(shape, np.nan, dtype=np.float32)
872  self._coverage_matrix_coverage_matrix = np.full(shape, np.nan, dtype=np.float32)
873  self._state_matrix_state_matrix = np.full(shape, 0, dtype=np.int32)
874  self._model_ligand_states_model_ligand_states = np.zeros(len(self.model_ligandsmodel_ligands))
875  self._target_ligand_states_target_ligand_states = np.zeros(len(self.target_ligandstarget_ligands))
876  self._aux_matrix_aux_matrix = np.empty(shape, dtype=dict)
877 
878  # precompute ligand graphs
879  target_graphs = \
880  [_ResidueToGraph(l, by_atom_index=True) for l in self.target_ligandstarget_ligands]
881  model_graphs = \
882  [_ResidueToGraph(l, by_atom_index=True) for l in self.model_ligandsmodel_ligands]
883 
884  for g_idx, g in enumerate(target_graphs):
885  if not networkx.is_connected(g):
886  self._target_ligand_states_target_ligand_states[g_idx] = 4
887  self._state_matrix_state_matrix[g_idx,:] = 6
888  msg = "Disconnected graph observed for target ligand "
889  msg += str(self.target_ligandstarget_ligands[g_idx])
890  LogVerbose(msg)
891 
892  for g_idx, g in enumerate(model_graphs):
893  if not networkx.is_connected(g):
894  self._model_ligand_states_model_ligand_states[g_idx] = 4
895  self._state_matrix_state_matrix[:,g_idx] = 6
896  msg = "Disconnected graph observed for model ligand "
897  msg += str(self.model_ligandsmodel_ligands[g_idx])
898  LogVerbose(msg)
899 
900 
901  for target_id, target_ligand in enumerate(self.target_ligandstarget_ligands):
902  LogVerbose("Analyzing target ligand %s" % target_ligand)
903 
904  if self._target_ligand_states_target_ligand_states[target_id] == 4:
905  # Disconnected graph - already updated states and reported
906  # to LogVerbose
907  continue
908 
909  for model_id, model_ligand in enumerate(self.model_ligandsmodel_ligands):
910  LogVerbose("Compare to model ligand %s" % model_ligand)
911 
912 
915 
916  if self._model_ligand_states_model_ligand_states[model_id] == 4:
917  # Disconnected graph - already updated states and reported
918  # to LogVerbose
919  continue
920 
921  try:
922  symmetries = ComputeSymmetries(
923  model_ligand, target_ligand,
924  substructure_match=self.substructure_matchsubstructure_match,
925  by_atom_index=True,
926  max_symmetries=self.max_symmetriesmax_symmetries,
927  model_graph = model_graphs[model_id],
928  target_graph = target_graphs[target_id])
929  LogVerbose("Ligands %s and %s symmetry match" % (
930  str(model_ligand), str(target_ligand)))
931  except NoSymmetryError:
932  # Ligands are different - skip
933  LogVerbose("No symmetry between %s and %s" % (
934  str(model_ligand), str(target_ligand)))
935  self._state_matrix_state_matrix[target_id, model_id] = 1
936  continue
937  except TooManySymmetriesError:
938  # Ligands are too symmetrical - skip
939  LogVerbose("Too many symmetries between %s and %s" % (
940  str(model_ligand), str(target_ligand)))
941  self._state_matrix_state_matrix[target_id, model_id] = 2
942  continue
943  except NoIsomorphicSymmetryError:
944  # Ligands are different - skip
945  LogVerbose("No isomorphic symmetry between %s and %s" % (
946  str(model_ligand), str(target_ligand)))
947  self._state_matrix_state_matrix[target_id, model_id] = 3
948  continue
949  except DisconnectedGraphError:
950  # this should never happen as we guard against
951  # DisconnectedGraphError when precomputing the graph
952  LogVerbose("Disconnected graph observed for %s and %s" % (
953  str(model_ligand), str(target_ligand)))
954  # just set both ligand states to 4
955  self._model_ligand_states_model_ligand_states[model_id] = 4
956  self._model_ligand_states_model_ligand_states[target_id] = 4
957  self._state_matrix_state_matrix[target_id, model_id] = 6
958  continue
959 
960 
963  score, pair_state, target_ligand_state, model_ligand_state, aux\
964  = self._compute_compute(symmetries, target_ligand, model_ligand)
965 
966 
969 
970  # Ensure that returned states are associated with a
971  # description. This is a requirement when subclassing
972  # LigandScorer => state_decoding dict from base class must
973  # be modified in subclass constructor
974  if pair_state not in self.state_decodingstate_decoding or \
975  target_ligand_state not in self.state_decodingstate_decoding or \
976  model_ligand_state not in self.state_decodingstate_decoding:
977  raise RuntimeError(f"Subclass returned state "
978  f"\"{state}\" for which no "
979  f"description is available. Point "
980  f"the developer of the used scorer "
981  f"to this error message.")
982 
983  # if any of the ligand states is non-zero, this must be marked
984  # by the scorer subclass by specifying a specific pair state
985  if target_ligand_state != 0 and pair_state != 6:
986  raise RuntimeError("Observed non-zero target-ligand "
987  "state which must trigger certain "
988  "pair state. Point the developer of "
989  "the used scorer to this error message")
990 
991  if model_ligand_state != 0 and pair_state != 6:
992  raise RuntimeError("Observed non-zero model-ligand "
993  "state which must trigger certain "
994  "pair state. Point the developer of "
995  "the used scorer to this error message")
996 
997  self._state_matrix_state_matrix[target_id, model_id] = pair_state
998  self._target_ligand_states_target_ligand_states[target_id] = target_ligand_state
999  self._model_ligand_states_model_ligand_states[model_id] = model_ligand_state
1000  if pair_state == 0:
1001  if score is None or np.isnan(score):
1002  raise RuntimeError("LigandScorer returned invalid "
1003  "score despite 0 error state")
1004  # it's a valid score!
1005  self._score_matrix_score_matrix[target_id, model_id] = score
1006  cvg = len(symmetries[0][0]) / len(model_ligand.atoms)
1007  self._coverage_matrix_coverage_matrix[target_id, model_id] = cvg
1008  self._aux_matrix_aux_matrix[target_id, model_id] = aux
1009 
1010  def _compute(self, symmetries, target_ligand, model_ligand):
1011  """ Compute score for specified ligand pair - defined by child class
1012 
1013  Raises :class:`NotImplementedError` if not implemented by child class.
1014 
1015  :param symmetries: Defines symmetries between *target_ligand* and
1016  *model_ligand*. Return value of
1017  :func:`ComputeSymmetries`
1018  :type symmetries: :class:`list` of :class:`tuple` with two elements
1019  each: 1) :class:`list` of atom indices in
1020  *target_ligand* 2) :class:`list` of respective atom
1021  indices in *model_ligand*
1022  :param target_ligand: The target ligand
1023  :type target_ligand: :class:`ost.mol.ResidueHandle` or
1024  :class:`ost.mol.ResidueView`
1025  :param model_ligand: The model ligand
1026  :type model_ligand: :class:`ost.mol.ResidueHandle` or
1027  :class:`ost.mol.ResidueView`
1028 
1029  :returns: A :class:`tuple` with three elements: 1) a score
1030  (:class:`float`) 2) state (:class:`int`).
1031  3) auxiliary data for this ligand pair (:class:`dict`).
1032  If state is 0, the score and auxiliary data will be
1033  added to :attr:`~score_matrix` and :attr:`~aux_matrix` as well
1034  as the respective value in :attr:`~coverage_matrix`.
1035  Returned score must be valid in this case (not None/NaN).
1036  Child specific non-zero states must be >= 10.
1037  """
1038  raise NotImplementedError("_compute must be implemented by child "
1039  "class")
1040 
1041  def _score_dir(self):
1042  """ Return direction of score - defined by child class
1043 
1044  Relevant for ligand assignment. Must return a string in ['+', '-'].
1045  '+' for ascending scores, i.e. higher is better (lddt etc.)
1046  '-' for descending scores, i.e. lower is better (rmsd etc.)
1047  """
1048  raise NotImplementedError("_score_dir must be implemented by child "
1049  "class")
1050 
1051 
1052 def _ResidueToGraph(residue, by_atom_index=False):
1053  """Return a NetworkX graph representation of the residue.
1054 
1055  :param residue: the residue from which to derive the graph
1056  :type residue: :class:`ost.mol.ResidueHandle` or
1057  :class:`ost.mol.ResidueView`
1058  :param by_atom_index: Set this parameter to True if you need the nodes to
1059  be labeled by atom index (within the residue).
1060  Otherwise, if False, the nodes will be labeled by
1061  atom names.
1062  :type by_atom_index: :class:`bool`
1063  :rtype: :class:`~networkx.classes.graph.Graph`
1064 
1065  Nodes are labeled with the Atom's uppercase
1066  :attr:`~ost.mol.AtomHandle.element`.
1067  """
1068  nxg = networkx.Graph()
1069 
1070  for atom in residue.atoms:
1071  nxg.add_node(atom.name, element=atom.element.upper())
1072 
1073  # This will list all edges twice - once for every atom of the pair.
1074  # But as of NetworkX 3.0 adding the same edge twice has no effect,
1075  # so we're good.
1076  nxg.add_edges_from([(
1077  b.first.name,
1078  b.second.name) for a in residue.atoms for b in a.GetBondList()])
1079 
1080  if by_atom_index:
1081  nxg = networkx.relabel_nodes(nxg,
1082  {a: b for a, b in zip(
1083  [a.name for a in residue.atoms],
1084  range(len(residue.atoms)))},
1085  True)
1086  return nxg
1087 
1088 def ComputeSymmetries(model_ligand, target_ligand, substructure_match=False,
1089  by_atom_index=False, return_symmetries=True,
1090  max_symmetries=1e6, model_graph = None,
1091  target_graph = None):
1092  """Return a list of symmetries (isomorphisms) of the model onto the target
1093  residues.
1094 
1095  :param model_ligand: The model ligand
1096  :type model_ligand: :class:`ost.mol.ResidueHandle` or
1097  :class:`ost.mol.ResidueView`
1098  :param target_ligand: The target ligand
1099  :type target_ligand: :class:`ost.mol.ResidueHandle` or
1100  :class:`ost.mol.ResidueView`
1101  :param substructure_match: Set this to True to allow partial ligands
1102  in the reference.
1103  :type substructure_match: :class:`bool`
1104  :param by_atom_index: Set this parameter to True if you need the symmetries
1105  to refer to atom index (within the residue).
1106  Otherwise, if False, the symmetries refer to atom
1107  names.
1108  :type by_atom_index: :class:`bool`
1109  :type return_symmetries: If Truthy, return the mappings, otherwise simply
1110  return True if a mapping is found (and raise if
1111  no mapping is found). This is useful to quickly
1112  find out if a mapping exist without the expensive
1113  step to find all the mappings.
1114  :type return_symmetries: :class:`bool`
1115  :param max_symmetries: If more than that many isomorphisms exist, raise
1116  a :class:`TooManySymmetriesError`. This can only be assessed by
1117  generating at least that many isomorphisms and can take some time.
1118  :type max_symmetries: :class:`int`
1119  :raises: :class:`NoSymmetryError` when no symmetry can be found;
1120  :class:`NoIsomorphicSymmetryError` in case of isomorphic
1121  subgraph but *substructure_match* is False;
1122  :class:`TooManySymmetriesError` when more than `max_symmetries`
1123  isomorphisms are found; :class:`DisconnectedGraphError` if
1124  graph for *model_ligand*/*target_ligand* is disconnected.
1125  """
1126 
1127  # Get the Graphs of the ligands
1128  if model_graph is None:
1129  model_graph = _ResidueToGraph(model_ligand,
1130  by_atom_index=by_atom_index)
1131 
1132  if not networkx.is_connected(model_graph):
1133  msg = "Disconnected graph for model ligand %s" % model_ligand
1134  raise DisconnectedGraphError(msg)
1135 
1136 
1137  if target_graph is None:
1138  target_graph = _ResidueToGraph(target_ligand,
1139  by_atom_index=by_atom_index)
1140 
1141  if not networkx.is_connected(target_graph):
1142  msg = "Disconnected graph for target ligand %s" % target_ligand
1143  raise DisconnectedGraphError(msg)
1144 
1145  # Note the argument order (model, target) which differs from spyrmsd.
1146  # This is because a subgraph of model is isomorphic to target - but not the
1147  # opposite as we only consider partial ligands in the reference.
1148  # Make sure to generate the symmetries correctly in the end
1149  gm = networkx.algorithms.isomorphism.GraphMatcher(
1150  model_graph, target_graph, node_match=lambda x, y:
1151  x["element"] == y["element"])
1152  if gm.is_isomorphic():
1153  if not return_symmetries:
1154  return True
1155  symmetries = []
1156  for i, isomorphism in enumerate(gm.isomorphisms_iter()):
1157  if i >= max_symmetries:
1158  raise TooManySymmetriesError(
1159  "Too many symmetries between %s and %s" % (
1160  str(model_ligand), str(target_ligand)))
1161  symmetries.append((list(isomorphism.values()),
1162  list(isomorphism.keys())))
1163  assert len(symmetries) > 0
1164  LogDebug("Found %s isomorphic mappings (symmetries)" % len(symmetries))
1165  elif gm.subgraph_is_isomorphic() and substructure_match:
1166  if not return_symmetries:
1167  return True
1168  symmetries = []
1169  for i, isomorphism in enumerate(gm.subgraph_isomorphisms_iter()):
1170  if i >= max_symmetries:
1171  raise TooManySymmetriesError(
1172  "Too many symmetries between %s and %s" % (
1173  str(model_ligand), str(target_ligand)))
1174  symmetries.append((list(isomorphism.values()),
1175  list(isomorphism.keys())))
1176  assert len(symmetries) > 0
1177  # Assert that all the atoms in the target are part of the substructure
1178  assert len(symmetries[0][0]) == len(target_ligand.atoms)
1179  n_sym = len(symmetries)
1180  LogDebug("Found %s subgraph isomorphisms (symmetries)" % n_sym)
1181  elif gm.subgraph_is_isomorphic():
1182  LogDebug("Found subgraph isomorphisms (symmetries), but"
1183  " ignoring because substructure_match=False")
1184  raise NoIsomorphicSymmetryError("No symmetry between %s and %s" % (
1185  str(model_ligand), str(target_ligand)))
1186  else:
1187  LogDebug("Found no isomorphic mappings (symmetries)")
1188  raise NoSymmetryError("No symmetry between %s and %s" % (
1189  str(model_ligand), str(target_ligand)))
1190 
1191  return symmetries
1192 
1193 class NoSymmetryError(ValueError):
1194  """ Exception raised when no symmetry can be found.
1195  """
1196  pass
1197 
1198 class NoIsomorphicSymmetryError(ValueError):
1199  """ Exception raised when no isomorphic symmetry can be found
1200 
1201  There would be isomorphic subgraphs for which symmetries can
1202  be found, but substructure_match is disabled
1203  """
1204  pass
1205 
1206 class TooManySymmetriesError(ValueError):
1207  """ Exception raised when too many symmetries are found.
1208  """
1209  pass
1210 
1211 class DisconnectedGraphError(Exception):
1212  """ Exception raised when the ligand graph is disconnected.
1213  """
1214  pass
1215 
1216 # specify public interface
1217 __all__ = ('LigandScorer', 'ComputeSymmetries', 'NoSymmetryError',
1218  'NoIsomorphicSymmetryError', 'TooManySymmetriesError',
1219  'DisconnectedGraphError')
Protein or molecule.
definition of EntityView
Definition: entity_view.hh:86
def __init__(self, model, target, model_ligands=None, target_ligands=None, resnum_alignments=False, rename_ligand_chain=False, substructure_match=False, coverage_delta=0.2, max_symmetries=1e5)
def _compute(self, symmetries, target_ligand, model_ligand)
def _prepare_ligands(new_entity, old_entity, ligands, rename_chain)
def _get_report(self, ligand_state, pair_states)
def ComputeSymmetries(model_ligand, target_ligand, substructure_match=False, by_atom_index=False, return_symmetries=True, max_symmetries=1e6, model_graph=None, target_graph=None)