io
- Input and Output of Sequences, Structures and Maps¶
The io module deals with the input and output of entities
, alignments
,
sequences
, images
. Importers for common file formats containing molecules
such as PDB, SDF and CHARMM trajectory files are available. Sequence and
alignment file formats such as FASTA and CLUSTALW are supported as well as
various image data (e.g. png, dm3) and density map files (e.g. CCP4, MRC).
Molecular Structures¶
Loading Molecular Structures¶
The io
modules offer several ways to load molecular structures
depending on your requirements. The most general way is offered by
LoadEntity()
, which will automatically detect the file format based
on the file extension.
-
LoadEntity
(filename, format='auto')¶ Load entity from disk. If format is set to ‘auto’, the function guesses the filetype based on the extension of the file e.g. files ending in ‘.pdb’, ‘.ent’, ‘.ent.gz’ and ‘.pdb.gz’ will automatically be loaded as PDB files. For files without or exotic extensions, the format can be set explicitly as the second parameter.
# Recognizes SDF file by file extension ent = io.LoadEntity('file.sdf') # In this case, there is no file extensions, so you have to say it's a # SDF file explicitly ent = io.LoadEntity('file', 'sdf')
For a list of file formats supported by
LoadEntity()
, see Supported Structure File Formats.Raises: IOUnknownFormatException
if the format string supplied is not recognized or the file format can not be detected based on the file extension.IOException
if the import fails due to an erroneous or inexistent file.
Some of the formats have a dedicated function that allows you to tweak many
parameters that affect the import. PDB files can be loaded with LoadPDB()
and mmCIF files with LoadMMCIF()
(this also gives you access to the
MMCifInfo
class). It offers tighter control over the exact loading
behaviour.
-
LoadPDB
(filename, restrict_chains='', no_hetatms=None, fault_tolerant=None, load_multi=False, quack_mode=None, join_spread_atom_records=None, calpha_only=None, profile='DEFAULT', remote=False, dialect=None, seqres=False, bond_feasibility_check=None)¶ Load PDB file from disk and return one or more entities. Several options allow to customize the exact behaviour of the PDB import. For more information on these options, see IO Profiles for entity importer.
Residues are flagged as ligand if they are mentioned in a HET record.
Parameters: - restrict_chains – If not an empty string, only chains listed in the string will be imported.
- fault_tolerant – Enable/disable fault-tolerant import. If set, overrides
the value of
IOProfile.fault_tolerant
. - no_hetatms – If set to True, HETATM records will be ignored. Overrides
the value of
IOProfile.no_hetatms
- load_multi – If set to True, a list of entities will be returned instead of only the first. This is useful when dealing with multi-PDB files.
- join_spread_atom_records – If set, overrides the value of
IOProfile.join_spread_atom_records
. - remote – If set to True, the method tries to load the pdb from the remote pdb repository www.pdb.org. The filename is then interpreted as the pdb id.
- dialect (
str
) – Specifies the particular dialect to use. If set, overrides the value ofIOProfile.dialect
- seqres – Whether to read SEQRES records. If set to True, the loaded entity and seqres entry will be returned as a tuple.
Return type: EntityHandle
or a list thereof if load_multi is True.Raises: IOException
if the import fails due to an erroneous or inexistent file
Saving Molecular Structures¶
Saving a complete entity or a view is a matter of calling
SaveEntity()
.
ent = io.LoadEntity('protein.pdb')
# save full entity
io.SaveEntity(ent, 'full.pdb')
# only save C-alpha atoms
io.SaveEntity(ent.Select('aname=CA and peptide=true'), 'calpha.pdb')
SavePDB()
provides a simple way to save several entities into one
file:
ent = io.LoadEntity('protein.pdb')
# Save complete entity
io.SavePDB(ent, 'full.pdb')
# Save chain A and chain B separately
io.SavePDB([ent.Select('cname=A'), ent.Select('cname=B')], 'split.pdb')
-
SaveEntity
(ent, filename, format='auto')¶ Save entity to disk. If format is set to ‘auto’, the function guesses the filetype based on the file extension, otherwise the supplied format is checked against the available export plugins.
Parameters: - ent (
EntityHandle
orEntityView
) – The entity to be saved - filename (string) – The filename
- format (string) – Name of the format
Raises: IOUnknownFormatException
if the format string supplied is not recognized or the file format can not be detected based on the file extension.- ent (
-
SavePDB
(models, filename, dialect=None, pqr=False, profile='DEFAULT')¶ Save entity or list of entities to disk. If a list of entities is supplied the PDB file will be saved as a multi PDB file. Each of the entities is wrapped into a MODEL/ENDMDL pair.
If the atom number exceeds 99999, ‘*‘ is used.
Parameters: - models – The entity or list of entities (handles or views) to be saved
- filename (string) – The filename
Sequences and Alignments¶
Loading sequence or alignment files¶
-
LoadSequence
(filename, format='auto')¶ Load sequence data from disk. If format is set to ‘auto’, the function guesses the filetype based on the extension of the file. Files ending in ‘.fasta’, ‘.aln’ will automatically be loaded.
For files with non-standard extensions, the format can be set explicitly specifying the format parameter.
# recognizes FASTA file by file extension myseq = io.LoadSequence('seq.fasta') # for obtaining a SequenceList seqlist = io.LoadSequenceList('seqs.fasta') # or for multiple aligned fasta files use aln = io.LoadAlignment('algnm.aln',format="clustal")
For a list of file formats supported by
LoadSequence()
see Supported Sequence File Formats.Raises: IOUnknownFormatException
if the format string supplied is not recognized or the file format can not be detected based on the file extension.IOException
if the import fails due to an erroneous or inexistent file.
-
LoadSequenceList
(filename, format='auto')¶ For a description of how to use
LoadSequenceList()
please refer toLoadSequence()
. For a list of file formats supported byLoadSequenceList()
see Supported Sequence File Formats.
-
LoadAlignment
(filename, format='auto')¶ For a description of how to use
LoadAlignment()
please refer toLoadSequence()
. For a list of file formats supported byLoadAlignment()
see Supported Sequence File Formats.
-
LoadSequenceProfile
(filename, format='auto')¶ Load sequence profile data from disk. If format is set to ‘auto’, the function guesses the filetype based on the extension of the file. Files ending in ‘.hhm’ (output of HHblits) and ‘.pssm’ (ASCII Table (PSSM) output of PSI-BLAST as generated with blastpgp and flag -Q) will automatically be loaded.
For files with non-standard extensions, the format can be set explicitly specifying the format parameter.
# recognizes hhm file by file extension myprof = io.LoadSequenceProfile('myhmm.hhm') # recognizes pssm file by file extension myprof = io.LoadSequenceProfile('myprof.pssm') # to override format myprof = io.LoadSequenceProfile('myfile', format='hhm') myprof = io.LoadSequenceProfile('myfile', format='pssm')
For a list of file formats supported by
LoadSequenceProfile()
see Supported Sequence Profile File Formats.Return type: Raises: IOUnknownFormatException
if the format string supplied is not recognized or the file format can not be detected based on the file extension.IOException
if the import fails due to an erroneous or inexistent file.
Saving Sequence Data¶
-
SaveSequence
(sequence, filename, format='auto')¶ Saving sequence data is performed by calling
SaveSequence()
. For files with non-standard extensions, the format can be set explicitly specifying the ‘format’ parameter.# recognizes FASTA file by file extension io.SaveSequence(myseq,'seq.fasta') # for saving a SequenceList io.SaveSequenceList(seqlist,'seqlist.fasta') # or multiple aligned fasta files io.SaveAlignment(aln,'algnm.aln',format="clustal")
For a list of file formats supported by
SaveSequence()
see Supported Sequence File Formats.Raises: IOUnknownFormatException
if the format string supplied is not recognized or the file format can not be detected based on the file extension.IOException
if the import fails due to an erroneous or inexistent file.
-
SaveSequenceList
(seq_list, filename, format='auto')¶ For a desription of how to use
SaveSequenceList()
please refer toSaveSequence()
. For a list of file formats supported bySaveSequenceList()
see Supported Sequence File Formats.
-
SaveAlignment
(aln, filename, format='auto')¶ For a desription of how to use
SaveAlignment()
please refer toSaveSequence()
.For a list of file formats supported by
SaveAlignment()
see Supported Sequence File Formats.
Density Maps¶
Loading Density Maps¶
-
LoadImage
(filename)¶ Load density map from disk with the extension being guessed by the function.
Parameters: filename (string) – The filename
-
LoadImage
(filename, format) Load density map from disk. If no format is given, the function guesses the filetype based on the extension of the file. If the extension is unknown or not present the filetype will be guessed based on the content of the file if possible.
Parameters: - filename (string) – The filename
- format – The file format
Raises: IOUnknownFormatException
if the format supplied is not recognized or the file format can not be detected based on the file extension and content.IOException
if the import fails due to an erroneous or inexistent file.# recognizes mrc file by file extension ent = io.LoadImage('file.mrc') # it is always possible to explicitly set the image format # DAT file explicitly ent = io.LoadImage('file', Dat())
For a list of file formats supported by
LoadImage()
, see Supported Image File Formats.
Saving Density Maps¶
-
SaveImage
(image, filename)¶ Save density map to disk with the function guessing the filetype based on the file extension.
-
SaveImage
(image, filename, format) Save density map to disk. If no format is set, the function guesses the filetype based on the file extension.
Parameters: - image (
IMageHandle
) – The density map to be saved - filename (string) – The filename
- format – The file format
Raises: IOUnknownFormatException
if the file format can not be detected based on the file extensionFor a list of file formats supported by
SaveImage()
, see Supported Image File Formats.# load density map image = io.LoadImage('density_map.ccp4') # save density map io.SaveImage(image, 'new_map.map', CCP4())
- image (