clustalw - Perform multiple sequence alignment¶
- ClustalW(seq1, seq2=None, clustalw=None, keep_files=False, nopgap=False, clustalw_option_string=False)¶
Runs a clustalw multiple sequence alignment. The results are returned as a AlignmentHandle instance.
There are two ways to use this function:
align exactly two sequences:
param seq1: sequence_one type seq1: SequenceHandle or str param seq2: sequence_two type seq2: SequenceHandle or str The two sequences can be specified as two separate function parameters (seq1, seq2). The type of both parameters can be either SequenceHandle or str, but must be the same for both parameters.
align two or more sequences:
param seq1: sequence_list type seq1: SequenceList param seq2: must be None Two or more sequences can be specified by using a SequenceList. It is then passed as the first function parameter (seq1). The second parameter (seq2) must be None.
Parameters: - clustalw (str) – path to clustalw executable (used in Locate())
- nopgap (bool) – turn residue-specific gaps off
- clustalw_option_string (str) – additional clustalw flags (see http://toolkit.tuebingen.mpg.de/clustalw/help_params)
- keep_files (bool) – do not delete temporary files
Note: ClustalW will convert lowercase to uppercase, and change all ‘.’ to ‘-‘. OST will convert and ‘?’ to ‘X’ before aligning sequences with Clustalw.
ClustalW will accept only IUB/IUPAC amino acid and nucleic acid codes:
Residue Name Residue Name A alanine P proline B aspartate or asparagine Q glutamine C cystine R arginine D aspartate S serine E glutamate T threonine F phenylalanine U selenocysteine G glycine V valine H histidine W tryptophan I isoleucine Y tyrosine K lysine Z glutamate or glutamine L leucine X any M methionine * translation stop N asparagine - gap of indeterminate length