clustalw
- Perform multiple sequence alignment¶
-
ClustalW
(seq1, seq2=None, clustalw=None, keep_files=False, nopgap=False, clustalw_option_string=False)¶ Runs a clustalw multiple sequence alignment. The results are returned as a
AlignmentHandle
instance.There are two ways to use this function:
align exactly two sequences:
param seq1: sequence_one type seq1: SequenceHandle
orstr
param seq2: sequence_two type seq2: SequenceHandle
orstr
The two sequences can be specified as two separate function parameters (seq1, seq2). The type of both parameters can be either
SequenceHandle
orstr
, but must be the same for both parameters.align two or more sequences:
param seq1: sequence_list type seq1: SequenceList
param seq2: must be None
Two or more sequences can be specified by using a
SequenceList
. It is then passed as the first function parameter (seq1). The second parameter (seq2) must beNone
.
Parameters: - clustalw (
str
) – path to clustalw executable (used inLocate()
) - nopgap (
bool
) – turn residue-specific gaps off - clustalw_option_string (
str
) – additional clustalw flags (see http://toolkit.tuebingen.mpg.de/clustalw/help_params) - keep_files (
bool
) – do not delete temporary files
Note: ClustalW will convert lowercase to uppercase, and change all ‘.’ to ‘-‘. OST will convert and ‘?’ to ‘X’ before aligning sequences with Clustalw.
ClustalW will accept only IUB/IUPAC amino acid and nucleic acid codes:
Residue Name Residue Name A alanine P proline B aspartate or asparagine Q glutamine C cystine R arginine D aspartate S serine E glutamate T threonine F phenylalanine U selenocysteine G glycine V valine H histidine W tryptophan I isoleucine Y tyrosine K lysine Z glutamate or glutamine L leucine X any M methionine * translation stop N asparagine - gap of indeterminate length