clustalw
- Perform multiple sequence alignment¶
- ClustalW(seq1, seq2=None, clustalw=None, keep_files=False, nopgap=False, clustalw_option_string=False)¶
Runs a ClustalW multiple sequence alignment. The results are returned as a
AlignmentHandle
instance.There are two ways to use this function:
align exactly two sequences:
- param seq1:
sequence_one
- type seq1:
SequenceHandle
orstr
- param seq2:
sequence_two
- type seq2:
SequenceHandle
orstr
The two sequences can be specified as two separate function parameters (seq1, seq2). The type of both parameters can be either
SequenceHandle
orstr
, but must be the same for both parameters.align two or more sequences:
- param seq1:
sequence_list
- type seq1:
- param seq2:
must be
None
Two or more sequences can be specified by using a
SequenceList
. It is then passed as the first function parameter (seq1). The second parameter (seq2) must beNone
.
- Parameters:
clustalw (
str
) – path to ClustalW executable (used inLocate()
)nopgap (
bool
) – turn residue-specific gaps offclustalw_option_string (
str
) – additional ClustalW flags (see http://www.clustal.org/download/clustalw_help.txt)keep_files (
bool
) – do not delete temporary files
Note
In the passed sequences ClustalW will convert lowercase to uppercase, and change all ‘.’ to ‘-’. OST will convert and ‘?’ to ‘X’ before aligning sequences with ClustalW.
If a
sequence name
contains spaces, only the part before the space is considered as sequence name. To avoid surprises, you should remove spaces from the sequence name.Sequence names must be unique (
ValueError
exception raised otherwise).
ClustalW will accept only IUB/IUPAC amino acid and nucleic acid codes:
Residue
Name
Residue
Name
A
alanine
P
proline
B
aspartate or asparagine
Q
glutamine
C
cystine
R
arginine
D
aspartate
S
serine
E
glutamate
T
threonine
F
phenylalanine
U
selenocysteine
G
glycine
V
valine
H
histidine
W
tryptophan
I
isoleucine
Y
tyrosine
K
lysine
Z
glutamate or glutamine
L
leucine
X
any
M
methionine
*
translation stop
N
asparagine
-
gap of indeterminate length