Old algorithm
The MUSCLE algorithm (before v5) proceeds in three stages: the draft progressive, improved progressive, and refinement stage.
Stage 2: Improved Progressive
This stage focuses on obtaining a more optimal tree by calculating the Kimura distance for each pair of input sequences using the multiple sequence alignment obtained in Stage one, and creates a second distance matrix. UPGMA clusters this distance matrix to obtain a second binary tree. A progressive alignment is performed to obtain a multiple sequence alignment like in Stage one, but it is optimized by only computing alignments in subtrees whose branching orders have changed from the first binary tree, resulting in a more accurate alignment.
Refined alignments are made in the second stage by recalculating a more accurate tree via the Kimura distance. Thus, the algorithm analysis involves the initial subprocedures of pairwise distance calculations and progressive alignment; however, optimizations in computation are made by limiting re-alignment to only those subtrees with altered branching orders. The optimization is thus given as
,
where the variable
denotes the number of subtree realignments. Similarly, the space complexity is
,
as profiles and alignments for the input sequences are stored for the progressive alignment process.[1]
Stage 3: Refinement
In this final stage, an edge is chosen from the second tree, with edges being visited in decreasing distance from the root. The chosen edge is deleted, dividing the tree into two subtrees. The profile of the multiple alignment is then computed for each subtree. A new multiple sequence alignment is produced by re-aligning the subtree profiles. If the SP score is improved, the new alignment is kept, otherwise, it is discarded. The process of deleting an edge and aligning is repeated until convergence, or until a user-defined limit is reached.
The time complexity of the refinement stage is given as
. Here,
denotes the number of edge deletions and
denotes the average sequence length, where re-alignment of the subtree profiles is still the dominant cost per iteration. The space complexity remains the same as given in Stage one and two:
. Since the same iterative refinement process occurs, the asymptotic complexity remains polynomial as the dominant term grows linearly with respect to the number of refinement steps.
In comparison, the CLUSTALW algorithm includes an optimized iterative refinement step such that selective re-alignment of the tree occurs in order to maximize alignment accuracy without repeating the entire process. The time and space complexity, however, do not change for this optimized iterative refinement step. The time complexity is
, where
is the number of refinement steps and
is the average alignment length. The space complexity is given as
, again, for alignment profiles and sequence data for all
input sequences.[1][2]
Comparison of Space Complexity and Alignment Accuracy
| Feature | CLUSTALW | MUSCLE v3 Version |
| Time Complexity |
 |  |
| Alignment Accuracy |
Moderate | Higher, due to iterative refinement improving SP score |
Algorithm Flowchart

Complexity and Comparison
In the first two stages of the algorithm, the time complexity is O(N2L + NL2), the space complexity is O(N2 + NL + L2). The refinement stage adds to the time complexity another term, O(N3L).[1] MUSCLE is often used as a replacement for Clustal, since it usually (but not always) gives better sequence alignments. Depending on the chosen options, MUSCLE is significantly faster than Clustal, more so for larger alignments.[1][2]
Most modern multiple sequence alignment programs are generally accepted when presenting aligned sequences but there are few differences amongst them. The main difference between programs is the method used to align the sequences. For instance, T-COFFEE and Clustal use the progressive method while MUSCLE and MAFFT perform using the iterative method of alignment.[9] These two methods differ in their ability to handle low similarity sequences with the iterative method providing more accurate results. The other way methods differ is with their computational needs.
Originally MUSCLE had middling CPU demands in comparison to other programs but was definitely higher than the progressive methods.[1] Comparisons with modern versions of MSA programs reveal that many are quite similar in capabilities. The alignments were assessed based on their sum of pairs (SP), correctly matching two nucleotides/amino acids across two sequences, and their total columns (TC), matching columns divided by the total columns. In these cases, MUSCLE was average in its ability to maximize matching pairs and columns, being slightly worse than the ProbCons, T-Coffee, Probalign and MAFFT.[10] Outside the alignment scores, MUSCLE was less computationally demanding in both time to execute the alignment and the memory demand.
MSA Computational Properties [10]
| Program |
Average Alignment Time (sec) |
Average Memory Usage (Mb) |
Short Sequence Average SP |
Short Sequence Average TC |
Long Sequence Average SP |
Long Sequence Average TC |
| MUSCLE |
301.4 |
60.8 |
0.718 |
0.341 |
0.789 |
0.437 |
| Probalign |
1410.2 |
162.7 |
0.774 |
0.219 |
0.800 |
0.455 |
| ProbCons |
1781.7 |
192.5 |
0.763 |
0.220 |
0.831 |
0.524 |
| MAFFT |
309.0 |
231.6 |
0.767 |
0.421 |
0.803 |
0.470 |
| T-Coffee |
1964.9 |
372.2 |
0.760 |
0.407 |
0.830 |
0.520 |