Step | Number | Script |
---|---|---|
Download from GenBank | [I] | proseqco |
Standardize headers | [a.I], [b.I] | header_standardizer |
Split sequences to single genes | [b.II] | multiple_sequence_splitter |
Check strand polarity and sequence similarity | [b.III] | checking_seq |
Choose longest sequence per species and gene | [a.IV], [b.IV] | choose_longest_seq |
Translate coding mitochondrial sequences from nucleotides to amino acids | [b.V] | dna2aa |
Delete groups of orthologs with three or fewer species | [II], [III], [XIII] | small_groups_deleter |
Delete species with only one sequence | [III], [XIII] | taxon_deleter |
Backtranslate coding mitochondrial sequences from amino acids to nucleotides | [VI] | aa2dna |
Mask gappy regions in alignment | [VII] | gap_killer |
Select maximum clique of overlapping sequences | [IX], [X] | minimum_sequence_overlap |
Ban compositional heterogeneity | [XI], [XII] | nucleotide_chi |
Prune genera to best represented species | [XIV] | prune_genera |
Select largest group of species that overlap in at least one group of orthologs | [XV] | reduce2leading_gene |
Concatenate alignments | [XVI] | concatenator |