From: Transcriptome, proteome and draft genome of Euglena gracilis

Large paralog gene families are present in the Euglena gracilis genome. Several orthogroups contain many E. gracilis paralogs. The phylogenetic distribution of one large orthogroup, the nucleotidylcyclase III domain-containing proteins, is shown. Lineage groupings are colour coded: gray, all eukaryotes (and collapsed for clarity); red, N. gruberi; amber, B. saltans; and green, E. gracilis. Clades containing only Euglena sequences are boxed in green. Each sequence has been assigned a domain composition (colour gradient black to teal to blue), number of predicted trans-membrane domains (colour coded red to orange to black gradient). To obtain this phylogenetic tree, sequences with likely low coverage (less than 30% of the length of the overall alignment) were removed during alignment to avoid conflicting homology or artefact generation. Domain compositions identified are nucleotidylcyclase III, BLUF, NIT, P-loopNTPase, HAMP and Cache1

