Skip to main content
Fig. 4 | BMC Biology

Fig. 4

From: A new lineage of non-photosynthetic green algae with extreme organellar genomes

Fig. 4

Distribution of repeats in organellar genomes of Leontynka pallida. a The most abundant imperfect palindromes and their characteristics. The “Spacer” corresponds to the presumed loop separating the palindromic regions that presumably pair to form a stem structure. The “Mismatch” column indicates the number of positions that deviate from a perfect palindrome. The occurrence of the repeats is given for the plastome (ptDNA) and mitogenome (mtDNA), with the number of cases indicated for the whole organellar genome, and separately for exons in protein-coding genes. In two cases of mitogenome repeats, two variants—a shorter and a longer—are considered, with the latter indicated in parentheses. b Distribution of the imperfect palindrome AAGCCAGC|NNN|GCTGACTT and its most common variants within exons of the plastome. The numbers show the abundance of the given repeat in the direct/reverse complement orientation (relative to the coding sequence). In the case of “variant 1”, the repeat has the same sequence in both directions, so only one number per gene is presented. Note that the variants considered are not mutually exclusive alternatives, but correspond to nested categories with a different degree or relaxation of the sequence pattern. c Characterisation of repeats from b and their abundance in various regions of the plastome and the mitogenome of L. pallida as well as other plastomes of Chlamydomonadales deposited in NCBI databases. The numbers show the abundance of the given repeat in the direct/reverse complement orientation (relative to the coding sequence in the case of exons, or relative to the DNA strand corresponding to the reference organellar sequence in the case of the values for the whole organellar genome). d Occurrence of the “variant 8” repeat (translated in the reading frame +0 as KDKPANLTS) in a variable region of the ribosomal protein Rps8 (detail; the full alignment is available as Additional file 1: Fig. S9). e Occurrence of the “variant 4” repeat in protein-coding sequences and its translation for all six reading frames. The category of rare codons (“rare 2%”) is defined as the sum of the least used codons, together representing less than 2% of all codons in the plastome (100% = 19,899 codons); the categories of the 4%, 10%, and 20% rarest codons and that of more than 50% of the most frequent codons are defined following the same convention (listed in Additional file 3: Table S5). The numbers indicated for “codon usage” correspond to the minimum number of the codons of the respective category present in the respective reading frame, with the “max X” numbers indicating the maximum number of such codons, depending on the actual nucleotide sequence of the degenerated “variant 4” repeat. Note that some rare codons (2–4% category) are not observed in the actual L. pallida plastid gene sequences, although their presence would be theoretically possible (see the asterisks). The column “Rare AA” indicates the occurrence of amino acids belonging to the category of amino acids generally rarely used in plastome-encoded proteins in L. pallida (see Additional file 3: Table S6). The occurrence of the repeat variants indicated for coding sequences (CDS) corresponds to their occurrence as counted at the nucleotide level, whereas the occurrence in proteins is counted at the amino acid sequence level (and may be higher due to different nucleotide sequences encoding the same amino acid sequence). The analysis of intraexonic repeat insertions is discussed in more detail in Additional file 2: Note S4

Back to article page