We sequenced the mitochondrial genome of Liriodendron tulipifera, the first from the large (>10,000 species) magnoliid lineage, to fill an important phylogenetic gap and provide an outgroup for comparison to the previously studied monocot and eudicot lineages. The phylogenetic position of Liriodendron allowed us to polarize changes in monocots and eudicots, leading to a more detailed understanding of the patterns of loss of RNA editing, gains of plastid tRNAs, and gene cluster conservation across flowering plants. These efforts were bolstered by the fact that the Liriodendron mitochondrial genome evolves exceptionally slowly in terms of gene sequence, content and order, allowing an unprecedented look into the early evolution of plant mitochondrial genomes. Thus, in many striking ways, Liriodendron has a “fossilized” mitochondrial genome, having undergone remarkably little change over the last ∽100 million years.
Insights into the acquisition of plastid-derived tRNAs
The evidence presented here points to a different evolutionary history of mitochondrial plastid-derived tRNAs in angiosperms than previously postulated [16, 54], generally pushing back their origins earlier in flowering plant evolution (Figure 2). Whereas Wang et al. posited a recent origin of trnP(TGG)-cp on the branch leading to Nicotiana, its presence in monocots, eudicots and now magnoliids (Figure 2) suggests that its acquisition likely predated the common ancestor of these three lineages. Similarly, the presence of trnD(GTC)-cp in Liriodendron likely pushes the origin of that tRNA back from the common ancestor of eudicots to sometime after the gymnosperm/angiosperm divergence. It should be noted, however, that parallel gains in the magnoliids and eudicots is possible in this case as well. The small size and conserved nature of tRNA genes is such that these competing hypotheses are difficult, if not impossible, to test with phylogenetic analysis.
We know from other angiosperm mitochondrial genomes that sequence transfer from the plastid genome is frequent on an evolutionary timescale [14, 55] and that on occasion these transfer events led to the gain of functional tRNAs, based on their widespread conservation across angiosperms . However, the timing of functional transfers has been unclear. Due to its slow rates of gene loss, sequence change and gene-cluster fragmentation, Liriodendron may have retained one or more regions of plastid DNA that date back to the original sequence transfers that permanently seeded some of the plastid tRNAs found across angiosperm mitochondrial genomes (Figures 2 and 3). Other interpretations are possible, however. For example, that Liriodendron and most eudicots have trnD(GTC)-cp (Figures 2 and 3A) could be due to independent parallel gains, once in a magnoliid ancestor and once early in eudicot evolution.
Part of our reasoning that the plastid-derived sequences in Figure 3A, B may be remnants of early functional plastid tRNA transfers is that the tRNA appears to be more strongly conserved than the flanking regions that were simultaneously transferred, suggesting that purifying selection has preserved the tRNA while the surrounding noncoding sequence deteriorated. The fragment in Figure 3A appears to be the oldest, having accumulated 15% pairwise sequence divergence. Given the inferred low rates of sequence evolution in both the mitochondrial and plastid genomes of Liriodendron, its transfer may well date to early in angiosperm evolution. We hesitate, however, to estimate the actual timing of the transfer event for several reasons. The current low substitution rates in the magnoliid lineage are possibly lower than rates were earlier in angiosperm evolution, precluding the use of a strict molecular clock. The transferred regions contain plastid sequence with intergenic DNA, as well as synonymous and nonsynonymous sites, which are under different constraints in the plastid relative to the mitochondrial genome, further complicating fragment-wide divergence time estimates. The plastid-derived fragment containing trnP(TTG)-cp (Figure 3B) appears to be more recently transferred than the fragment in Figure 3A, given the lower overall divergence from its cognate plastid sequence. In this case, however, more of the fragment consists of protein-coding genes, which would likely decrease the overall rate of pairwise sequence divergence following the transfer event.
Our interpretation of the time since transfer may also be complicated by the possibility that concerted evolution homogenizes homologous plastid and mitochondrial sequences . For example, it is possible that a divergent, plastid-derived sequence fragment containing trnN(GTT)-cp (Figure 3C) was already present in the Liriodendron mitochondrial genome from an earlier transfer, and the short stretch containing the tRNA was “updated” via gene conversion between it and a reintroduced copy of the same stretch of plastid DNA, restoring the sequence identity between the plastid and mitochondrial copies. This concerted-evolution mechanism was postulated to explain patterns of sequence divergence in a stretch of plastid-derived sequence in the mitochondrial genomes of Oryza and Zea, where within-species plastid/mitochondrial divergence is less than between species in the mitochondrial region, despite the putatively shared origin of the transferred fragment . If mitochondrial and plastid copies are evolving in concert, the nearly identical plastid-derived fragment in Figure 3C could be much older than suggested by the high sequence similarity.
Low mitochondrial and plastid substitution rates in magnoliids
The mitochondrial genes in Liriodendron evolve at an exceptionally low rate, accumulating just 0.035 nucleotide substitutions per silent site per billion years. As a point of reference, using the same computational approach as employed for the plant mitochondrial rate analysis, we aligned all 13 protein coding genes from the full mitochondrial genomes of a human , a Neanderthal , a more distantly related Denisova hominin , and a chimpanzee outgroup . We calculated an absolute silent substitution rate of 69.5 ssb in humans, using the relevant divergence dates from Krause et al.. The human mitochondrial substitution rate is more than 5,000 times faster than Magnolia and 2,000 times faster than Liriodendron. Stated differently, the average amount of silent site mitochondrial divergence accrued over the course of a single generation (25 years) in humans would take roughly 50,000 years in Liriodendron and 130,000 years in Magnolia.
Mower et al. characterized mitochondrial silent substitution rates across approximately 600 plant species with datasets of one to five genes and also found that Silene noctiflora is the fastest . The slowest evolving mitochondrial genome reported by Mower et al. was Cycas at 0.02 +/− 0.1 ssb, similar to the Liriodendron rate reported here, and greater than our estimate for Magnolia using an 18-gene concatenated alignment. To our knowledge, the estimated rate of 0.013 ssb in Magnolia is the lowest reported genome-wide substitution rate in any organism, but this conclusion is tempered by the associated error in our estimates. For Magnolia and Liriodendron, the 95% likelihood confidence interval about the ssb estimate due to errors in branch specific synonymous substitution estimation was 0.003 to 0.034 and 0.015 to 0.065, respectively (Additional file 1: Table S3). In addition, our estimates rely heavily on fossil-calibrated divergence times, which add an additional source of error (for example, see [30, 31, 62, 63]). We used two widely accepted fossils within magnoliids [64, 65], which together should provide a relatively accurate divergence time estimate for the relevant Liriodendron–Magnolia split. The 95% highest probability density interval for this split was 94.9 to 102.2 mya, and the median value we used for our estimate was 97.4 mya (see Methods). Therefore, in our study, errors in absolute rate estimation for Liriodendron and Magnolia are less influenced by divergence time uncertainty than by error in the likelihood estimate of the branch-specific synonymous substitution rates.
We found that mitochondrial and chloroplast substitution rates were roughly correlated in the taxa examined here (Figure 4), an observation deserving of more detailed follow-up study. Although it is too early to extrapolate too much, growth habit (annual vs. perennial, shrub vs. tree) might underlie this pattern . Generation time and rates of synonymous substitution are generally inversely correlated in plants (for review, see ). The driving forces behind this relationship are unclear, however, as plants do not have a dedicated germ line, so generation time and number of reproductive cell divisions per year are not as closely linked as they are in animals. Differences between annuals and perennials, in terms of speciation rate and/or metabolism, could underlie the generation time substitution rate relationship , and might be expected to similarly influence each of the plant’s three genomes. As nuclear genomic data become available for a broader diversity of plants, it will be interesting to determine whether this correlation extends across all three genetic compartments.
Our data also recovered a greater ratio of plastid to mitochondrial silent substitution rates than was found previously [9, 13, 51, 52]. Our estimate benefited from considerably more sequence data and much broader taxon sampling than previous studies, which might account for the discrepancy. In addition, given the 5,000-fold and 40-fold range in mitochondrial and plastid substitution rates, respectively, that we found, it appears that taxon sampling can have a large effect on average inferred ratios. “High-rate” mitochondrial and plastid lineages do not always have proportionally elevated rates in both organelle genomes , leading to extreme plastid–mitochondrial rate relationships (for example, 0.08 in Silene conica) (Figure 4). Gene-to-gene variation in mitochondrial  and plastid [48, 53] silent substitution rates are common as well, underscoring the need to consider many mitochondrial and plastid genes for an accurate determination of relative rates.
Retention of RNA editing sites lost in many lineages
The overall high level of C-to-U RNA editing in Liriodendron, along with its large number of unique edit sites, add further support for a model of relatively high levels of RNA editing in the ancestral angiosperm mitochondrial genome (approximately 700 sites in protein-coding genes), followed by various degrees of subsequent loss in different lineages (Figure 5) [26, 27]. RNA editing data from an angiosperm from an “early diverging” lineage, such as Amborella or Nymphaea, would help polarize the degree of editing loss in Liriodendron, which looks to be exceptionally low based on these data. There is no clear adaptive explanation for the emergence and maintenance of RNA editing in plants [25, 68, 69], but it may have emerged through neutral processes, only to become essential following substitutions at functionally important cytosines that required post-transcriptional editing to produce the conserved amino acid  – a hypothesis falling under the category of ‘constructive neutral evolution’ [71, 72]. Consistent with this model, most edit sites change the translated amino acid sequence [21, 73], a pattern underscored in Liriodendron, in which 82% of the edits were at nonsynonymous sites. While the emergence of RNA editing may be due to neutral processes, comparative work has found support for selection favoring loss of editing over time [26, 27], and it is likely that such selection would be stronger at nonsynonymous sites, where unreliable editing would be most deleterious. Consistent with this hypothesis, we found the ratio of loss to gain was 14:1 at nonsynonymous sites compared to 2:1 at silent sites across angiosperms (Figure 5).
Conservation of ancient gene clusters
Although overall gene order is highly variable among angiosperm mitochondrial genomes , even between closely related taxa , the results here underscore countervailing constraints on short clusters of gene linkage operating across angiosperm evolution. While some of the conserved clusters (for example, rrnS–rrn5 and rpl2–rps19–rps3–rpl16) date back to the original bacterial ancestor of mitochondria , others are unique to angiosperms, such as the atp8–cox3–sdh4 and rps13–nad1.x2.x3 clusters. The five clusters shared by Liriodendron and Cycas most likely were present early in seed plant evolution, and we can look outside of seed plants to infer which of these were also present early in vascular plant evolution as well. A comparative gene order analysis showed Huperzia to have experienced fewer rearrangements relative to bryophytes than any other vascular plant mitochondrial genome , making it a meaningful comparison for vascular plant gene order conservation. Of the five clusters shared by Cycas and Liriodendron, three are shared with Huperzia and two are not. All of the gene clusters found in Liriodendron to the exclusion of Cycas are also lacking in Huperzia, suggesting such clusters are indeed angiosperm-specific.
Transcription is likely an important constraint, whereby adjacent genes share a single promoter and are co-transcribed, as was shown for three conserved gene clusters in Nicotiana. This could explain why all of the clusters conserved across angiosperms involve genes encoded on the same strand. Interestingly, three of the clusters inferred to be present in the ancestral angiosperm involve internal fragments of trans-spliced genes (Figure 6), which may, upon further examination, provide clues as to the regulation and reconstruction of full-length transcripts from trans-spliced genes.
The Liriodendron mitochondrial genome appears to have been subject to both low silent-substitution rates and infrequent gene-cluster fragmentation relative to sequenced eudicot and monocot mitochondrial genomes (Figures 4 and 6). However, levels of silent substitution and gene cluster fragmentation do not necessarily covary across all angiosperms in our study. For example, one of the taxa with a relatively high silent substitution rate (>30 × faster than Liriodendron), Cucurbita, has 11 conserved gene clusters compared to 12 in Liriodendron, whereas Zea, with a relatively slower rate (10 × faster than Liriodendron), has only five. In angiosperm plastid genomes, there is support for a positive relationship between rates of structural and sequence evolution , but this relationship is not universal [48, 53]. In Silene, for example, although rates of plastid gene order rearrangement are higher in species with higher substitution rates, many of these substitutions occur at nonsynonymous sites and so are not easily explained by a simple, mutationally-driven model .