Basal jawed vertebrate phylogeny inferred from multiple nuclear DNA-coded genes.

BACKGROUND
Phylogenetic analyses of jawed vertebrates based on mitochondrial sequences often result in confusing inferences which are obviously inconsistent with generally accepted trees. In particular, in a hypothesis by Rasmussen and Arnason based on mitochondrial trees, cartilaginous fishes have a terminal position in a paraphyletic cluster of bony fishes. No previous analysis based on nuclear DNA-coded genes could significantly reject the mitochondrial trees of jawed vertebrates.


RESULTS
We have cloned and sequenced seven nuclear DNA-coded genes from 13 vertebrate species. These sequences, together with sequences available from databases including 13 jawed vertebrates from eight major groups (cartilaginous fishes, bichir, chondrosteans, gar, bowfin, teleost fishes, lungfishes and tetrapods) and an outgroup (a cyclostome and a lancelet), have been subjected to phylogenetic analyses based on the maximum likelihood method.


CONCLUSION
Cartilaginous fishes have been inferred to be basal to other jawed vertebrates, which is consistent with the generally accepted view. The minimum log-likelihood difference between the maximum likelihood tree and trees not supporting the basal position of cartilaginous fishes is 18.3 +/- 13.1. The hypothesis by Rasmussen and Arnason has been significantly rejected with the minimum log-likelihood difference of 123 +/- 23.3. Our tree has also shown that living holosteans, comprising bowfin and gar, form a monophyletic group which is the sister group to teleost fishes. This is consistent with a formerly prevalent view of vertebrate classification, although inconsistent with both of the current morphology-based and mitochondrial sequence-based trees. Furthermore, the bichir has been shown to be the basal ray-finned fish. Tetrapods and lungfish have formed a monophyletic cluster in the tree inferred from the concatenated alignment, being consistent with the currently prevalent view. It also remains possible that tetrapods are more closely related to ray-finned fishes than to lungfishes.


Background
The evolutionary relationship among jawed vertebrates is currently a controversial issue. Cartilaginous fishes are traditionally considered to be ancestral to other jawed vertebrates ( Figure 1A). Arnason and colleagues challenged the traditional view, based on phylogenetic analyses of complete mitochondrial sequences from several vertebrates [1][2][3]. According to their mitochondrial tree ( Figure 1B), cartilaginous fishes have a terminal position in the phylogeny of bony fishes (coelacanth, lungfishes, bichirs, teleost fishes and other ray-finned fishes), implying that bony fishes are ancestral to cartilaginous fishes. Furthermore, the mitochondrial tree shows a basal split between tetrapods and other jawed vertebrates.
Phylogenetic analyses based on mitochondrial sequences, however, often result in misleading trees when distantly related vertebrates are compared [4][5][6][7]. Some efforts have been made by several groups to obtain the robust phylogenetic trees of jawed vertebrates based on nuclear DNAcoded genes. In the LSU rRNA tree by Zardoya and Meyer [6], the basal position of cartilaginous fishes is not significantly supported; the bootstrap probabilities are 72%, 68% and 74%, for the maximum parsimony (MP) method, the neighbor joining (NJ) method and the maximum likelihood (ML) method, respectively. On the basis of presence or absence of insertions or deletions within conserved sequences [8], Venkatesh et al. [9] claimed to have found robust molecular evidence (molecular synapomorphy) against the mitochondrial tree [1][2][3]. However, their tree is basically an unrooted tree of major groups of jawed vertebrates as pointed out by Dimmick [10], because none of the molecular synapomorphies they found included an outgroup (cyclostomes or lancelets). Apart from the position of bichir, the tree by Venkatesh et al. is equivalent to that by Rasmussen et al., when compared as unrooted trees.
Martin [11] analyzed multiple nuclear DNA-coded genes and the hypothesis by Rasmussen et al. [1][2][3] could not be refuted. Hedges [12] analyzed 10 nuclear DNA-coded genes from two cyclostomes and three jawed vertebrates, and concluded the basal position of cartilaginous fishes in the jawed vertebrate tree. Takezaki et al. [13] confirmed this finding based on a comparison of 31 nuclear DNAcoded genes. Because only a single bony fish lineage represented by teleost fishes is included in these analyses, it remains possible that other bony fishes (lungfishes or bichir) are more deeply branching than cartilaginous fishes are. If it is the case, one cannot refute the hypothesis by Rasmussen and Arnason [3] that bony fishes are ancestral to cartilaginous fishes. The phylogenetic position of bichir is particularly important; bichir is often inferred to be the outgroup to all other jawed vertebrates in mito-chondrial trees, when amphibian data is included in comparison (data not shown).
The phylogenetic relationship amongst teleost fishes and two holosteans is also controversial. Living holosteans, comprising bowfin and gar, are possible sister groups of teleost fishes [14,15]. All of three possible topologies (Figure 2A,2B,2C) were proposed by morphologists to date (see references cited in [15] and [16]). Partial mitochondrial and LSU rRNA data, on the other hand, do not support any of these morphology-based trees at a statistically significant level [17,18]. Venkatesh et al. [9] noted the possibility of an alternative tree ( Figure 2D) based on a molecular synapomorphy. Inoue et al. [16] recently reported that this tree was supported by complete mitochondrial sequences. Mitochondrial sequences, however, may not be suitable for inferring phylogenetic relationships among such distantly related groups [6,19].
To test the mitochondrial trees at a statistically significant level, it is therefore necessary to perform phylogenetic analyses based on nuclear DNA-coded genes. There is, however, a possible error from paralogous comparisons when a nuclear DNA-coded gene tree is used for inferring the phylogenetic relationship of organisms. To avoid this, we selected basically single copy genes, such as enzymes in glycolysis, which are evolving at roughly constant rates over a wide range of animal taxa [20,21]. Since their evolutionary rates are generally low, no single gene amongst them has detailed phylogenetic information. Thus a large Two hypotheses on jawed vertebrates  We have cloned and sequenced seven nuclear DNA-coded genes comprising ~3,000 amino acid residues in total, from eleven jawed vertebrate species, two cyclostomes (lamprey and/or hagfish) and a lancelet. These amino acid sequences, together with those available from the DDBJ/EMBL/GenBank databases, were subjected to phylogenetic analyses and statistical tests based on the ML method. We report here that the nuclear DNA-coded gene tree differs sharply from the mitochondrial tree on the two phylogenetic problems of jawed vertebrates; our tree supports the deepest position of cartilaginous fishes in jawedvertebrate phylogeny, and the monophyly of holosteans.

Results and discussion
Phylogenetic tree inference Teleost fishes have two TPI genes (TPI-A and TPI-B) [22]; A. baerii has two ALDa genes (AB111402 and AB111403); mammals have two PGK genes (M11968 and X05246 for human, M15668 and M17299 for mouse); and the mouse has two G6PD genes (Z84471 and AF326207). Each of these four gene pairs is shown to have multiplied within the respective taxonomic group by preliminary phylogenetic analyses. To avoid 'long branch attraction' (LBA) artifact [23], the slowly evolving counterpart for each of these gene pairs was selected for phylogenetic inference: O. latipes ortholog of TPI-B, AB111402 for A. baerii ALDa, M11968 for human PGK, M15668 for mouse PGK and AF326207 for mouse G6PD. Cyclostomes have muscle and non-muscle types of aldolase (ALD) genes [24]. Although the relationship between these two cyclostome ALD genes and three ALD genes (a, b and c) from jawed vertebrates is not clearly resolved, each of the jawed-vertebrate ALD genes was inferred to be orthologous [25]. The muscle-type ALD gene of hagfish, the non-muscle-type ALD gene of hagfish and the non-muscle-type ALD gene of lamprey were used as outgroups for ALDa, ALDb and ALDc of jawed vertebrates, respectively.
For each of the seven proteins, the amino acid sequences from 15 vertebrate species listed in Materials and methods have been aligned, and phylogenetic tree analyses have been carried out for regions comprising 317 amino acid residues (aa) in ALDa, 316aa in ALDb, 317aa in ALDc, 463aa in G6PD, 940aa in GAG, 383aa in PGK and 206aa in TPI, for each of which unambiguous alignment is possible. The total data set of 2,942aa was subjected to phylogenetic analyses based on the GAMT program [26] as described in materials and methods.
We have selected the candidate topologies -a set of topologies with log-likelihood values close to that of the ML tree -from seven protein data sets as described in Materials and Methods. The numbers of candidate topologies selected are 379 from the ALDa data set, 91 from the ALDb data set, 1,121 from the ALDc data set, 665 from the G6PD data set, 11 from the GAG data set, 652 from the PGK data set and 2,860 from the TPI data set, and 103 from the concatenated alignment. Excluding identical topologies, a total of 5,801 topologies were subjected to further analyses as the candidate topologies. For each candidate topology, the likelihood value of totalml and that of concatenated alignment were computed. Figure 3 shows the ML tree inferred from the concatenated alignment. This tree strongly supports the basal position of cartilaginous fishes and the monophyly of holosteans, although individual trees inferred from each of seven proteins did not give statistically significant results, probably because of limited phylogenetic information held in a single gene (data not shown). Tables 1 and 2 show the ML topology and some topologies with large likelihood values inferred from concatenated alignment analysis and totalml analysis, respectively. Each table includes only Four hypotheses of phylogenetic relationship among ray-finned fishes  Table 2) differs from that in concatenated alignment analysis ( Figure 3 and topology a in Table 1).

Statistical tests
Topologies a and b in Tables 1 and 2 differ considerably from other topologies in their bootstrap probabilities and P-values. These two topologies have approximately equal likelihood values in each of the totalml and concatenated alignment analyses, although the ML tree in concatenated alignment analysis is the second best tree in totalml analysis, and vice versa.
In addition to the bootstrap probability and the KH test, a test based on Bayesian posterior probability (BPP) has been carried out. The resulting BPP values are self-contradictory; topology a, which was the best topology in concatenated alignment analysis (Table 1), is significantly rejected in totalml analysis ( Table 2; the BPP value was 0.005). Thus the BPP test might be too liberal, as already pointed out [27,28]. The approximately unbiased (AU) test has also been carried out for reference.
Focusing on some phylogenetic problems, the support values for each competing hypothesis were computed based on the intact bootstrap probability (BP) analysis (see Materials and Methods), the TREE-PUZZLE (TP) program [29] and the MRBAYES [30] program (Table 3). In addition, the RELL BP value, the BPP value and the P-values by the KH test and the AU test, which are based on concatenated alignment analysis described above, are also shown. The intact BP value is largely accordant to the RELL BP value, whereas low support values are observed in the TP method. This may be an artifact derived from the limited topology searches in the TP method, because the same result as shown in Table 1 was obtained, when the candidate topologies described above were subjected to the TREE-PUZZLE program with the 'user defined trees' option.

Cartilaginous fishes have a basal position among jawed vertebrates
Cartilaginous fishes are thought to be ancestral to other jawed vertebrates in the traditional view ( Figure 1A). In contrast, Rasmussen and Arnason [1,2] and Arnason et al. [3] pointed out another possibility that bony fishes are ancestral to cartilaginous fishes ( Figure 1B). The present results strongly support the traditional view as shown in Figure 3 and Table 3. The bootstrap probabilities of the The maximum likelihood tree inferred from the concatenated amino acid sequences (2,942 residues) of seven proteins Figure 3 The maximum likelihood tree inferred from the concatenated amino acid sequences (2,942 residues) of seven proteins. Reliability index [26] and the bootstrap probability for each branch are indicated before and after a slant, respectively. This tree corresponds to topology a in Tables 1 and 2. Topology b in Tables 1 and 2  topologies having a basal position of cartilaginous fishes totaled 88.2% and 87.8% in concatenated alignment analysis and totalml analysis, respectively. The minimum log-likelihood difference between the ML tree and trees not supporting a basal split between cartilaginous fishes and remaining jawed vertebrates was 18.3 ± 13.1 (P-value = 0.09) and 15.3 ± 12.7 (P-value = 0.12), in concatenated alignment analysis and totalml analysis, respectively. The minimum log-likelihood difference between the ML tree and that supporting the bony fish origin of cartilaginous fishes was 123 ± 23.3 (P-value < 0.01) and 137 ± 29.6 (Pvalue < 0.01) in concatenated alignment analysis and totalml analysis, respectively, providing strong evidence against the hypothesis by Arnason's group [1][2][3]. When the lancelet (a distant outgroup) sequences are excluded from the analysis, the minimum log-likelihood difference between the ML tree and trees that support their hypothesis was 122 ± 25.9, still being statistically significant.
According to the phylogenetic analysis based on mitochondrial sequences, however, all topologies consistent with the present analysis are significantly rejected (P-value < 0.01). This controversial result may be due to the incompleteness of phylogenetic information retained in the mitochondrial sequences; the amino acid composition of mitochondrial DNA-coded proteins is highly biased to hydrophobic residues and thus multiple and reverse substitutions may occurs very frequently [4]. In addition, the evolutionary rates of mitochondrial sequences often differ greatly for different lineages; the mitochondrial sequences of most tetrapods evolve more rapidly than those of fishes [31,32]. These evolutionary features characteristic of mitochondrial sequences might result in the LBA artifact [23].

Did tetrapods originate from lobe-finned fishes?
Several molecular phylogenetic analyses were carried out to clarify the phylogenetic relationship among tetrapods, coelacanth and lungfishes, using ray-finned fishes [33][34][35][36] and/or cartilaginous fishes [5,9,37] as an outgroup. The validity of these two rootings needs to be confirmed with molecular evidence [2,5]. Although no coelacanth sequence is included, the present analysis provides a confirmation for the cartilaginous fish rooting. Ray-finned fishes, however, cannot be used as an outgroup, because it remains possible that tetrapods are more closely related to ray-finned fishes than to lobe-finned fishes (topology b in Tables 1 and 2).
tetrapods. This is consistent with the present study suggesting the almost simultaneous divergence of tetrapods, lungfishes and ray-finned fishes.

Living holosteans form a natural group
The phylogenetic relationship among teleost fishes and holosteans comprising bowfin and gar is controversial [15]. Four different tree topologies (Figure 2A,2B,2C,2D) have been proposed to date from morphological and molecular data. According to a formerly accepted view, living ray-finned fishes are divided into three major groups ( Figure 2A): Chondrostei (chondrosteans including sturgeons and paddlefishes), 'Holostei' (holosteans comprising bowfin and gar), and Teleostei (teleost fishes consisting all other living ray-finned fishes). 'Holostei' is, however, a term that has fallen into disuse in formal classifications. Instead, in the currently accepted view, holosteans are considered to be paraphyletic; bowfin is thought to be more closely related to teleost fishes than gar is [14,38], as shown in Figure 2B, and therefore rayfinned fishes are classified into two monophyletic groups: Chondrostei and Neopterygii (holosteans and teleost fishes). Another possibility that gar is closely related to teleost fishes ( Figure 2C) was also proposed [42]. Furthermore, mitochondrial sequences suggest a distinct tree topology ( Figure 2D), in which holosteans and chondrosteans form a monophyletic group [16].
In the present analysis, holosteans are inferred to form a monophyletic group that is the sister group to teleost fishes, as shown in Figure 3 and Table 3. The bootstrap probabilities for the holostean clade are 92.2% and 83.8% in concatenated alignment analysis and totalml analysis, respectively. The topologies not supporting the holostean clade are relatively small in P-values (≤ 0.12), as shown in Tables 1 and 2. This result is rather consistent with a formerly accepted view of vertebrate classification, but is inconsistent with the currently accepted view. The mitochondrial tree shown in Figure 2D was significantly rejected by the KH test, if its likelihood value was calculated using nuclear DNA-coded genes; its log-likelihood difference from the ML tree was 34.9 ± 18.0 (P-value = 0.03) and 42.3 ± 20.8 (P-value = 0.02) in concatenated alignment analysis and totalml analysis, respectively.
We also analyzed a mitochondrial data set and confirmed the monophyly of holosteans and chondrosteans. In contrast to the high support value (100%) by the MRBAYES program for this relationship, however, the RELL BP value was only 71%. The likelihood difference between topologies A and D of Figure 2 was 11.9 ± 13.3 (P-value = 0.18), which is not significant, as Inoue et al. [16] noted. Considering the Bayesian inference often results in erroneously high support values [28,43], the inconsistency between the present inference and that based on mitochondrial sequences might be caused by the artifact of the Bayesian inference.
are basal in the jawed-vertebrate tree. Our result strongly confirms the result from molecular synapomorphies: bichir is placed at the deepest position in ray-finned fishes, and the bootstrap probabilities are 98.1% and 95.1% in concatenated alignment analysis and in totalml analysis, respectively, as shown in Figure 3 and Table 3. The alternative hypothesis that bichir and chondrosteans form a monophyletic group was not supported; its loglikelihood difference from the ML tree is 37.4 ± 18.5 (Pvalue = 0.02) and 33.3 ± 19.5 (P-value = 0.04) in concatenated alignment analysis and totalml analysis, respectively.

Chimaeras and other cartilaginous fishes form a monophyletic group
Some paleontologists have pointed out the possibility that chimaeras were derived from placoderms independently from other cartilaginous fishes (eg, [38]). To test this possibility, we have isolated the genes listed in Materials and methods, except for ALDb, from a plownose chimaera, Callorhinchus callorhynchus, and have inferred its phylogenetic position based on the concatenated alignment of 2,431 amino acid residues. The resulting tree significantly supported the monophyly of cartilaginous fishes including chimaeras (as shown by dash-dotted line in Figure 3) with the RELL bootstrap probability of 100%.
Mitochondrial data also support this relationship [3].

Conclusions
Molecular phylogenetic analyses of jawed vertebrates based on mitochondrial sequences often result in confusing inferences which are obviously inconsistent with gen-erally accepted trees. To obtain a robust tree of jawed vertebrates, we have cloned and sequenced seven nuclear DNA-coded genes from thirteen vertebrate species and have carried out phylogenetic analyses including thirteen jawed vertebrates from eight major groups and an outgroup (a cyclostome and a lancelet) based on the maximum likelihood method. We have shown that (i) cartilaginous fishes are basal to other jawed vertebrates. This is consistent with generally accepted view, but is inconsistent with mitochondrial trees. (ii) Living holosteans, comprising bowfin and gar, form a monophyletic group which is the sister group to teleost fishes. This is consistent with a formerly prevalent view of vertebrate classification, but inconsistent with both of the current morphology-based and mitochondrial sequencebased trees. (iii) The bichir is the basal ray-finned fish. (iv) Tetrapods and lungfish form a monophyletic cluster in the tree inferred from the concatenated alignment, being consistent with currently accepted view. It remains also possible that tetrapods are more closely related to rayfinned fishes than to lungfishes.
The present results are statistically solid and highly consistent with traditional views based on morphological and paleontological evidence. Comparing with trees inferred from mitochondrial sequences, which often provide obviously bizarre phylogeny, these nuclear DNA-coded genes probably have more accurate phylogenetic information. More intensive taxonomic sampling, particularly inclusion of coelacanth, would provide more solid inference for the origin of tetrapods and other phylogenetic problems currently discussed mainly based on mitochondrial  A. ((Teleostei, (bowfin, gar) sequences. An extended analysis including coelacanth sequences is in progress.

Isolation and sequencing of cDNAs
We have carried out a phylogenetic analysis of jawed vertebrates based on seven nuclear DNA-coded genes from six ray-finned fishes, three tetrapods, two lobe-finned fishes, three cartilaginous fishes and an outgroup (a cyclostome and a lancelet). For plownose chimaera, only six gene sequences excluding ALDb sequence were available for analysis. The names and abbreviations of proteins used in the present analysis are as follows: ALDa, ALDb and ALDc, fructose-bisphosphate aldolase A, B and C, respectively; G6PD, glucose-6-phosphate 1-dehydrogenase; GAG, a trifunctional protein with glycinamide ribonucleotide synthetase (GARS)-aminoimidazole ribonucleotide synthetase (AIRS)-glycinamide ribonucleotide formyltransferase (GART); PGK, phosphoglycerate kinase; TPI, triosephosphate isomerase.
The PCR products were separated in 1.5% agarose gel containing ethidium bromide. Products of expected size were isolated as gel slices, purified using DNA purification kit (TOYOBO), and cloned into pT7Blue vector (Novagen). Then, Escherichia coli strain DH5α (TOYOBO) was transformed with a ligated vector. More than three independent clones were isolated for each gene and sequenced by dideoxy chain termination method using BigDye Terminator Cycle Sequencing Ready Kit (Applied Biosystems) and ABI PRISM 377 and 3100 DNA sequencers (Applied Biosystems).
The 3' ends of cDNAs were amplified using 3'RACE System for Rapid Amplification of cDNA Ends (GIBCO BRL). The amplified fragments were purified, subcloned and sequenced in the same way as above.

Sequence data
The following sequence data was taken from the DDBJ/ EMBL/GenBank database: the seven gene sequences from human, mouse and Takifugu rubripes (fugu); ALD gene sequences from Eptatretus burgeri (inshore hagfish), Lethenteron japonicum (Japanese lamprey) and B. belcheri; TPI gene sequences from L. reissneri and B. belcheri. The DDBJ/EMBL/GenBank accession number of each sequence data is shown in Table 5.

Phylogenetic tree inference
Multiple alignments of amino acid sequences were carried out by MAFFT [47], a multiple sequence alignment program recently developed by us, and manually inspected on the XCED sequence alignment editor.
Using the cyclostome and lancelet sequences as an outgroup, phylogenetic analyses have been carried out by GAMT [26], a genetic algorithm-based ML method, with the JTT-F model [48,49]. Heterogeneity of evolutionary rates among sites was modeled by a discrete Γ distribution [50] with the optimized shape parameter α for each protein. A limited number of candidate tree topologies were generated by the following procedure and subjected