Loss, duplication, and genus-level expansion of far genes in nematodes
We found 586 FAR proteins from 58 nematode species by searching for the Gp-FAR-1 domain (pfam05823, Additional file 2: Table S1). The FAR domain was not found in 5 species from Clade I, but present in Clades III, IV, and V. The median number of far genes in 53 species from Clades III, IV, and V was 5. The number of the far genes varied at the genus level (Fig. 1 and Additional file 3: Table S2). The number of the far genes ranged from 1 to 5 in Clade III, and being 3 in most species in Clade IIIc. In Clade IV, the number of the far genes in plant-parasitic nematodes (1-7, in Clade IVc) was significantly lower than in entomopathogenic nematode Steinernema (37-43, in Clade IVa) and parthenogenetic nematode Strongyloides (16, in Clade IVb) (Fig. 1 and Additional file 1: Fig. S1). Variations in numbers of far genes in Clade V was detected not only among free-living nematodes C. elegans (9, in Clade Vb), D. coronatus (6, in Clade Vb), and Pristinochus (21-23, in Clade Va), but also among parasitic nematodes of Angiostrongylus (3-4, in Clade Vd), Dictyocaulus viviparus (4, in Clade Vd), and expanded Ancylostoma (18-30, in Clade Vc), Nippostrongylus brasiliensis (12, in Clade Ve), and Haemonchus (12-19, in Clade Ve) (Fig. 1 and Additional file 1: Fig. S1). Thus, the lack or expansion of gene was responsible for the extensive variations in the number of far genes in nematodes of different genera.
Low sequence identity and high diversity of FAR proteins at the genus level within phylum Nematoda
The sequence identity of FAR proteins was compared among members within the phylum Nematoda. The average sequence identity between the FAR domain in 586 nematode proteins and the Gp-FAR-1 domain was 23.9% (Additional file 2: Table S1). The average sequence identities among 55 FAR domains from Clade III, 310 from Clade IV, and 221 from Clade V were 29.1%, 17.1%, and 21.2%, respectively (Additional file 1: Fig. S2, S3, S4, and Additional file 2: Table S1). We named orthologs in species with serial numbering according to the sequence identity to Gp-FAR-1 and known nematode FAR-1 s (Additional file 2: Table S1). We used OrthoMCL to infer the orthologous relationship of the 586 protein sequences with FAR domains among free-living and parasitic nematodes. The results obtained showed sequence divergence of FAR proteins across nematodes, categorizing them into 18 groups. A ML tree of the FAR domains across nematodes was further constructed. Expansions in FAR domains in species from three clades had led to the formation of several monophyletic groups (Additional file 1: Fig. S5), which make the phylogenetic tree complicated. It illustrates variances in gene numbers and low sequence homology among FAR proteins. These features reflect the complex genetic relationship of FAR among nematodes. Thus, we constructed phylogenetic trees of FAR for some species from these nematode clades.
Three isoforms of FAR in the ancestor of Clade III
FAR proteins grouped into one cluster in phylogenetic tree are expected to include members of nematode species of the same subclade within Clade III. Indeed, 55 ortholog sequences of FAR domain from Clade III formed three clusters (Fig. 2). Most FAR-1 proteins were grouped into cluster 1 and showed sequence identity of 49.4%. In cluster 2, most FAR-2 proteins from Clade III were grouped together, but some FARs from Clade IIIb were clustered into Ascaridomorpha lineage-specific branch. Cluster 3 contained most FAR-3 from Clade IIIc, three FARs of Ascaridida in Clade IIIb, and FAR-5 of Enterobius vermicularis in Clade IIIa. Far genes from Clade IIIa and IIIc shared similar numbers and length of introns, particularly in Spirurida (Additional file 1: Fig. S6). Far genes in Ascaridida had more and longer introns than others in Clade III, suggesting that far genes in Ascaridida have been separated from other species in Clade III at their ancestors. Two FARs of marine nematode A. simplex in Clade IIIb clustered together into the Ascaridomorpha lineage-specific branch and shared intron structure like far genes in Clade IIIa and IIIc, suggesting that far genes in A. simplex might have experienced losses in the evolutionary process. Thus, the far genes from Clade III might have separated into three clusters at their ancestor of Oxyuridae, Ascaridida, and Spirurida. Moreover, the ancestor of Ascaridida has experienced gene duplication of far genes.
Extensive expansion and divergence of FAR proteins in Clade IV
FAR proteins in genera Steinernema of Clade IVa and Strongyloides of Clade IVb have significantly expanded to more than 37 and 16, respectively, while those in plant-parasitic nematodes of Clade IVc ranged from 1 to 7. The ML tree of FARs in Clade IV formed three clusters. The expanded FARs are seen as several Steinernema-specific and Strongyloides-specific groups in different clusters. In cluster 1, we observed 4 monophyletic groups from five Steinernema species and 11 monophyletic groups from four Strongyloides species (Fig. 3A and Additional file 2: Fig. S7). In cluster 2, we observed 9 monophyletic groups from five Steinernema species and 2 monophyletic groups from four Strongyloides species. Some FARs from Steinernema species had independent expansion. In cluster 3, FARs from five Steinernema species and four Strongyloides species formed one monophyletic group separately; moreover, FARs from Ste. monticolum and Ste. glaseri appeared to have gone through expansions. Thus, FARs in five Steinernema species and four Strongyloides species had experienced expansions and formed at least 14 monophyletic groups. In addition, FARs from plant-parasitic nematode Meloidogyne, Globodera, and Ditylenchus destructor were only grouped into cluster 1, while FARs from Bursaphelenchus had lineage-specific expansions and were grouped into three clusters.
FARs arose independently in entomopathogenic lineage of Steinernema
Steinernema spp. are entomopathogenic nematodes, which can kill an insect host within 24–48 h [27, 28]. Steinernema species have more than 38 FARs with significant sequence divergence. The FAR gene family represents the dramatic case of genus-wide expansion in Steinernema genomes [20]. Some expanded far genes were tandem ones that had higher homology and closer phylogenetic relationship (Fig. 3A and Additional file 1: Fig. S7 and S8). We also examined the synteny of far gene of Steinernema in chromosomes or scaffolds (gene number in a scaffold of more than six genes were considered; information on scaffolds/contigs encoding far gene is listed in Additional file 2: Table S1). The gene order in the syntenic block containing the far gene was highly conserved among Ste. carpocapsae, Ste. feltiae, and Ste. scapterisci (Fig. 3B). The expression of far genes across developmental stages showed two divergent expression patterns. Some far genes had high expression in infective L3 stage; others had high expression in L1 and young adult stages, while far genes in Ste. feltiae showed low expression in egg stage (Fig. 3C), according to the data of Dillman AR et al. We found no far gene in insect-parasitic R. culicivorax from Clade I and three far genes in entomopathogenic Heterorhabditis bacteriophora (PRJNA13977) [2], which may illustrate independent evolution of far genes in insect-parasitic Steinernema lineages.
Tandem FAR-1 and FAR-2 are possibly related with Strongyloides development
Clade IVb contains the free-living Rhabditophanes sp. KR3021 from the Alloionematidae family and parasitic Parastrongyloides trichosuri and Strongyloides from the Strongyloididae family. The 16 far genes in Strongyloides and 19 in P. trichosuri as a result of gene expansion formed three clusters (Fig. 3A and Additional file 1: Fig. S7). Some of the expanded far genes from Strongyloides in cluster 1 are in tandem and have high sequence homology and close phylogeny relationship (Fig. 3A and Additional file 1: Fig. S7 and S8). We also assessed gene synteny in chromosomes or scaffolds containing far among Strongyloides, P. trichosuri, and Rhabditophanes sp. KR3021 (Additional file 2: Table S1). The results showed that gene order in the syntenic blocks is highly conserved between Strongyloides and P. trichosuri, but not between S. ratti and Rhabditophanes sp. KR3021 (Fig. 3D). Far had two exons in Strongyloididae and three exons in free-living Rhabditophanes sp. KR3021 (Additional file 3: Table S2), suggesting that far gene experienced intron losses in the last common ancestor of Strongyloididae. These data indicate that far genes in free-living Rhabditophanes and parasitic Strongyloididae had diverged early.
RNA-seq data from Strongyloides spp. in public database [29,30,31] enables us to investigate the potential roles of genes in nematode biology. Analysis of transcriptomic data from three Strongyloides species showed that the far-1 and far-2 genes had coordinately higher expression than other far genes. Moreover, low expression of far was observed in iL3 compared with other developmental stages in four Strongyloides species (Fig. 3E and Additional file 4: Table S3). Strongyloides spp. are female-only in parasitic lifestyle and dioecious in free-living lifestyle. The far genes in free-living or parasitic females had similar gene expression level. Analysis of somatic proteomes of free-living and parasitic females of Str. ratti showed that FAR-1 (original gene id: SRAE_2000289100) and FAR-2 (original gene id: SRAE_2000289500) had high expression in free-living and parasitic stages [30]. In addition, FAR-1 and FAR-2 could be detected in excretory-secretory (ES) proteome of Str. ratti [30, 32], reflecting its importance in the host-nematode interaction. Thus, considering the high expression level of far-1 and far-2 genes in free-living and parasitic females, and its presence in ES, we propose that at least FAR-1 and FAR-2 in Strongyloides might be important in its development and parasitism.
FAR represents the evolutionary dynamic of plant-parasitic nematodes
Orthologs of FAR have diverse phylogeny relationship among pine wood, root-knot, stem, and cyst nematodes (Fig. 3A and Additional file 1: Fig. S7). In cluster 1, FARs from root-knot, pine wood, stem, and cyst nematodes were clustered together, and other FARs from pine wood, stem, and cyst nematodes were grouped into another group, while FARs from B. xylophilus had lineage-specific expansion and were clustered in three clusters (Fig. 3A and Additional file 1: Fig. S7). To elucidate the evolutionary relationship of FAR in plant-parasitic nematodes, we conducted comparative analyses of genomes of seven divergent plant-parasitic nematodes: the root-knot nematodes M. graminicola, M. floridensis, M. arenaria, M. javanica, and M. enterolobii; the cyst nematode Heterodera glycines; and the pine wood nematode B. okinawaensis.
In phylogenetic analyses of the sequences, FARs from root-knot nematodes clustered together (Figs. 3A and 4 and Additional file 1: Fig. S7). One far gene is present in the genomes of M. hapla, M. graminicola, and M. floridensis, but two to four FARs were encoded in the genomes of M. incognita, M. arenaria, M. javanica, and M. enterolobii. To determine whether the latter might have originated from gene duplication, we analyzed the reproduction mode and other features. The reproduction mode in root-knot nematode is complex and different from that of other plant-parasitic nematodes. Some of root-knot nematodes have facultative meiotic parthenogenesis (M. hapla, M. graminicola, and M. floridensis), while others are obligatory mitotic parthenogenesis (M. incognita, M. arenaria, M. javanica, and M. enterolobii), which lead to the aneuploid and polyploid genomes [33]. The ratio of far gene number in mitotic parthenogenetic species to meiotic parthenogenetic M. hapla is approximately 2:1 or more than 3:1 (Fig. 4). Our previous study showed that the proportion of the duplicated BUSCOs (13.1–36.7%) in four mitotic parthenogenetic species was higher than in three meiotic parthenogenetic species (0.4–3.0%). The ratio of these BUSCOs number in root-knot nematodes to M. hapla with 2:1 or 3:1 reached to 26–42% in four mitotic parthenogenetic species, particularly in M. arenaria, while was less than 5% in two meiotic parthenogenetic species [34]. Thus, the multi-copy nature of far gene in mitotic parthenogenetic species was likely due to their genomic characteristics. The analysis of RNA-seq data indicated that far-1 and far-2 genes had relatively high expression across developmental stages of M. incognita (Additional file 1: Fig. S9B and Additional file 4: Table S3).
Results of phylogenetic analysis showed that FARs in pine wood, stem, and cyst nematodes were grouped into two clusters, with duplicated FAR of Bursaphelenchus in a separate branch (Fig. 4). The far genes in Globodera rostochiensis and Globodera pallida shared similar intron structure, while differed from the orthologs in H. glycines (Additional file 1: Fig. S9A). Species in Globodera and Heterodera had been diverged over 30 million years ago, and the far genes in G. pallida and H. glycines might have experienced independent duplications in the evolutionary process. The genome of pine wood nematode B. okinawaensis had seven far genes, which is consistent with those in B. xylophilus. Phylogenetic analysis indicated that the lineage-specific far genes occurred and duplicated in their last common ancestor (Figs. 3A and 4 and Additional file 1: Fig. S7). RNA-seq data indicated that far-1 and far-2 in B. xylophilus had relatively high expression across developmental stages, and the expression level of far-1 was higher than far-2 (Additional file 1: Fig. S9B and Additional file 4: Table S3). Lineage-specific expansions and high expression of far-1 and far-2 in infective or parasitic stages may be advantageous to the parasitism of pinewood nematode.
Comparison of FAR from plant-parasitic nematode and bacteria
A comprehensive homology searching of whole genome sequence data showed the presence of FAR domains in bacteria Streptomyces, Kitasatospora sp., Bacillus subtilis, and Lysobacter. Results of sequence identity and phylogenetic analyses indicated that bacterial FAR domains had higher sequence identity to those in plant-parasitic nematodes than in other nematodes, especially to FAR-1 in plant-parasitic nematodes (Additional file 1: Fig. S10A). Bacterial far genes, however, have no intron (Additional file 1: Fig. S10B). We observed genome collinearity in the coding sequence (CDS) region of FAR domains between plant-parasitic nematodes and these bacteria. The gene spacing and orientation of FAR domains were conserved between them, which was not the case between the bacteria and other nematodes (Additional file 1: Fig. S11 and Fig. S12). There were extensive differences between far genes and other genes in GC content, gene combination, and codon usage bias. The GC content in bacterial genomes (66.5-72.1% in Lysobacter, Kitasatospora sp., and Streptomyces) was significantly higher than in plant-parasitic nematode genomes (23.5% - 40.4%) (Additional file 5: Table S4). The average GC content between all CDS and FAR domains were 48.3% and 47.6% (P = 0.38) in plant-parasitic nematodes, 44.5% and 42.4% (P = 0.26) in other nematodes from different clades, and 72.3% and 59.2% (P = 0.0000006) in bacterial CDS and FAR domains, respectively (Additional file 5: Table S4). Thus, bacterial far genes had GC contents compared different significantly from the whole genomes. Because of the difference in GC content, the codon usage frequency of bacterial far genes was compared with that of other genes in the genomes of bacteria. The results obtained showed that the ratio of the five codon indices (CAI, Fop, Nc, GC3s, and GC) of far genes to the whole genome was about 1 (Additional file 5: Table S5). Therefore, the codon index of bacterial far genes was similar to that of the whole genome of bacteria. Streptomyces, Kitasatospora sp., and Bacillus subtilis are endophytes, which are microbes grow inside the plant tissues without causing any harm to the host [35, 36]. Endophytes play an important role in improving stress tolerance of the host because they can produce active materials, fix nitrogen, accelerate plant to grow, and enhance the immune system and allelopathy of the host [37]. Lysobacter strains efficiently colonize on the root surfaces of several plants, including spinach, tomato, Arabidopsis thaliana, and Amaranthus gangeticus [38]. Thus, plant-parasitic nematodes have long-term co-existence with endophyte or root-colonized Lysobacter species in the plant host. Bacteria frequently respond to selective pressures and adapt to new environments by acquiring new genetic traits from other species via genetic communication. Therefore, genetic communication of far genes might have occurred between bacteria and plant-parasitic nematodes.
Duplication, genus-level expansion, and distinct ligands binding of FAR in Clade V nematodes
Phylogeny analysis of 221 orthologs of FAR from free-living and parasitic nematodes in Clade V identified a Pristionchus-specific group. Among the 21-23 Pristionchus far genes in the genomes, 17 were placed in this group (Additional file 1: Fig. S13). This suggests that far genes from Pristionchus experienced lineage-specific duplications and these duplicated orthologs had low sequence homology to the orthologs in other species from Clade V. We further analyzed the phylogenetic relationship of FAR orthologs in Clade V without Pristionchus. Orthologs of FARs could be separated into three main clusters (Fig. 5). In free-living nematodes, nine FARs encoded in the genome of C. elegans were placed in three clusters, with seven of them in cluster 1. The tandem duplicated C. elegans far-1 and far-2 are located in chromosome III and have intron splice sites similar to C. elegans far-6. Tandem C. elegans far-3, -4, and -5 located in chromosome V also have similar intron splice sites and were clustered together in phylogenetic analysis (Additional file 1: Fig. S14). C. elegans FAR-7 was placed in cluster 3; and the novel C. elegans FAR-9 was placed in cluster 2. Results of the analysis of RNA-seq data showed that far-1 and far-2 genes of C. elegans had higher expression than others across developmental stages (Fig. 5 and Additional file 4: Table S3). In contrast, the 6 FARs encoded in the genome of D. coronatus were all placed in cluster 1, forming three branches. Thus, the orthologs of FARs among free-living Pristionchus, D. coronatus, and C. elegans have diverged early and experienced independent duplications.
FARs are possibly important in parasitism of Strongylida
Members of Strongylida are a large group of animal-parasitic nematodes residing in the intestine, respiratory tract, blood vessels, and other sites of host. In Strongylida, FARs obviously have experienced expansions in intestinal parasitic-nematodes from Clades Vc and Ve, including hookworms Necator americanus (8 copies, Clade Vc) and Ancylostoma (18-30 copies, Clade Vc), nodule worm Oesophagostomum dentatum (17 copies, Clade Vc), and strongylids Nippostrongylus brasiliensis (12 copies, Clade Ve) and Haemonchus (12-19 copies, Clade Ve). These FARs were placed in three clusters and some subclusters. Approximately 44% of the expanded FARs were clustered into a specific group within cluster 1 (Fig. 5). Gene locus analysis showed that some expanded far genes in A. ceylanicum, A. caninum, and H. contortus were in tandem (Fig. 5 and Additional file 1: Fig. S15). Phylogenetic analysis indicated that FARs from intestinal parasitic nematodes in Clades Vc and Ve formed at least 6 monophyletic groups. Analysis of ES proteins indicated that the far genes in O. dentatum are transcribed at the high level in parasitic stages (L4 and adults) [39]. In our analysis, the H. contortus far genes had the highest expression in L3, L4, and adult. In addition, H. contortus far-1 and far-2 had stage-specific expression and were expressed at the higher levels than other orthologs across developmental stages (Fig. 5 and Additional file 4: Table S3). Similarly, N. americanus far-1 gene is known to have abundant expression across developmental stages [40]. In lungworms, orthologs of FAR in murine Angiostrongylus and bovine D. viviparus from Clade Vd were limited to 3–4. The FARs of two Angiostrongylus species clustered together and formed three subclusters. In contrast, FARs from D. viviparus under the superfamily of Trichostrongyloidea were placed in two branches within cluster 1. In A. cantonensis, far-1 had the highest expression in parasitic stages (L4 and female) in the definitive host rat, far-2 had high expression in L1 and L3, which are larval stages in the intermediate host snail (Fig. 5 and Additional file 4: Table S3), while far-3 had low expression across developmental stages. In D. viviparus, far-1 had higher expression in juveniles and adults than other developmental stages, while far-2 had high expression in all stages from eggs to adults. Thus, FARs in both lungworms had divergent sequence and gene expression patterns, with higher far-1 and far-2 expression across developmental stages than others.
FAR is a lipid-binding protein and is involved in the transport of fatty acids and retinol to modulate cell growth and proliferation. Subcellular localization analysis indicated that most FAR proteins were secretory proteins containing signal peptide (406/586) (Additional file 2: Table S1). Recent studies of plant-parasitic nematodes indicated that secretory FAR-1 s are localized in the hypodermis of nematodes [17, 18]. Functional C. elegans FAR proteins have distinct abilities of binding fatty acids and retinols. C. elegans FAR-1 through -6 can bind fatty acids and retinol, but FAR-7 has weak binding capacity for 11-(5-dimethylaminonaphthalene-1sulfonyl amino) undecanoic acid (DAUDA), retinol, and C18:4 [41, 42]. FAR-1 proteins in parasitic nematodes are known as functional proteins that bind fatty acids and retinol (Additional file 6: Table S6) [21, 41,42,43,44,45]; however, there is a lack of information on the ligand binding ability of other FARs with low sequence identity to FAR-1. We cloned far-1 and far-3 genes of A. cantonensis to assess the ligand binding ability of FARs with low sequence identity. In fluorescence-based ligand-binding assays, A. cantonensis FAR-1 bound the fluorescent fatty acid analog DAUDA and naturally fluorescent retinol (Fig. 6C and Additional file 1: Fig. S16). The degree of blue shift in DAUDA fluorescence emission (from 550 nm in buffer to 525 nm) indicated that FAR-1 had a highly apolar binding ability, as described for FAR-1 from other species [22, 46]. The preference of FAR-1 for fatty acids was investigated through the addition of fatty acids with different chain lengths in the DAUDA assay. DAUDA displacement occurred with fatty acids ranged C12:0–C22:6, especially the saturated C15:0 (Fig. 6A). The results obtained suggested that A. cantonensis FAR-1 had binding ability with retinol (Fig. 6C), while A. cantonensis FAR-3 had weak binding ability with fatty acids and retinol (Fig. 6B and 6D), which is similar to the function of C. elegans FAR-7 [42]. Further structural analysis revealed that A. cantonensis FAR-1 and FAR-3 were α-helix-rich proteins that closely resembled FARs from other nematodes [41, 43]. They had typical binding pockets as N. americanus FAR-1. The cavity volume of A. cantonensis FAR-1 was 1437.6 Å3, which is smaller than 2031.5 Å3 in N. americanus FAR-1, but significantly bigger than 836 Å3 in A. cantonensis FAR-3 (Fig. 6E). Thus, the differences in sequences and protein structures might lead to differential ligand-binding properties of FAR proteins.