The draft genome of whitefly Bemisia tabaci MEAM1, a global crop pest, provides novel insights into virus transmission, host adaptation, and insecticide resistance

Background The whitefly Bemisia tabaci (Hemiptera: Aleyrodidae) is among the 100 worst invasive species in the world. As one of the most important crop pests and virus vectors, B. tabaci causes substantial crop losses and poses a serious threat to global food security. Results We report the 615-Mb high-quality genome sequence of B. tabaci Middle East-Asia Minor 1 (MEAM1), the first genome sequence in the Aleyrodidae family, which contains 15,664 protein-coding genes. The B. tabaci genome is highly divergent from other sequenced hemipteran genomes, sharing no detectable synteny. A number of known detoxification gene families, including cytochrome P450s and UDP-glucuronosyltransferases, are significantly expanded in B. tabaci. Other expanded gene families, including cathepsins, large clusters of tandemly duplicated B. tabaci-specific genes, and phosphatidylethanolamine-binding proteins (PEBPs), were found to be associated with virus acquisition and transmission and/or insecticide resistance, likely contributing to the global invasiveness and efficient virus transmission capacity of B. tabaci. The presence of 142 horizontally transferred genes from bacteria or fungi in the B. tabaci genome, including genes encoding hopanoid/sterol synthesis and xenobiotic detoxification enzymes that are not present in other insects, offers novel insights into the unique biological adaptations of this insect such as polyphagy and insecticide resistance. Interestingly, two adjacent bacterial pantothenate biosynthesis genes, panB and panC, have been co-transferred into B. tabaci and fused into a single gene that has acquired introns during its evolution. Conclusions The B. tabaci genome contains numerous genetic novelties, including expansions in gene families associated with insecticide resistance, detoxification and virus transmission, as well as numerous horizontally transferred genes from bacteria and fungi. We believe these novelties likely have shaped B. tabaci as a highly invasive polyphagous crop pest and efficient vector of plant viruses. The genome serves as a reference for resolving the B. tabaci cryptic species complex, understanding fundamental biological novelties, and providing valuable genetic information to assist the development of novel strategies for controlling whiteflies and the viruses they transmit. Electronic supplementary material The online version of this article (doi:10.1186/s12915-016-0321-y) contains supplementary material, which is available to authorized users.

genes encoding odorant binding proteins and chemosensory proteins in B. tabaci, 9 and 18 respectively, is likewise reduced compared to most other examined insect genomes, but similar to that of other Hemiptera [2]. The low number of genes encoding ORs is not wholly unexpected and in line with the observation that the antenna of B. tabaci only houses a handful of olfactory sensilla [4], which taken together with the genetic make-up suggests a reduced role for olfaction in this taxon. More unexpected is the low number of GRs. Compared to specialized species, polyphagous insects typically show expanded GR repertoires, particularly in genes encoding sensing of bitter tastes, which presumably reflect a need to identify a wider range of plantproduced toxins [5]. The low number of GRs in N. lugens may accordingly be a result of its specialized lifestyle and hence a limited need to differentiate and detect a large variety of toxins.
Curiously, B. tabaci appears to have evolved a different strategy: rather than expanding the GR repertoire, it has instead increased the number of detoxification genes, thereby possibly rendering the need for detecting plant toxins of reduced importance.

Immune components and responses
The whitefly immune system is expected to be critical for recognizing and degrading microbial pathogens, while retaining beneficial endosymbionts. Orthologs of the key immune components of the TOLL, JAK-STAT, and JNK pathways were readily identified in the B. tabaci genome (Additional file 19). However, crucial components in the IMD pathway, such as IMD, dFADD, Dredd, and Relish, were not present. The IMD pathway mediates the humoral immune response against Gram-negative bacteria in Drosophila melanogaster [6]. This pathway is also incomplete in the genomes of other hemipteran insects, including A. pisum, R. prolixus, D. citri and N. lugens, while being intact in the non-hemipteran insects including Apis mellifera (honey bee), Anopheles gambiae (mosquito) and D. melanogaster (fruit fly) (Additional file 1: Figure S11).
In the B. tabaci genome, we found a single ortholog for the PGRP receptor family and five for the GNBP family, which are presumed to play central roles in recognition of bacteria and fungi.
In addition, four antimicrobial peptides (AMPs) including thaumatin and defensin, were detected in the B. tabaci genome. Thus, the B. tabaci genome has a reduced immune repertoire as was found in the pea aphid genome [7], which may have facilitated the acquisition and maintenance of its microbial symbionts.

RNAi pathway
RNA interference (RNAi) is a conserved post-transcriptional gene silencing mechanism mediated by short interfering double-stranded RNA (dsRNA)-induced mRNA degradation in a variety of eukaryotic organisms. Gene silencing is induced by two partially overlapping pathways related to microRNA (miRNA) or small interfering RNA (siRNA) biogenesis. The B.
tabaci genome possesses a single copy of most core genes in the miRNA pathway, including one copy of dicer-1, ago-1, drosha, exportin-5, loquacious and pasha (Additional file 20). Single copies of the core genes in the miRNA pathway were also observed in most other sequenced insect genomes [8,9]. A previous study found a single copy of each siRNA pathway gene in B.
tabaci based on transcriptome data [10]. In the B. tabaci genome, we identified single copies of dicer-2 and R2D2 but two copies of ago-2 (Additional file 20). Interestingly, we identified 11 copies of RdRPs in the B. tabaci genome and several of these are more similar to virus RdRPs than to those found in other insects (Additional file 1: Figure S12).