In a host-parasite interaction the associated partners can have an influence on each other's evolution . Molecular signatures of these complex evolutionary processes can be detected in the genomes of both organisms involved in such associations. Indeed, genes encoding pathogenicity factors directly involved in counteracting host defences or vice versa are expected to be subject to positive selection, driven by an arms race between the two partners. Such coevolutionary processes have been well described in certain plant-pathogen interactions, where the host resistance genes and corresponding avirulence genes in the pathogen show evidence of positive selection . In the Xanthomonas-pepper interaction, the Hrp pilus, a filamentous structure allowing bacteria to directly inject toxins into plant cells, also evolves under positive selection, thereby avoiding the plant defence surveillance system . Positive selection has also been detected in insect-pathogen interactions. For example, in Drosophila, RNA interference (RNAi) molecules involved in anti-viral defence are among the fastest evolving genes in this insect. This rapid evolution is due to strong positive selection, illustrating that the host pathogen arms race between RNA viruses and host antiviral RNAi genes is very active and significant in shaping RNAi function .
We are interested in characterizing the evolutionary processes underlying the insect host-parasite interactions between lepidopteran hosts and parasitoid wasps. In these systems, the endoparasitoid wasp larvae develop inside the lepidopteran host despite the hostile environment this habitat represents. One of the most original strategies developed by these wasps to defeat these defences is the injection of a symbiotic polydnavirus (PDV) at the same time as the wasp eggs [5–7]. PDVs are divided in two genera, ichnoviruses and bracoviruses, which are associated with tens of thousands of endoparasitoid wasps belonging to two different families, Ichneumonidae and Braconidae . PDVs are found in these wasps as proviruses which are transmitted vertically from one wasp generation to the next [9–13]. Proviruses are excised from the wasp genome in the female ovaries and, after replication, are injected into the host caterpillar as multiple double-stranded DNA circles packaged in capsids. The virus does not replicate in the host caterpillar, but viral gene expression and protein production are essential for alterations to the immune system and development of the host leading to successful development of the wasp larvae.
In this biological system, the virus plays key roles both in the mutualistic association with the wasp and in the parasitic association between the wasp and the caterpillar. PDVs are therefore likely to display molecular signatures which reflect constraints imposed both by the wasp and the host caterpillar. So far, however, reports have principally concentrated on the influence of wasp evolution on viral genomes. Braconid wasps carrying PDV form a monophyletic lineage, suggesting a unique event of association between the wasp ancestor and the virus ancestor and a vertical transmission of the virus along wasp lineages . Accordingly, a phylogenetic study of Cotesia spp. and their associated viruses has shown a codivergence between the two mutualists . Finally, recent data on the genome sequence of several PDVs has revealed that these viruses harbour a large number of eukaryotic genes likely picked up from the wasp genomes. These genes form multigene families that are good candidates to be involved in alteration of host caterpillar physiology [16–20]. Surprisingly, very few studies have focused on the potential influence of the host caterpillar on viral gene evolution despite the strong selective pressure this habitat represents. In this paper, we report on the molecular evolution of a viral gene family considering both wasp evolution and the selective pressure imposed by the caterpillar hosts.
Our model system is the interaction between the braconid wasp Cotesia congregata and its lepidopteran host, the tobacco hornworm, Manduca sexta. The PDV associated with C. congregata (CcBracovirus, CcBV) has been sequenced, revealing the presence of numerous genes possibly involved in host deregulation . Among these viral genes, one gene family encoding cystatins constitutes an interesting candidate system to study the influence of the host-parasitoid association at the viral molecular level. Cystatins are tightly binding reversible inhibitors of papain-like cysteine proteases, and are widespread in plants and animals . They are characterized by three conserved domains forming the site of interaction with C1 cysteine proteases: an N-terminal glycine, a glutamine-X-valine-X-glycine motif and a C-terminal proline-tryptophane amino acid pair [22, 23]. Cystatins and their target proteases have often been shown to be involved in host-parasite interactions with cystatins either playing the role of defence molecules or virulence factors. For example, in parasitic nematodes, cystatins are thought to play a key role in controlling the host immune response [24–26]. Remarkably, plant cystatins acting as defence proteins have been shown to evolve under strong positive selection in response to cysteine proteases released by phytophagous insects. In this system, it has been suggested that plant cystatins and insect cysteine proteases are involved in a coevolutionary process .
CcBV cystatins constitute the first description of cystatin genes in a virus and are organized in a multigene family, composed of three genes present on the same circle [17, 20]. To date, there is no evidence of cystatin genes in Microplitis demolitor bracovirus (MdBV) which has been fully sequenced  and they have only been identified in one other PDV (GiBV) from the braconid wasp Glyptapanteles indiensis . Both genomic and physiological features of cystatins suggest that these viral proteins could play an important role in the host-parasite association. First, the genomic organization in a multigene family could be indicative of selective pressures acting on these genes. Indeed, Francino  suggested that gene duplications that can lead to an increase in protein dosage are favoured by selective pressures. Second, cystatin genes are expressed rapidly and at an extremely high level during parasitism. This early and prolonged expression could be indicative of a role of cystatins in the early steps of host physiological disruption, as well as in the maintenance of this perturbed state. Finally a recombinant viral cystatin (Cystatin 1) was shown to be a functional and specific cysteine protease inhibitor .
In this study we checked for molecular signatures associated with positive selection that may act on the viral cystatin gene family. We demonstrate strong and lineage-specific adaptive evolution acting on these genes. Using homology modelling and molecular dynamics (MD) simulation techniques we obtained the three-dimensional (3D) structure of CcBV cystatin 1. The predicted model of the 3D structure of CcBV cystatin provides a framework to position the positively selected residues, and reveals that these are situated in key sites which are important for the interaction with target proteases. This particular selection, which is probably imposed by host defences, emphasizes the potential role of cystatins as pathogenic factors and suggests that cystatins coevolve with host cysteine proteases.