### The model

Given the different magnitudes and directions of selection acting at the *Cyp6g1* locus in males and females, it is difficult to predict the invasibility of susceptible populations or how DDT-R frequencies will change in the absence of insecticide selection. Building on a previously published model [33], we modelled the frequency of DDT-R over time in *D. melanogaster* using selection estimates from published fitness determinants documented in the *Canton-S* background in the absence of DDT. Additionally, we considered the effect of including a period of selection with pesticide on allele trajectories, and then by removing DDT selection (as this mimics the current situation), asked if DDT-R could be retained in the absence of this strong source of selection. All aspects of the model were executed using MATLAB [34].

The model terms with default parameter values are outlined in Table 1. Given that there is a competitive mating disadvantage of DDT-R for *Canton-S* males [23], we need to calculate the probability that a mating male has a specific genotype. We do this using the parameter *m,* which represents the mating success of R males relative to S males. The proportion of fathers who carry each genotype is given by the following equations,

$$ \begin{array}{l}{y}_{RR}=\frac{m{x}_{RR}}{m\left({x}_{RR}+{x}_{RS}\right)+{x}_{SS}}\\ {}{y}_{RS}=\frac{m{x}_{RS}}{m\left({x}_{RR}+{x}_{RS}\right)+{x}_{SS}}\\ {}{y}_{SS}=1-\left({y}_{RR}+{y}_{RS}\right)\end{array} $$

(3)

where *R* represents the DDT-R allele and *S* the susceptible allele. Here we assumed that heterozygote males (*RS*) experience the same mating disadvantage (*m*) as homozygous DDT-R males (*RR*). This assumption is based on the dominant nature of the DDT-R allele with respect to both the resistance [22, 35] and female fitness [24] phenotypes. Male mating probabilities (*y*
_{
j
}) vary with population genotype frequency (*x*
_{
i
}) for different values of *m.* For *m* = 1 (i.e. no mating disadvantage), male mating genotype probabilities are equivalent to the genotype frequencies i.e. *y*
_{
j
} = *x*
_{
i
}. Provided there are both resistant and susceptible males in a population, as *m* decreases, the proportion of DDT-R fathers (*y*
_{
RR
} and *y*
_{
RS
}) will be biased downwards (*y*
_{
RR
} < *x*
_{
RR
} and *y*
_{
RS
} < *x*
_{
RS
}) and the proportion of DDT susceptible fathers (*y*
_{
SS
}) biased upward (*y*
_{
SS
} > *x*
_{
SS
}).

Now we can calculate the relative mating frequencies (denoted by *λ*) in our population using the DDT-R genotype frequency and male mating probabilities as follows,

$$ {\lambda}_{ij}={x}_i{y}_j $$

(4)

where the mating frequency subscripts are listed in the order female genotype, male genotype.

Next, DDT-R fitness effects (Additional file 1: Table S1) need to be incorporated into the model in order to predict the genotypic frequencies from one generation to the next. The relative numbers of each genotype eclosing in the next generation can then be calculated, taking into account the mating probabilities and fitness consequences as follows,

$$ \begin{array}{l}{n}_{RR}=FP\left({\lambda}_{RRRR}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{RRRS}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{RSRR}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.{\lambda}_{RSRS}\right)\\ {}{n}_{RS}=FP\left({\lambda}_{RRSS}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{RRRS}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{RSRR}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{RSRS}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{RSSS}\right)+P\left({\lambda}_{SSRR}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{SSRS}\right)\\ {}{n}_{SS}=F\left(\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.{\lambda}_{RSRS}+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{RSSS}\right)+\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.{\lambda}_{SSRS}+{\lambda}_{SSSS}\end{array} $$

(5)

where *F* = *f* × *e* × *l. f* is the relative fecundity of DDT-R females compared to susceptible females; *e* is the relative viability of eggs laid by DDT-R females; *l* is the relative viability of larvae of DDT-R females compared to susceptible females; and *P* is the relative pupal viability of DDT-R flies compared to susceptible flies. Thus, we effectively census the model population at the adult stage, with relative fitness accrued to males and females being a product of maternal and direct contributions of the *Cyp6g1* genotype as shown in Additional file 1: Table S1.

To obtain the frequency of the genotypes in the next generation we use the following recursions (which can be used via numerical simulation to predict genotype and allele frequencies at specific generations),

$$ {x}_i^{\prime }=\raisebox{1ex}{${n}_i$}\!\left/ \!\raisebox{-1ex}{${\Sigma}_i{n}_i$}\right. $$

(6)

Now we would like to examine the dynamics of the model, beginning with solving for frequency equilibria \( \left({\widehat{x}}_{RR},{\widehat{x}}_{RS},{\widehat{x}}_{SS}\right) \) by letting *x*′ = *x* for each genotype*.* Because the three genotype frequencies must necessarily sum to unity, this non-linear system is effectively a two-variable (*x*
_{
RR
}, *x*
_{
RS
}) model and is fully described by the first two genotype recursions. If we represent the functions *x*
^{′}_{
RR
}
and *x*
^{′}_{
RS
}
by *g*
_{
1
} and *g*
_{
2
}, respectively, then there are two conditions, namely \( {g}_1\left({\widehat{x}}_{RR},{\widehat{x}}_{RS}\right)={\widehat{x}}_{RR} \) and \( {g}_2\left({\widehat{x}}_{RR},{\widehat{x}}_{RS}\right)={\widehat{x}}_{RS} \) which must be satisfied simultaneously at any equilibrium. This was done to obtain an analytical equilibrium solution for DDT-R frequency, *x*
_{
R
} (see Additional file 1: equation (S1)).

All initial fitness parameter estimates were derived from previously conducted assays [23, 24]. The relative competitive male mating success, *m*, was derived as the number of mating trials won by resistant males divided by the number won by susceptible males. Relative fecundity, *f*, was derived by dividing the egg count of resistant females by that of susceptible females. The relative viability measures (*e, l, P*), were derived by dividing the resistant viability by the susceptible viability.

To simulate a prolonged period of pesticide selection, the model was initially run for 200 generations, starting at low DDT-R frequency (*x*
_{
RR
} = 0, *x*
_{
RS
} = 0.001) with all parameters set to default. This represents an initially susceptible population into which the DDT-R allele has been introduced at very low frequency and is allowed to go to an internal equilibrium, representing the situation prior to the use of DDT in the 1940s. After this initial phase a period of ‘DDT selection’ was added by introducing a viability advantage, *D* = 5, that is the mortality ratio of susceptible to resistant flies in the presence of DDT. This ratio is conservative compared to the DDT resistance ratios of Daborn et al. [35]. As the DDT resistance phenotype is dominant, this added viability advantage was assigned to both *RR* and *RS* flies. DDT selection was applied for 300 generations after which time *D* was set to zero and the model run until previous internal equilibrium was achieved.

### Empirical tests of the model in replicate experimental evolution populations

Our model gives specific predictions about the speed with which DDT-R alleles can invade a susceptible population and DDT-R frequency equilibria with the parameter settings employed. How well this describes changes in allele frequencies in real populations is uncertain. As a qualitative test of the model we set up replicate fly populations at known initial DDT-R frequencies and propagated them for five non-overlapping generations to examine DDT-R frequency trajectories over time.

*Canton-S* flies were supplied by Bloomington Stock Center in 2011 and were initially homozygous for the ancestral *Cyp6g1* allele (designated *Cyp6g1-M* by Schmidt et al. [36] and referred to as DDT-S herein) as confirmed by PCR [22]. For the purpose of introgression, we followed McCart et al. [24] in using *Hikone-R* flies (supplied by Bloomington Drosophila Stock Center at Indiana University, Indiana USA in 2011), which are homozygous for the most common resistance-associated *Cyp6g1* allele (designated *Cyp6g1-BA* in a previous study [36] and referred to herein as DDT-R) as confirmed using PCR [22].

DDT-R was introduced to the susceptible background by two replicate crosses each of 25 susceptible stock females × 25 *Hikone-R* males and the reciprocal 25 *Hikone-R* females × 25 susceptible stock males. The 50 flies for each replicate cross were placed in a 10 cm × 6 cm glass jar containing *Drosophila* Quick Mix Medium (Blades Biological Ltd, Edenbridge, Kent, UK), allowed to mate and oviposit for 72 hours and then moved on to a similarly prepared jar – each replicate was moved on twice to maximise offspring production. Immediately following removal of parental flies the inner surface of each jar was laced with DDT by pipetting 500 μl of 60 μg/ml DDT in acetone solution, and rolling until the acetone had fully evaporated. F1 larvae that survived and developed into adults were then backcrossed with the relevant susceptible stock as above. This backcrossing, combined with DDT selection, was carried out for five generations after which offspring were mated in individual pairs and allowed to lay eggs. The parents were then diagnosed for the presence of DDT-R alleles using PCR [22]. The offspring of homozygous DDT-R crosses were then used to found the corresponding DDT-R populations. All populations (DDT-R and DDT-S) were subsequently maintained in (side 30cm) population cages.

We then established eight low frequency (LF) populations (initial DDT-R allele frequency 10 %), two mid-frequency (MF) populations (initial DDT-R allele frequency 50 %) and two high frequency (HF) populations (initial DDT-R allele frequencies 90 %) as follows. Each population was started with two hundred three- to five-day old virgin flies at an even sex ratio with *Cyp6g1* genotypes at Hardy-Weinberg equilibrium frequencies (e.g. RR:RS:SS was 2:36:162 and 50:100:50 for LF and MF replicates, respectively). Populations were reared in vials (diameter 4.5 cm and height 12 cm) with adult flies left to mate and lay eggs for 72 hours, at which time the adults were removed, to limit larval density, and stored at −20 °C. Larvae were allowed to develop, pupate and eclose, and were collected as virgins for four days after initial eclosions. Eighty flies of each sex (n = 160) from the second and third day of eclosion were then haphazardly selected to act as parental flies for the next generation. Non-parental flies (i.e. offspring that were not members of the selected 160) were frozen. The process was repeated for four more generations, after which the populations were terminated. To determine the frequency of *Cyp6g1* genotypes at the end of this period about 50 individual 5th generation flies were analysed by PCR [3] for the presence/absence of the *Accord* LTR-inserted resistance allele.

We also used previously collected population-cage data [37]. The original aim of this population cage experiment was to determine if DDT-R conferred an overall pleiotropic fitness advantage at the population level. Two sets of population cages were established using either 50 RR 5-generation-backcrossed virgin females or males crossed to 50 SS males or 50 SS virgin females (RR × SS and SS × RR), respectively. For each set, three replicate cages were run for a total of six replicate populations (designated here as nHW populations). Flies were left to mate and lay eggs for 72 hours at which time the adults were removed to limit larval density. Following the emergence of the next generation, adult flies were collected for seven days and then used to found a new cage for the next generation. The populations were maintained in this manner for 10 generations. At each generation 80–120 adult offspring were taken from the transfer population to allow allele frequency estimation using PCR [22].