Three sets of D. melanogaster lines resulting from long-term directional selection for stress tolerance were employed in our experiments: (1) three desiccation-resistant lines established by selection over 48 generations; (2) three lines tolerant to severe hypoxic stress generated through long-term experimental selection (for more than 200 generations), and (3) three hyperoxia-tolerant lines. Details of the experimental scheme for hypoxia-tolerance selection were provided elsewhere [81, 82]. Peculiarities of the selection for hyperoxia tolerance are described by Zhao et al. [83]. Selection for desiccation tolerance was performed by DDA.
Selection for desiccation tolerance
Wild individuals of D. melanogaster (n = 120) were collected in March 2009 from Madhya Pradesh, Jabalpur, India (23°30’N; 80°01’E; alt. 393 m). Before the start of the selection experiment, mass culture was maintained for five generations under standard laboratory conditions at low density (on yeast-cornmeal-agar medium at 21 °C, and ~70 % relative humidity) to eliminate environmental effects. For laboratory selection, virgin flies were sexed under CO2 anesthesia at least 48 h prior to the experiment. Then, virgin flies (3–4 days old) were placed in groups of 25 into plastic vials containing 2 g of silica gel and covered with foam discs. Experiments were conducted for males and females separately. Flies were subjected to desiccation stress until approximately LT70–LT85 level of mortality was reached. Control groups were established in the same manner, excluding water stress. In each generation, we examined approximately 1,000 virgin flies of each sex per replicate, of which at least 100 males and 100 females survived the LT70–85 cut-off to become the parents of the next generation. For each group (selection and control), survivors were randomly allocated into three sub-groups (three replicates). The same protocol was repeated for 48 generations (each next generation was subjected to analogous treatment), and then selection was relaxed for 8–10 generations before initiating the recombination tests. The control lines were not subjected to any treatment and were maintained in comparable densities to the selection lines on standard media. In the present study, we used three control and three desiccation-resistant lines for recombination tests. Average desiccation tolerance of the initial population was 14.8 h and 23.2 h (with SD = 2.88 and 3.44), for males and females, respectively. After 48 generations of selection, these tolerance characteristics increased to 25.3 h and 43.6 h for males and females, respectively, i.e. 3.65 SDs and 5.93 SDs compared to the starting population.
Hypoxia- and hyperoxia-tolerant lines
Selection for hypoxia/hyperoxia tolerance was initiated after crossing 27 isofemale D. melanogaster lines (kindly provided by Dr. Andrew Davis), that varied considerably in acute anoxia test as well as for eclosion rates when cultured under hypoxic or hyperoxic conditions. Males and virgin females (n = 20) were collected and pooled from each isofemale line. This parental population was reared at room temperature with standard food medium. F1 embryos from the pooled population were separated and maintained in nine separate chambers, three each for control, hypoxia- and hyperoxia-selection experiments. Trial experiments were run to determine the starting O2 concentrations for hypoxia- and hyperoxia-tolerance selection. We analyzed the feasibility and tolerance capacity of the F1 progeny of the parental cross to different O2 concentrations (i.e. 8, 6, or 4 % O2 for hypoxia selection and 60 %, 70 %, 80 % and 90 % O2 for hyperoxia selection). In addition, the tolerance levels of each parental line to hypoxia or hyperoxia were measured by testing survival of each individual line in the hypoxic or hyperoxic environments. In the pilot study, the selection for hypoxia tolerance was therefore started at 8 % O2 and for hyperoxia tolerance at 60 % O2. The low O2 concentration was gradually decreased by 1 % and the high O2 was increased by 10 % every 3 to 5 generations to maintain the selection pressure. The population size was kept at around 2,000 flies in each generation. Eggs of the first egg laying for each generation were removed to limit genetic drift induced by the ‘early-bird’ effect. After seven generations of selection, hyperoxia tolerance was increased to 80 % O2, and after 13 generations the hypoxia tolerance in the hypoxia-selected flies reached 5 %, a level that is lethal for most of the control flies (Additional file 2: Figure S3). The hyperoxia-selected flies broke through the lethal hyperoxic level (90 % O2) after 13 generations of selection, and the hypoxia-selected flies exhibited tolerance to a severe level of hypoxia (4 % O2, embryonic lethal to control flies) following 32 generations of selection. The lethality in these selection experiments was defined as the level of oxygen in which D. melanogaster cannot complete development and reproduce.
Genetic crosses
Virgin females (3 days post-eclosion) of each control and selection lines (three replicate lines each for control and selection groups) were allowed to mate with males of marker stocks (Additional file 2: Figure S1). Four marker stocks were employed (Additional file 2: Figure S2): y cv v f for the X chromosome; net dp b pk cn for the 2 L arm, cn kn c px sp for the 2R arm, and ru h th cu sr e for chromosome 3. F1 heterozygous virgin females were collected for each replicate line, and thereafter test-crossed with marker males. Because maternal age may also influence rf in D. melanogaster, we reduced this effect by allowing the 50- to 60-hour old (post-eclosion) F1 virgin females to mate with marker males for approximately 48 hours. To obtain a sufficient number of flies per replicate for scoring recombination, each replicate line was divided into three sub-replicates before the start of recombination experimentation. In this panel, we scored recombination in nine sub-replicates of three replicate lines each for control and selection. In the desiccation experiment, we scored 1,050 individuals of each replicate line (or 350 individuals per sub-replicate), i.e. a total 6,300 flies were counted for estimation of rf at the X chromosome. We scored 750 individuals of each replicate line (or 250 individuals per sub-replicate), i.e. 4,500 individuals each were scored for arms 2 L and 2R and chromosome 3. A total of 19,800 flies were counted for estimation of rf in the desiccation-selection experiment. Similarly, 750 flies per line, or a total 27,000 flies, were scored for rf in the hypoxia/hyperoxia experiments. In the three experiments, we scored a total of 46,800 individuals.
Statistical analysis
For each pair of intervals and each of the three control or selection lines, ML analysis was performed to estimate the recombination frequencies r
1k
and r
2k together with the coefficient of coincidence c
k
(k = 1,2,3). For a pair of intervals, either adjacent or non-adjacent, the log-likelihood function had the following form:
$$ \log \left(L\left(r{1}_k,r{2}_k,{c}_k\right)\right)=\sum_{ij,k}{n}_{ij,k} \log \left({p}_{ij,k}\left(\kern0.10em r{1}_k,r{2}_k,{c}_k\right)\right) $$
where i, j ϵ {0, 1} define whether the recombination event occurred in the first or second interval, respectively (0 – no recombination, 1 – recombination), k denotes the replicate line, and p
ijk
and n
ijk
represent the probability and the observed number of individuals of the genotype class ij in replicate line k in the backcross progeny (within control or selection). The frequencies for the four genotype classes were defined as:
$$ \begin{array}{l}{p}_{11,k}=\left(r{1}_kr{2}_k{c}_k\right),\kern1em \\ {}{p}_{01,k}=r{2}_k\left(1-r{1}_k{c}_k\right),\kern1em \\ {}{p}_{01,k}=r{1}_k\left(1-r{2}_k{c}_k\right),\kern1em \\ {}{p}_{00,k}=\left(1-r{2}_k-r{1}_k+r{1}_kr{2}_k{c}_k\right).\kern1em \end{array} $$
The ML estimate \( \widehat{{\boldsymbol{\uptheta}}_k} \) of the vector θ
k
= (r1
k
, r2
k
, c
k
) for k = 1,2,3 was obtained by numerical optimization of the log-likelihood function L (θ
k
), using the gradient-descent procedure in which all three parameters r1
k
, r2
k
and c
k
are evaluated simultaneously in every iteration:
$$ \begin{array}{l}r{1}_{n+1,k}=r{1}_{n,k}-{\alpha}_{n+1}\frac{\partial \mathrm{L}\left({\boldsymbol{\uptheta}}_{\boldsymbol{\kappa}}\right)}{\partial r{1}_k}\\ {}r{2}_{n+1,k}=r{2}_{n,k}-{\alpha}_{n+1}\frac{\partial \mathrm{L}\left({\boldsymbol{\uptheta}}_{\boldsymbol{\kappa}}\right)}{\partial r{2}_k}\\ {}{c}_{n+1,k}={c}_{n,k}-{\alpha}_{n+1}\frac{\partial \mathrm{L}\left({\boldsymbol{\uptheta}}_{\boldsymbol{\kappa}}\right)}{\partial {c}_k}\end{array} $$
where n refers to iteration number, k to the line (within control or selection), and α to the step size. The variances of the estimated parameters r
1k
, r
2k
, c
k
were calculated as corresponding diagonal elements of the covariance matrix V
k
= I
−1(\( {\widehat{\boldsymbol{\uptheta}}}_k \) ) = I
k
−1, where I is the Fisher’s information matrix [54]. The estimates of the parameter vector Θ = (r
1, r
2, c) for the entire group (control or selection) together with the vector V
Θ
of their variances, were obtained as:
$$ \widehat{\boldsymbol{\Theta}}=\frac{{{\displaystyle {\sum}_i\boldsymbol{I}}}_i\widehat{{\boldsymbol{\uptheta}}_{\boldsymbol{\upiota}}}}{{{\displaystyle {\sum}_i\boldsymbol{I}}}_i}\kern0.24em \mathrm{and}\;{\mathbf{V}}_{\varTheta }={\left({\displaystyle {\sum}_i{\mathbf{I}}_i}\right)}^{-1} $$
This approach enables tests of the heterogeneity of the lines within selection and control groups, across the entire set of selection and control lines, and between selection and control groups, with respect to the estimated parameters. To assess the heterogeneity of \( \widehat{{\boldsymbol{\uptheta}}_k} \) estimates of all three parameters (r1
k
, r2
k
c
k
) in k lines we can use the following statistics that is asymptotically distributed as χ
2 with 3(k-1) degrees of freedom:
$$ {X}_3^2\left(k-1\right)={\displaystyle \sum_m{\left(\widehat{\varTheta}-\widehat{{\boldsymbol{\uptheta}}_m}\right)}^T{I}_m}\left(\widehat{\varTheta}-\widehat{{\boldsymbol{\uptheta}}_m}\right) $$
To assess heterogeneity of a single parameter p in k lines the following statistics asymptotically distributed as χ
2 with df = k-1 can be used:
$$ {X}_{k-1}^2 = {\displaystyle \sum_m}\frac{{\left(\widehat{\varTheta}-\widehat{{\boldsymbol{\uptheta}}_m}\ \right)}^2}{\upsigma_{pm}^2} $$
where \( \widehat{{\boldsymbol{\uptheta}}_k} \) is the ML-estimate of θ
k
, σ
pk
2 is the squared standard error of parameter p in the k
th line, and \( \widehat{\varTheta} \) is the weighted mean of \( \widehat{{\boldsymbol{\uptheta}}_k} \). Using this weighted likelihood approach, we can present the total heterogeneity of \( \widehat{{\boldsymbol{\uptheta}}_k} \) across all lines of control and selection groups as:
$$ {X^2}_{\mathrm{total}\ \left(\mathrm{control}+\mathrm{selection}\right)} = {X^2}_{\mathrm{within}\ \left(\mathrm{control}\right)}+{X^2}_{\mathrm{within}\ \left(\mathrm{selection}\right)}+{X^2}_{\mathrm{between}\ \left(\mathrm{control}\ \mathrm{v}\mathrm{s}.\ \mathrm{s}\mathrm{election}\right).} $$
Thus, the significance of the difference between selection and control lines can be tested using the statistics:
$$ {X^2}_{\mathrm{between}\ \left(\mathrm{control}\ \mathrm{v}\mathrm{s}.\ \mathrm{s}\mathrm{election}\right)} = {X^2}_{\mathrm{total}\ \left(\mathrm{control}+\mathrm{selection}\right)}\mathit{\hbox{-}}{X^2}_{\mathrm{within}\ \left(\mathrm{control}\right)}\hbox{-} {X^2}_{\mathrm{within}\ \left(\mathrm{selection}\right)} $$
which is distributed approximately as χ
2 with df = 1 upon H0{no difference between the compared groups (selection vs. control) for the parameter p}.
The importance of using this approach in testing the differences in interference derives from the fact that heterogeneity of recombination rates within the sample (e.g. between replicate lines of the selection group), with positive co-variation of recombination rates in two intervals, may lead to biased upward estimates of c and even c >1 [63]. Therefore, to reduce the danger of such outcomes while testing for significance between control and selection lines in each of the three experiments, we employed, wherever possible, the weighted ML estimates of recombination (Additional file 3) and interference (Additional files 4 and 5) parameters in weighted likelihood approach, in addition to the standard ML approach (see below). However, where \( \widehat{\theta_c} \) , the estimate of c, was zero in one or more of the three control or selection lines, its standard error was also zero, thereby overweighting the estimates of c from the other two lines and leading to zero weighted average per selection or control. Thus, for all the data we also employed the standard and more direct ML approach allowing for each line, in both selection and control, to have its own r1
k
and r2
k
. Namely, to test for significance of the differences of c values in selection and control, we performed log-likelihood ratio test of H0 {one global c for all selected and control lines} versus H1 {two c’s, one for all selected lines and one for all control lines}:
H1 : {Θ
control = (r
1c, r
2c, c
c), Θ
selection = (r
1s, r
2s, c
s)} vs. H0 : {Θ
control = (r
1c, r
2c, c), Θ
selection = (r
1c, r
2c, c)}, where pairs of vectors r
1c and r
2c represent the unknown rf values for the analyzed pair of intervals for the three control lines, r
1s and r
2s – the vectors of rf values for the three selection lines, c
c
and c
s
– the line-independent values of coefficients of coincidence for control and selection groups, and c
g
– the global c under the H0 assumption that c
s
= c
c
. Therefore, the H0 and H1 hypotheses are specified by 14 and 13 parameters and the log-likelihood ratio test of H1 versus H0 is asymptotically distributed as χ
2 with df = 1.
The obtained P values (for two-tailed test) were subjected to false discovery rate correction for multiple comparisons before demonstrations in tables, figures and text. For false discovery rate correction, we used a total 48 comparisons across three experiments (with 16 intervals in each) for the recombination rates, while 189 comparisons for the interference estimates.