- Research article
- Open Access
Evolution favors protein mutational robustness in sufficiently large populations
BMC Biologyvolume 5, Article number: 29 (2007)
An important question is whether evolution favors properties such as mutational robustness or evolvability that do not directly benefit any individual, but can influence the course of future evolution. Functionally similar proteins can differ substantially in their robustness to mutations and capacity to evolve new functions, but it has remained unclear whether any of these differences might be due to evolutionary selection for these properties.
Here we use laboratory experiments to demonstrate that evolution favors protein mutational robustness if the evolving population is sufficiently large. We neutrally evolve cytochrome P450 proteins under identical selection pressures and mutation rates in populations of different sizes, and show that proteins from the larger and thus more polymorphic population tend towards higher mutational robustness. Proteins from the larger population also evolve greater stability, a biophysical property that is known to enhance both mutational robustness and evolvability. The excess mutational robustness and stability is well described by mathematical theory, and can be quantitatively related to the way that the proteins occupy their neutral network.
Our work is the first experimental demonstration of the general tendency of evolution to favor mutational robustness and protein stability in highly polymorphic populations. We suggest that this phenomenon could contribute to the mutational robustness and evolvability of viruses and bacteria that exist in large populations.
Proteins are quite tolerant of mutations, allowing evolution to produce highly diverged sequences that fold to similar structures and perform conserved biochemical functions [1, 2]. However, proteins with nearly identical structures and functions can differ in their robustness to mutation [3–5], as well as in their capacity to acquire new functions . The fact that mutational robustness and evolvability can vary among the functionally equivalent proteins produced by natural sequence divergence makes these properties important hidden dimensions in evolution - direct selection for protein function is blind to them, yet they can play a crucial role in enabling future evolution. Whether the evolutionary process somehow promotes the acquisition of mutational robustness and evolvability therefore remains a major question [6–8].
Previous experiments have identified several specific evolutionary conditions that can affect mutational robustness. For example, genetic complementation decreases the mutational robustness of viruses , while high mutation rates favor mutational robustness in simulated digital organisms . However, theory  makes the much broader - and previously experimentally untested - prediction that extra mutational robustness will arise quite generally in sufficiently large populations. This prediction cannot be understood in the standard framework of Kimura's neutral theory , because one of the usual assumptions of the neutral theory is that mutational robustness is constant. (Although Takahata  treated the consequences of stochastically fluctuating neutrality on the molecular clock, he did not describe how mutational robustness might change systematically during evolution.) However, changes in mutational robustness can be described by envisioning evolution as occurring on neutral networks, or sets of functionally equivalent proteins that are connected by single mutational steps [14–17]. In a seminal theoretical analysis of evolution on neutral networks, van Nimwegen and coworkers  predicted that the extent of mutational robustness should depend on the degree of population polymorphism. Here, we briefly summarize their reasoning, as it motivates our experimental work. We also refer the reader to chapter 16 of Wagner , which contains an excellent explanation of the densely mathematical work of van Nimwegen and coworkers .
If an evolving population is mostly monomorphic, then each mutation is either lost or goes to fixation before another mutation occurs. The population is therefore usually clustered at a single genotype and rarely experiences mutations, meaning that selection does not distinguish between genotypes of different mutational robustness. The evolving population can be envisioned a single walker on the neutral network, and although the population is less likely to move to poorly-connected nodes of the neutral network, when it does reach such nodes it will tend to remain "stuck" at them for long periods of time (the population behaves as in the "blind ant" walk described in ). As a result, a mostly monomorphic population occupies all neutral network nodes with equal probability . By contrast, a highly polymorphic population is always spread across many nodes of the neutral network. When mutations occur, the members of the population at highly connected nodes have a better chance of surviving, causing them to be favored by evolution and increasing the average mutational robustness [11, 17–20]. Specifically, a highly polymorphic population occupies each node with a probability proportional to its eigenvector centrality [11, 17], a measure of how connected it is to other connected nodes (a variant of eigenvector centrality is used by Google's PageRank algorithm to rank a webpage's importance in the network of internet links ). Figure 1A illustrates how mostly monomorphic and highly polymorphic populations are predicted to occupy a neutral network. The preference of highly polymorphic populations for more connected neutral network nodes leads to an increase in the average mutational robustness, as a node's connectivity is proportional to its robustness to single mutations.
For proteins, this preference for excess mutational robustness in highly polymorphic populations can also be seen in the stabilities of the evolved proteins . The basic idea is that selection for protein function imposes a roughly threshold requirement on protein stability, with proteins able to perform their biochemical functions if, and only if, they are more stable than some minimal threshold. Extra stability beyond the threshold confers no direct benefit on a protein's function, but it does increase the protein's mutational robustness by allowing to tolerate a wider range of destabilizing mutations (as has been experimentally verified for three different enzymes [3–5]). The preference for protein mutational robustness in highly polymorphic populations is therefore predicted to be manifested by higher average stability of proteins evolving in such populations . Figure 1B illustrates how proteins from highly polymorphic populations are predicted to be more stable than their counterparts from mostly monomorphic populations. Note that the extent of polymorphism depends on the product of the mutation rate and population size, meaning that proteins from populations of different sizes are predicted to evolve to different levels of mutational robustness and stability even if they experience the same mutation rate.
Results and Discussion
Design of neutral evolution experiment
To test whether high population polymorphism drives an increase in mutational robustness and protein stability, we performed laboratory evolution experiments on cytochrome P450 proteins. The basic idea was to neutrally evolve P450s under a constant selection pressure in populations that were either monomorphic or highly polymorphic, and observe whether the proteins evolved to different levels of mutational robustness and stability. The evolution experiments started with a P450 BM3 heme domain that had been engineered to hydroxylate 12-p-nitrophenoxydodecanoic acid (12-pNCA) . We imposed the selection criterion that Escherichia coli cells expressing the P450 had to yield lysate with enough active enzyme to hydroxylate a specified amount of 12-pNCA in 40 min. This criterion roughly corresponds to the case in which an enzyme must catalyze a biochemically relevant reaction at some minimal level in order for its host to survive. Note that other properties such as stability and expression level can vary freely, provided that the criterion for total activity is met.
The properties of a neutrally evolving protein eventually "equilibrate," much as the properties of an isolated physical system under some macroscopic constraint tend towards the values that maximize the system's internal entropy. For proteins, this usually means that stability, expression, and activity drift towards their lowest tolerable values, as the vast majority of random sequences do not encode stable, well expressed enzymes (that is, natural selection must work against sequence entropy to maintain a functional protein) [22, 24]. The initial P450 had been engineered for maximal activity , meaning that it was not equilibrated to the more mild selection criterion of the experiments. We therefore neutrally evolved this initial P450 for 16 generations, introducing random mutations with error-prone PCR and retaining all mutants that met the selection criterion for total activity on 12-pNCA. The procedure used for this equilibration evolution was similar to that used for the polymorphic neutral evolution described below. As expected, expression, stability, and activity all dropped during the equilibration evolution. At the end of the equilibration evolution, we chose a single sequence as the parent for the neutral evolution experiments. The gene encoding this parent sequence contained 29 nucleotide mutations and 13 amino acid mutations relative to the initial P450 (Additional file 1).
We used this parent gene to begin three parallel sets of neutral evolution experiments, which we named "monomorphic," "polymorphic," and "unselected" (Figure 2). The monomorphic experiments capture the case where the population moves as a single entity, the polymorphic experiment captures the case where the population spreads across many sequences, and the unselected experiments show how the gene evolves in the absence of selection for protein function. In all experiments, at each generation we used error-prone PCR to introduce an average of 1.4 nucleotide mutations per P450 gene (Table 1). The mutant genes were ligated into a plasmid and transformed into E. coli , and transformants were selected using the plasmid's antibiotic resistance marker. For the unselected case, we randomly picked one of the mutants, recovered the mutant gene with a plasmid mini-prep, and used this mutant as the template for the next generation of error-prone PCR. We performed four independent replicates of unselected evolution, evolving each for 12 generations.
For the monomorphic and polymorphic populations, we imposed the selection criterion that the P450s hydroxylate 12-pNCA with at least 75% of the total activity of the original parent gene. We expressed the P450s in E. coli, and then assayed the cell lysates for activity in a high-throughput 96-well plate format. The total amount of product produced by 80 μl of clarified lysate in 40 min was compared to the median of four control wells containing the original parent P450 to determine if the mutant met the selection criterion. The only difference between the monomorphic and polymorphic experiments was the size of the evolving populations. In the monomorphic limit, each mutation is either lost or goes to fixation before the next occurs. We enforced this evolutionary dynamic by holding the population size to a single protein sequence. At each generation, we assayed a single mutant. If this mutant met the selection criterion, then it was carried over to the next generation, corresponding to a neutral mutation going to fixation. If the mutant failed the selection criterion, then the population stayed at the previous sequence for the next generation, corresponding to a mutation lost to selection. The fact that we retained the previous sequence when a nonfunctional mutant was screened is critical, as it made the protocol correspond to the case of a mostly monomorphic population where the genotype is unchanged if a nonfunctional mutant is produced (if instead a functional variant was selected at each generation, the protocol would then correspond to the "myopic ant" walk of , and would no longer reproduce the behavior of a mostly monomorphic population). If all of the mutants assayed had zero or one mutations, then this protocol would correspond exactly to the "blind ant" walk of  or the N μ ≪ 1 equations of . However, in order to achieve appreciable sequence evolution on a laboratory time scale, we used a mutation rate that sometimes produced multiple mutations in a generation. We mathematically describe this situation in the Appendix; here we simply note that it is possible to think of each generation as introducing a single mutational event rather than a single mutation. We performed 22 independent replicates of monomorphic evolution, evolving each for 25 generations.
In the polymorphic limit, the population spreads across many sequences. To implement this experimentally, we assayed 435 mutants at each generation. The selection criterion was used to classify each mutant as functional or nonfunctional. In neutral evolution, all functional mutants reproduce with equal probability. We therefore pooled equal volumes of stationary-phase cultures of each functional mutant and recovered the pooled genes with a mini-prep. The polymorphic evolution experiment therefore approaches the equations of [11, 22], again with the exception that a sequence might undergo multiple mutations at a single generation. We give the equations describing this situation in the Appendix. The mutational robustness of a sufficiently large population is expected to evolve deterministically [11, 22], so we only performed a single polymorphic replicate (evidence that the experiment was near the deterministic regime is seen below from the fact that the mutational robustness was roughly constant). Because mutations accumulate more rapidly in the polymorphic experiments than the monomorphic ones, we evolved the polymorphic population for 15 generations rather than 25.
Mutations and mutational robustness
Figure 3 shows how mutations accumulated during the course of the neutral evolution experiments (full data are given in Table 2 and Additional file 2). As the unselected protein populations evolve without constraint, mutations accumulate at the same rate at which they are introduced by error-prone PCR, 1.4 nucleotide mutations per generation. Because selection eliminates mutations that disrupt P450 activity, mutations accumulate more slowly in the monomorphic and polymorphic populations. Mutations accumulate more rapidly in the polymorphic population than in the monomorphic populations. This difference in rates is predicted by the equations in the Appendix to be a consequence of the fact that the polymorphic population is more mutationally robust, and so can tolerate more of the possible mutations.
To test directly whether the polymorphic population evolves higher average mutational robustness, we measured the fraction of 435 random mutants that met the selection criterion. Figure 4 shows that the polymorphic population neutrally evolved to a markedly higher mutational robustness than the monomorphic populations, with 50 ± 2% of the final polymorphic population mutants continuing to function versus 39 ± 2% for the final monomorphic populations (Chi-square P-value of 10-3 that these values are the same). The only difference between the two types of populations was their size, so evolution has clearly favored mutational robustness in the larger and thus more polymorphic population. This finding represents the first experimental support for the prediction that highly polymorphic populations evolve excess mutational robustness .
Figure 4 also indicates that the experiments have proceeded for a sufficient number of generations for the mutational robustness to have equilibrated to around its average value. Such equilibration is important because the populations all started from a single parent sequence, and so will take some number of generations to lose their "memory" of this starting sequence. Once this memory is lost, the mutational robustness should remain relatively constant around its average value, as appears to be the case in Figure 4. This figure also supports the notion that the polymorphic population is sufficiently large to be relatively well described by the deterministic equations given in the Appendix, as the fluctuations in its mutational robustness are small relative to the overall difference compared to the monomorphic populations.
Theory predicts that the excess mutational robustness of a highly polymorphic protein population comes from increased protein stability . Because the P450 variants unfold irreversibly, an equilibrium thermodynamic stability ΔG f cannot be measured. We therefore determined stability to irreversible thermal and chemical denaturation, two highly correlated measures of P450 stability that have previously been shown to contribute to mutational robustness  (see Additional files 3, 4, and 5). Figure 5 shows that proteins from the polymorphic population were in fact more stable than their counterparts from the monomorphic population (statistical tests showing that this difference is significant are given in the figure legend). We also observed that proteins in the polymorphic population tended to accumulate to higher levels in E. coli (Figure 5). Elevated expression could be a byproduct of increased stability, or it could independently increase mutational robustness by allowing the proteins to better tolerate mutations that decrease codon adaptation or reduce folding efficiency. Changes in P450 catalytic efficiency did not appear to be a major mechanism for the observed differences in mutational robustness, as we did not see any evidence of systematic differences between the polymorphic and monomorphic populations in the number of 12-pNCA turnovers per enzyme (see the detailed analysis in  and in the Methods section of the present article). However, it is certainly possible that additional unrecognized biophysical factors contributed to the excess mutational robustness of the polymorphic population, although no such factors were immediately obvious.
Interpretation in terms of the P450 neutral network
The higher mutational robustness of the polymorphic population is due to the fact that it occupies the P450 gene neutral network differently than the monomorphic populations. Measurements from the evolution experiments can therefore be used to infer basic properties of the underlying neutral network of P450 genes, as originally noted by van Nimwegen and coworkers . In the Appendix, we derive approximations for the normalized principal eigenvalue 〈ν〉∞ and the normalized average connectivity 〈ν〉 o of the neutral network, where in both cases the normalization is obtained by dividing by the network coordination number. We obtain 〈ν〉∞ = 0.51 and 〈ν〉 o = 0.35 for the P450 gene neutral network. Our ability to consistently estimate these two parameters from four different experimental measurements supports the idea that the theory that we elaborate in the Appendix appropriately describes the experiments. The difference between 〈ν〉∞ and 〈ν〉 o is a measure of the extent to which some P450 neutral network nodes have more connections than others. We note that 〈ν〉∞ is approximately equal to the exponential decline parameter for the asymptotic decline in the fraction of functional mutants with increasing numbers of random nucleotide mutations [3, 27, 28] (see Appendix). Previous studies looking at this exponential decline have reported 〈ν〉∞ = 0.7 for subtilisin , 〈ν〉∞ = 0.7 for 3-methyladenine DNA glycosylase , and 〈ν〉∞ = 0.7 - 0.8 for TEM1 β-lactamase . These comparisons suggest that P450 has a sparser neutral network (smaller 〈ν〉∞) than these other proteins. We suspect, however, that these earlier studies (one of which is our own) overestimate 〈ν〉∞ due to insufficient equilibration of the starting sequence. We believe that the approach of the current work is more accurate for determining 〈ν〉∞ because the measurements are made after many mutations have equilibrated the initial sequence. This approach could be used in future experiments to compare the neutral network connectivities of proteins from different families.
We have demonstrated that neutral evolution favors more mutationally robust proteins when the evolving population is highly polymorphic. Strikingly, the excess mutational robustness is due only to population polymorphism, and so will arise in any population of sufficiently large size. Our work is the first experimental demonstration of this phenomenon, which is predicted to occur quite generally in neutrally evolving proteins and nucleic acids . Furthermore, we were able to identify one of the biophysical factors underlying the increase in mutational robustness by showing that proteins from the highly polymorphic population are more stable. We recognize, however, that evolution in a biological context will be more complex. In our experiments, fitness was considered as the P450's ability to be expressed in active form by bacteria grown to saturation in an environment with plentiful nutrients. Biological fitness, however, depends on numerous additional and subtle effects such as the metabolic costs of synthesis or the burdens imposed by misfolded molecules. Some mutations that are neutral in the experiments might therefore have deleterious effects in a biological setting . The experiments nonetheless capture the overriding constraint that proteins retain their biochemical functions. Our success in quantitatively explaining the results supports the notion that important aspects of protein evolution can be described simply in terms of mutational effects on stability [22, 29].
An obvious question is whether evolution in nature favors mutational robustness by the process we have demonstrated. Whether natural populations will evolve excess mutational robustness in their proteins depends on whether they are sufficiently polymorphic, which will be the case if the product of their effective population size N and per protein per generation mutation rate μ is much greater than one [11, 12]. Accurately estimating N μ, which is closely related to the widely used parameter θ in population genetics, for natural populations is difficult [30, 31] (note that as mutational robustness is a protein-wide property, the relevant mutation rate is per protein, which is ≈ 102 to 103 larger than the per codon mutation rate). For humans and other multicellular organisms, N μ is probably too small  for their proteins to neutrally evolve mutational robustness. But estimates [32, 33] place N μ ≈ 10 to 100 for typical-length proteins in bacteria, and it is probably much higher for many viruses [34, 35]. It therefore is likely that many viral and some bacterial proteins have evolved extra mutational robustness. It is important to note that this type of mutational robustness is due to changes in the internal properties (such as stability) of the proteins, and is limited by the "entropic force" caused by the constant rain of destabilizing mutations [22, 24] rather than by any direct organismal fitness cost of maintaining the mutational robustness. By contrast, some other mechanisms of mutational robustness (such as gene redundancy) impose direct organismal fitness costs, and so will not necessarily be favored in large populations .
The fact that evolution favors protein mutational robustness in sufficiently large populations might also contribute to adaptive evolution. Experiments have shown that extra stability increases a protein's evolvability by allowing it to tolerate a wider range of functionally beneficial but destabilizing mutations . A similar phenomenon seems to occur in natural evolution, where functionally neutral but stabilizing mutations can play a key role in adaptive evolution by counterbalancing the destabilizing effects of other functionally beneficial mutations . Viruses and perhaps bacteria might thus benefit from large population sizes and high mutation rates that drive an increase in the mutational robustness and stability of their proteins, which in turn enhances the capacity of these proteins to rapidly change their sequences and evolve new functions.
Equilibration evolution of the P450 protein
We began with a 21B3 P450 peroxygenase that had been engineered for highly efficient hydroxylation of 12-pNCA  (see Additional file 6). This P450 was not well equilibrated to the constant selection criterion that we planned to impose, because it had substantially higher total activity. We therefore neutrally evolved it for 16 generations in order to create P450s that were better equilibrated to the selection criterion. We evolved two parallel populations, which we named R1 and R2. The procedure was exactly identical to that described below for the polymorphic evolution with the following exceptions:
• Starting sequence: the starting sequence for the equilibration evolution was the 21B3 sequence.
• Population size: each of the two equilibration evolution populations had a size of 174 sequences rather than the 435 used for the polymorphic evolution.
• Selection criterion: the sequences were required to have at least 75% of the total activity of the 21B3 P450.
• Mutation rate: the mutation rate for the equilibration evolution was much higher than for the polymorphic evolution. The error-prone PCR protocol used 200 μM manganese chloride (MnCl2), rather than the 25 μM used for the polymorphic evolution. We estimate that this error-prone PCR protocol introduced ≈ 4 nucleotide mutations per P450 gene at each generation during the equilibration evolution.
We performed 16 generations of equilibration evolution, and then randomly selected 23 functional mutants from each of the R1 and R2 populations (see Additional file 7). We picked one of these mutants, R1-11, for use as the parent for the neutral evolution experiments.
Detailed protocol for evolution experiments
We began with the R1-11 P450 BM3 heme domain variant (see Additional file 1) cloned into the pCWori  plasmid with a 5' BamH1 and 3' EcoR1 site as described in . The cloning primers were pCWori_for (5'-GAAACAGGATCCATCGATGCTTAGGAGGTCAT-3') and pCWori_rev_clone (5'-GCTCATGTTTGACAGCTTATCATCG-3'). We used error-prone PCR to generate mutants, taking great care to make the error-prone PCR protocol repeatable by using a relatively small number of thermal cycles. This was both to control the mutation rate by ensuring that the reaction did not saturate the reagents, which would cause the number of doublings to become sensitive to the initial template concentration, and to avoid the PCR-based recombination events that can occur during with the last few thermal cycles of PCR reactions [38, 39]. The PCR reactions were 100 μl in volume, and contained ≈ 13 ng of plasmid template (corresponding to ≈ 3 ng of template gene), 7 mM MgCl2, 1 × Applied BioSystems PCR Buffer II without MgCl2, 25 μM MnCl2, 0.5 μM pCWori_for primer, 0.5 μM pCWori_rev primer, 200 μM of dATP and dGTP, 500 μm of dTTP and dCTP, and 5 units of Applied Biosystems AmpliTaq polymerase. The reactions were run on the BLOCK setting of a MJ Research PCR machine with a program of 95°C for 2 min, then 15 cycles of (95°C for 30 s, 57°C for 30 s, 72°C for 90 s), and then cooling to 4°C. This protocol yielded roughly 1-1.5 μg of product gene (as quantified by gel electrophoresis versus a known standard), for a PCR efficiency of ≈ 0.5. Sequencing the unselected populations at the end of the experiment indicated that this protocol introduced an average of 1.4 ± 0.2 nucleotide mutations, with the nucleotide error-spectrum shown in Table 1. Because the number of PCR doublings is large compared the average mutation rate, the distribution of mutations among sequences should be well-described by the Poisson distribution [40, 41].
The mutant genes from the error-prone PCR were purified over a ZymoResearch DNA clean and concentrator column, and digested at 37°C with EcoR1 and BamH1. The digested genes were then purified from an agarose gel with ZymoResearch DNA gel extraction columns, and ligated into pCWori plasmid that had been digested with BamH1 and EcoR1 and dephosphorylated. The ligations were transformed into electro-competent catalase-free Escherichia coli  (the catalase is removed because it breaks down the hydrogen peroxide utilized by the P450 peroxygenase), plated on Luria Broth (LB) plates containing 100 μg/ml of ampicillin to select for the plasmid's antibiotic resistance marker, and grown at 37°C. Transformation of a control ligation reaction without any digested gene yielded at least 100-fold fewer colonies, indicating that the rate of plasmid self-ligation was less than one percent.
Individual mutant colonies from the plates were picked into 96-well 2 ml deep-well plates containing 400 μl of LB supplemented with 100 μg/ml ampicillin. Each plate contained four parental control wells with cells carrying the parent R1-11 gene, four null control wells with cells carrying the pCWori plasmid without a P450 gene, and a non-inoculated well to check for contamination. For the polymorphic population, we picked five such plates with all 87 other wells containing different mutants for a total population size of 5 × 87 = 435 mutants. For the 22 monomorphic populations (we began with 24 populations but two had to be discarded due to contamination), we picked a single colony for growth and screening. For the unselected populations we picked a single colony for growth without screening. The LB deep-well plates were grown for 16-20 h at 30°C, 210 rpm, and 80% relative humidity in a Kuhner humidified shaker. To express the P450 mutants, we prepared 2 ml deep well plates containing 400 μl per well of terrific broth (TB) supplemented with 200 μM isopropyl β-D-thiogalactoside (IPTG), 100 μg/ml ampicillin, and 500 μM of d-aminolevulinic acid. We used a pipetting robot inoculated these TB plates with 100 μl from the LB plates. We stored the LB deep-well plates at 4°C, and grew the TB deep-well plates in the humidified shaker at 30°C, 210 rpm, and 80% relative humidity for 22-24 h. After this growth, the cells were harvested by centrifuging the TB deep-well plates at 4000 × g for 5 min and discarding of the liquid. The cell pellets were flash-frozen in liquid nitrogen to aid in cell lysis.
To lyse the cells for the assays, we resuspended the cell pellets in 300 μl of 100 mM [4-(2-hydroxyethyl)-1-piperazinepropanesulfonic acid] (EPPS) (pH 8.2) with 0.5 mg/ml lysozyme and 4 units/ml of deoxyribonuclease by pipetting 40 times with the pipetting robot. We then incubated the plates at 37°C for 30 min, again resuspended with the pipetting robot, and put back at 37°C for an addition 30 min. We then pelleted the cell debris by centrifugation at 6000 × g for 5 min at 4°C. The pipetting robot was used to dispense 80 μl of the clarified lysate into 96-well microtiter plates (Rainin). We prepared a 6 × stock of 1.5 mM 12-pNCA in 36% dimethyl sulfoxide (DMSO) and the EPPS buffer (the 12-pNCA was stored in the DMSO solution and combined with the buffer immediately before use). We used a multichannel pipette to add 20 μl of this substrate stock to each well of the microtiter plate. We briefly mixed the plates using the "shake" setting of a 96-well plate spectrophotometer, and read an absorbance baseline at 398 nm. We then immediately added 20 μl of a freshly prepared solution of 24 mM hydrogen peroxide in the EPPS buffer to initiate the reaction, and mixed again. The final reaction conditions were therefore the EPPS buffer with 6% DMSO, 4 mM hydrogen peroxide, and 250 μM 12-pNCA. After 40 min we quantified the amount of enzymatic product by the increase in absorbance at 398 nm. This absorbance increase is due to the 4-nitrophenolate molecule released after the P450 hydroxylates the twelfth carbon of the 12-pNCA molecule . To score the mutants as functional or nonfunctional, we compared their gain in absorbance minus the median null control reading to that of the median parental control reading minus the median null control reading. All mutants that had at least 75% of the parental gain were scored as functional, all other mutants were scored as nonfunctional.
We used the information from these assays to select the parents for the next generation. For the unselected population we did not require the mutants to be functional, so the selected mutant was used to start a 4 ml culture of LB with 100 μg/ml ampicillin, and the plasmid DNA was harvested with a mini-prep. This plasmid DNA was used as the template for the next round of error-prone PCR. Therefore, after the first generation the four unselected replicates diverged into four separate error-prone PCR reactions. These unselected replicates were evolved for a total of 12 generations, and were sequenced at every third generation.
For the polymorphic population, all mutants that were functional contributed an equal amount of plasmid DNA as template for the next generation. In order to do this, we collected 50 μl of the culture from the LB deep-well plate for each mutant that was scored as functional. All of these LB aliquots were pooled, and then the plasmid DNA was collected with a mini-prep. The pool of plasmid DNA was used as template for the next generation's error-prone PCR reactions. We performed 15 generations of evolution for this polymorphic population. Note that at each generation we are assaying 435 mutants as part of the evolutionary procedure, so this provides information on mutational robustness. At every third generation, we also selected a random sample of functional mutants for sequencing. After 15 generations, we randomly selected 22 mutants for stability measurements and sequencing analysis. The random selections were made from all functional mutants with the Python computer language random number generator.
For the monomorphic populations, at each generation we assayed just a single mutant. If that mutant was nonfunctional, then at that generation the population stayed at its original sequence. In that case, for the next generation we simply picked a new mutant from the previous generation's plate of transformed mutants. If the mutant we screened was functional, then that mutant represented the new population. We therefore grew a 4 ml LB culture with 100 μg/ml of ampicillin, and collected the plasmid DNA with a miniprep. That plasmid DNA was then used as the template for the next generation's error-prone PCR reaction. We thus had 22 (originally 24, two were subsequently contaminated) independent monomorphic populations that were being evolved in parallel. Each was evolved for 25 generations, and at the end of these 25 generations we measured the stability of the final sequence of each population. Each time an assayed mutant was functional, we sequenced the new P450 gene. We also measured the average mutational robustness of the monomorphic populations at every fifth generation. To do this, we did a pooled mini-prep of equal volumes of LB cultures of all 22 replicates to obtain a equal mix of plasmid DNA. We then performed error-prone PCR on this mix, and assayed 435 mutants to measure the fraction functional (see Additional file 2).
Test for recombination during error-prone PCR
During the polymorphic population evolution, we performed error-prone PCR on a mix of different plasmids. It is common for PCR on mixed templates to lead to recombination events during the reaction [38, 39]. We attempted to reduce this recombination by using a small number of thermal cycles. However, in order to test for recombination, we analyzed the sequences of the final 22 selected members of the polymorphic population. There are a variety of statistical tests to detect recombination in a set of sequences. A comparison of these tests by Posada  found that the Max-Chi2 method developed by John Maynard Smith  performs well. A publicly available implementation of this method  is at . We used this implementation to analyze the 22 final polymorphic sequences, and the resulting P-value was 0.29 after 100 random permutations, indicating that there is not significant recombination.
Measurement of P450 stabilities
We measured the stabilities to both irreversible thermal and irreversible urea denaturation of the final (generation 25) member of each monomorphic population, as well as of the 22 randomly selected members of the polymorphic population. As discussed in the supplementary information of , cytochrome P450 BM3 heme domains (and indeed most P450s) denature irreversibly, forcing us to use resistance to irreversible denaturation to quantify protein stability. The first stability measure is the T 50, defined as the temperature at which half of the protein is denatured after a 10 min incubation. The second stability measure is the [urea]50, defined as the urea concentration at which half of the protein denatures after a 4 h room-temperature incubation. Each set of measurements (those of T 50 and [urea]50) was performed on all of the mutants in the same day, and each mutant was treated identically. Therefore, it is possible to make accurate comparisons of the relative values of the measurements within the data set. However, the absolute values of the T 50 and [urea]50 values might be less accurate. Therefore, care should be taken in comparing the absolute value of these measurements to those of other studies (such as ).
Both the T 50 and [urea]50 measurements were performed in clarified cell lysate. The protein was expressed using catalase-free E. coli  containing the encoding gene on the IPTG inducible pCWori  plasmid. We used freshly streaked cells to inoculate 2 ml cultures of LB supplemented with 100 μg/ml of ampicillin, and grew these starter cultures overnight with shaking at 37°C. We then used 0.5 ml from these starter cultures to inoculate 1 litre flasks containing 200 ml of TB supplemented with 100 μg/ml of ampicillin. The TB cultures were grown at 30°C and 210 rpm until they reached an optical density at 600 nm of ≈ 0.9, at which point IPTG and d-aminolevulinic acid were added to a final concentration of 0.5 mM each. The cultures were grown for an additional 19 h, then the cells were harvested by pelletting 50 ml aliquots at 5,500 g and 4°C for 10 min, and stored at -20°C. To obtain clarified lysate, each pellet was resuspended in 8 ml of 100 mM EPPS (pH 8.2) and lysed by sonication, while being kept on ice. The cell debris was pelleted by centrifugation at 8,000 g and 4°C for 10 min, and the clarified lysate was decanted and kept on ice.
For the T 50 measurements, 125 μl of clarified lysate from a single mutant was added to all 12 wells in a row of a 96-well hard-shell thin-wall microplate (MJ Research). The plate was heated for 10 minutes using the gradient method of an Eppendorf Mastercycler gradient PCR machine, with the gradient set at either 33°C-45°C or 46°C-58°C (each mutant was exposed to both of these gradients), the machine on the BLOCK setting, and the heated lid set to 75°C with the lid WAIT option. The plate was then cooled to 4°C, removed from the PCR machine, and centrifuged at 5,500 g and 4°C for 5 min to pellet any debris. A pipetting robot was used to dispense 80 μl of the supernatent into a 96-well microtiter plate (Rainin), and the amount of remaining properly folded P450 was quantified from the carbon monoxide difference spectrum as described below. The T 50 values were determined by fitting sigmoidal curves the percent of remaining protein (see Additional file 3). Our ability to accurately compare T 50 values within the data set requires that each well in a given column of the gradient PCR machine be at the same temperature. We used a thermocouple to measure the temperature of the wells with the machine lid open, and confirmed that the wells were within a few tenths of a degree of the same temperature. Further evidence for the consistency of our T 50 values comes from the fact that two independent measurements of the T 50 for our R1-11 parent yielded values that differed by only 0.1°C. However, the absolute values of the measured temperatures are less accurate. Thermocouple measurements indicated that, with the machine lid open, the wells were ≈ 1°C cooler than the indicated temperature. We were unable to ascertain the temperatures with the heated lid closed, but based on comparisons water bath measurements, the temperatures with the lid closed slightly exceeded the indicated temperatures.
For the [urea]50 measurements, 125 μl of the clarified lysate from a single mutant was added to all 12 wells in a row of a 96-well microtiter plate. A pipetting robot was then used to add and mix 125 μl of a 2× solution of urea in 100 mM EPPS (pH 8.2) so that each subsequent column had a higher concentration of urea, and so that the final urea concentrations were those shown in Additional file 4. The plates were left on the bench at room temperature for 4 h, and the amount of remaining properly folded P450 was quantified from the carbon monoxide difference spectrum as described below. The [urea]50 values were determined by fitting sigmoidal curves to the percent of remaining protein. Evidence for the consistency of the [urea]50 measurements comes from the fact that two independent measurements of the [urea]50 for our R1-11 parent yielded values that differed by only 0.01 M. In addition, the [urea]50 and T 50 values are highly correlated (see Additional file 5), indicating that they provide consistent measures of stability.
For both the T 50 and [urea]50 measurements, the folded P450 was quantified from the carbon monoxide difference spectrum . The microtiter plates containing the P450 samples were first used to read blank spectra at 450 and 490 nm using a Tecan Safire 2 plate reader. The plates were then incubated for 10 min in an airtight oven with carbon monoxide. The plates were removed form the oven and 10 μl of 0.1 M sodium hydrosulfite in 1.3 M potassium phosphate (pH 8.0) was immediately added to each well. After 5-10 min, spectra were again read at 450 and 490 nm. The amount of P450 is proportional to the increase in the signal at 450 nm after this procedure minus the change in the signal at 490 nm.
Comparison of enzymatic substrate turnovers
Another possible source of difference between the P450s from the polymorphic and monomorphic populations is their catalytic efficiencies, measured as the total number of 12-pNCA substrate turnovers per enzyme. It was not possible to directly extract accurate values for enzymatic turnovers from the high-throughput screening procedures used in this study, because the neutral evolution selection criterion was set at a point where the assay readings were just beginning to saturate the linear range (i.e., this criterion was at a point where doubling the enzyme concentration led to a less than two-fold increase in the assay reading). However, we have recently completed a study that determined accurate per enzyme turnover values for most of the final neutrally evolved P450s from the polymorphic and monomorphic populations by constructing careful standard curves to ensure that values were taken from the fully linear range. This study analyzed the P450s on the substrate of 12-pNCA as well as a variety of "promiscuous" substrates; the paper is currently submitted and is publicly available as a preprint . This study measured the 12-pNCA turnovers for 18 of the final polymorphic P450 variants, and 16 of the final monomorphic P450 variants. The mean and standard deviations for the P450s from these two populations were 307 ± 88 and 385 ± 120 turnovers per enzyme, respectively, with experimental errors of about 10% (see  for full data). Based on these measurements, there do not appear to be substantial differences between the populations in the per enzyme turnovers on 12-pNCA.
A.1 Mathematical background
The first purpose of this appendix is to provide mathematical equations that describe the experiments. The second is to show how four measurements from the experiments can be used to calculate two quantities that describe the topology of the underlying protein neutral network. We will derive two equations for both quantitites, each in terms of a different measurement. The fact that the four equations will be seen to yield consistent results provides evidence for the accuracy of the following calculations.
Our calculations are based on a view of neutral protein evolution as a process constrained by a stability threshold, a view that we originally introduced to explain experimental protein mutagenesis results . The calculations closely parallel our earlier work , which is in turn based on a general theoretical treatment of evolution on neutral networks by van Nimwegen and coworkers . These calculations will probably be most thoroughly understood by first reading those works. The primary difference between the current calculations and  is that previously we assumed that the per generation per protein mutation rate μ was ≪ 1, so that at each generation a protein was either unmutated (with probability 1 - μ) or experienced a single mutation (with probability μ). In contrast, here we allow the mutation rate to be arbitrarily large, so that a protein can experience multiple mutations in a single generation (in this sense the calculations resemble the generalization by Wilke  of ). Specifically, let f m be the probability that a protein experiences m mutations in a single generation. Here we derive results for arbitrary f m , and then approximations relevant to the form of f m in the experiments. In the limiting case of small mutation rate (where f 0 = 1 - μ, f 1 = μ, and f m = 0 for m > 1), the calculations here reduce to those in . Proteins evolving in nature typically experience very low mutation rates, so  probably offers the best description of natural protein evolution. The calculations presented here are designed to specifically treat the evolutionary dynamics of the experiments.
A protein's thermodynamic stability is described by its free energy of folding, ΔG f , with more negative values indicating more stable proteins. As described in several previous papers [3, 5, 22], we assume that selection requires a protein to fold with some minimal stability , so that a protein adequately folds if and only if . The amount of extra stability a protein possesses relative to the stability threshold is given by ; note that all folded proteins will have . We further assume that as long as , selection is indifferent to the exact amount of extra stability that a protein possesses (see  for a discussion of the limitations of this assumption). We conceptually divide the continuous variable of protein stability into small discrete bins of width b. Specifically, a protein is in bin i if it has between (1 - i) b and -ib, where i = 1, 2, .... Mutating a protein changes its stability by an amount ΔΔG (defined as the stability of the mutant protein minus the stability of the initial protein), and so can move it to a new stability bin. In , we defined a matrix W with elements W ij giving the transition probabilities that a single mutation changes a protein's stability from bin j to bin i. We noted that W could be computed from the distribution of ΔΔG values for all single mutations, and argued that W remains fairly constant during neutral evolution as the distribution of ΔΔG values remains relatively unchanged. However, we emphasize that (as discussed in detail in ) the constancy of the ΔΔG distribution remains an assumption, albeit one that has now been shown to be quite accurate for lattice proteins [3, 22, 47] and provide a consistent theoretical explanation for a growing body of experimental results (the current work, and ).
Since we are allowing for larger mutation rates, and we must consider the possibility that a protein's stability might change due to multiple mutations at a single generation. Therefore, we make a more general definition of W ij,m as the probability that m random mutations to a protein in stability bin j change its stability to bin i, and let W m be the matrix with elements W ij,m . Note that W m only describes mutations that cause transitions from one folded protein to another, as the stability bins i = 1, 2, ... all correspond to folded proteins. As before , we assume that W m is roughly constant during evolution, meaning that the distribution of ΔΔG values for multiple mutations is roughly constant during neutral evolution. Note that if m = 1, then W m is just the matrix W that can be computed from the distribution of single-mutant ΔΔG values . We will now use the matrices W m to calculate the following characteristics of a population that has evolved to equilibrium: the distribution of stabilities, the average number of mutations 〈m〉 T accumulated after T generations, and the average fraction 〈〉 of stably folded proteins in the population. We then introduce a few approximations (that should be quite accurate for the experimental work in this paper) that greatly simplify these calculations. Finally, we relate the calculations to properties of the underlying protein neutral network.
As described generally by van Nimwegen and coworkers , the evolutionary dynamics depend on whether the evolving population tends to be monomorphic or highly polymorphic. When the per sequence per generation mutation rate μ is ≪ 1, whether the population is mostly monomorphic or highly polymorphic is determined by the product of the population size N and μ: when N μ ≪1 the population is mostly monomorphic, and when N μ ≫ 1 the population is highly polymorphic. However, with multiple mutations per generation, N μ is no longer an appropriate parameter to distinguish between mono- and polymorphism, because if the population size is sufficiently small the population can still be monomorphic even if there are multiple mutations per generation. Specifically, in one set of experiments we constrained the population to be monomorphic (by maintaining a population size of one), but still allowed the single protein in this population to experience more than one mutation at a generation. So we instead denote the populations as either monomorphic or polymorphic. We indicate quantities calculated for the monomorphic population by the subscript M (i.e. 〈〉 M ) and those calculated for the polymorphic population by the subscript P (i.e. 〈〉 P ).
A.2 Monomorphic limit
In the limit of a completely monomorphic population, all of the proteins are in a single stability bin. Let p i (t) be the probability that the population is in stability bin i at time t, and let p (t) be the column vector with elements p i (t). At each generation there is a probability f 0 that there is no mutation that becomes fixed in the population, a probability of that the population experiences a mutational event (that could be a single mutation or several simultaneous mutations) that moves it into bin i, and a probability that the population is in bin i and experiences one or more mutations that move it to another bin of stably folded proteins. Define to be the fraction of m-mutants of a protein in bin i that still fold, and let V m be the matrix with diagonal elements given by V ii, m = ν i, m and all other elements zero. The time evolution of p is:
where I is the identity matrix. Note that mutations that destabilize a protein beyond the stability threshold are immediately lost to natural selection, and so leave the population in its original stability bin. This describes the experiments for the monomorphic populations, where we retain the parental sequence if the single mutant we generate is nonfunctional. Equation 1 here corresponds to Equation (1) of , and the blind ant random walk described by van Nimwegen and coworkers .
Equation 1 describes a Markov process with a non-negative, irreducible, and acyclic transition matrix, and so p approaches a unique stationary distribution (equilibrium value) of p M given by the eigenvector equation:
Once p has reached equilibrium, the average fraction of proteins that still stably fold at each generation is:
where e = (1, ..., 1) is the unit row vector.
To calculate 〈m〉 T, M , the average number of mutations accumulated after T generations once the population has equilibrated, we note that at each generation there is a probability of that a randomly chosen protein is in bin j, experiences m mutations, and still stably folds. The average number of mutations accumulated in a single generation is simply the average of m weighted over this probability. So summing over all values of m and j, we see that:
This equation corresponds to Equation 6 of , which was derived using an embedded Markov process formalism. Here we have foregone this formalism for the more intuitive argument presented above, as we do not attempt to calculate higher moments of the number of mutations.
A.3 Polymorphic limit
In the limit when the population is highly polymorphic, at each generation there are sequences in many different stability bins. In this case, we describe the distribution of stabilities by the column vector x (t), with element x i (t) giving the fraction of proteins in stability bin i at time t. At generation t, the fraction of mutants that continue to fold is:
Therefore, in order to maintain a constant population size, each remaining protein must produce an average of α t = 〈〉 t -1 offspring. The population therefore evolves according to:
After the population evolves for a sufficiently long period of time, x will approach an equilibrium value of x P . At this equilibrium, the average fraction of mutants that fold at each generation is:
and the equilibrium reproduction rate is α = 〈〉 P -1. Therefore:
Equations 7 and 8 can be combined to show that x P and 〈〉 P can be calculated from the eigenvector equation:
with (〈〉 P - f 0) the principal eigenvalue of the nonnegative and irreducible matrix . Equation 9 corresponds to Equation 14 of , Equation 6 of the work by van Nimwegen and coworkers , and Equation 13 of the work by Wilke .
We now calculate 〈m〉 T, P , the average number of mutations accumulated in T generations after the population has equilibrated. At equilibrium, there is a probability of that a protein is in bin j, experiences m mutations, and still stably folds. Subsequently, all of these folded proteins produce an average of α offspring. The average number of mutations accumulated in a single generation is simply the average of m weighted over this probability, and then multiplied by the average reproduction rate. So summing over all values of m and j, we obtain:
A.4 Approximations for polymorphic limit
We can dramatically simplify the results from the previous sections with several reasonable approximations. The first approximation is that the ΔΔG values for random mutations are roughly additive, and is supported by a number of experimental studies of the thermodynamic effects of mutations [48–50]. We have previously shown that this approximation can be used to accurately describe experimental protein mutagenesis data with a simple stability threshold model . Under this approximation, the distribution of net ΔΔG values for multiple mutations can be computed from the distribution of ΔΔG values for single mutations by performing convolutions of the single-mutation ΔΔG distribution , meaning that W m for arbitrary m can be computed solely from the distribution of ΔΔG values for single mutations. However, to simplify the equations from previous sections, we need to express W m for arbitrary m only in terms of W (recall that W = W 1). As W only contains information about stability transitions from folded proteins to other folded proteins, if we make the second approximation that a protein that is destabilized beyond the minimal stability threshold by one mutation is not re-stabilized to a folded protein by a subsequent mutation, then W m = W m. This approximation that unfolded proteins are not re-stabilized should be quite accurate as stabilizing mutations tend to be relatively rare and small in magnitude [51–54] (this is the underlying idea behind the Markov chain approximation that was shown to be highly accurate for lattice proteins ). To summarize, if ΔΔG values are roughly additive and stabilizing mutations are rare, we have the approximation:
W m ≈ W m.
Simplifying the equations of the previous sections also requires assigning a specific functional form to f m , the probability that a sequence undergoes m mutations. Here we assume that mutations are Poisson distributed among sequences, so that:
where is the average number of mutations per protein per generation. When the mutations are introduced by error-prone PCR, the Poisson distribution is an excellent approximation to the true theoretical distribution of mutations created by error-prone PCR [40, 41] provided that μ is much less than the number of PCR doublings, as is the case in all of the experiments in the current work.
This equation makes clear that x P is the principal eigenvector of the matrix , therefore x P must also be the principal eigenvector of W. Now in our earlier work , we defined the principal eigenvector of W as x ∞, called the corresponding eigenvalue 〈ν〉∞, and showed that this eigenvalue is shown the average fraction of single mutations that are neutral in a population that is evolving with N μ ≫ 1 and μ ≪ 1. Therefore, with the approximation of Equation 11, x P and x ∞ are equal, and are both defined by the same eigenvector equation:
〈ν〉∞ x P = W x P = W x ∞ = 〈ν〉∞ x ∞.
Combining Equations 13 and 14, we have:
Equation 15 can be solved to yield:
Similarly, we can simplify Equation 10 to:
Solving this equation for 〈ν〉∞ yields:
A.5 Approximations for monomorphic limit
We now simplify the equations for the monomorphic limit. This requires several further approximations. We begin by approximating that the stability probability distribution p M given by Equation 2 by the distribution p o defined in  as satisfying
0 = (W - V)p o .
The basic rationale behind approximating p M with p o is that Equation 2 can be viewed as a perturbation to Equation 19 . Essentially, p o is an eigenvector of the matrix W - V while p M is the corresponding eigenvector of the matrix . The latter matrix can be viewed as a perturbation to the first, as the sum is small. This smallness is due to the fact that W mtends to zero with large m, causing V m to tend towards the identity matrix. In addition, the μ m/m! terms tend to zero with large m. Therefore, the terms in the summation are all simply either a perturbation to W - V or involve subtracting terms that are fractions of the identity matrix. The perturbations lead to bounded changes in the eigenvectors , while the identity matrix terms do not change the eigenvectors. Below we give a more rigorous justification of the assumption that p M is approximately equal to p o .
We need one additional approximation to make further progress. Both Equations 3 and 4 contain terms of the form W m p o , and even if we use Equation 11 to rewrite these terms as W m p o , there are no further clear simplifications. However, any probability vector that is multiplied repeatedly by W and normalized will eventually converge to x ∞ = x P (as this is the principal eigenvector of W). We make the approximation that this convergence is sufficiently rapid to be essentially complete after a single multiplication. This approximation is supported by both protein mutagenesis studies [3, 27, 28] that indicate that proteins rapidly converge to an exponential decline in the fraction folded (indicating the stability distribution has equilibrated, as discussed below, and by lattice protein studies showing the same [3, 47]. Therefore, we make the approximation that eW m p o = 〈ν〉 o eW m-1 x ∞ = 〈ν〉 o 〈ν〉∞ m-1 where 〈ν〉 o = eWp o has the same definition as in , where it was defined as the average fraction of functional single mutants of a population evolving with μ ≪ 1 and N μ ≪ 1.
We use these approximations to simplify Equation 3 to:
Solving this equation for 〈ν〉 o , we find:
We now use the approximations to simplify Equation 4 to:
Solving this equation for 〈ν〉 o yields:
To recap, we now have equations to calculate 〈ν〉∞ and 〈ν〉 o from experimentally measurable quantities. Equations 16 and 18 allow us to calculate 〈ν〉∞ from 〈〉 P and 〈m〉 T, P , respectively. Given this calculated value of 〈ν〉∞, Equations 21 and 23 then allow us to calculate 〈ν〉 o from 〈〉 M and 〈m〉 T, M , respectively. The fact that we have two equations each for 〈ν〉∞ and 〈ν〉 o allows us to assess the self-consistency of the approach.
A.6 Interpretation in terms of neutral networks
Throughout the preceding calculations, we have referred to 〈ν〉∞ and 〈ν〉 o as we defined them in : namely, as the average neutrality of protein populations evolving with μ ≪ 1 and N μ either ≫ 1 or ≪ 1, respectively. However, van Nimwegen and coworkers  have shown that they can also be interpreted in terms of the underlying neutral network. In the experiments we make mutations at the nucleotide (rather than amino acid) level, so each point in our sequence space corresponds to a different gene. Every gene that yields an amount of protein sufficient to hydroxylate the twelfth carbon of 12-p-nitrophenoxydodecanoic acid with at least 75% of the total activity conferred by the original R1-11 parent gene represents a node on this neutral network. We note that in the experiments (and also usually in natural evolution), the edges on the neutral network are not all completely equivalent or fully undirected, as some mutations are more likely to occur than others (for example, error-prone PCR with Taq polymerase is more likely to cause an A → G mutation than an A → C mutation). In the analysis that follows, we ignore this complication and assume all neutral network edges are equivalent.
In an extremely insightful analysis, van Nimwegen and coworkers  have shown that important characteristics of a neutral network can be inferred from evolutionary quantities. Specifically, they have shown that if a population is evolving with μ ≪ 1 and N μ ≫ 1, then the average neutrality (that we have denoted by 〈ν〉∞) is equal to the principal eigenvalue of the adjacency matrix of the neutral network, normalized by the network coordination number (number of possible connections per node). In addition, they pointed out that a population evolving with μ ≪ 1 and N μ ≪ 1 moves like a blind ant random walk, meaning that the average neutrality (that we have denoted by 〈ν〉 o ) is equal to the average connectivity of a neutral network node divided by the network coordination number. In our P450 experiments, we have measured the values needed to estimate 〈ν〉∞ and 〈ν〉 o using Equations 16, 18, 21, and 23. Using the final values listed in Table 2, 〈〉 P = 0.50 and 〈〉 M = 0.39. Taking the final nucleotide mutation values from Table 2, 〈m〉 T, P /T = 0.69 and 〈m〉 T, M /T = 0.31. The average mutation rate, computed from the unselected population, is μ = 1.40. So using Equation 16, 〈ν〉∞ = 0.53, while using Equation 18, 〈ν〉∞ = 0.49. The consistency of these two values supports the idea that the calculations above accurately describe the evolutionary process. Taking the average value of these two measurement as 〈ν〉∞ = 0.51, we can then use Equations 21 and 23 to calculate 〈ν〉 o . We calculate values of 0.28 and 0.43, respectively. These estimates differ by more than those for 〈ν〉∞, perhaps because additional approximations have gone into the derivation of the relevant equations (in addition, we have made no attempt to carry out the rather complex propagation of the sampling errors of Table 2). However, the values are still in a similar range. Taking the average of these two values, we estimate that 〈ν〉 o = 0.35. So overall, we predict that each functional P450 gene should have an average fraction of 0.35 of its sequence nearest neighbors also encoding a functional gene, for an average of about 1,500 neighbor genes. We predict that the principal eigenvalue of the neutral network adjacency matrix is 0.51 × 3L. The fact that principal eigenvalue exceeds the average connectivity indicates that the neutral network is not a regular graph, but instead has some nodes more highly connected than others.
The value for 〈ν〉∞ calculated above can also be related to measurements from protein mutagenesis experiments. A number of studies [3, 27, 28] have observed that the probability that a protein remains functional after m mutations falls off exponentially with the number of mutations. In fact, the decline is not always exponential for the first few mutations if the starting protein has especially high or low stability  or activity , but will still converge to this exponential form after a few mutations [3, 47, 57]. The stability threshold model can be used to relate this decline to 〈ν〉∞, as is performed indirectly in the Markov chain approximation of . Here we make that connection explicit. The initial protein has a stability that falls into some stability bin i. Therefore, its stability can be described by the column vector y 0 , which has element i equal to one and all other elements equal to zero. Now imagine constructing all single mutants of this protein. The fraction of these single mutants that still fold is just eWy 0 , and the distribution of stabilities among the single mutants is y 1 = Wy 0 (note that the elements of y 1 no longer sum to one). Similarly, after m mutations, the fraction of mutants that still fold is eW m y 0 , and the distribution of stabilities among the m-mutants is y m = W m y 0 . With the approximation of Equation 11, y m = W m y 0 . This makes it clear that y m will converge to a vector proportional to x ∞, the principal eigenvector of W. Once this convergence is complete, each new mutation simply reduces the fraction of mutants that fold by a factor of 〈ν〉∞, the principal eigenvalue of W (and the spectral radius of the neutral network normalized by the coordination number). Therefore, what we have called 〈ν〉∞ in the present work and  is equal to what is called x in , q in , and 〈ν〉 in . The major difficulty that is faced in extracting 〈ν〉∞ by the method of those three studies [3, 27, 28] is that it is not possible to directly assay mutants with m mutations, but instead only possible to assay a set of mutants with a distribution of m. All three studies use different (and valid) methods to account for this distribution, but this accounting is still difficult because most of the functional mutants come from the low m end of the distribution. This makes it difficult to get ascertain values for the fraction functional after large numbers of mutations, as most of the functional mutants in the set come from sequences with few mutations. For this reason, we believe the current method of measuring 〈ν〉∞ is more accurate. A second caution about comparing values of 〈ν〉∞ from different studies is that its value depends on the nucleotide error-spectrum of the experiment, as different mutagenesis methods create different distributions of nucleotide and amino acid mutation types.
We also briefly mention how we arrived at an estimate of 〈ν〉∞ for 3-methyladenine DNA glycosylase from the data of . This paper reports that a fraction x = 0.34 of amino acid mutations inactivate the protein. We would like to determine the fraction 〈ν〉∞ of nucleotide mutations that do not inactivate the protein. Roughly 75% of random mutations to a gene will be synonymous. Therefore, m amino acid mutations should cause about 4m/3 nucleotide mutations. The study of  measures that after m mutations, a fraction (1 - x)mof the mutants are functional. That means that 〈ν〉∞ 4m/3 fraction should be functional. Equating these expressions yields . So using x = 0.34, we arrive at 〈ν〉∞ = 0.73.
A.7 Detailed justification for approximating pM by po
Here we provide a detailed justification for the approximation that p M is about equal to p o . In the monomorphic limit, the time evolution of p is given by Equation 1, and the stationary distribution p M is given by Equation 2. We assume the approximations of Equations 11 and 12 and show that we can approximate p M by p o , where p o is given by Equation 19. To justify this approximation, we insert p o into the righthand side of Equation 1 and ask to what extent p o is left unaltered by the dynamics. If p o is found to be stationary to good approximation then, by uniqueness of the stationary distribution of an ergodic process, p o would be a good approximation to p M .
We therefore suppose that at some time t the distribution is given by p o and compute, using Equation 1, the change in p o after one generation. The new distribution at time t + 1 is given by:
Using (V - W) p o = 0, and taking components of the above equation, we obtain:
Thus p o would be an approximately stationary distribution of the dynamics if . We now proceed to show that this will be the case in most situations of interest by deriving upper and lower bounds on the second term of the righthand side of Equation 25.
Consider first the term (W m p o ) i , which can be written as:
where we have used Wp o = Vp o in the second equality. We now note that ν k ≤ ν max for all k, where ν max is the maximum neutrality, maximized over all bins. This leads to the successive inequalities:
yielding the upper bound:
In an identical manner, we obtain the lower bound:
where ν min is the smallest neutrality, minimized over all bins. Note that both inequalities above become exact equalities when all bins have the same neutrality ν, which could be interpreted as either ν min or ν max.
Having obtained inequality constraints on (W m p o ) i , we now consider the term (V m p o ) i , which can be written as:
that yields an identical upper bound to that on (W m p o ) i , namely:
It should again be noted that both the above inequalities become exact equalities when all bins have a common neutrality ν.
where the inequality above becomes an exact equality when all bins have the same neutrality. However, in this limit, the righthand side of the above equation vanishes, and therefore the second term of Equation 25 is identically zero in this case, giving the result that p M is exactly equal to p o when all bins have the same neutrality, even if μ is arbitrarily large.
We now carry out the sum over m to obtain an upper bound on the second term of Equation 25 in the more general and realistic case of unequal neutrality bins. Using Equation 34 and the specific Poisson form of f m , we obtain an upper bound on the fractional change in p 0i in one generation:
The above bound vanishes for small μ, is an increasing function of ν max - ν min, and is typically much smaller than 1. An extreme estimate of the size of the fractional change can be made when ν max = 1 and ν min = 0. In this case, using μ = 1.4 (the value in our experiments), the above inequality simplifies to:
Noting that ν i < 1, the fractional change in p 0i is therefore reasonably controlled even in the most extreme case. For realistic situations, the fractional change in p 0i is expected to be much lower, thus justifying the use of p o as the stationary distribution of the dynamics of Equation 1.
Zuckerkandl E, Pauling L: Evolutionary divergence and convergence in proteins. Evolving genes and proteins. 1965, New York, NY: Academic Press, 97-166.
Lesk AM, Chothia C: How different amino acid sequences determine similar protein structures: The structure and evolutionary dynamics of the globins. J Mol Biol. 1980, 136: 225-270. 10.1016/0022-2836(80)90373-3.
Bloom JD, Silberg JJ, Wilke CO, Drummond DA, Adami C, Arnold FH: Thermodynamic prediction of protein neutrality. Proc Natl Acad Sci USA. 2005, 102: 606-611. 10.1073/pnas.0406744102.
Besenmatter W, Kast P, Hilvert D: Relative tolerance of mesostable and thermostable protein homologs to extensive mutation. Proteins. 2007, 66: 500-506. 10.1002/prot.21227.
Bloom JD, Labthavikul ST, Otey CR, Arnold FH: Protein stability promotes evolvability. Proc Natl Acad Sci USA. 2006, 103: 5869-5874. 10.1073/pnas.0510098103.
Lenski RE, Barrick JE, Ofria C: Balancing robustness and evolvability. PLoS Biology. 2006, 4: e428-10.1371/journal.pbio.0040428.
Sniegowski PD, Murphy HA: Evolvability. Current Biology. 2006, 16: R831-R834. 10.1016/j.cub.2006.08.080.
Wagner A: Robustness and Evolvability in Living Systems. 2005, Princeton, New Jersey: Princeton University Press
Montville R, Froissart R, Remold SK, Tenaillon O, Turner PE: Evolution of mutational robustness in an RNA virus. PLoS Biol. 2005, 3: e381-10.1371/journal.pbio.0030381.
Wilke CO, Wang JL, Ofria C, Lenski RE, Adami C: Evolution of digital organisms at high mutation rates leads to survival of the flattest. Nature. 2001, 412: 331-333. 10.1038/35085569.
van Nimwegen E, Crutchfield JP, Huynen M: Neutral evolution of mutational robustness. Proc Natl Acad Sci USA. 1999, 96: 9716-9720. 10.1073/pnas.96.17.9716.
Kimura M: The Neutral Theory of Molecular Evolution. 1983, Cambridge, U.K.: Cambridge Univ. Press
Takahata N: On the overdispersed molecular clock. Genetics. 1987, 116: 169-179.
Smith JM: Natural selection and the concept of a protein space. Nature. 1970, 225: 563-564. 10.1038/225563a0.
Lipman DJ, Wilbur WJ: Modeling neutrality and selective evolution of protein folding. Proc R Soc London Ser B. 1991, 245: 7-11. 10.1098/rspb.1991.0081.
Huynen MA, Stadler PF, Fontana W: Smoothness within ruggedness: the role of neutrality in adaptation. Proc Natl Acad Sci USA. 1996, 93: 397-401. 10.1073/pnas.93.1.397.
Bornberg-Bauer E, Chan HS: Modeling evolutionary landscapes: mutational stability, topology, and superfunnels in sequence space. Proc Natl Acad Sci USA. 1999, 96: 10689-10694. 10.1073/pnas.96.19.10689.
Wilke CO: Adaptive evolution on neutral networks. Bull Math Biol. 2001, 63: 715-730. 10.1006/bulm.2001.0244.
Taverna DM, Goldstein RA: Why are proteins so robust to site mutations?. J Mol Biol. 2002, 315: 479-484. 10.1006/jmbi.2001.5226.
Xia Y, Levitt M: Funnel-like organization in sequence space determines the distributions of protein stability and folding rate preferred by evolution. Proteins. 2004, 55: 107-114. 10.1002/prot.10563.
Brin S, Page L: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems. 1998, 30 (1-7): 107-117. 10.1016/S0169-7552(98)00110-X.
Bloom JD, Raval A, Wilke CO: Thermodynamics of neutral protein evolution. Genetics. 2007, 175: 255-266. 10.1534/genetics.106.061754.
Cirino PC, Arnold FH: A self-sufficient peroxide-driven hydroxylation biocatalyst. Angew Chem Int Ed. 2003, 42: 3299-3301. 10.1002/anie.200351434.
Taverna DM, Goldstein RA: Why are proteins marginally stable?. Proteins. 2002, 46: 105-109. 10.1002/prot.10016.
Barnes HJ, Arlotto MP, Waterman MR: Expression and enzymatic activity of recombinant cy-tochrome P450 17 α-hydroxylase in Escherichia coli. Proc Natl Acad Sci USA. 1991, 88: 5597-5601. 10.1073/pnas.88.13.5597.
Bloom JD, Romero PA, Lu Z, Arnold FH: Neutral genetic drift can alter promiscuous protein functions, potentially aiding functional evolution. Biol Direct. 2007, 2: 17-10.1186/1745-6150-2-17.
Shafikhani S, Siegel RA, Ferrari E, Schellenberger V: Generation of large libraries of random mutants in Bacillus subtilis by PCR-based plasmid multimerization. BioTechniques. 1997, 23: 304-310.
Guo HH, Choe J, Loeb LA: Protein tolerance to random amino acid change. Proc Natl Acad Sci USA. 2004, 101: 9205-9210. 10.1073/pnas.0403255101.
DePristo MA, Weinreich DM, Hartl DL: Missense meanderings in sequence space: a biophysical view of protein evolution. Nat Rev Genetics. 2005, 6: 678-687. 10.1038/nrg1672.
Plotkin JB, Dushoff J, Desai MM, Fraser HB: Codon usage and selection in proteins. J Mol Evol. 2006, 63: 635-653. 10.1007/s00239-005-0233-x.
Berg OG: Selection intensity for codon bias and the effective population size of Escherichia coli. Genetics. 1996, 142: 1379-1382.
Lynch M, Conery JS: The origins of genome complexity. Science. 2003, 302: 1401-1404. 10.1126/science.1089370.
Hartl DL, Moriyama EN, Sawyer SA: Selection intensity for codon bias. Genetics. 1994, 138: 227-234.
Moya A, Holmes EC, Gonzalez-Candelas F: The population genetics and evolutionary epidemiology of RNA viruses. Nature Reviews Microbiology. 2004, 2: 279-288. 10.1038/nrmicro863.
Vignuzzi M, Stone JK, Arnold JJ, Cameron CE, Andino R: Quasispecies diversity determines pathogenesis through cooperative interactions in a viral population. Nature. 2006, 439: 344-348. 10.1038/nature04388.
Krakauer DC, Plotkin JB: Redundancy, antiredundancy, and the robustness of genomes. Proc Natl Acad Sci USA. 2002, 99: 1405-1409. 10.1073/pnas.032668599.
Wang X, Minasov G, Shoichet BK: Evolution of an antibiotic resistance enzyme constrained by stability and activity trade-offs. J Mol Biol. 2002, 320: 85-95. 10.1016/S0022-2836(02)00400-X.
Meyerhans A, Vartanian JP, Wain-Hobson S: DNA recombination during PCR. Nucleic Acids Res. 1990, 18: 1687-1691. 10.1093/nar/18.7.1687.
Kanagawa T: Bias and artifacts in multitemplate polymerase chain reactions (PCR). Journal of Bioscience and Bioengineering. 2003, 96: 317-323.
Sun F: The polymerase chain reaction and branching processes. J Comput Biol. 1995, 2: 63-86.
Drummond DA, Iverson BL, Georgiou F, Arnold FH: Why high-error-rate random mutagenesis libraries are highly enriched in functional and improved proteins. J Mol Biol. 2005, 350: 806-816. 10.1016/j.jmb.2005.05.023.
Posada D: Evaluation of methods for detecting recombination from DNA sequences: empirical data. Mol Biol Evol. 2002, 19: 708-717.
Smith JM: Analyzing the mosaic structure of genes. J Mol Evol. 1992, 34: 126-129.
Piganeau G, Gardner M, Eyre-Walker A: A broad survey of recombination in animal mitochondria. Mol Biol Evol. 2004, 21: 2319-2325. 10.1093/molbev/msh244.
RecombiTEST. [http://www.lifesci.sussex.ac.uk/CSE/test/maxchi.php ]
Otey CR: High-throughput carbon monoxide binding assay for cytochromes P450. Directed enzyme evolution: screening and selection methods, of Methods in Molecular Biology. Edited by: Arnold FH, Georgiou G. 2003, Humana press, 230: 137-139.
Wilke CO, Bloom JD, Drummond DA, Raval A: Predicting the tolerance of proteins to random amino acid substitution. Biophysical J. 2005, 89: 3714-3720. 10.1529/biophysj.105.062125.
Wells JA: Additivity of mutational effects in proteins. Biochemistry. 1990, 29: 8509-8517. 10.1021/bi00489a001.
Zhang XJ, Baase WA, Shoichet BK, Wilson KP, Matthews BW: Enhancement of protein stability by the combination of point mutations in T4 lysozyme is additive. Protein Eng. 1995, 8: 1017-1022. 10.1093/protein/8.10.1017.
Serrano L, Day AG, Fersht AR: Step-wise mutation of barnase to binase: a procedure for engineering increased stability of proteins and an experimental analysis of the evolution of protein stability. J Mol Biol. 1993, 233: 305-312. 10.1006/jmbi.1993.1508.
Godoy-Ruiz R, Perez-Jimenez R, Ibarra-Molero B, Sanchez-Ruiz JM: Relation between protein stability, evolution and structure as probed by carboxylic acid mutations. J Mol Biol. 2004, 336: 313-318. 10.1016/j.jmb.2003.12.048.
Pakula AA, Sauer RT: Genetic analysis of protein stability and function. Annu Rev Genet. 1989, 23: 289-310. 10.1146/annurev.ge.23.120189.001445.
Matthews BW: Structural and genetic analysis of protein stability. Annu Rev Biochem. 1993, 62: 139-160. 10.1146/annurev.bi.62.070193.001035.
Bava KA, Gromiha MM, Uedaira H, Kitajimi K, Sarai A: ProTherm, version 4.0: thermodynamic database for proteins and mutants. Nucleic Acids Res. 2004, 32: D120-D121. 10.1093/nar/gkh082.
Franklin JN: Matrix Theory. 1968, Englewood Cliffs, N.J.: Prentice-Hall, Inc
Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS: Robustness-epistasis link shapes the fitness landscape of a randomly drifting protein. Nature. 2006, 444: 929-932. 10.1038/nature05385.
Bloom JD, Arnold FH, Wilke CO: Breaking proteins with mutations: threads and thresholds in evolution. Molecular Systems Biology. 2007, 3: 76-10.1038/msb4100119.
We thank Claus O Wilke for helpful advice and comments. JDB is supported by a HHMI predoctoral fellowship. ZL and DC were supported by Summer Undergraduate Research Fellowships from the California Institute of Technology. AR is supported by NSF grants CCF 0523643 and FIBR 0527023.
JDB and FHA designed the project and wrote the paper. JDB and ZL performed the bulk of the experiments; OSV assisted with the experiments. JDB and DC analyzed the data. JDB and AR performed the theoretical work. All authors read and approved the final manuscript.