Evolution favors protein mutational robustness in sufficiently large populations

Background An important question is whether evolution favors properties such as mutational robustness or evolvability that do not directly benefit any individual but can influence the course of future evolution. Functionally similar proteins can differ substantially in their robustness to mutations and capacity to evolve new functions, but it has remained unclear whether any of these differences might be due to evolutionary selection for these properties. Results Here, we use laboratory experiments to demonstrate that evolution favors protein mutational robustness if the evolving population is sufficiently large. We neutrally evolve cytochrome P450 proteins under identical selection pressures and mutation rates in populations of different sizes, and show that proteins from the larger and thus more polymorphic population tend towards higher mutational robustness. Proteins from the larger population also evolve greater stability, a biophysical property that is known to enhance both mutational robustness and evolvability. The excess mutational robustness and stability is well described by mathematical theory, and can be quantitatively related to the way that the proteins occupy their neutral network. Conclusion Our work is the first experimental demonstration of the general tendency of evolution to favor mutational robustness and protein stability in highly polymorphic populations. We suggest that this phenomenon could contribute to the mutational robustness and evolvability of viruses and bacteria that exist in large populations.


Background
Proteins are quite tolerant of mutations, allowing evolution to produce highly diverged sequences that fold to similar structures and perform conserved biochemical functions [1,2].However, proteins with nearly identical structures and functions may differ in their robustness to mutation [3][4][5], as well as in their capacity to acquire new functions [5].The fact that mutational robustness and evolvability can vary among the functionally equivalent proteins produced by natural sequence divergence makes these properties important hidden dimensions in evolution -direct selection for protein function is blind to them, yet they can play a crucial role in enabling future evolution.Whether the evolutionary process somehow promotes the acquisition of mutational robustness and evolvability therefore remains a major question [6][7][8].
Previous experiments have identified several specific evolutionary conditions that can affect mutational robustness.For example, genetic complementation decreases the mutational robustness of viruses [9], while high mutation rates favor mutational robustness in simulated digital organisms [10].However, theory [11] makes the much broader -and heretofore experimentally untested -prediction that extra mutational robustness will arise quite generally in sufficiently large populations.This prediction cannot be understood in the standard framework of Kimura's neutral theory [12], since one of the usual assumptions of the neutral theory is that mutational robustness is constant.(Although Takahata [13] treated the consequences of stochastically fluctuating neutrality on the molecular clock, he did not describe how mutational robustness might change systematically during evolution.)However, changes in mutational robustness can be described by envisioning evolution as occurring on neutral networks, or sets of functionally equivalent proteins that are connected by single mutational steps [14][15][16][17].In a seminal theoretical analysis of evolution on neutral networks, van Nimwegen and coworkers [11] predicted that the extent of mutational robustness should depend on the degree of population polymorphism.Here we briefly summarize their reasoning, since it motivates our experimental work.We also refer the reader to chapter 16 of [8], which contains an excellent explanation of the densely mathematical work of van Nimwegen and coworkers [11].
If an evolving population is mostly monomorphic, then each mutation is either lost or goes to fixation before another mutation occurs.The population is therefore usually clustered at a single genotype and rarely experiences mutations, meaning that selection does not distinguish between genotypes of different mutational robustness.All nodes of the neutral network are thus equivalent and will be occupied by the population with equal probability [11].On the other hand, a highly polymorphic population is always spread across many nodes of the neutral network.When mutations occur, the members of the population at highly connected nodes have a better chance of surviving, causing them to be favored by evolution and increasing the average mutational robustness [11,[17][18][19][20]. Specifically, a highly polymorphic population occupies each node with a probability proportional to its eigenvector centrality [11,17], a measure of how connected it is to other connected nodes (a variant of eigenvector centrality is used by Google's PageRank algorithm to rank a webpage's importance in the network of internet links [21]).Figure 1A illustrates how mostly monomorphic and highly polymorphic populations are predicted to occupy a neutral network.For proteins, changes in neutral network occupancy should be manifested by changes in thermodynamic stability [22], with proteins from highly polymorphic populations predicted to be more stable than their counterparts from mostly monomorphic populations (Figure 1B).Note that the extent of polymorphism depends on the product of the mutation rate and population size, meaning that protein populations of different sizes are predicted to evolve to different levels of mutational robustness and stability even if they experience the same mutation rate.

Results and Discussion
Design of neutral evolution experiment To test whether high population polymorphism drives an increase in mutational robustness and protein stability, we performed laboratory evolution experiments on cytochrome P450 proteins.The basic idea was to neutrally evolve P450s under a constant selection pressure in populations that were either monomorphic or highly polymorphic, and observe whether the proteins evolved to different levels of mutational robustness and stability.The evolution experiments started with a P450 BM3 heme domain that had been engineered to hydroxylate 12-pnitrophenoxydodecanoic acid   [23].We imposed the selection criterion that Escherichia coli cells expressing the P450 had to yield lysate with enough active enzyme to hydroxylate a specified amount of 12-pNCA in 40 minutes.This criterion roughly corresponds to the case in which an enzyme must catalyze a biochemically relevant reaction at some minimal level in order for its host to survive.Note that other properties such as stability and expression level can vary freely, provided that the criterion for total activity is met.
The properties of a neutrally evolving protein eventually "equilibrate," much as the properties of an isolated physical system under some macroscopic constraint tend towards the values that maximize the system's internal entropy.For proteins, this usually means that stability, expression, and activity drift towards their lowest tolerable values, since the vast majority of random sequences do not encode stable, wellexpressed enzymes (that is, natural selection must work against sequence entropy to maintain a functional protein) [22,24].The initial P450 had been engineered for maximal activity [23], meaning that it was not equilibrated to the more mild selection criterion of the experiments.We therefore neutrally evolved this initial P450 for 16 generations, introducing random mutations with error-prone PCR and retaining all mutants that met the selection criterion for total activity on 12-pNCA.The procedure used for this equilibration evolution was similar to that for the polymorphic neutral evolution described below.As expected, expression, stability, and activity all dropped during the equilibration evolution.At the end of the equilibration evolution, we chose a single sequence as the parent for the neutral evolution experiments.The gene encoding this parent sequence contained 29 nucleotide mutations and 13 amino acid mutations relative to the initial P450 (Additional File 1).
We used this parent gene to begin three parallel sets of neutral evolution experiments, which we named "monomorphic," "polymorphic," and "unselected" (Figure 2).
The monomorphic experiments capture the case where the population moves as a single entity, the polymorphic experiment captures the case where the population spreads across many sequences, and the unselected experiments show how the gene evolves in the absence of selection for protein function.In all experiments, at each generation we used error-prone PCR to introduce an average of 1.4 nucleotide mutations per P450 gene (Table 1).The mutant genes were ligated into a plasmid and transformed into E. coli [25], and transformants were selected using the plasmid's antibiotic resistance marker.For the unselected case, we randomly picked one of the mutants, recovered the mutant gene with a plasmid mini-prep, and used this mutant as the template for the next generation of errorprone PCR.We performed four independent replicates of unselected evolution, evolving each for 12 generations.
For the monomorphic and polymorphic populations, we imposed the selection criterion that the P450s hydroxylate 12-pNCA with at least 75% of the total activity of the original parent gene.We expressed the P450s in E. coli, and then assayed the cell lysates for activity in a high-throughput 96-well plate format.The total amount of product produced by 80 µl of clarified lysate in 40 minutes was compared to the median of four control wells containing the original parent P450 to determine if the mutant met the selection criterion.The only difference between the monomorphic and polymorphic experiments was the size of the evolving populations.In the monomorphic limit, each mutation is either lost or goes to fixation before the next occurs.We enforced this evolutionary dynamic by holding the population size to a single protein sequence, similar to the "blind ant" random walk of [11].At each generation, we assayed a single mutant.If this mutant met the selection criterion, then it was carried over to the next generation, corresponding to a neutral mutation going to fixation.If the mutant failed the selection criterion, then the population stayed at the previous sequence for the next generation, corresponding to a mutation lost to selection.If all of the mutants assayed had zero or one mutations, then this protocol would correspond exactly to the equations of [11,22].However, in order to achieve appreciable sequence evolution on a laboratory time scale, we used a mutation rate that sometimes produced multiple mutations in a generation.We mathematically describe this situation in the Mathematical Appendix; here we simply note that it is possible to think of each generation as introducing a single mutational event rather than a single mutation.We performed 22 independent replicates of monomorphic evolution, evolving each for 25 generations.
In the polymorphic limit, the population spreads across many sequences.To implement this experimentally, we assayed 435 mutants at each generation.The selection criterion was used to classify each mutant as functional or nonfunctional.In neutral evolution, all functional mutants reproduce with equal probability.We therefore pooled equal volumes of stationary-phase cultures of each functional mutant and recovered the pooled genes with a miniprep.The polymorphic evolution experiment therefore approaches the equations of [11,22], again with the exception that a sequence may undergo multiple mutations at a single generation.We give the equations describing this situation in the Mathematical Appendix.Since the population evolves deterministically in the polymorphic limit [11,22], a single replicate was performed.Because mutations accumulate more rapidly in the polymorphic experiments than the monomorphic ones, we evolved the polymorphic population for 15 generations rather than 25.

Mutations and mutational robustness
Figure 3 shows how mutations accumulated during the course of the neutral evolution experiments (full data are in Table 2 and Additional File 2).Since the unselected protein populations evolve without constraint, mutations accumulate at the same rate at which they are introduced by error-prone PCR, 1.4 nucleotide mutations per generation.Because selection eliminates mutations that disrupt P450 activity, mutations accumulate more slowly in the monomorphic and polymorphic populations.Mutations accumulate more rapidly in the poly-morphic population than in the monomorphic populations.This difference in rates is predicted by the equations in the Mathematical Appendix to be a consequence of the fact that the polymorphic population is more mutationally robust, and so can tolerate more of the possible mutations.
To test directly whether the polymorphic population evolves higher average mutational robustness, we measured the fraction of 435 random mutants that met the selection criterion.Figure 4 shows that the polymorphic population neutrally evolved to a markedly higher mutational robustness than the monomorphic populations, with 50 ± 2% of the final polymorphic population mutants continuing to function versus 39 ± 2% for the final monomorphic populations (Chi-square P -value of 10 −3 that these values are significantly different).The only difference between the two types of populations was their size, so evolution has clearly favored mutational robustness in the larger and thus more polymorphic population.This finding represents the first experimental support for the prediction that highly polymorphic populations evolve excess mutational robustness [11].
Theory predicts that the excess mutational robustness of a highly polymorphic protein population comes from increased protein stability [22].Because the P450 variants unfold irreversibly, an equilibrium thermodynamic stability ∆G f cannot be measured.We therefore determined stability to irreversible thermal and chemical denaturation, two highly correlated measures of P450 stability that have previously been shown to contribute to mutational robustness [5] (see Additional Files 3, 4, and 5). Figure 5 shows that proteins from the polymorphic population were in fact more stable than their counterparts from the monomorphic population.We also observed that proteins in the polymorphic population tended to accumulate to higher levels in E. coli (Figure 5).Elevated expression could be a byproduct of increased stability, or it could independently increase mutational robustness by allowing the proteins to better tolerate mutations that decrease codon adaptation or reduce folding efficiency.It is possible that additional unrecognized biophysical factors also contributed to the excess mutational robustness of the polymorphic population, but no such factors were immediately obvious.
Interpretation in terms of the P450 neutral network The higher mutational robustness of the polymorphic population is due to the fact that it occupies the P450 gene neutral network differently than the monomorphic populations.Measurements from the evolution experiments can therefore be used to infer basic properties of the underlying neutral network of P450 genes, as originally noted by van Nimwegen and coworkers [11].In the Mathematical Appendix, we derive approximations for the normalized principal eigenvalue ν ∞ and the normalized average connectivity ν o of the neutral network, where in both cases the normalization is obtained by dividing by the network coordination number.We obtain ν ∞ = 0.51 and ν o = 0.35 for the P450 gene neutral network.Our ability to consistently estimate these two parameters from four different experimental measurements supports the idea that the theory that we elaborate in the Mathematical Appendix appropriately describes the experiments.The difference between ν ∞ and ν o is a measure of the extent to which some P450 neutral network nodes have more connections than others.We note that ν ∞ is approximately equal to the exponential decline parameter for the asymptotic decline in the fraction of functional mutants with increasing numbers of random nu-cleotide mutations [3,26,27] (see Mathematical Appendix).Previous studies looking at this exponential decline have reported ν ∞ = 0.7 for subtilisin [26], ν ∞ = 0.7 for 3-methyladenine DNA glycosylase [27], and ν ∞ = 0.7 -0.8 for TEM1 β-lactamase [3].These comparisons suggest that P450 has a sparser neutral network (smaller ν ∞ ) than these other proteins.We suspect, however, that these earlier studies (one of which is our own) overestimate ν ∞ due to insufficient equilibration of the starting sequence.We believe that the approach of the current work is more accurate for determining ν ∞ because the measurements are made after many mutations have equilibrated the initial sequence.This approach could be used in future experiments to compare the neutral network connectivities of proteins from different families.

Conclusions
We have demonstrated that neutral evolution favors more mutationally robust proteins when the evolving population is highly polymorphic.Strikingly, the excess mutational robustness is due only to population polymorphism, and so will arise in any population of sufficiently large size.Our work is the first experimental demonstration of this phenomenon, which is predicted to occur quite generally in neutrally evolving proteins and nucleic acids [11].Furthermore, we were able to identify one of the biophysical factors underlying the increase in mutational robustness by showing that proteins from the highly polymorphic population are more stable.We recognize that evolution in a biological context will be more complex.In our experiments, fitness was the P450's ability to be expressed in active form by bacteria grown to saturation in an environment with plentiful nutrients.Biological fitness, however, depends on numerous additional and subtle effects such as the metabolic costs of synthesis or the burdens imposed by misfolded molecules.Some mutations that are neutral in the experiments may therefore have deleterious effects in a biological setting [28].The experiments nonetheless capture the overriding constraint that proteins retain their biochemical functions.Our success in quantitatively explaining the results supports the notion that important aspects of protein evolution can be described simply in terms of mutational effects on stability [22,28].
An obvious question is whether evolution in nature favors mutational robustness by the process we have demonstrated.Whether natural populations will neutrally evolve mutational robustness depends on whether they are sufficiently polymorphic, which will be the case if the product of their effective population size N and per protein per generation mutation rate µ is much greater than one [11,12].Accurately estimating N µ, which is closely related to the widely used parameter θ in population genetics, for natural populations is difficult [29,30] (note that since mutational robustness is a protein-wide property, the relevant mutation rate is per protein, which is ≈ 10 2 to 10 3 larger than the per codon mutation rate).For humans and other multicellular organisms, N µ is probably too small [31] for their proteins to neutrally evolve mutational robustness.But estimates [31,32] place N µ ≈ 10 to 100 for typical-length proteins in bacteria, and it is probably much higher for many viruses [33,34].It is therefore likely that many viral and some bacterial proteins have neutrally evolved extra mutational robustness.
The neutral evolution of protein mutational robustness may also contribute to adaptive evolution.Experiments have shown that extra stability increases a protein's evolvability by allowing it to tolerate a wider range of functionally beneficial but destabilizing mutations [5].A similar phenomenon seems to occur in natural evolution, where functionally neutral but stabilizing mutations can play a key role in adaptive evolution by counterbalancing the destabilizing effects of other functionally beneficial mutations [35].Viruses and perhaps bacteria may thus benefit from large population sizes and high mutation rates that drive an increase in the mutational robustness and stability of their proteins, which in turn enhances the capacity of these proteins to rapidly change their sequences and evolve new functions.

Methods Equilibration evolution of the P450 protein
We began with a 21B3 P450 peroxygenase that had been engineered for highly efficient hydroxylation of 12-pNCA [23] (sequence shown in Additional File 6).This P450 was not well equilibrated to the constant selection criterion that we planned to impose, since it had substantially higher total activity.We therefore neutrally evolved it for 16 generations in order to create P450s that were better equilibrated to the selection criterion.We evolved two parallel populations, which we named R1 and R2.The procedure was exactly identical to that described below for the polymorphic evolution with the following exceptions: • Starting sequence: the starting sequence for the equilibration evolution was the 21B3 sequence.
• Population size: each of the two equilibration evolution populations had a size of 174 sequences rather than the 435 used for the polymorphic evolution.
• Selection criterion: the sequences were required to have at least 75% of the total activity of the 21B3 P450.
• Mutation rate: the mutation rate for the equilibration evolution was much higher than for the polymorphic evolution.The error-prone PCR protocol used 200 µM manganese chloride (MnCl 2 ), rather than the 25 µM used for the polymorphic evolution.We estimate that this error-prone PCR protocol introduced ≈ 4 nucleotide mutations per P450 gene at each generation during the equilibration evolution.
We performed 16 generations of equilibration evolution, and then randomly selected 23 functional mutants from each of the R1 and R2 populations (Additional File 7).We picked one of these mutants, R1-11, for use as the parent for the neutral evolution experiments.

3').
We used error-prone PCR to generate mutants, taking great care to make the errorprone PCR protocol repeatable by using a relatively small number of thermal cycles.This was both to control the mutation rate by ensuring that the reaction did not saturate the reagents (which would cause the number of doublings to become sensitive to the initial template concentration), and to avoid the PCR-based recombination events which can occur during with the last few thermal cycles of PCR reactions [36,37].The PCR reactions were 100 µl in volume, and contained ≈ 13 ng of plasmid template (corresponding to ≈ 3 ng of template gene), 7 mM magnesium chloride MgCl 2 , 1 × Applied BioSystems PCR Buffer II without MgCl 2 , 25 µM MnCl 2 , 0.5 µM pCWori for primer, 0.5 µM pCWori rev primer, 200 µM of dATP and dGTP, 500 µm of dTTP and dCTP, and 5 units of Applied Biosystems AmpliTaq polymerase.The reactions were run on the BLOCK setting of a MJ Research PCR machine with a program of 95 o C for 2 minutes, then 15 cycles of (95 o C for 30 seconds, 57 o C for 30 seconds, 72 o C for 90 seconds), and then cooling to 4 o C.This protocol yielded roughly 1-1.5 µg of product gene (as quantified by gel electrophoresis versus a known standard), for a PCR efficiency of ≈ 0.5.Sequencing the unselected populations at the end of the experiment indicated that this protocol introduced an average of 1.4 ± 0.2 nucleotide mutations, with the nucleotide error-spectrum shown in Table 1.Because the number of PCR doublings is large compared the average mutation rate, the distribution of mutations among sequences should be well-described by the Poisson distribution [38,39].
The mutant genes from the error-prone PCR were purified over a ZymoResearch DNA clean and concentrator column, and digested at 37 o C with EcoR1 and BamH1.The digested genes were then purified from an agarose gel with ZymoResearch DNA gel extraction columns, and ligated into pCWori plasmid that had been digested with BamH1 and EcoR1 and dephosphorylated.The ligations were transformed into electro-competent catalase-free Escherichia coli [25] (the catalase is removed because it breaks down the hydrogen peroxide utilized by the P450 peroxygenase), plated on Luria Broth (LB) plates containing 100 µg/ml of ampicillin to select for the plasmid's antibiotic resistance marker, and grown at 37 o C. Transformation of a control ligation reaction without any digested gene yielded at least 100-fold fewer colonies, indicating that the rate of plasmid self-ligation was less than one percent.
Individual mutant colonies from the plates were picked into 96-well 2 ml deep-well plates containing 400 µl of LB supplemented with 100 µg/ml ampicillin.Each plate contained four parental control wells with cells carrying the parent R1-11 gene, four null control wells with cells carrying the pCWori plasmid without a P450 gene, and a non-inoculated well to check for contamination.For the polymorphic population, we picked five such plates with all 87 other wells containing different mutants for a total population size of 5 × 87 = 435 mutants.For the 22 monomorphic populations (we began with 24 populations but two had to be discarded due to contamination), we picked a single colony for growth and screening.For the unselected populations we picked a single colony for growth without screening.The LB deep-well plates were grown for 16-20 hours at 30 o C, 210 revolutions per minute (rpm), and 80% relative humidity in a Kuhner humidified shaker.To express the P450 mutants, we prepared 2 ml deep well plates containing 400 µl per well of terrific broth (TB) supplemented with 200 µM isopropyl β-D-thiogalactoside (IPTG), 100 µg/ml ampicillin, and 500 µM of δ-aminolevulinic acid.We used a pipetting robot inoculated these TB plates with 100 µl from the LB plates.We stored the LB deep-well plates at 4 o C, and grew the TB deep-well plates in the humidified shaker at 30 o C, 210 rpm, and 80% relative humidity for 22-24 hours.After this growth, the cells were harvested by centrifuging the TB deep-well plates at 4000×g for 5 minutes and discarding the liquid.The cell pellets were flash-frozen in liquid nitrogen to aid in cell lysis.
To lyse the cells for the assays, we resuspended the cell pellets in 300 µl of 100 mM [4-(2-hydroxyethyl)-1-piperazinepropanesulfonic acid] (EPPS) (pH 8.2) with 0.5 mg/ml lysozyme and 4 units/ml of deoxyribonuclease by pipetting 40 times with the pipetting robot.We then incubated the plates at 37 o C for 30 minutes, again resuspended with the pipetting robot, and put back at 37 o C for an addition 30 minutes.We then pelleted the cell debris by centrifugation at 6000×g for 5 minutes at 4 o C. The pipetting robot was used to dispense 80 µl of the clarified lysate into 96-well microtiter plates (Rainin).We prepared a 6× stock of 1.5 mM 12-pNCA in 36% dimethyl sulfoxide (DMSO) and the EPPS buffer (the 12-pNCA was stored in the DMSO solution and combined with the buffer immediately before use).We used a multichannel pipette to add 20 µl of this substrate stock to each well of the microtiter plate.We briefly mixed the plates with "shake" setting of a 96-well plate spectrophotometer, and read an absorbance baseline at 398 nm.We then immediately added 20 µl of a freshly prepared solution of 24 mM hydrogen peroxide in the EPPS buffer to initiate the reaction, and mixed again.The final reaction conditions were therefore the EPPS buffer with 6% DMSO, 4 mM hydrogen peroxide, and 250 µM 12-pNCA.After 40 minutes we quantified the amount of enzymatic product by the increase in absorbance at 398 nm.This absorbance increase is due to the 4-nitrophenolate molecule released after the P450 hydroxylates the twelfth carbon of the 12-pNCA molecule [23].To score the mutants as functional or nonfunctional, we compared their gain in absorbance minus the median null control reading to that of the median parental control reading minus the median null control reading.All mutants that had at least 75% of the parental gain were scored as functional, all other mutants were scored as nonfunctional.We used the information from these assays to select the parents for the next generation.For the unselected population we did not require the mutants to be functional, so the selected mutant was used to start a 4 ml culture of LB with 100 µg/ml ampicillin, and the plasmid DNA was harvested with a mini-prep.This plasmid DNA was used as the template for the next round of error-prone PCR.Therefore, after the first gen-eration the four unselected replicates diverged into four separate error-prone PCR reactions.These unselected replicates were evolved for a total of twelve generations, and were sequenced at every third generation.
For the polymorphic population, all mutants that were functional contributed an equal amount of plasmid DNA as template for the next generation.In order to do this, we collected 50 µl of the culture from the LB deep-well plate for each mutant that was scored as functional.All of these LB aliquots were pooled, and then the plasmid DNA was collected with a mini-prep.The pool of plasmid DNA was used as template for the next generation's errorprone PCR reactions.We performed 15 generations of evolution for this polymorphic population.Note that at each generation we are assaying 435 mutants as part of the evolutionary procedure, so this provides information on mutational robustness.At every third generation, we also selected a random sample of functional mutants for sequencing.After 15 generations, we randomly selected 22 mutants for stability measurements and sequencing analysis.The random selections were made from all functional mutants with the Python computer language random number generator.
For the monomorphic populations, at each generation we assayed just a single mutant.If that mutant was nonfunctional, then at that generation the population stayed at its original sequence.In that case, for the next generation we simply picked a new mutant from the previous generation's plate of transformed mutants.If the mutant we screened was functional, then that mutant represented the new population.We therefore grew a 4 ml LB culture with 100 µg/ml of ampicillin, and collected the plasmid DNA with a miniprep.That plasmid DNA was then used as the template for the next genera-tion's error-prone PCR reaction.We thus had 22 (actually 24 before 2 were contaminated) independent monomorphic populations that were being evolved in parallel.Each was evolved for 25 generations, and at the end of these 25 generations we measured the stability of the final sequence of each population.Each time an assayed mutant was functional, we sequenced the new P450 gene.We also measured the average mutational robustness of the monomorphic populations at every fifth generation.To do this, we did a pooled mini-prep of equal volumes of LB cultures of all 22 replicates to obtain a equal mix of plasmid DNA.We then performed error-prone PCR on this mix, and assayed 435 mutants to measure the fraction functional.Full neutral evolution data are in Additional File 2.

Test for recombination during error-prone PCR
During the polymorphic population evolution, we performed error-prone PCR on a mix of different plasmids.It is common for PCR on mixed templates to lead to recombination events during the reaction [36,37].We attempted to reduce this recombination by using a small number of thermal cycles.However, in order to test for recombination, we analyzed the sequences of the final 22 selected members of the polymorphic population.There are a variety of statistical tests to detect recombination in a set of sequences.A comparison of these tests by Posada [40] found that the Max-Chi 2 method developed by John Maynard Smith [41] performs well.A publicly available implementation of this method [42] is at http://www.lifesci.sussex.ac.uk/CSE/test/maxchi.php.We used this implementation to analyze the 22 final polymorphic sequences, and the resulting P -value was 0.29 after 100 random permutations, indicating that there is not significant recombination.

Measurement of P450 stabilities
We measured the stabilities to both irreversible thermal and irreversible urea denaturation of the final (generation 25) member of each monomorphic population, as well as of the 22 randomly selected members of the polymorphic population.As discussed in the Supplementary Information of [5], cytochrome P450 BM3 heme domains (and indeed most P450s) denature irreversibly, forcing us to use resistance to irreversible denaturation to quantify protein stability.The first stability measure is the T 50 , defined as the temperature at which half of the protein is denatured after a 10 minute incubation.The second stability measure is the [urea] 50 , defined as the urea concentration at which half of the protein denatures after a 4 hour room-temperature incubation.Each set of measurements (those of T 50 and [urea] 50 ) was performed on all of the mutants in the same day, and each mutant was treated identically.Therefore, it is possible to make accurate comparisons of the relative values of the measurements within the data set.However, the absolute values of the T 50 and [urea] 50 values may be less accurate.Therefore, care should be taken in comparing the absolute value of these measurements to those of other studies (such as [5]).
Both the T 50 and [urea] 50 measurements were performed in clarified cell lysate.The protein was expressed using catalase-free E. coli [25] containing the encoding gene on the IPTG inducible pCWori [25] plasmid.We used freshly streaked cells to inoculate 2 ml cultures of LB supplemented with 100 µg/ml of ampicillin, and grew these starter cultures overnight with shaking at 37 o C. We then used 0.5 ml from these starter cultures to inoculate 1 L flasks containing 200 ml of TB supplemented with 100 µg/ml of ampicillin.The TB cultures were grown at 30 o C and 210 rpm until they reached an optical density at 600 nm of ≈0.9, at which point IPTG and δ-aminolevulinic acid were added to a final concentration of 0.5 mM each.The cultures were grown for an additional 19 hours, then the cells were harvested by pelletting 50 ml aliquots at 5,500 g and 4 o C for 10 min, and stored at -20 o C. To obtain clarified lysate, each pellet was resuspended in 8 ml of 100 mM EPPS (pH 8.2) and lysed by sonication, while being kept on ice.The cell debris was pelleted by centrifugation at 8,000 g and 4 o C for 10 minutes, and the clarified lysate was decanted and kept on ice.
For the T 50 measurements, 125 µl of clarified lysate from a single mutant was added to all 12 wells in a row of a 96-well hard-shell thin-wall microplate (MJ Research).The plate was heated for 10 minutes using the gradient method of an Eppendorf Mastercycler gradient PCR machine, with the gradient set at either 33 o C-45 o C or 46 o C-58 o C (each mutant was exposed to both of these gradients), the machine on the BLOCK setting, and the heated lid set to 75 o C with the lid WAIT option.The plate was then cooled to 4 o C, removed from the PCR machine, and centrifuged at 5,500 g and 4 o C for 5 minutes to pellet any debris.A pipetting robot was used to dispense 80 µl of the supernatent into a 96-well microtiter plate (Rainin), and the amount of remaining properly folded P450 was quantified from the carbon monoxide difference spectrum as described below.The T 50 values were determined by fitting sigmoidal curves the percent of remaining protein as shown in Additional File 3. Our ability to accurately compare T 50 values within the data set requires that each well in a given column of the gradient PCR machine be at the same temperature.We used a thermocouple to measure the temperature of the wells with the machine lid open, and confirmed that the wells were within a few tenths of a degree of the same temperature.Further evidence for the consistency of our T 50 values comes from the fact that two independent measurements of the T 50 for our R1-11 parent yielded values that differed by only 0.1 o C.However, the absolute values of the measured temperatures are less accurate.Thermocouple measurements indicated that, with the machine lid open, the wells were ≈ 1 o C cooler than the indicated temperature.We were unable to ascertain the temperatures with the heated lid closed, but based on comparisons water bath measurements, the temperatures with the lid closed slightly exceeded the indicated temperatures.
For the [urea] 50 measurements, 125 µl of the clarified lysate from a single mutant was added to all 12 wells in a row of a 96-well microtiter plate.A pipetting robot was then used to add and mix 125 µl of a 2X solution of urea in 100 mM EPPS (pH 8.2) so that each subsequent column had a higher concentration of urea, and so that the final urea concentrations were those shown in Additional File 4. The plates were left on the bench at room temperature for 4 hours, and the amount of remaining properly folded P450 was quantified from the carbon monoxide difference spectrum as described below.The [urea] 50 values were determined by fitting sigmoidal curves to the percent of remaining protein.Evidence for the consistency of the [urea] 50 measurements comes from the fact that two independent measurements of the [urea] 50 for our R1-11 parent yielded values that differed by only 0.01 M. In addition, the [urea] 50 and T 50 values are highly correlated (Additional File 5), indicating that they provide consistent measures of stability.
For both the T 50 and [urea] 50 measurements, the folded P450 was quantified from the carbon monoxide difference spectrum [43].The microtiter plates containing the P450 samples were first used to read blank spectra at 450 and 490 nm using a Tecan Safire 2 plate reader.The plates were then incubated for 10 minutes in an airtight oven with carbon monoxide.The plates were removed form the oven and 10 µl of 0.1 M sodium hydrosulfite in 1.3 M potassium phosphate (pH 8.0) was immediately added to each well.After 5-10 minutes, spectra were again read at 450 and 490 nm.The amount of P450 is proportional to the increase in the signal at 450 nm after this procedure minus the change in the signal at 490 nm.

A.3 Polymorphic limit 15
A.4 Approximations for polymorphic limit 16 A.5 Approximations for monomorphic limit 18 A.6 Interpretation in terms of neutral networks 20 A.7 Detailed justification for approximating p M by p o 22

A.1 Mathematical background
The first purpose of this appendix is to provide mathematical equations that describe the experiments.The second is to show how four measurements from the experiments can be used to calculate two quantities that describe the topology of the underlying protein neutral network.We will derive two equations for both quantitites, each in terms of a different measurement.The fact that the four equations will be seen to yield consistent results provides evidence for the accuracy of the following calculations.
Our calculations are based on a view of neutral protein evolution as a process constrained by a stability threshold, a view that we originally introduced to explain experimental protein mutagenesis results [3].The calculations closely parallel our earlier work [22], which is in turn based on a general theoretical treatment of evolution on neutral networks by van Nimwegen and coworkers [11].These calculations will probably be most thoroughly understood by first reading those works.The primary difference between the current calculations and [22] is that previously we assumed that the per generation per protein mutation rate µ was ≪ 1, so that at each generation a protein was either unmutated (with probability 1 − µ) or experienced a single mutation (with probability µ).In contrast, here we allow the mutation rate to be arbitrarily large, so that a protein may experience multiple mutations in a single generation (in this sense the calculations resemble the generalization by Wilke [18] of [11]).Specifically, let f m be the probability that a protein experiences m mutations in a single generation.Here we derive results for arbitrary f m , and then approximations relevant to the form of f m in the experiments.In the limiting case of small mutation rate (where f 0 = 1 − µ, f 1 = µ, and f m = 0 for m > 1), the calculations here reduce to those in [22].Proteins evolving in nature typically experience very low mutation rates, so [22] probably offers the best description of natural protein evolution.The calculations presented here are designed to specifically treat the evolutionary dynamics of the experiments.A protein's thermodynamic stability is described by its free energy of folding, ∆G f , with more negative values indicating more stable proteins.As described in several previous papers [3,5,22], we assume that selection requires a protein to fold with some minimal stability ∆G min f , so that a protein adequately folds if and only if ∆G f ≤ ∆G min f .The amount of extra stability a protein possesses relative to the stability threshold is given by ∆G extra f = ∆G f − ∆G min f ; note that all folded proteins will have ∆G extra f ≤ 0. We further assume that as long as ∆G extra f ≤ 0, selection is indifferent to the exact amount of extra stability that a protein possesses (see [22] for a discussion of the limitations of this assumption).We conceptually divide the continuous variable of protein stability into small discrete bins of width b.Specifically, a protein is in bin i if it has ∆G extra f between (1 − i) b and −ib, where i = 1, 2, . ... Mutating a protein changes its stability by an amount ∆∆G (defined as the stability of the mutant protein minus the stability of the initial protein), and so may move it to a new stability bin.In [22], we defined a matrix W with elements W ij giving the transition probabilities that a single mutation changes a protein's stability from bin j to bin i.We noted that W could be computed from the distribution of ∆∆G values for all single mutations, and argued that W remains fairly constant during neutral evolution since the distribution of ∆∆G values remains relatively unchanged.However, we emphasize that (as discussed in detail in [22]) the constancy of the ∆∆G distribution remains an assumption, albeit one that has now been shown to be quite accurate for lattice proteins [3,22,44] and provide a consistent theoretical explanation for a growing body of experimental results (the current work as well as [3]).
Since we are allowing for larger mutation rates, and we must consider the possibility that a protein's stability might change due to multiple mutations at a single generation.Therefore, we make a more general definition of W ij,m as the probability that m random mutations to a protein in stability bin j change its stability to bin i, and let W m be the matrix with elements W ij,m .Note that W m only describes mutations that cause transitions from one folded protein to another, since the stability bins i = 1, 2, . . .all correspond to folded proteins.As before [22], we assume that W m is roughly constant during evolution, meaning that the distribution of ∆∆G values for multiple mutations is roughly constant during neutral evolution.Note that if m = 1, then W m is just the matrix W that can be computed from the distribution of single-mutant ∆∆G values [22].We will now use the matrices W m to calculate the following characteristics of a population that has evolved to equilibrium: the distribution of stabilities, the average number of mutations m T accumulated after T generations, and the average fraction F of stably folded proteins in the population.We then introduce a few approximations (that should be quite accurate for the experimental work in this paper) that greatly simplify these calculations.Finally, we relate the calculations to properties of the underlying protein neutral network.
As described generally by van Nimwegen and coworkers [11], the evolutionary dynamics depend on whether the evolving population tends to be monomorphic or highly polymorphic.When the per sequence per generation mutation rate µ is ≪ 1, whether the population is mostly monomorphic or highly polymorphic is determined by the product of the population size N and µ: when N µ ≪ 1 the population is mostly monomorphic, and when N µ ≫ 1 the population is highly polymorphic.However, with multiple mutations per generation, N µ is no longer an appropriate parameter to distinguish between mono-and polymorphism, since if the population size is sufficiently small the population can still be monomorphic even if there are multiple mutations per generation.Specifically, in one set of experiments we constrained the population to be monomorphic (by maintaining a population size of one), but still allowed the single protein in this population to experience more than one mutation at a generation.So we instead denote the populations as either monomorphic or polymorphic.We indicate quantities calculated for the monomorphic population by the subscript M (i.e.F M ) and those calculated for the polymorphic population by the subscript P (i.e.F P ).

A.2 Monomorphic limit
In the limit of a completely monomorphic population, all of the proteins are in a single stability bin.Let p i (t) be the probability that the population is in stability bin i at time t, and let p (t) be the column vector with elements p i (t).At each generation there is a probability f 0 that there is no mutation that becomes fixed in the population, a probability of ∞ m=1 f m j W ij,m p j that the population experiences a mutational event (which could be a single mutation or several simultaneous mutations) that moves it into bin i, and a probability ∞ m=1 f m p i j W ji,m that the population is in bin i and experiences one or more mutations that move it to another bin of stably folded proteins.Define ν i,m = j W ji,m to be the fraction of m-mutants of a protein in bin i that still fold, and let V m be the matrix with diagonal elements given by V ii,m = ν i,m and all other elements zero.The time evolution of p is where I is the identity matrix.Note that mutations that destabilize a protein beyond the stability threshold are immediately lost to natural selection, and so leave the population in its original stability bin.This describes the experiments for the monomorphic populations, where we retain the parental sequence if the single mutant we generate is nonfunctional.Equation 1 corresponds to Equation (1) of [22], and the blind ant random walk described by van Nimwegen and coworkers [11].
Equation 1 describes a Markov process with a non-negative, irreducible, and acyclic transition matrix, and so p approaches a unique stationary distribution (equilibrium value) of p M given by the eigenvector equation ( Once p has reached equilibrium, the average fraction of proteins that still stably fold at each generation is where e = (1, . . ., 1) is the unit row vector.
To calculate m T,M , the average number of mutations accumulated after T generations once the population has equilibrated, we note that at each generation there is a probability of f m p j i W ij,m that a randomly chosen protein is in bin j, experiences m mutations, and still stably folds.The average number of mutations accumulated in a single generation is simply the average of m weighted over this probability.So summing over all values of m and j, we see that This equation corresponds to Equation ( 6) of [22], which was derived using an embedded Markov process formalism.Here we have foregone this formalism for the more intuitive argument presented above, since we do not attempt to calculate higher moments of the number of mutations.

A.3 Polymorphic limit
In the limit when the population is highly polymorphic, at each generation there are sequences in many different stability bins.In this case, we describe the distribution of stabilities by the column vector x (t), with element x i (t) giving the fraction of proteins in stability bin i at time t.At generation t, the fraction of mutants that continue to fold is Therefore, in order to maintain a constant population size, each remaining protein must produce an average of α t = F t −1 offspring.The population therefore evolves according to After the population evolves for a sufficiently long period of time, x will approach an equilibrium value of x P .At this equilibrium, the average fraction of mutants that fold at each generation is and the equilibrium reproduction rate is α = F P −1 .Therefore, Equations 7 and 8 can be combined to show that x P and F P can be calculated from the eigenvector equation with ( F P − f 0 ) the principal eigenvalue of the nonnegative and irreducible matrix Equation 9 corresponds to Equation ( 14) of [22], Equation ( 6) of the work by van Nimwegen and coworkers [11], and Equation ( 13) of the work by Wilke [18].
We now calculate m T,P , the average number of mutations accumulated in T generations after the population has equilibrated.At equilibrium, there is a probability of f m x j i W ij,m that a protein is in bin j, experiences m mutations, and still stably folds.Subsequently, all of these folded proteins produce an average of α offspring.The average number of mutations accumulated in a single generation is simply the average of m weighted over this probability, and then multiplied by the average reproduction rate.So summing over all values of m and j, we obtain This equation is the counterpart of Equation ( 18) of [22], where we have again foregone the embedded Markov process formalism for a more intuitive derivation.

A.4 Approximations for polymorphic limit
We can dramatically simplify the results from the previous sections with several reasonable approximations.The first approximation is that the ∆∆G values for random mutations are roughly additive, and is supported by a number of experimental studies of the thermodynamic effects of mutations [45][46][47].We have previously shown that this approximation can be used to accurately describe experimental protein mutagenesis data with a simple stability threshold model [3].Under this approximation, the distribution of net ∆∆G values for multiple mutations can be computed from the distribution of ∆∆G values for single mutations by performing convolutions of the singlemutation ∆∆G distribution [3], meaning that W m for arbitrary m can be computed solely from the distribution of ∆∆G values for single mutations.However, to simplify the equations from previous sections, we need to express W m for arbitrary m only in terms of W (recall that W = W 1 ).Since W only contains information about stability transitions from folded proteins to other folded proteins, if we make the second approximation that a protein that is destabilized beyond the minimal stability threshold by one mutation is not re-stabilized to a folded protein by a subsequent mutation, then W m = W m .This approximation that unfolded proteins are not re-stabilized should be quite accurate since stabilizing mutations tend to be relatively rare and small in magnitude [48][49][50][51] (this is the underlying idea behind the Markov chain approximation that was shown to be highly accurate for lattice proteins [44]).To summarize, if ∆∆G values are roughly additive and stabilizing mutations are rare, we have the approximation Simplifying the equations of the previous sections also requires assigning a specific functional form to f m , the probability that a sequence undergoes m mutations.Here we assume that mutations are Poisson distributed among sequences, so that where µ = ∞ m=0 mf m is the average number of mutations per protein per generation.When the mutations are introduced by error-prone PCR, the Poisson distribution is an excellent approximation to the true theoretical distribution of mutations created by error-prone PCR [38,39] provided that µ is much less than the number of PCR doublings, as is the case in all of the experiments in the current work.
We now use the approximations of Equations 11 and 12 to simplify the results given above for the highly polymorphic limit.We begin by using these approximations to rewrite Equation 9as This equation makes clear that x P is the principal eigenvector of the matrix m! W m , therefore x P must also be the principal eigenvector of W. Now in our earlier work [22], we defined the principal eigenvector of W as x ∞ , called the corresponding eigenvalue ν ∞ , and showed that this eigenvalue is shown the average fraction of single mutations that are neutral in a population that is evolving with N µ ≫ 1 and µ ≪ 1.Therefore, with the approximation of Equation 11, x P and x ∞ are equal, and are both defined by the same eigenvector equation, Combining Equations 13 and 14 we have, Equation 15 can be solved to yield Similarly, we can simplify Equation 10, Solving this equation for ν ∞ yields A.5 Approximations for monomorphic limit We now simplify the equations for the monomorphic limit.This requires several further approximations.We begin by approximating that the stability probability distribution p M given by Equation 2 by the distribution p o defined in [22] as satisfying The basic rationale behind approximating p M with p o is that Equation 2 can be viewed as a perturbation to Equation 19 [52].Essentially, p o is an eigenvector of the matrix W − V while p M is the corresponding eigenvector of the matrix ).The latter matrix can be viewed as a perturbation to the first, since the sum ) is small.This smallness is due to the fact that W m tends to zero with large m, causing V m to tend towards the identity matrix.In addition, the µ m /m! terms tend to zero with large m.Therefore, the terms in the summation are all simply either a perturbation to W − V or involve subtracting terms that are fractions of the identity matrix.The perturbations lead to bounded changes in the eigenvectors [52], while the identity matrix terms do not change the eigenvectors.Below we give a more rigorous justification of the assumption that p M is approximately equal to p o .
We need one additional approximation to make further progress.Both Equations 3 and 4 contain terms of the form W m p o , and even if we use Equation 11to rewrite these terms as W m p o , there are no further clear simplifications.However, any probability vector that is multiplied repeatedly by W and normalized will eventually converge to x ∞ = x P (since this is the principal eigenvector of W).We make the approximation that this convergence is sufficiently rapid to be essentially complete after a single multiplication.This approximation is supported by both protein mutagenesis studies [3,26,27] that indicate that proteins rapidly converge to an exponential decline in the fraction folded (indicating the stability distribution has equilibrated, as discussed below, and by lattice protein studies showing the same [3,44].Therefore, we make the approximation that where ν o = eWp o has the same definition as in [22], where it was defined as the average fraction of functional single mutants of a population evolving with µ ≪ 1 and N µ ≪ 1.
We use these approximations to simplify Equation 3as Solving this equation for ν o , we find We now use the approximations to simplify Equation 4as Solving this equation for ν o yields To recap, we now have equations to calculate ν ∞ and ν o from experimentally measurable quantities.Equations 16 and 18 allow us to calculate ν ∞ from F P and m T,P , respectively.Given this calculated value of ν ∞ , Equations 21 and 23 then allow us to calculate ν o from F M and m T,M , respectively.The fact that we have two equations each for ν ∞ and ν o allows us to assess the self-consistency of the approach.

A.6 Interpretation in terms of neutral networks
Throughout the preceding calculations, we have referred to ν ∞ and ν o as we defined them in [22]: namely, as the average neutrality of protein populations evolving with µ ≪ 1 and N µ either ≫ 1 or ≪ 1, respectively.However, van Nimwegen and coworkers [11] have shown that they can also be interpreted in terms of the underlying neutral network.In the experiments we make mutations at the nucleotide (rather than amino acid) level, so each point in our sequence space corresponds to a different gene.Every gene that yields an amount of protein sufficient to hydroxylate the twelfth carbon of 12-p-nitrophenoxydodecanoic acid with at least 75% of the total activity conferred by the original R1-11 parent gene represents a node on this neutral network.We note that in the experiments (and also usually in natural evolution), the edges on the neutral network are not all completely equivalent or fully undirected, since some mutations are more likely to occur than others (for example, error-prone PCR with Taq polymerase is more likely to cause an A→G mutation than an A→C mutation).In the analysis that follows, we ignore this complication and assume all neutral network edges are equivalent.
In an extremely insightful analysis, van Nimwegen and coworkers [11] have shown that important characteristics of a neutral network can be inferred from evolutionary quantities.Specifically, they have shown that if a population is evolving with µ ≪ 1 and N µ ≫ 1, then the average neutrality (which we have denoted by ν ∞ ) is equal to the principal eigenvalue of the adjacency matrix of the neutral network, normalized by the network coordination number (number of possible connections per node).In addition, they pointed out that a population evolving with µ ≪ 1 and N µ ≪ 1 moves like a blind ant random walk, meaning that the average neutrality (which we have denoted by ν o ) is equal to the average connectivity of a neutral network node divided by the network coordination number.In our P450 experiments, we have measured the values needed to estimate ν ∞ and ν o using Equations 16, 18, 21, and 23.Using the final values listed in Table 2, F P = 0.50 and F M = 0.39.Taking the final nucleotide mutation values from Table 2, m T,P /T = 0.69 and m T,M /T = 0.31.The average mutation rate, computed from the unselected population, is µ = 1.40.So using Equation 16, ν ∞ = 0.53, while using Equation 18, ν ∞ = 0.49.The consistency of these two values supports the idea that the calculations above accurately describe the evolutionary process.Taking the average value of these two measurement as ν ∞ = 0.51, we can then use Equations 21 and 23 to calculate ν o .We calculate values of 0.28 and 0.43, respectively.These estimates differ by more than those for ν ∞ , perhaps because additional approximations have gone into the derivation of the relevant equations (in addition, we have made no attempt to carry out the rather complex propagation of the sampling errors of Table 2).However, the values are still in a similar range.Taking the average of these two values, we estimate that ν o = 0.35.So overall, we predict that each functional P450 gene should have an average fraction of 0.35 of its sequence nearest neighbors also encoding a functional gene, for an average of about 1,500 neighbor genes.We predict that the principal eigenvalue of the neutral network adjacency matrix is 0.51 ×3L.The fact that principal eigenvalue exceeds the average connectivity indicates that the neutral network is not a regular graph, but instead has some nodes more highly connected than others.
The value for ν ∞ calculated above can also be related to measurements from protein mutagenesis experiments.A number of studies [3,26,27] have observed that the probability that a protein remains functional after m mutations falls off exponentially with the number of mutations.In fact, the decline is not always exponential for the first few mutations if the starting protein has especially high or low stability [3] or activity [53], but will still converge to this exponential form after a few mutations [3,44,54].The stability threshold model can be used to relate this decline to ν ∞ , as is done indirectly in the Markov chain approximation of [44].Here we make that connection explicit.The initial protein has a stability that falls into some stability bin i.Therefore, its stability can be described by the column vector y 0 , which has element i equal to one and all other elements equal to zero.Now imagine constructing all single mutants of this protein.The fraction of these single mutants that still fold is just eWy 0 , and the distribution of stabilities among the single mutants is y 1 = Wy 0 (note that the elements of y 1 no longer sum to one).Similarly, after m mutations, the fraction of mutants that still fold is eW m y 0 , and the distribution of stabilities among the m-mutants is y m = W m y 0 .With the approximation of Equation 11, y m = W m y 0 .This makes it clear that y m will converge to a vector proportional to x ∞ , the principal eigenvector of W. Once this convergence is complete, each new mutation simply reduces the fraction of mutants that fold by a factor of ν ∞ , the principal eigenvalue of W (and the spectral radius of the neutral network normalized by the coordination number).Therefore, what we have called ν ∞ in the present work and [22] is equal to what is called x in [27], q in [26], and ν in [3].The major difficulty that is faced in extracting ν ∞ by the method of those three studies [3,26,27] is that it is not possible to directly assay mutants with m mutations, but instead only possible to assay a set of mutants with a distribution of m.All three studies use different (and valid) methods to account for this distribution, but this accounting is still difficult because most of the functional mutants come from the low m end of the distribution.This makes it hard to get accurate values for the fraction functional after large numbers of mutations, since most of the functional mutants in the set come from sequences with few mutations.For this reason, we believe the current method of measuring ν ∞ is more accurate.A second caution about comparing values of ν ∞ from different studies is that its value depends on the nucleotide error-spectrum of the experiment, since different mutagenesis methods create different distributions of nucleotide and amino acid mutation types.
We also briefly mention how we arrived at an estimate of ν ∞ for 3-methyladenine DNA glycosylase from the data of [27].This paper reports that a fraction x = 0.34 of amino acid mutations inactivate the protein.We would like to determine the fraction ν ∞ of nucleotide mutations that do not inactivate the protein.Roughly 75% of random mutations to a gene will be synonymous.Therefore, m amino acid mutations should cause about 4m/3 nucleotide mutations.The study of [27] measures that after m mutations, a fraction (1 − x) m of the mutants are functional.That means that ν ∞ 4m/3 fraction should be functional.Equating these expressions yields ν ∞ = exp 3 4 log (1 − x) .So using x = 0.34, we arrive at ν ∞ = 0.73.

A.7 Detailed justification for approximating p M by p o
Here we provide a detailed justification for the approximation that p M is about equal to p o .In the monomorphic limit, the time evolution of p is given by Equation 1, and the stationary distribution p M is given by Equation 2. We assume the approximations of Equations We therefore suppose that at some time t the distribution is given by p o and compute, using Equation 1, the change in p o after one generation.The new distribution at time t + 1 is given by Using (V − W) p o = 0, and taking components of the above equation, we obtain Thus p o would be an approximately stationary distribution of the dynamics if . We now proceed to show that this will be the case in most situations of interest by deriving upper and lower bounds on the second term of the right hand side of Equation 25.
Consider first the term (W m p o ) i , which can be written as where we have used Wp o = Vp o in the second equality.We now note that ν k ≤ ν max for all k, where ν max is the maximum neutrality, maximized over all bins.This leads to the successive inequalities yielding the upper bound In an identical manner, we obtain the lower bound where ν min is the smallest neutrality, minimized over all bins.Note that both inequalities above become exact equalities when all bins have the same neutrality ν, which could be interpreted as either ν min or ν max .
Having obtained inequality constraints on (W m p o ) i , we now consider the term (V m p o ) i , which can be written as which yields an identical upper bound to that on (W m p o ) i , namely, It should again be noted that both the above inequalities become exact equalities when all bins have a common neutrality ν.
We are now in a position to estimate bounds on the magnitude of the second term of Equation 25.Using the four inequalities of Equations 28, 29, 31, and 32 above, we have where the inequality above becomes an exact equality when all bins have the same neutrality.However, in this limit, the right hand side of the above equation vanishes, and therefore the second term of Equation 25 is identically zero in this case, giving the result that p M is exactly equal to p o when all bins have the same neutrality, even if µ is arbitrarily large.
We now carry out the sum over m to obtain an upper bound on the second term of Equation 25 in the more general and realistic case of unequal neutrality bins.Using Equation 34 and the specific Poisson form of f m , we obtain an upper bound on the fractional change in p 0i in one generation: The above bound vanishes for small µ, is an increasing function of ν max − ν min , and is typically much smaller than 1.An extreme estimate of the size of the fractional change can be made when ν max = 1 and ν min = 0.In this case, using µ = 1.4 (the value in our experiments), the above inequality simplifies to Noting that ν i < 1, the fractional change in p 0i is therefore reasonably controlled even in the most extreme case.For realistic situations, the fractional change in p 0i is expected to be much lower, thus justifying the use of p o as the stationary distribution of the dynamics of Equation 1.  (A) Theory predicts that a mostly monomorphic population is equally likely to occupy any node of its neutral network, while a highly polymorphic population will prefer more connected nodes [11].Node sizes are drawn proportional to the occupation probabilities.(B) Proteins evolving in a highly polymorphic population are predicted to be more stable than their counterparts in a mostly monomorphic population [22].The histograms illustrate the distributions of stabilities for the two cases.The increased stability is a biophysical manifestation of excess mutational robustness, since more stable proteins are more mutationally robust [3][4][5].The P450s from the polymorphic population neutrally evolved higher stability and expression levels than their counterparts from the monomorphic populations.The histograms show the distributions for the final protein from all monomorphic replicates and for the same number of randomly chosen proteins from the final polymorphic population.The plots show (left to right) the temperature at which half the protein irreversibly denatured after 10 minutes (T 50 ), the urea concentration at which half the protein denatured after 4 hours ([urea] 50 ), and the expression level relative to that of the original parental P450.The means are significantly different, with unequal variance t-test P -values of 0.02, 0.005, and 0.04, respectively.
Spectrum of nucleotide mutations introduced by the error-prone PCR procedure used in the neutral evolution experiments.The spectrum was determined by sequencing the four final (generation 12) sequences from the unselected population, since in these sequences the mutations accumulate without constraint.As has been previously noted for error-prone PCR with Taq polymerase [3,5,26], the nucleotide error spectrum is biased towards certain types of mutations. Figures

Figure 1 -
Figure 1 -Theoretical views of the evolution of protein mutational robustness and stability.

Figure 2 -
Figure 2 -Outline of the neutral evolution experimental procedure.

Figure 4 -
Figure4-The polymorphic population neutrally evolved a higher average mutational robustness than the monomorphic populations.The fraction functional was determined by assaying 435 mutants (average of 1.5 nucleotide mutations per gene).Error bars show binomial standard error.For the monomorphic population, numbers are the average over all replicates.

Figure 5 -
Figure 5 -The more mutationally robust proteins are more stable.
11 and 12 and show that we can approximate p M by p o , where p o is given by Equation 19.To justify this approximation, we insert p o into the right hand side of Equation 1 and ask to what extent p o is left unaltered by the dynamics.If p o is found to be stationary to good approximation then, by uniqueness of the stationary distribution of an ergodic process, p o would be a good approximation to p M .

Table 2 -
Neutral evolution robustness and mutation data.Each row is for a different generation, T .Entries of NA indicate that no measurement was made.The m nt and m aa are the average number of nucleotide mutations and nonsynonymous mutations, respectively.Numbers in parentheses are total counts over the total samples.Subscripts indicate the population type: U for unselected, P for polymorphic, and M for monomorphic.For the unselected and monomorphic populations, numbers represent averages of all replicates.For the polymorphic population, numbers are for a random sample of functional mutants.F P and F M are the fraction of functional mutants out of 435 assayed.