The protein production rate fluctuates quasi-periodically
To measure the effect of the cell cycle on protein expression, we first determined protein production rate as quantified by the time derivative of the total cellular fluorescence (Methods). Taking the data for all cells with a completed cell cycle (n = 393) over all cell cycle phases, the protein expression rate displayed a total noise intensity (defined as standard deviation divided by the mean) of 0.48 [17]. When plotting the production rate versus cell cycle phase ϕ (where 0 is cell birth and 1 is cell division) and averaging over all cells (Fig. 1a), it displayed the following trend: it was approximately constant in the first half, after which it rose to about two-fold at the end of the cycle (Fig. 1b, Additional file 2: Figure S1). An initially constant rate and two-fold increase is consistent with the known chromosome replication pattern for the observed mean growth rate (0.6 dbl/h): a single chromosome copy in the first period of the cell cycle, after which replication occurs in the second period that produces two copies [29]. Each chromosome copy then yields a fixed expression rate. This is not unreasonable, as other components required for expression, such as RNA polymerases and ribosomes, also double throughout the cell cycle. At faster growth, replication occurs throughout the cell cycle for multiple nested chromosome copies [30]. Consistently, we found that the production rate was not initially flat, but instead rose continuously throughout the cell cycle when growing on a different medium that supported a higher mean growth rate of 1.8 dbl/h (Additional file 2: Figure S2). The total increase remained two-fold, in agreement with an expected doubling of the number of gene copies. Overall, these data indicate that the mean protein expression rate is likely proportional to the gene copy number and hence doubles during chromosome replication. This variation is more continuous at high growth rate because of the nested replication and overall higher gene copy numbers.
Deterministic cell cycle variations contribute to expression noise
To quantify the contribution of the mean cell cycle fluctuations (Fig. 1b) to protein production noise we split the single-cell production rate (which is distinct from the protein concentration) p(ϕ, x) into the population averaged rate \( \overline{p_c}\left(\phi \right) \) and individual deviations δp(ϕ, x), which together capture all cell-to-cell variability (Fig. 1a,b): \( p\left(\phi, \mathbf{x}\right)=\overline{p_c}\left(\phi \right)+\delta p\left(\phi, \mathbf{x}\right) \). Here, ϕ denotes the cell cycle phase and x all other causes of cell-to-cell variability; c refers to cell cycle dependence, which here is redundant because it is implied by the ϕ dependence but used for notation consistency. \( \overline{p_c}\left(\phi \right) \) can be estimated by the curve in Fig. 1b, and subtracted from individual traces to obtain an estimate for δp(ϕ, x). The noise intensity caused by the deterministic cell cycle fluctuation \( \overline{p_c}\left(\phi \right) \) is 0.26, which was obtained by considering the phase ϕ as a random variable and then calculating the variance of the trace. Noise of the individual expression traces δp(ϕ, x), averaged over all cells and ϕ, was 0.42 (Additional file 2: Figure S3a). These values are consistent with a scenario in which population mean trace \( \overline{p_c}\left(\phi \right) \) and deviation traces δp(ϕ, x) are independent and thus their variances (squared noise) can be added up: 0.482 ≈ 0.262 + 0.422. This population-average cell cycle contribution towards production rate noise does not include cell cycle stochasticity of individual cells and we will consider that below.
Concentration fluctuations are buffered by dilution
Fluctuating production rates can cause noise in the protein concentration. To determine the latter, we quantified the mean fluorescence per unit area of the cell. The noise intensity of 0.15 (0.10 for fast growth), which was obtained by taking the data of all cells and at all cell cycle phases, was consistent with previous reports [1]. After ordering by cell cycle phase and averaging (Fig. 1c), the concentration also showed systematic variations (Fig. 1d
, Additional file 2: Figure S1): it increased slightly right after cell birth, then decreased and finally rose again. The amplitude of these variations was 4 % of the mean. This low value (Additional file 2: Figure S3b) and the initial increase seemed inconsistent with the large amplitude of variations in production rate caused by the cell cycle, as well as with the initially constant value of production rate (Fig. 1b) [25].
To get a more intuitive understanding of these differences, we formulated a minimal cell cycle model based on the measured cell cycle dependency of production rate (Fig. 1b). The concentration cannot be determined by simply integrating the production rate, as this would ignore dilution due to volume growth. To quantify the volume growth, we determined for each cell its length and its dependence on the cell cycle phase (Fig. 1e, Methods) [28]. The population mean cell length \( \overline{L}\left(\phi \right) \) was well described by an exponential function (Fig. 1f) [31–33], and not by bi-linear or linear functions (Additional file 2: Figure S4), as suggested previously [34–37]. Therefore, an exponential function for cell size was used as input for the minimal model (Fig. 2b). With a mean protein production \( \overline{p}\left(\phi \right) \) at phase ϕ (Fig. 2a), the concentration \( \overline{E}\left(\phi \right) \) can then be written as: \( \overline{E}\left(\phi \right)=\left(\overline{F_0}+{\displaystyle {\int}_0^{\phi }}\overline{p}\left({\phi}^{\hbox{'}}\right)d{\phi}^{\hbox{'}}\right)/\overline{L}\left(\phi \right) \), where \( \overline{F_0} \) is the total amount of protein at cell birth.
By design, \( \overline{E}\left(\phi \right) \) (Fig. 2c) reproduced the measured data (Fig. 1d) and provided an explanation for the observed functional form, including the counterintuitive increase in concentration at the beginning of the cell cycle, before duplication occurs. As \( \overline{E}\left(\phi \right) \) is periodic, we know that increases (dilution rate smaller than expression rate) are balanced by decreases (dilution rate larger than expression rate). In cases where duplication occurs late, the expression rate is essentially constant because there is only one gene copy, while the dilution rate changes. Thus, dilution is then comparatively weak at the beginning of the cell cycle, resulting in an increasing concentration, while dilution is comparatively strong further into the cell cycle, resulting in decreasing concentrations. This rationale also explains why concentration fluctuations are small: the functional form of the total fluorescence (as a function of the cell cycle phase) is almost identical to that of the volume (Fig. 2b).
Stochastic replication timing contributes to expression noise
The single cell data also suggested that stochasticity in replication timing is a source of protein production noise, which is supported by previous studies [23, 38] (Fig. 1a, thin lines). In other words, δp(ϕ, x) would be the sum of fluctuations caused by cell cycle stochasticity δp
c
(ϕ, ν) and of fluctuations δp
nc
(x) unrelated to the cell cycle (Fig. 3a). Here, v is the cell cycle phase at which the gene of interest is replicated and v varies from cell to cell. Thus, the sum of δp
c
(ϕ, ν) and the population-average \( \overline{p_c}\left(\phi \right) \) yield all the fluctuations p
c
(ϕ, ν) caused by the cell cycle. To determine the stochastic contribution of the cell cycle to the expression noise, one needs to quantify δp
c
(ϕ, ν). However, it is not trivial to distinguish δp
c
(ϕ, ν) from the other stochastic, non-cell cycle variations in the experimental single-cell traces.
To overcome this problem, we started with p
c
(ϕ, ν) and followed a variance decomposition approach using the law of total variance [39, 40]. The variance of the full cell cycle fluctuations can be decomposed as follows:
$$ Var\left({p}_c\left(\phi, \nu \right)\right)=\left\langle Var\left({p}_c\left(\phi, \nu \right)\Big|\phi \right)\right\rangle +Var\left(\left\langle {p}_c\left(\phi, \nu \right)\Big|\phi \right\rangle \right) $$
(1)
Here, angular brackets denote averaging, and the notations Var(… |ϕ) and 〈 … |ϕ〉 indicate, respectively, the variance and the average for a given phase ϕ (conditioned on ϕ). In the second term, the brackets thus indicate an averaging over the stochastic variable v, which yields \( \overline{p_c}\left(\phi \right) \). Next, the variance is taken. This variance was in fact calculated previously, and found to be (0.26)2 (Fig. 1b). Thus, the second term indicates the deterministic contribution to the cell cycle induced noise.
In the first term, the variance of p
c
(ϕ, ν) is determined conditionally on ϕ, and then averaged. This term thus denotes the stochastic contribution to the cell cycle-induced noise. The data does not directly provide an estimate of this variance, because the cell cycle-induced noise and noise from other sources are confounded in the measured single-cell traces of the production rate (Fig. 1a). Indeed, in these traces, other noise sources, such as metabolism [28] and fluctuating transcription factors [1], are substantial and can mask the quick two-fold increase expected from gene replication events. However, in a subset of traces, the two-fold increase was clear (Fig. 3b,c, Methods). Fitting each of these traces with a step-function (Additional file 2: Figure S5) provided a distribution of the step-moment, v. We obtained a wide distribution for v with a mean 0.64 and a standard deviation of 0.17 (Fig. 3c). To check whether this distribution was consistent with the full dataset, we compared the average of the fitted step-functions to the average of all measured traces (\( \overline{p_c}\left(\phi \right) \), Fig. 1b), and found that they were similar (Fig. 3d). These findings suggested that gene duplication events with stochastic timing in individual cells underlie the smooth shape of the population average production rate (Fig. 1b).
The distribution of v (Fig. 3c) now allowed us to estimate the first term in eq. (1), by first determining the variance of the step-functions at fixed phase, and then averaging over all phases (Additional file 2: Figure S6a). We obtained a value of (0.23)2 for this stochastic contribution of the cell cycle to expression noise, which is comparable in magnitude to the deterministic contribution denoted by the second term ((0.26)2, Additional file 2: Table S1). Thus, variability in initiation timing contributes substantially to the cell cycle-induced noise. The deterministic and stochastic contributions together (p
c
(ϕ, ν)) thus caused a variance of (0.23)2 + (0.26)2 = (0.35)2, which is about half (52 %) of the protein production variance (Fig. 5b, Additional file 2: Table S1).
To estimate how the protein concentration noise is affected by the cell cycle, we computed the concentration traces resulting from the step-like production rate functions (Additional file 2: Figure S6a). For each p
c
(ϕ, ν) of the set (Fig. 3c) the corresponding concentration curve was computed, considering that proteins are diluted due to volume growth (Additional file 2: Figure S6b). We found that the quasi-periodic concentration fluctuations caused by the cell cycle (which includes deterministic and stochastic components) contributed less than 1.5 % to the variance in protein concentration (Additional file 2: Figure S6b and Fig. 5b). Note that one can distinguish contributions from the population average trend (Fig. 1d) and the stochastic deviations around it due to variability in replication timing (less than 1 % contribution each, Additional file 2: Table S1).
Location on the chromosome affects expression fluctuations
Chromosome replication is initiated at the origin of replication (oriC) from which two replication forks then progress simultaneously and bi-directionally along the two strands of DNA [41]. This raises two expectations: first, genes located at opposite sides but at the same distance from oriC should be duplicated at the same time and thus show the same cell cycle dependence of protein production and concentration. Second, if one gene is located upstream of the other, the increase in protein production should occur earlier. To test the first prediction, we investigated a cfp gene positioned symmetrical to the yfp gene studied so far, at the opposite strand at the same distance from oriC (Methods, Fig. 4a inset). We indeed found that both reporters displayed a similar dependence of production rate and concentration on cell cycle phase (Fig. 4ab, Additional file 2: Figure S1).
To change the position we studied a gfp gene under P
lac
control closer to oriC than yfp or cfp (Methods, Fig. 4e). As expected from the earlier replication, the GFP production rate indeed increased earlier than the previous YFP signal (Fig. 4e). It started comparatively low, then increased more than two-fold and subsequently decreased again to end at twice the initial rate (Fig. 4e). The cause of the high fold-change and decrease is unknown, but changes in chromosome structure or transient improvement in competition for RNA polymerases for this promoter (two binding sites at the two replicated genes) could play a role. As predicted by the model (Fig. 4c,d), the dip in GFP concentration occurred earlier and the initial increase disappeared (Fig. 4f). The magnitude of fluctuations remained at around 4 %. Overall, these data show that gene position on the chromosome affects cell cycle-related noise.