Division in Escherichia coli is triggered by a size-sensing rather than a timing mechanism

Background Many organisms coordinate cell growth and division through size control mechanisms: cells must reach a critical size to trigger a cell cycle event. Bacterial division is often assumed to be controlled in this way, but experimental evidence to support this assumption is still lacking. Theoretical arguments show that size control is required to maintain size homeostasis in the case of exponential growth of individual cells. Nevertheless, if the growth law deviates slightly from exponential for very small cells, homeostasis can be maintained with a simple ‘timer’ triggering division. Therefore, deciding whether division control in bacteria relies on a ‘timer’ or ‘sizer’ mechanism requires quantitative comparisons between models and data. Results The timer and sizer hypotheses find a natural expression in models based on partial differential equations. Here we test these models with recent data on single-cell growth of Escherichia coli. We demonstrate that a size-independent timer mechanism for division control, though theoretically possible, is quantitatively incompatible with the data and extremely sensitive to slight variations in the growth law. In contrast, a sizer model is robust and fits the data well. In addition, we tested the effect of variability in individual growth rates and noise in septum positioning and found that size control is robust to this phenotypic noise. Conclusions Confrontations between cell cycle models and data usually suffer from a lack of high-quality data and suitable statistical estimation techniques. Here we overcome these limitations by using high precision measurements of tens of thousands of single bacterial cells combined with recent statistical inference methods to estimate the division rate within the models. We therefore provide the first precise quantitative assessment of different cell cycle models.

http://www.biomedcentral.com/1741-7007/12/17 particular, on the basis of recent single-cell analysis, the team headed by N Kleckner proposed that replication initiation is more tightly linked to the time elapsed since birth than to cell mass [13,14]. In addition, the extent to which initiation timing affects division timing is unclear. In particular, variations in initiation timing are known to lead to compensatory changes in the duration of chromosome replication (see [15][16][17] and references therein). These studies argue against a size control model based on replication initiation. Another model postulates that size control acts directly on septum formation [18,19]. Nevertheless, the nature of the signals triggering the formation of the septal ring and its subsequent constriction are still unknown [17,20] and no molecular mechanism is known to sense cell size and transmit the information to the division machinery in bacteria.
Besides the work of Donachie, the assumption of size control in bacteria originates from a theoretical argument stating that such a control is necessary in exponentially growing cells to ensure cell size homeostasis, i.e. to maintain a constant size distribution through successive cycles. The growth of bacterial populations has long been mathematically described using partial differential equation (PDE) models. These models rely on hypotheses on division control: the division rate of a cell, i.e. the instantaneous probability of its dividing, can be assumed to depend either on cell age (i.e. the time elapsed since birth) or cell size. In the classical 'sizer' model, the division rate depends on size and not on age whereas in the 'timer' model it depends on age and not on size. Mathematical analysis of these models sheds light on the role of size control in cell size homeostasis. In particular, it has been suggested that for exponentially growing cells, a timer mechanism cannot ensure a stable size distribution [21,22]. Nevertheless, this unrealistic behavior of the timer mechanism is based on a biologically meaningless assumption, namely the exponential growth of cells of infinitely small or large size [23,24]. Cells of size zero or infinity do not exist and particularly small or large cells are likely to exhibit abnormal growth behavior. In conclusion, the mathematical arguments that were previously developed are insufficient to rule out a size-independent, timer model of bacterial division: quantitative comparisons between models and data are needed.
In the present study, we test whether age (i.e. the time elapsed since birth) or size is a determinant of cell division in E. coli. To do so, we analyzed two datasets derived from two major single-cell experimental studies on E. coli growth, performed by Stewart et al. [25] and Wang et al. [26]. Our analysis is based on division rate estimation by state-of-the-art nonparametric inference methods that we recently developed [27,28]. The two datasets correspond to different experimental setups and image analysis methods but lead to similar conclusions. We show that even though a model with a simple timer triggering division is sufficient to maintain cell size homeostasis, such a model is not compatible with the data. In addition, our analysis of the timer model shows that this model is very sensitive to hypotheses regarding the growth law of rare cells of very small or large size. This lack of robustness argues against a timer mechanism for division control in E. coli as well as in other exponentially growing organisms.
In contrast, a model where cell size determines the probability of division is in good agreement with experimental data. Unlike the timer model, this sizer model is robust to slight modifications of the growth law of individual cells. In addition, our analysis reveals that the sizer model is very robust to phenotypic variability in individual growth rates or noise in septum positioning.

Description of the data Age and size distribution of the bacterial population
The results reported in this study were obtained from the analysis of two different datasets, obtained through microscopic time-lapse imaging of single E. coli cells growing in a rich medium, by Stewart et al. [25] and Wang et al. [26]. Stewart et al. followed single E. coli cells growing into microcolonies on LB-agarose pads at 30°C. The length of each cell in the microcolony was measured every 2 min. Wang et al. grew cells in LB: Luria Bertani medium at 37°C in a microfluidic setup [26] and the length of the cells was measured every minute. Due to the microfluidic device structure, at each division only one daughter cell could be followed (data s i : sparse tree), in contrast to the experiment of Stewart et al. where all the individuals of a genealogical tree were followed (data f i : full tree). It is worth noting that the different structures of the data f i and s i lead to different PDE models, and the statistical analysis was adapted to each situation (see below and Additional file 1). From each dataset (f i and s i ) we extracted the results of three experiments (experiments f 1 , f 2 and f 3 and s 1 , s 2 and s 3 ). Each experiment f i corresponds to the growth of approximately six microcolonies of up to approximately 600 cells and each experiment s i to the growth of bacteria in 100 microchannels for approximately 40 generations.
Given the accuracy of image analysis, we do not take into account variations of cell width within the population, which are negligible compared to cell-cycle-induced length variations. Thus, in the present study we do not distinguish between length, volume and mass and use the term cell size as a catch-all descriptor. Cell age and cell size distributions of a representative experiment from each dataset are shown in Figure 1. These distributions are estimated from the age and size measurements of every cell at every time step of a given experiment f i or http://www.biomedcentral.com/1741-7007/12/17 s i , by using a simple kernel density estimation method (kernel estimation is closely related to histogram construction but gives smooth estimates of distributions, as shown in Figure 1, for instance; for details see the Methods and Additional file 1). As expected for the different data structures (full tree f i or sparse tree s i ) and different experimental conditions, the distributions for the two datasets are not identical. The age distribution is decreasing with a maximum for age zero and the size distribution is wide and positively skewed, in agreement with previous results using various bacterial models [29][30][31].

Age-structured (timer) and size-structured (sizer) models
The timer and sizer hypotheses are easily expressed in mathematical terms: two different PDE models are commonly used to describe bacterial growth, using a division rate (i.e. the instantaneous probability of division) depending either on cell age or cell size. In the agestructured model (Age Model) the division rate B a is a function only of the age a of the cell. The density n(t, a) of cells of age a at time t is given as a solution to the Mckendrick-Von Foerster equation (see [32] and references therein): with the boundary condition In this model, a cell of age a at time t has the probability B a (a)dt of dividing between time t and t + dt.
In the size-structured model (Size model), the division rate B s is a function only of the size x of the cell. Assuming that the size of a single cell grows with a rate v(x), the den-sity n(t, x) of cells of size x at time t is given as a solution to the size-structured cell division equation: [32] ∂ ∂t In the Size Model, a cell of size x at time t has the probability B s (x)dt of dividing between time t and t + dt. This model is related to the so-called sloppy size control model [33] describing division in S. pombe.
For simplicity, we focused here on a population evolving along a full genealogical tree, accounting for f i data. For data s i observed along a single line of descendants, an appropriate modification is made to Equations (1) and (2) (see Additional file 1: Supplementary Text).

Testing the Age Model (timer) and the Size Model (sizer) with experimental data
In this study we tested the hypothesis of an age-dependent versus size-dependent division rate by comparing the ability of the Age Model and Size Model to describe experimental data. The PDE given by Equations (1) and (2) can be embedded into a two-dimensional age-and-sizestructured equation (Age & Size Model), describing the temporal evolution of the density n(t, a, x) of cells of age a and size x at time t, with a division rate B a,s a priori depending on both age and size: with the boundary condition In this augmented setting, the Age Model governed by the PDE (1) and the Size Model governed by (2) are restrictions to the hypotheses of an age-dependent or http://www.biomedcentral.com/1741-7007/12/17 size-dependent division rate, respectively (B a,s = B a or B a,s = B s ).
The density n(t, a, x) of cells having age a and size x at a large time t can be approximated as n(t, a, x) ≈ e λt N(a, x), where the coefficient λ > 0 is called the Malthus coefficient and N(a, x) is the stable age-size distribution. This regime is rapidly reached and time can then be eliminated from Equations (1), (2) and (3), which are thus transformed into equations governing the stable distribution N(a, x). Importantly, in the timer model (i.e. B a,s = B a ), the existence of this stable distribution requires that growth is sub-exponential around zero and infinity [23,24].
We estimate the division rate B a of the Age Model using the age measurements of every cell at every time step. Likewise, we estimate the division rate B s of the Size Model using the size measurements of every cell at every time step. Our estimation procedure is based on mathematical methods we recently developed. Importantly, our estimation procedure does not impose any particular restrictions on the form of the division rate function B, so that any biologically realistic function can be estimated (see Additional file 1: Section 4 and Figure S6). In Additional file 1: Figures S1 and S2, we show the sizedependent and age-dependent division rates B s (x) and B a (a) estimated from the experimental data. Once the division rate has been estimated, the stable age and size distribution N(a, x) can be reconstructed through simulation of the Age & Size Model (using the experimentally measured growth rate; for details see the Methods).
We measure the goodness-of-fit of a model (timer or sizer) by estimating the distance D between two distri-butions: the age-size distribution obtained through simulations of the model with the estimated division rate (as explained above), and the experimental age-size distribution. Therefore, a small distance D indicates a good fit of the model to the experimental data. To estimate this distance we use a classical metric, which measures the average of the squared difference between the two distributions. As an example, the distance between two bivariate Gaussian distributions with the same mean and a standard deviation difference of 10% is 17%, and a 25% difference in standard deviation leads to a 50% distance between the distributions. The experimental age-size distribution is estimated from the age and size measurements of every cell at every time step of a given experiment f i or s i , thanks to a simple kernel density estimation method.

Analysis of single-cell growth
As mentioned above, to avoid unrealistic asymptotic behavior of the Age Model and ensure the existence of a stable size distribution, assumptions have to be made on the growth of very small and large cells, which cannot be exactly exponential. To set realistic assumptions, we first studied the growth of individual cells. As expected, we found that during growth, a cell diameter is roughly constant (see inset in Figure 2A). Figure 2A shows cell length as a function of time for a representative cell, suggesting that growth is exponential rather than linear, in agreement with previous studies [25,26,[34][35][36]. To test this hypothesis further, we performed linear and exponential fits of cell length for each single cell. We then calculated in each case the R 2 coefficient of determination, which is classically used to measure how well a regression curve We thus obtain a distribution of R 2 coefficients corresponding to the linear (green) and exponential (red) fits. http://www.biomedcentral.com/1741-7007/12 /17 approximates the data (a perfect fit would give R 2 = 1 and lower values indicate a poorer fit). The inset of Figure 2B shows the distribution of the R 2 coefficient for all single cells for exponential (red) and linear (green) regressions, demonstrating that the exponential growth model fits the data very well and outperforms the linear growth model. We then investigated whether the growth of cells of particularly small or large size is exponential. If growth is exponential, the increase in length between each measurement should be proportional to the length. Therefore, we averaged the length increase of cells of similar size and tested whether the proportionality was respected for all sizes. As shown in Figure 2B, growth is exponential around the mean cell size but the behavior of very small or large cells may deviate from exponential growth. We therefore determined two size thresholds x min and x max below and over which the growth law may not be exponential (e.g. for the experiment f 1 shown in Figure 2B, we defined x min = 2.3 µm and x max = 5.3 µm).

The age-size joint distribution of E. coli corresponds to a size-dependent division rate
We used both the Age Model and Size Model to fit the experimental age-size distributions, following the approach described above. The growth law below x min and above x max is unknown. Therefore, to test the Age Model, growth was assumed to be exponential between x min and x max and we tested several growth functions v(x) for x < x min and x > x max , such as constant (i.e. linear growth) and polynomial functions. Figure 3 shows the best fit we could obtain. Comparing the experimental data f 1 shown in Figure 3A ( Figure 3B for s 1 data) with the reconstructed distribution shown in Figure 3C ( Figure 3D for s 1 data) we can see that the Age Model fails to reconstruct the experimental age-size distribution and produces a distribution with a different shape. In particular, its localization along the y-axis is very different. For instance, for f 1 data (panels A and C), the red area corresponding to the maximum of the experimental distribution is around 2.4 on the y-axis whereas the maximum of the fitted distribution is around 3.9. The y-axis corresponds to cell size. The size distribution produced by the Age Model is thus very different from the size distribution of the experimental data (experimental and fitted size distributions are shown in Additional file 1: Figure S9).
As an additional analysis to strengthen our conclusion, we calculated the correlation between the age at division and the size at birth using the experimental data. If division is triggered by a timer mechanism, these two variables should not be correlated, whereas we found a significant correlation of −0.5 both for s i and f i data (P < 10 −16 ; see Additional file 1: Figure S7).
We used various growth functions for x < x min and x > x max but a satisfying fit could not be obtained with the Age Model. In addition, we found that the results of the Age Model are very sensitive to the assumptions made for the growth law of rare cells of very small and large size (see Additional file 1: Figure S3). This ultrasensitivity to hypotheses regarding rare cells makes the timer model unrealistic generally for any exponentially growing organisms.
In contrast, the Size Model is in good agreement with the data (Figure 3: A compared to E and B compared to F) and allows a satisfactory reconstruction of the age-size structure of the population. The shape of the experimental and fitted distributions as well as their localization along the y-axis and x-axis are similar (size distributions and age distributions, i.e. projections onto the y-axis and x-axis, are shown in Additional file 1: Figure S8).
The quantitative measure of goodness-of-fit defined above is coherent with the curves' visual aspects: for the Size Model the distance D between the model and the data ranges from 17% to 20% for f i data (16% to 26% for s i data) whereas for the Age Model it ranges from 51% to 93% for f i data (45% to 125% for s i ).
The experimental data has a limited precision. In particular, the division time is difficult to determine precisely by image analysis and the resolution is limited by the time step of image acquisition (for s i and f i data, the time step represents respectively 5% and 8% of the average division time). By performing stochastic simulations of the Size Model (detailed in Additional file 1: Section 6), we evaluated the effect of measurement noise on the goodness of fit of the Size Model. We found that noise of 10% in the determination of the division time leads to a distance D around 14%, which is of the order of the value obtained with our experimental data. We conclude that the Size Model fits the experimental data well. Moreover, we found that in contrast to the Age Model, the Size Model is robust with respect to the mathematical assumptions for the growth law for small and large sizes: the distance D changes by less than 5%.

Size control is robust to phenotypic noise
Noise in the biochemical processes underlying growth and division, such as that created by stochastic gene expression, may perturb the control of size and affect the distribution of cell size. We therefore investigated the robustness of size control to such phenotypic noise. The Size Model describes the growth of a population of cells with variable age and size at division. Nevertheless, it does not take into account potential variability in individual growth rate or the difference in size at birth between two sister cells, i.e. the variability in septum positioning. To do so, we derived two PDE models, which are revised Size Models with either growth rate or septum positioning variability (see Additional file 1: Supplementary http://www.biomedcentral.com/1741-7007/12/17 Text) and ran these models with different levels of variability.

Variability in individual growth rate has a negligible effect on the size distribution
For each single cell, a growth rate can be defined as the rate of the exponential increase of cell length with time [25,26]. By doing so, we obtained the distribution of the growth rate for the bacterial population (Additional file 1: Figure S4A). In our dataset this distribution is statistically compatible with a Gaussian distribution with a coefficient of variation of approximately 8% (standard deviation/mean = 0.08).
We recently extended the Size Model to describe the growth of a population with single-cell growth rate variability (the equation is given in Additional file 1: Section 5) [28]. We simulated this extended Size Model using the growth rate distribution of f i data. The resulting size distribution is virtually identical to the one obtained without growth rate variability ( Figure 4A, red and blue lines). Therefore, the naturally occurring variability in individual growth rate does not significantly perturb the size control. To investigate the effect of growth rate variability further, we simulated the model with various levels of noise, using truncated Gaussian growth rate distributions with coefficients of variation from 5 to 60%. We found that to obtain a 10% change in size distribution, a 30% coefficient of variation is necessary, which would represent an extremely high level of noise ( Figure 4A, inset).

Variability in septum positioning has a negligible effect on size distribution
The cells divide into two daughter cells of almost identical length. Nevertheless, a slight asymmetry can arise as http://www.biomedcentral.com/1741-7007/12/17 an effect of noise during septum positioning. We found a 4% variation in the position of the septum (Additional file 1: Figure S4B), which is in agreement with previous measurements [35,[37][38][39]. To test the robustness of size control to noise in septum positioning, we extended the Size Model to allow for different sizes of the two sister cells at birth (the equation is given in Additional file 1: Section 5). We ran this model using the empirical variability in septum positioning (shown in Additional file 1: Figure S4B) and compared the resulting size distribution to the one obtained by simulations without variability. As shown in Figure 4B (comparing the red and blue lines), the effect of natural noise in septum positioning is negligible. We also ran the model with higher levels of noise in septum positioning and found that a three times higher (12%) coefficient of variation is necessary to obtain a 10% change in size distribution ( Figure 4B inset, and Additional file 1: Figure S5).

Conclusions
In the present study, we present statistical evidence to support the hypothesis that a size-dependent division rate can be used to reconstruct the experimental age-size distribution of E. coli. In contrast, this distribution cannot be generated by a timer model where the division rate depends solely on age. Even though the timer model can maintain cell size homeostasis, it is quantitatively incompatible with the observed size distribution. Our analysis of two different datasets shows the robustness of our conclusions to changes in experimental setup and image analysis methods. Our results therefore confirm the hypothesis of size control of division in E. coli. In addition, our analysis of the timer model shows that it is very sensitive to mathematical assumptions for the growth law of very rare cells of abnormal size, suggesting that this model is unrealistic for any exponentially growing organisms.
Noise in biochemical processes, in particular gene expression, can have a significant effect on the precision of biological circuits. In particular, it can generate a substantial variability in the cell cycle [5]. Therefore we investigated in bacteria the robustness of size control to noise in the single-cell growth rate and septum positioning, using appropriate extensions of the Size Model. We found that variability of the order of what we estimated from E. coli data does not significantly perturb the distribution of cell size. Therefore, in a natural population exhibiting phenotypic noise, the control of cell size is robust to fluctuations in septum positioning and individual growth rates. From a modeling perspective, this demonstrates that the simple Size Model is appropriate for describing a natural bacterial population showing phenotypic diversity.
Our approach is based on comparisons between PDE models and single-cell data for the cell cycle. Such comparisons were attempted a few decades ago using data from yeasts (e.g. [21,33]). Nevertheless, these interesting studies were hampered by the scarcity and poor quality of single-cell data as well as the lack of appropriate statistical procedures to estimate the division rate within the models. In contrast, we used high-precision measurements of tens of thousands cells in combination with modern statistical inference methods, which allowed us to assess quantitatively the adequacy of different models. We think this approach could prove successful in studying other aspects of the cell cycle, such as the coordination between replication and division or the molecular mechanisms underlying size control of division. Several different http://www.biomedcentral.com/1741-7007/12/17 mechanisms involved in division control in bacteria have already been unraveled, in particular MinCD inhibition and nucleoid occlusion [40][41][42]. We believe that a better understanding of the relative roles played by MinCD inhibition and nucleoid occlusion in division control can be gained by analyzing the age-size distributions of minCD and nucleoid occlusion mutants. We are therefore currently performing time-lapse microscopy experiments to record the growth of such mutants.

Data analysis
The data of Stewart et al. contain the results of several experiments performed on different days, each of them recording the simultaneous growth of several microcolonies of the MG1655 E. coli strain on LB-agar pads at 30°C, with a generation time of approximately 26 min [25]. The first 150 min of growth were discarded to limit the effects of non-steady-state growth (cells undergo a slight plating stress when put on microscopy slides and it takes several generations to recover a stable growth rate). For the dataset obtained by Wang et al., the MG1655 E. coli strain was grown in LB at 37°C in a microfluidic device with a doubling time of approximately 20 min. To avoid any effect of replicative aging such as described in [26], we only kept the first 50 generations of growth. In addition the first ten generations were discarded to ensure steadystate growth. Both datasets were generated by analyzing fluorescent images (the bacteria express the Yellow Fluorescent Protein) using two different software systems. For s i data, cell segmentation was based on the localization of brightness minima along the channel direction (see [26]). In the same spirit, for f i data, local minima of fluorescence intensity were used to outline the cells, following by an erosion and dilation step to separate adjacent cells (see [25]). To measure its length, a cell was approximated by a rectangle with the same second moments of pixel intensity and location distribution (for curved cells the measurement was done manually).
For both datasets we extracted data from three experiments done on different days. We did not pool the data together to avoid statistical biases arising from day-to-day differences in experimental conditions. Each analysis was performed in parallel on the data corresponding to each experiment.

Numerical simulations and estimation procedures
All the estimation procedures and simulations were performed using MATLAB. Experimental age-size distributions, such as those shown in Figure 3A,B, were estimated from the size and age measurements of every cell at every time step using the MATLAB kde2D function, which estimates the bivariate kernel density. This estimation was performed on a regular grid composed of 2 7 equally spaced points on [0, A max ] and 2 7 equally spaced points on [0, X max ], where A max is the maximal cell age in the data and X max the maximal cell size (for instance A max = 60 min and X max = 10 µm for the experiment f 1 , as shown in Figure 3A). To estimate the size-dependent division rate B s for each experiment, the distribution of size at division was first estimated for the cell size grid [0, X max ] using the ksdensity function. This estimated distribution was then used to estimate B s for the size grid using Equation (20) (for s i data) or (22) (for f i data) of Additional file 1. The age-size distributions corresponding to the Size Model ( Figure 3E,F) were produced by running the Age & Size Model (Equation (3) in the main text) using the estimated division rate B s and an exponential growth function (v(x) = vx) with a rate v directly estimated from the data as the average of single-cell growth rates in the population (e.g. v = 0.0274 min −1 for the f 1 experiment and v = 0.0317 min −1 for s 1 ). For the Age & Size Model, we discretized the equation along the grid [0, A max ] and [0, X max ], using an upwind finite volume method described in detail in [43]. We used a time step: dt = 0.9 2 7 ×max(v(x)) X max A max meeting the CFL: Courant-Friedrichs-Lewy stability criterion. We simulated n(t, a, x) iteratively until the age-size distribution reached stability (|(n(t + dt, a, x) − n(t, a, x))| < 10 −8 ). To eliminate the Malthusian parameter, the solution n(t, a, x) was renormalized at each time step (for details see [43]). The age-dependent division rate B a for each experiment was estimated for the cell age grid [ 0, A max ] using Equation (14) and (16) of Additional file 1. Using this estimated division rate, the age-size distributions corresponding to the Age Model ( Figure 3C,D) were produced by running the Age & Size Model. As explained in the main text, we used various growth functions for small and large cells (i.e. for x < x min and x > x max ; between x min and x max growth is exponential with the same rate as for the Size Model). For instance for the fit of the experiment f 1 shown in Figure 3C, for x < 2.3 µm and x > 5.3 µm, v(x) = max(p(x), 0), with p(x) = −0.0033x 3 + 0.036x 2 − 0.094x + 0.13. Likewise, for the fit of the experiment s 1 shown in Figure 3D, for x < 3.5 µm and x > 7.2 µm, v(x) = max(p(x), 0), with p(x) = −0.0036x 3 + 0.063x 2 − 0.33x + 0.67. For each dataset the polynomial p(x) was chosen as an interpolation of the function giving the length increase as a function of length (shown in Figure 2B for f 1 data).
Simulations of the extended size models with variability in growth rates or septum positioning (Equations (23) and (24) in Additional file 1) were performed as for the Age & Size Model, with an upwind finite volume scheme. http://www.biomedcentral.com/1741-7007/12 /17 To simulate Equation (23), we used a grid composed of 2 7 equally spaced points on [0, X max ] and 100 equally spaced points on [0.9v min , 1.1v max ], where v min and v max are the minimal and maximal individual growth rates in the data.