Patterns of periodontitis progression across clinical groups
The data presented here were obtained from a prospective multi-center clinical study to identify periodontitis biomarkers (i.e., gum disease) progression described elsewhere [13, 15]. A total of 415 participants were examined every 2 months for 12 months in the absence of periodontal treatment to Additional File 5, Table S1monitor periodontal disease progression, based on clinical attachment level (CAL) measurements. Then, participants received periodontal therapy (scaling and root planning, SRP) and were followed for six more months. At each visit (baseline, 2, 4, 6, 8, 10, and 12 months, and 6 months post-treatment visits), participants received clinical examination and provided subgingival (i.e., below the gumline) microbial samples from the same specific teeth. At the end of the study, three clinical behaviors were observed on the teeth examined: stability, disease progression, and fluctuation (cycles of disease progression/regression) [13, 15].
Our approach is detailed in Additional File 1, Fig. S1a. Fifteen individuals were selected for the present study. Three teeth were chosen in each of them, each representing one of the three clinical groups (3 teeth/participant, 45 teeth total). For each tooth, we analyzed eight samples, representing each of the time points in the longitudinal study (baseline, 2, 4, 6, 8, 10, and 12 months, and 6 months post-treatment), for a total of 360 microbial samples included in this study. Complete details of the study outline are presented elsewhere [13, 15]. To assess whether a sample size of 15 subjects per group likely affords adequate statistical power, we calculated effect size measured as omega-squared (ω2) described in Kelly et al. [16] using Jaccard distance. This method has been specifically designed to estimate sample size for microbiome analysis. We found that with a power of 90%, the stable group has a ω2 of 0.019, the progressing group 0.042, and the fluctuating group 0.022; all smaller than the ω2 of 0.08 that Kelly et al. found in the ten subjects per group.
Time-series modeling and forecasting confirmed the validity of the three clinical trajectories selected to study. We employed the Dickey-Fuller test to determine whether the time series were stationary or non-stationary, using CAL in our predicted trajectories: stable (no change in CAL), progressing (an increase of CAL with time), and fluctuating (up and down changes in CAL), illustrated in Additional File 1, Fig. S1b. Both progressing and fluctuating were non-stationary. Autoregressive Integrated Moving Average (ARIMA) modeling [17], a widely used approach to stationary time-series forecasting, was used for stable samples, whereas differencing [17] was employed for progressing and fluctuating samples. We performed forecasting on CAL profiles as a proxy for disease progression with time for the ΔCAL in stable, progressing, and fluctuating samples (Additional File 2, Fig. S2). Forecasting results indicate that, without intervention, the three patterns previously defined in our samples followed the predicted classification (Additional File 2, Fig. S2b,d,f). The forecasted values for the next 12 months flatten in the stable samples (Additional File 2, Fig. S2b), the ΔCAL (difference of CAL values between time points) grows exponentially in the case of the progressing sites (Additional File 2, Fig. S2d), and it follows a zig-zagging trajectory in the fluctuating sites (Additional File 2, Fig. S2f). As expected, after periodontal treatment, CAL values were reduced in all groups (Additional File 2, Fig. S2a,c,e).
Patterns of community structure during periodontitis progression across clinical groups
We profiled prokaryotic composition for the 16S rRNA gene datasets and eukaryotic composition for ITS1 and ITS2 genes using Kraken2 and Bracken programs with a custom 16S rRNA+ITS database [18, 19]. The profiles of community composition in the three kinds of samples showed a similar composition of the most abundant taxa, although some species’ relative abundance was different (Fig. 1a). For instance, Fusobacterium nucleatum was more abundant in the fluctuating samples than in stable and progressing samples, whereas Streptococcus sp. oral taxon 064 and oral taxon 058 were more abundant in the stable and progressing samples than in the fluctuating (Fig. 1a). Progressing and stable samples also showed specific differences. Lactobacillus panis was consistently more abundant in stable samples, whereas Actinomyces naeslundii was consistently more abundant in progressing samples (Fig. 1a). Also, we assessed α-diversity in the globality of samples under the three different conditions. Shannon diversity, which accounts for both abundance and evenness of the species present, was significantly higher in the fluctuating samples, followed by progressing samples and stable samples with the lower overall α-diversity (Fig. 1b). Similar results were obtained for other α-diversity indexes such as richness and Fisher α-diversity (Additional File 3, Fig. S3a). Shannon diversity values were consistently higher in the progressing and fluctuating sites than in stable sites for the study’s whole period (Fig. 1c and Additional File 3, Fig. S3b). Remarkably, fluctuating samples were the ones showing higher α-diversity, statistically significantly higher than the other two groups, and at months 8, 10, and 12 (Fig. 1c). After treatment, all conditions showed a stabilization in α-diversity (Fig. 1c).
Shannon diversity’s rate of change was higher during disease progression for most of the sampled points, while the rate of change of Shannon diversity was consistently lower in the stable sites (Fig. 1d). Thus, the periodontal treatment seems to stabilize the rate of diversity change in the progressing sites and lower it in the other two conditions (Fig. 1d).
Three different β-diversity metrics—the Jaccard index (for membership), Bray–Curtis (B.C.) dissimilarity (for abundance), and weighted-Unifrac (for phylogenetic relatedness) were used to assess bacterial communities differences. β-diversity multivariate tests yielded significant results for the progressing and fluctuating sites when compared to the stable samples, whereas progressing and fluctuating site samples were not significantly different (Additional File 4, Fig. S4).
Taxa associated with one of the analyzed conditions at different times were identified using the linear discriminant analysis effect size (LEfSe) [20]. Taxa with effect size (LDA) scores higher than 2 could be considered biomarkers of the different conditions. The most specific associated taxa samples were the fluctuating samples with Fusobacteria and Flavobacteria frequently associated with them (Fig. 2). Stable samples showed Firmicutes and, most specifically, the family Streptococcaceae as the most abundant phylogenetic unit, while in the case of progressing sites after the sixth-month Actinobacteria (family Actinomycetaceae) was the most abundant phylogenetic unit (Fig. 2). Surprisingly, in the samples of month 18, the biomarkers of the different communities seemed to diverge instead of converging due to periodontal treatment (Fig. 2).
Temporal network analysis and network cartography
Microbial abundances are not independent, and traditional statistical metrics (e.g., correlation) for detecting OTU-OTU relationships can lead to spurious results. Moreover, microbial sequencing-based studies typically measure hundreds of OTUs on only tens to hundreds of samples; thus, inference of OTU-OTU association networks is severely underpowered. We used SPIEC-EASI to reconstruct ecological networks. This statistical method addresses both of these issues [21]. We first performed temporal network analysis and generated dynamic networks as described in the vignette of the tsna R package [22]. We then assigned roles to the different species in the ecological network, as described by Guimerà and Amaral [23]. These authors demonstrated that nodes could be classified into universal roles according to their pattern of intra- and inter-module connections. They are thus yielding a “cartographic representation” of complex networks [23]. Within-module degree z measures how “well-connected” a particular node (bacterial species) is to other nodes in the same module. High values of z indicate high within-module connectivity and vice versa. The participation coefficient (P) defines how the node is positioned in its module and with respect to other modules. The participation coefficient is close to 1 if its links are uniformly distributed among all the modules and 0 if all its links are within its module [23]. Based on z and P, nodes in a network can be classified as hubs if z ≥ 2.5 and non-hubs if z < 2.5. In all three conditions, most nodes were classified as peripheral or ultra-peripheral, that is non-hub nodes with most links within their modules (0.05 < P ≤ 0.62) [23] (Fig. 3a–c). In stable sample networks, Actinomyces gerencseriae, Anaeroglobus geminatus, Desulfomicrobium orale, Peptostreptococcaceae bacterium oral taxon 369, Polyporales sp., Treponema sp. oral taxon G76, acted as hub connector; that is, hubs with many links to most of the other modules (0.30 < P ≤ 0.75). In the progressing sites, Actinomyces sp. oral taxon 171, Actinomyces sp. oral taxon 448, Aspergillus caesiellus, Candida albicans, Candida quercitrusa, and Cardiobacterium valvarum were hub connectors (Fig. 3a, b, Additional File 5, Table S1). Finally, in fluctuating sites, Actinomyces sp. oral taxon 169, Actinomyces sp. oral taxon 171, Actinomyces sp. oral taxon 175, and Aspergillus flavus acted as hub connectors (Fig. 3c, Additional File 5, Table S1).
We then performed a temporal analysis of two network centralities: degree and betweenness centrality. The degree of a node (or a species) refers to the number of links to other interacting partners in the network, while the betweenness of taxa is a measure of taxa’s control in the network. Stable samples showed an increase in degree centrality throughout the study (Fig. 3d) while progressing sites showed a high degree of centrality consistently until month 10 when there was a steep decrease. Interestingly, fluctuating sites showed low-degree centrality compared to the other two conditions until the tenth month when, as in the progressing sites, a steep increase occurred (Fig. 3d). High betweenness centrality implies that a corresponding node has more influence in the network and vice versa. In betweenness centrality, progressing and fluctuating sites showed a similar pattern to the one observed for degree centrality, low values in fluctuating sites, and higher in progressing sites (Fig. 3e). Stable samples showed, in general, sharper oscillations, with a high peak between 2 and 6 months (Fig. 3e). In all three cases, there was a significant increase in-betweenness after treatment, special pronounce in the case of the progressing sites (Fig. 3e). Finally, we measured the number of edges formed at different times in the different temporal networks. Progressing sites showed the most distinct profiles with a high peak at month 4 and a lower number of edges formed at month 8 (Fig. 3f). At months 4 and 6, the stable networks showed a significantly lower formation of nodes than progressing and fluctuating sites (Fig. 3f).
Temporal community dynamics and dominance structure vary with clinical progression
To better understand the dynamics in microbial species composition within and across sample types, we first visualize the degree to which the most abundant species reorder over time and the effect of periodontal treatment on this rearrangement of species. There were substantial shifts in species dominance in all communities over the sampling period. Rank clocks highlight that there has been high reordering in the relative abundance of the dominant species in the fluctuating and progressing microbiomes, but to a less extent in the stable communities (Fig. 4a).
The temporal measure of species reordering measured as Mean Rank Shifts (MRS) showed large oscillations during the study period. MRS progression describes relative changes in species rank abundances and indicates the degree of species reordering between two time points. Calculating mean rank shifts highlights that communities’ stability diverged from the beginning of the study, with changes an increase of MRS at the beginning of the study in progressing sites and a decrease during the same period (2 to 6 months) in stable and fluctuating sites (Fig. 4b). At month 8, there was a peak in all the groups, after which all of them showed a steep decrease, especially pronounced in the stable samples (Fig. 4b). Interestingly, periodontal treatment resulted in an MRS increase in all groups (Fig. 4b).
We also assessed the rate and pattern of variability within a community, which indicates whether species reordering over time results in a directional change; that is, sample distance increases over time. Differences in species composition were characterized by Euclidean distances, calculated on pairwise communities across the entire time series. The regression line’s slope indicates the rate and direction of compositional change in the community [24]. Communities converge if sample distances decrease over time, whereas if sample distance increases over time, the communities are undergoing directional change [24]. The stable samples showed a more significant directional change through time than the progressing and fluctuating samples (Fig. 4c). The progressing and fluctuating communities were unstable with a negative slope and undergoing convergence, with the fluctuating samples showing a more negative value of the slope (Fig. 4c, d).
Defined sub-communities followed marked temporal fluctuations
Next, we considered the degree to which different species clustered in their abundance profiles during the study. Using an infinite Gaussian process mixture model [25], we found that in the case of the stable samples, there were 12 clusters of behavior, while in the progressing and fluctuating groups, there were 11 and 13, respectively (Additional File 6, Fig. S5 and Additional File 7, Table S2). One cluster had the most species in all three conditions, more than 200 (cluster 2). More importantly, these clusters shared a large proportion of species. One hundred eighty-three species were common to all three clusters 2 (Additional File 6, Fig. S5d).
Furthermore, although the composition of these clusters was similar, their behavior was very different. For example, cluster 2 from fluctuating samples and the other two groups followed opposite trajectories at the beginning of the study, but after month 8, while stable samples showed a decrease in the abundance of cluster 2, progressing sites maintained a high proportion of those bacteria (Additional File 6, Fig. S5e). On the other hand, cluster 2 of the fluctuating samples followed a completely different profile. While in high proportion up to month 6, it decreased in proportion at the end of the study (Additional File 6, Fig. S5e).
One unexpected result was obtained when we plotted all clusters on the same graph, and temporal fluctuations of the different clusters were observed, in what looks like a succession of different communities, with peaks and valleys of abundances (Fig. 5a–c). What is more, in some cases, two or more clusters shared temporal peaks but behaved differently at other times. Thus, the periodontal treatment seemed to work by selecting specific clusters while reducing others in abundance (Fig. 5a–c). In particular, cluster 2 was relatively less abundant in progressing and fluctuating samples but no stable samples (Fig. 5a–c).
Progressing samples are characterized by high synchrony, where different species respond similarly through time
One key question in the relationship between species diversity and stability is how the community’s components affect the whole community’s aggregate properties’ stability. Unstable species populations may still maintain stable communities, which is a time series, is reflected by a pattern in which species negatively covary whereas total community stability remained relatively stable. The previous section showed that different communities followed a succession in time, with some clusters following similar or opposite profiles (Fig. 5a–c).
To assess community stability, we used Tilman’s method to aggregate species abundances within replicate and time and utilize these values to calculate community stability as the temporal mean divided by the temporal standard deviation [26]. As expected, the stable communities reported higher stability values, as defined by Tilman [26], than progressing communities (Fig. 5d). Surprisingly, fluctuating sites were only slightly less stable than the stable communities (Fig. 5d).
The variance ratio (V.R.) was one of the first metrics to characterize species covariance patterns [28]. It was used in an early synthesis paper of species covariance in long time series [29]. The metric compares the community’s variance as a whole relative to the sum of the individual population variances. If species vary independently, then the variance ratio will be close to 1. A VR < 1 indicates predominately negative species covariance, whereas a V.R. > 1 indicates that species generally positively covary. Our results show that in the species of the stable samples, covary predominantly negatively (V.R. < 1), whereas in the progressing and fluctuating samples, they do positively (V.R.> 1) (Fig. 4e). A significant criticism of the variance ratio is that it is sensitive to species richness, which is of particular concern when the metric is used to compare communities with different species richness levels. Other alternative metrics that quantify species asynchrony have been developed in part to respond to this issue. We measured synchrony using the Loreau and Mazancourt metric that compares the variance of aggregated species abundances with the summed variances of individual species [27]. This metric ranges from 0 (perfect asynchrony between species) to 1 (perfect synchrony). Stable and fluctuating samples presented a lower synchrony level in the communities than the progressing (Fig. 5f).
Progressing and fluctuating sites follow a neutral model of phylogenetic recruitment of new species
Knowing when and why new species are recruited into microbial communities has significant implications in understanding the dynamics of health and progression and implications in devising strategies to managing the microbiome to restore a healthy status. We used a mathematical model developed by Darcy et al. that describes the order in which new species are detected in microbial communities over time within a phylogenetic framework [30]. The model estimates the degree to which the recruitment of new species is more or less likely when a close relative has been previously recruited. The model estimates an empirical dispersion parameter D, which quantifies the degree to which first-time species detections are phylogenetically related. If D ≠ 0, then species are preferentially added if they have relatively low (D < 0) or relatively high (D > 0) phylogenetic distance to the resident community, yielding accumulations of total phylodiversity that are relatively slow (D < 0) or relatively fast (D > 0) compared with the neutral model (D = 0) [30]. Figure 6a shows the phylogenetic accumulation of the three datasets. New species with a previously detected close relative contribute little phylodiversity and cause slow phylodiversity accumulation (blue). New species that do not have a close relative contribute more phylodiversity and cause faster accumulation (green). Stable samples showed a relatively faster accumulation of phylogenetic diversity than the neutral model (green fraction is reduced with time). In contrast, the progressing and fluctuating samples followed the neutral model of phylogenetic accumulation (green and blue fractions do not change with time) (Fig. 6a). Figure 6b shows the estimates for D’s empirical value, which is the value at which ΔPD = 0. Figure 6c shows the distribution of D estimates where dots within violins are means. While stable samples show mean estimates of D much higher than 0, new species are more likely to be recruited if they are phylogenetically distant (overdispersed) from previously recruited species, progressing and fluctuating sites followed a neutral model, D ≤ 0.
The “individual null” test results confirm the previous conclusions; progressing and fluctuating sites followed a neutral model of phylogenetic recruitment of new species (Fig. 6d). This “individual null” accounts for differences in total species relative abundance across a time series, while the simulation above only considers the presence-absence of species.