Single-cell transcriptome profiling of postnatal cerebellar cells
We have previously used the Math1-GFP and Dcx-DsRed reporter mouse lines to study the cell division modes of developing GNs [22]. As shown in Fig. 1b, the GFP+ high cells represented Math1-expressing GNPs, while red fluorescent-labeled cells corresponded to the Dcx-expressing differentiating GNs. In this study, to focus on GN development, we performed scRNA-seq for both Math1-GFP+ and Dcx-DsRed+ cells isolated from postnatal cerebellum tissues of transgenic mice. Fluorescent-labeled cells were isolated using fluorescent-activated cell sorting (FACS) from the early postnatal cerebellum at postnatal day 7 and 11 (P7 and P11), as GN cells undergo massive proliferation and differentiation during the first 2 weeks after birth [1, 3].
After quality control including doublet removal, we obtained a total of 24,919 FACS-purified cells. Unsupervised clustering using t-distributed stochastic neighbor embedding (t-SNE) revealed thirteen clusters (Fig. 1c). Based on the known annotations of marker genes from the literature [16, 19, 23], we divided these cell populations into eight cell lineages, including GNP/GN cells (expressing Math1, Barhl1 and Pax6), GABAergic progenitors (GABAergic pro) (expressing Ptf1a, Ascl1, and Lhx5), GABAergic interneurons (GABAergic int) (expressing Pax2 and Lbx1), astrocytes (expressing Fabp7, Slc1a3, Aldh1l1, and Aqp4), oligodendrocytes (expressing Olig1, Olig2, and Pdgfra), fibroblasts (expressing Col1a1), microglia (expressing Aif1, Cd68, and Tmem119), and ciliated cells (expressing Dynlrb2 and Meig1) (Fig. 1d). We found that the proportion of GNPs/GNs in Math1-GFP+ sorted cells was significantly larger than in Dcx-DsRed+ sorted cells (92% in Math1-GFP+ cells, 77% in Dcx-DsRed+ cells, P < 0.0001, Pearson’s chi-square test; Fig. 1e, f), in line with the fact that Dcx is broadly expressed in both GNs and other migrating neurons [24].
Considering that the biased FACS-based sorting strategy may result in the loss of some cell populations/states of developing GN cells, we also collected whole cerebellar cells from wild-type (WT) mice for scRNA-seq (Additional file 1: Figure S1a). We obtained 20,367 cells from two WT samples, which were then integrated with cells isolated from Math1-GFP+ and Dcx-DsRed+ reporter mice (45,286 cells) (Additional file 1: Figure S1b). This analysis showed two additional cell types, Purkinje cells (PCs, expressing Calb1 and Car8) and unipolar brush cells (UBCs, expressing Eomes) in WT cerebellar cells (Additional file 1: Figure S1f). Additionally, astrocytes (expressing Fabp7, Slc1a3, Aldh1l1, and Aqp4) were also mainly derived from unsorted samples (Additional file 1: Figure S1d and S1e). Notably, there was no unique granule neuron assigned to WT samples, suggesting that our FACS strategy did not result in the loss of any GN subsets (Additional file 1: Figure S1d and S1e). Nevertheless, probably due to a high proportion of granule neurons in developing cerebella, the lineage-tracing strategy in our study did not show an obvious advantage over WT samples in enriching granule neurons for single-cell sequencing.
Identifying distinct states associated with postnatal GN development
The progenitor/differentiation states of GNs have been characterized by many previous studies, which suggested that there are at least two major cell subsets of GN cells referred as GNPs and GNs that differ in cell cycle activities and spatial locations [25]. To obtain a better understanding of the cellular states associated with developing postnatal GN cells, we computationally isolated and re-clustered GNPs/GNs of Math1-GFP+ and Dcx-DsRed+ cells, which revealed nine sub-clusters (Fig. 2a). Five clusters (cluster 2, 3, 4, 5, and 6) highly expressed the progenitor cell marker Math1 (Fig. 2c), while the remaining clusters (cluster 0, 1, 7, and 8) showed upregulated expression of Dcx (Fig. 2c), which corresponds to differentiating/differentiated cell states. GNPs have been shown to possess high proliferation potential. In keeping with this notion, most Math1-expressing cells showed markedly unregulated cell cycle activities (Fig. 2d). In contrast, Dcx-expressing cells showed downregulated cell cycle activities and an upregulated gene set that was previously used to highlight the differentiating/differentiated GNs (Rbfox3, Grin2b, and Neurod1) [23, 26] (Fig. 2d). As expected, Dcx-expressing cells were largely derived from Dcx-DsRed+ sorted samples (Fig. 2b). In contrast, Math1-GFP+ sorted strategy seems not to be able to enrich GNPs. We speculate that this is due to our FACS threshold that was set for collecting all Math1-expressing cells but not for Math1high GNPs (Additional file 1: Figure S1c).
To gain a global understanding of the undifferentiated/differentiated states of GNs, we next applied Moran’s I test implemented in Monocle 3 [27, 28] to dissect the underlying transcriptional modules. This analysis identified seven modules differentially expressed within and among Louvain component clusters in Monocle 3 (Fig. 2e, Y-axis). All GNPs/GNs were then scored with representative genes of each module (Fig. 2e, X-axis). By hierarchical clustering and subsequent gene ontology (GO) enrichment analysis, these seven modules were further grouped into four main modules (Additional file 4: Table S3) that may correspond to four distinct states of GN cells (Fig. 2e, Additional file 1: Figure S2a). Module A contained many genes correlated with nuclear division and cell cycle (Additional file 4: Table S3), corresponding to the dividing Math1-expressing GNPs (Fig. 2e). Module B was highly expressed in a small population of Math1-expressing GNPs with downregulated cell cycle activities, likely corresponding to the non-dividing GNPs (Fig. 2e). In contrast, modules C and D characterized by gene ontology as associated with neurogenesis and neuron differentiation were highly upregulated in Dcx-expressing non-cycling cells, indicating that there were likely two subsets of differentiated GNs (Fig. 2e). We next scored individual cells according to these four modules, which assigned GNPs/GNs into dividing GNPs (corresponding to module A), non-dividing GNPs (corresponding to module B), GNs I (corresponding to module C), and GNs II (corresponding to module D) (Fig. 2f, g).
We next assessed the transcriptional signature of these four states of GNs. As shown in Fig. 2 h, both dividing and non-dividing GNPs highly expressed progenitor-associated signature genes (such as Math1, Barhl1, Hey1, and Hes6), but differed in cell cycle-related gene expression. In addition to the well-known marker genes, some novel marker genes were associated with the GNP state. For example, Srebf1 and Tead2, which were both introduced as critical regulators of cerebral neuron development [29, 30], were highly expressed in the progenitor state of GN. To validate this finding, we then checked their expression patterns with in situ hybridization (ISH) from the Allen Developing Mouse Brain Atlas [31]. As shown in Fig. 2i, Srebf1 and Tead2 are both expressed in the outer EGL, similar to Math1 expression.
We next turned our attention to the differentiated GNs and found that GNs I highly expressed Vim, Ebf3, Apc2, Sox4, Nhlh2, and Nhlh1 (Fig. 2h). Among these markers, Vim has been widely linked with epithelial-mesenchymal transition and cell migration [32, 33] and Nhlh1 is a TF linked to the onset of granule cell differentiation in migrating granule cell precursors [34]. In addition, Nhlh2 was previously found in the premigratory zone of the EGL where GNPs undergo the initial stages of neuronal differentiation [35] and Apc2-deficient neurons could impair the migration of GNs [36]. These findings indicated that GNs I likely represented a group of differentiating/migrating GNs. In line with our expectations, ISH analysis revealed that the GNs I marker Nhlh1 was strongly expressed in the inner EGL (Fig. 2i). Other markers detected in GNs I, such as Sox4 and Ebf3, were also expressed in the inner EGL (Fig. 2i). GNs II highly expressed markers that have been reported to correspond to GN differentiation such as Grin2b, Calb2, and Cntn1 [23] and a novel marker such as Car10. The expressions of Grin2b, Calb2, and Car10 were detected within the IGL region (Fig. 2i), suggesting that GNs II represented a population of more committed GNs. Together these findings show that the delicate cell states of developing GNs and associated signature genes were highlighted at single-cell resolution, revealing four distinct states of postnatal developing GNs. Whether some unreported molecules in developing GNs such as Srebf1, Tead2, Vim, Sox4, Ebf3, and Car10 play a vital role in regulating the undifferentiated/differentiated states of GNs should be explored in future studies.
TF regulatory networks underlying cell states of GNs
Cell phenotypes/states are governed by core regulatory TFs that interact with cis-regulatory elements, namely regulons, for directing transcriptional programs [37]. We next utilized SCENIC [37] to identify the master TF networks associated with the four cell states in the GN cells. This result revealed that the four states of GN cells were constructed by distinct regulons (Fig. 3a). For example, as expected, the dividing GNPs are driven by many cell cycle-related regulons, including E2f2 and Ezh2 (Fig. 3b). Of interest, the non-dividing state of GNPs was orchestrated by a unique set of regulons, such as Zeb1 and Hey1 (Fig. 3b). The subsequent ISH examination confirmed that Zeb1 and Hey1 were expressed in the outer EGL (Fig. 3c). GNs I was highlighted by the regulon of Neurod1, which has been previously demonstrated to terminate GNP proliferation and thereby promote migration and differentiation [38]. In addition, we found that Eomes, the specific marker of UBCs [39], was also upregulated in GNs I (Fig. 3b). In ISH, Neurod1-positive cells were enriched in the inner EGL and molecular layer/Purkinje cell layer (ML/PCL), validating the migrating state of GNs I (Fig. 3c). GNs II showed significant upregulation of the En1 regulon, which was strongly expressed in the inner EGL at E17.5 in the previous report [40]. In comparison, En1 has dynamic expression patterns throughout cerebellar development. Examination of En1 at P4 revealed that most En1-positive cells were located in the IGL (Fig. 3c). Notably, the GNs II preferentially activated many TF regulons that have not been characterized in GN development, including Bmyc, Bcl3, and Esrrg (Fig. 3b). Taken together, SCENIC analysis supported that postnatal GNs displayed four distinct states.
Identifying cell populations in the cerebellum with ST
Although ISH data provided evidence for the spatial location of the identified GN subsets, we next used ST to capture more comprehensive spatial information of these cell types. ST was performed on brain sections from two C57BL mice at P7 (Additional file 1: Figure S3a), which generated a total of 473 individual spots on the H&E-stained slice of the developing cerebellum at P7. We then performed the dimensionality reduction and clustering analysis on the spatial (spot) gene expression profiles and identified nine sub-clusters (Fig. 4a). Based on the expression of the cell-type markers, these clusters primarily represented layer-enriched neuron populations present across the EGL, ML/PCL, and IGL to WM areas (Fig. 4b). For example, spots of cluster 2 and 6 highly expressed GNP markers such as Mki67, Math1, and Barhl1, corresponding to the EGL. Spots of clusters 5, 7, and 8 with strong expressions of PC marker genes (such as Calb1, Pcp4, and Car8) mapped to the ML/PCL. Of note, here we merged ML and PCL together because of the limited resolution of ST spot (55–100 μm). Next, we mapped the clustered spots back to their original coordinates in the cerebellar section and found that t-SNE clusters were projected into different histological annotations, supporting the ability to identify spatial regions based on ST gene expression (Fig. 4c–d, Additional file 1: Figure S3b–S3c).
To further reveal the cell types of mouse cerebellum at P7 in the spatiotemporal overview, we next applied intersection analysis [41] to compute the overlap between each pair of cell-type-specific and region-specific gene sets and performed a hypergeometric test to assess whether their overlap is significantly enrichment (P < 0.05) or depletion (P > 0.05) than expected by chance (for example see Fig. 4f; see “Methods”). The enrichment of overlap between the cell type and region is shown as a heatmap in Fig. 4e. We found that GNP and GN-specific genes in the postnatal cerebellum of our scRNA-seq data overlapped with the set of genes specific to EGL and IGL identified by ST, respectively (Fig. 4e; P < 0.001). For GABAergic lineage cells, GABA progenitors and interneurons were enriched in WM as previously reported [42] (Fig. 4e; P < 0.001), while ML/PCL were indeed enriched with PCs (Fig. 4e; P < 10−10). As expected, the WM region was also enriched with glial lineage cells [43, 44] like astrocytes and oligodendrocytes (Fig. 4e; P < 10−10). In addition, astrocytes were also enriched in ML/PCL and IGL (Fig. 4e; P < 0.05). Similar to what has been reported [44], it might be that there are three types of astrocytes in the murine cerebellar cortex, Bergmann glia in the PCL, fibrous astrocytes in WM, and protoplasmic astrocytes in IGL, leading to such enrichment. However, it seems that ST is unable to map the EGL location of astrocytes in developing cerebellum, as several studies have reported that there is a small population of scattered Nestin-expressing astrocyte progenitors within the EGL in addition to other locations (ML/PCL, IGL and WM) [45, 46]. It is possible that the high enrichment of granule cell in EGL attenuate the phenotype of astrocyte progenitor. In addition, the region of WM was also enriched with microglia, in line with the previous study [47] (Fig. 4e; P < 10−10). Fibroblasts were detected in both EGL (Fig. 4e; P < 10−10) and WM (Fig. 4e; P < 0.05), and ciliated cells were enriched in WM (Fig. 4e; P < 0.001). Taken together, nearly all cell lineages of cerebellum could be accurately positioned in ST, supplying a full characterization of the developing cerebellum at P7. By mapping cell types to ST, it makes sure that we could subset the regions corresponding to GN cells precisely for further study.
Identification and mapping of GN cell-type subpopulations across cerebellar regions
The EGL exclusively produces GN progenitors, which then migrate inward through the ML/PCL into IGL during early postnatal life. We further investigated subpopulations within postnatal GNs in the ST data and paid attention to the regions of EGL, ML/PCL, and IGL and discarded 60 spots defined as WM, which is mainly enriched for glial lineage cells (Fig. 5a, c). As each ST spot contains approximately 30 cells that reflect a mixed expression signature, we next filtered signature genes associated with other cell types, especially PCs within PCL and other cell types including glial cells and GABAergic lineage that may localize in ML/PCL during development (Additional file 1: Figure S4b). This step generated a total of 127 genes that are specifically associated with the GN identity (Additional file 1: Figure S4b, Additional file 6: Table S5). Using this gene set, we scored each spot (Fig. 5b, e) and found that GNPs including non-dividing GNPs and dividing GNPs were enriched in the EGL (P < 0.01; two-sided unpaired Wilcoxon test), while GNs I were enriched in the ML/PCL (P < 0.01), suggesting their migrating role during development. GNs II that are thought to be mature GNs were enriched in the IGL (P < 0.0001). Correlation analysis also supported the correspondence between the four states of GN cells and location of the three layers (Fig. 5f).
Next, we checked the expression levels of certain cell-type markers in the ST H&E image. As shown in Fig. 5g, markers of GNPs such as Math1 and Mki67 were expressed in the EGL, while Mvd, one of the genes specifically in GNs I, was expressed in the ML/PCL and the IGL region expressed Cntn1, which was similar to the outcome of ISH in Fig. 2h. Collectively, by mapping across the four cell types of scRNA-seq and the cerebellum tissue of ST data, the cell states of postnatal GN cells were described spatially and in line with previous ISH data. However, ISH shows a limitation in its dependency on available in situ probes. Therefore, the overview of cell states of GNs by ST enabled us to gain a better understanding of the expression patterns of GN cell states. Combined ISH and ST analyses confirmed that GN cells at P7 were described thoroughly as a sequential commitment from GNPs in the EGL, migrating GNs in the ML/PCL to the finally mature granule cells in the IGL, as the model shows in Fig. 5d.
Developmental trajectories within GN lineage cells
We next sought to illuminate the lineage relationship of postnatal granule cells by performing cell trajectory analysis using scVelo [48], a new version of RNA velocity analysis [49], which applies a likelihood-based dynamical model to solve the full gene-wise transcriptional dynamics. We performed this analysis for each individual sample and observed a consistent cell differentiation trend from GNPs to differentiated GNs (Fig. 6a, Additional file 1: S5a–S5d). For example, in the P7 Math1-positive sample, RNA velocity predicted non-dividing GNPs as the root cells that direct cellular differentiation by two different paths (Fig. 6a, c). A substantial number of non-dividing GNPs showed differentiation tendency towards dividing GNP phase (represented by cell #1, Fig. 6a, b) and subsequently migrating GNs (represented by cell #3, Fig. 6a, b) as well as the differentiated GNs (represented by cell #4, Fig. 6a, b) in P7 Math1-positive sample. In contrast, there were a few non-dividing GNPs that directly differentiate into migrating GNs without undergoing the transit-amplifying state in P7 Math1-positive sample (represented by cell #2, Fig. 6a, b).
Besides modeling cellular dynamics, scVelo analysis also enabled us to investigate the transcriptional dynamics of variable genes during cellular state progression. We focused on the TF genes that may act as driver genes underlying cell state transition. Visualizing the ratio of unspliced to spliced mRNA abundance showed a tightly controlled expression of cell state-associated TFs (Fig. 6d). For example, the expressions of non-dividing GNP markers Barhl1 and Tead2 (Fig. 6e) were downregulated when cells enter the cycling progenitor state. In contrast, Neurod1 and Scrt2 that play a modulatory role in cortical neurogenesis and neuronal migration [50] switched at GNs I (Fig. 6e). For GNs II, Neurod2, which is associated with neural differentiation, started to increase in expression, and Esrrg switched in line with the SCENIC results (Fig. 3b). Overall, the pseudotime ordering identified by RNA velocity revealed a sequential commitment from non-dividing GNPs to dividing GNPs, migrating granule cells and the finally mature granule cells.
Relationship between tumor cell identity and developmental GN cell origins
Previous studies have indicated that SHH-MB may mainly originate from early postnatal GN cells [16, 51]. We next asked whether MB tumor cells exhibited similar states resembling developing GNs. To this end, we performed scRNA-seq on MB developed after 50 weeks in Patched+/− mice. After quality control including doublet removal, we obtained a total of 18,372 cells from MB that developed in three mice (MB-1, MB-2, and MB-3) (Fig. 7a). Based on the known annotation of marker genes from the literature, we grouped cells of nine clusters into four cell types: GNP-like tumor cells (expressing Math1 and Grin2b), microglia (expressing Aif1 and Cd68), T cells (expressing Cd3d and Cd3e), and endothelial cells (expressing Cldn5 and Cdh5) (Additional file 1: Figure S6a). We next sought to distinguish malignant cells from non-malignant cells by inferring copy-number variations (CNVs) from single-cell transcriptome profiles as described by many previous studies [52,53,54]. As expected, this analysis showed that GNP-like tumor cells in all three MB samples had remarkable CNVs compared with normal granule neurons at P7 (Fig. 7b), confirming their malignant identity.
Previous single-cell analyses have shown that SHH-MB phenotypically resembles developing GNs [16]. In line with this finding, k-Nearest Neighbor (kNN) analysis [55] predicted that the three tumors showed a transcriptional similarity to the lineage of GN cells rather than other neural cell lineages (Fig. 7c), supporting the GN origin of the three tumors. Furthermore, we compared the transcriptional similarity between tumor cells and developing cell types using Scanorama [56], which integrated tumor cell and developmental neural lineage cell-type datasets after batch effect correlation. This analysis also showed that tumor cells were transcriptionally closer to the lineage of GN cells (Additional file 1: Figure S7). Based on this observation, we next examined whether tumor cell states recapitulate the proliferation/differentiation states of normal developing GNs. For this purpose, we scored each tumor cell using the signatures of the four states of developing cells (Fig. 7d). The results revealed that the intra-tumoral states of each tumor highly resembled GN states in normal development. Further examination of signature genes associated with these four cell states supported this observation (Additional file 1: Figure S6b).
To get an unbiased understanding of intra-tumor cell states, we next applied non-negative matrix factorization (NMF) [57] to extract the underlying transcriptional programs specific to each tumor. This analysis revealed four meta-programs (A, B, C, and D) that were shared across tumors (Fig. 7e). Meta-program A was characterized by markers of the cell cycle, such as Top2a and Mki67 (Additional file 7: Table S6), indicating the presence of proliferating subpopulations of tumor cells in all tumors. Meta-program B contained GNP-associated genes as mentioned above, such as Math1 and Zeb1, likely corresponding to undifferentiated progenitors in each tumor. Meta-program C was defined by many markers of differentiated GNs including Neurod1 and Nhlh1, probably reflecting the differentiation states of tumor cells. In addition, we identified a cancer metabolism-related meta-program (meta-program D), highlighted by Shmt2 [58] and Phgdh [59] (Additional file 1: Figure S6c). To relate these meta-programs with cellular states of developing GNs, we compared their similarity by calculating Jaccard index using associated signature genes. As expected, the cell cycle meta-program A was highly similar to the dividing GNPs, while the progenitor-like meta-program B was similar to non-dividing GNPs. The differentiation-related meta-program C showed similarity to both migrating GNs and mature GNs (Fig. 7f). Moreover, the cancer metabolism meta-program D did not show a clear similarity with any states of developing GNs, suggesting that upregulation of the metabolism-associated signature may be tumor-specific. Nevertheless, our single-cell analysis demonstrated that MB has similar states with developing GNs, which provides a new angle to understand the cellular states of tumor cells.
Delineating intra-tumoral cellular trajectories
To further understand the intra-tumoral cellular states in our MB samples, we used scVelo to visualize the cell-cell transition trends (Fig. 8a). By computing cell-to-cell transition probabilities between different states, we found that both non-dividing GNP-like and dividing GNP-like tumor cells could serve as tumorigenic cells (root cells) in all three tumors (Fig. 8a). This is different from the normal states, in which most root cells were predicated to be non-dividing GNPs (Fig. 8a). Subsequent quantification analysis of transition probabilities supported that dividing GNP-like tumor cells exhibit higher potential to give rise to non-dividing GNP-like tumor cells rather than differentiated progeny cells especially in MB-1 and MB-2, compared with GNPs in normal developing cerebellum (P < 0.0001, Pearson’s chi-square test; Fig. 8b). This finding generally supported the notion that tumor progenitor-like cells displayed more self-renewal potential that impeded the differentiation process to a certain extent, which is consistent with a previous study [23].
To understand the discrepancy between normal development and tumorigenesis, we next performed gene set variation analysis (GSVA) [60] on both transformed GN and normal GN (Additional file 1: Figure S8a). As expected, we found that the SHH signaling pathway was overactivated in tumor cells especially in the progenitor-like tumor cells. In addition, other tumor-related pathways such as cell cycle, metabolism, stem cell, hypoxia, and TGF-β pathways were also upregulated in the tumor samples compared with Math1-positive mice at P7 and P11. Of these tumor-specific pathways, some pathways such as the tRNA aminoacylation pathway, STAT5 pathway, and ALKBH8 pathway, have been reported to contribute to cancer progression in various cancers [61,62,63]. We next assessed the prognostic value of these pathways and found a total of eight pathways that can predict a significantly worse outcome in the overall survival of human MB, including cell cycle, hypoxia, stem cell, tRNA aminoacylation, STAT5, and ALKBH8 pathways (Additional file 1: Figure S8b).
Finally, we considered the clinical implications of the distinct GN-like tumor populations. We therefore used signatures of four tumor cell states to score 405 human SHH-MB bulk samples. We found that patients with higher dividing GNP-like signals, whose tumors presumably contain more dividing GNP-like tumor cells, had a significant poor prognosis (P = 0.0205; log-rank test; Fig. 8c). This data suggested that dividing GNP-like tumor cells plays a direct role in the development of tumorigenic progression. An unbiased comparison of normal dividing GNP cells and dividing GNP-like tumor cells revealed 332 genes (Additional file 8: Table S7) that were preferentially expressed in dividing GNP-like tumor cells, including progenitor-related genes (such as Hes1, Atoh1, Zeb1, and Mycn) and cell cycle genes (such as Cdkn2a, Cdkn1a, Pou4f1, and Ube2b) (Fig. 8d). In contrast, normal dividing GNPs upregulated expression of genes related to neuron migration (such as Neurod1 and Dcx) and neuron differentiation (such as Neurod2, Tubb2b, and Pax6), as confirmed by GO enrichment analysis (Fig. 8e). Therefore, these detailed analyses of intra-tumoral states highlighted the disparity between transformed GNs and normal developing GNs, although they shared similar undifferentiated/differentiated states. Dividing GNP-like tumor cells significantly upregulate cell cycle and progenitor-related gene programs, which to some extent inhibited the maturation of GNs.