Social isolation-induced epigenetic and transcriptional changes in Drosophila dopaminergic neurons

Epigenetic mechanisms play fundamental roles in brain function and behavior and stressors such as social isolation can alter animal behavior via epigenetic mechanisms. However, due to cellular heterogeneity, identifying cell-type-specific epigenetic changes in the brain is challenging. Here we report first use of a modified INTACT method in behavioral epigenetics of Drosophila: a method we call mini-INTACT. Using ChIP-seq on mini-INTACT purified dopaminergic nuclei, we identified epigenetic signatures in socially-isolated and socially-enriched Drosophila males. Social experience altered the epigenetic landscape in clusters of genes involved in transcription and neural function. Some of these alterations were predicted by expression changes of four transcription factors and the prevalence of their binding sites in several clusters. These transcription factors were previously identified as activity-regulated genes and their knockdown in dopaminergic neurons reduced the effects of social experience on sleep. Our work enables the use of Drosophila as a model for cell-type-specific behavioral epigenetics.

. 176 In summary, when averaged over entire gene bodies there are small but statistically significant 177 changes in histone marks, but when examined in islands detected by SICER there are much larger changes, 178 often restricted to regions such as the first exon of a gene (for activating mark H3K4me3). Daytime activity is significantly higher in SH flies as compared to GH flies (Ganguly-Fitzgerald et 207 al., 2006), suggesting that metabolic activity might be higher in SH flies. It is also known that 208 mitochondrially-encoded genes are upregulated in waking flies (Cirelli and Tononi, 1998). Consistent with 209 these observations, in our RNA-seq dataset we found that of 15 known mitochondrially encoded genes, 14 210 were higher in SH than in GH flies (p=0.0005, binomial test). 211 In summary, transcript levels of many genes expressed in dopaminergic neurons were changed by 212 social housing conditions, including those of many epigenetic reader and writer genes. 213

Social experience alters epigenetic landscape 214
To understand how social experience affects the epigenetic landscape of dopaminergic neurons, we 215 focused on epigenetic changes observed in the top 40% of genes by mRNA expression levels ("expressed 216 genes"). We clustered the z-score normalized differences between GH and SH flies for all 6 epigenetic marks 217 and for mRNA, and performed k-means clustering as in Shen et al. The first five clusters are enriched for house-keeping functions, and include mitochondrial, ribosomal, 223 and proteasome genes. However, the last three clusters (6-8, containing genes with higher expression) are 224 enriched in neural and regulatory functions.
genes. As noted above, HAT genes and several epigenetic regulators encode differentially expressed 227 mRNAs; but as can be seen in the left-hand column (mRNA z-score), mRNA level changes are 228 heterogeneous, as some genes in this cluster are upregulated in GH males (red) and others in SH males (blue). 229 This is interesting considering that the k-means analysis grouped genes in this cluster not primarily by the 230 direction of their mRNA change with regard to housing condition, but by their epigenetic mark changes; this 231 cluster is enriched for readers and writers of epigenetic marks. 232 The seventh cluster is enriched for genes regulating neural function (some of which are members of 233 the MAPK or WNT signaling pathways), transcription factors, and glycolysis genes. In this cluster there is a 234 pair of marks that show strong, anti-correlated changes: Heterochromatin Protein 1 (HP1)-associated 235 H3K9me3 (higher in GH than SH) and the Polycomb repressive complex 2 (PRC2)-associated H3K27me3 236 mark (higher in SH than GH). 237 The two inhibitory marks also change in opposite directions in the final (highest expression) cluster 238 8, but in this cluster the directions of change are reversed. H3K9me3 in cluster 8 is higher in SH than GH 239 males and H3K27me3 is higher in GH than SH males. This cluster is enriched in neural function genes, 240 including those involved in male mating behavior, learning and memory, synaptic, neuropeptide and 241 serotonin signaling, as well as ion channels and transcription regulation genes. Genes of this cluster have on 242 average higher expression in SH flies than in GH (p<10 -15 , t=-15.06, df=1361). 243 In summary, there are clusters of genes whose epigenetic marks and mRNA levels respond to social 244 experience in similar ways within each cluster, but quite differently between clusters. This suggests that 245 different regulatory programs may be acting in each cluster. We use this putative division of genes into 246 epigenetically distinct clusters to try to determine what the regulatory program might be in the next section.

Social enrichment induces activity-regulated genes in dopaminergic neurons 248
We used the Centrimo tool (Bailey and Machanick, 2012) to search for transcription factors (TFs) 249 whose binding sites might be enriched (occur more often than chance) in promoter-proximal regions of genes 250 expressed in dopaminergic neurons. The 8 epigenetic clusters discussed in the previous section provided us 251 with groups of genes that had similar regulatory programs (as evidenced by their epigenetic and 252 transcriptional response to housing conditions). We used Centrimo to search for TFs whose binding sites 253 were enriched in genes of each cluster relative to a control group of the same number of genes randomly 254 chosen from other clusters. The promoter-proximal region (±500 bp from TSS) was scanned. We found a 255 group of 24 TF motifs that were enriched in one or more of the clusters' promoter regions. These correspond 256 to 14 TF genes, as in many cases multiple binding motifs are documented for one TF in the motif databases 257 used by Centrimo (Supplement Data 2). 258 We further filtered the TFs under investigation by two criteria: 1) the TF had to be in the expressed 259 gene set and 2) the TF had to show at least a 33.3% change in transcript levels in response to housing 260 conditions. Five TFs met our criteria: Hr38 (Hormone receptor-like in 38), Sr (Stripe), CrebA, Cbt (Cabut), 261 and Pho (Pleiohomeotic). Interestingly, the genes encoding four of these TFs (Hr38, sr, CrebA and cbt) are 262 orthologs of vertebrate immediate early genes (Hu et al., 2011). The expression of these genes was higher in 263 GH males than in SH males. We hypothesized, consistent with another study (Ganguly-Fitzgerald et al., 264 2006), that being in the GH environment constitutes an enrichment of stimuli for male flies. In a recent study 265 (Chen et al., 2016) the Rosbash group thermogenetically stimulated dopaminergic neurons by expressing 266 dTRPA1 using the TH-GAL4 driver and measured changes in mRNA levels after 60 minutes. Genes with 267 large transcriptional upregulation due to neuronal stimulation were called "Activity Related Genes" (ARGs). between GH and SH males in our dataset; we found a significant positive correlation (r=0.41, p=0.003).
Interestingly, changes in the levels of some histone marks observed between GH and SH males also correlated 272 significantly with Chen et al.'s changes in expression of ARGs upon neuronal stimulation: H3K9me3 (r=-273 0.35, p=0.01), H3K27me3 (r=0.46, p=0.0009) and H3K4me3 (r=0.32, p=0.026). This result suggests that 274 genes in dopaminergic neurons responding to short-term direct neural stimulation also respond epigenetically 275 and transcriptionally to the long-term presumed behavioral stimulation of dopaminergic neuron due to 276 interaction among GH flies over the course of four days. 277 Four transcription factors were among the genes that showed the largest upregulation in response to 278 direct neuronal stimulation: Hr38, sr, CrebA, and Cbt (Chen et al., 2016). All four of these TFs were also 279 upregulated in our RNA-seq data in GH as compared to SH males ( Figure 4A), suggesting that they might 280 regulate transcriptional responses of other genes in response to group housing. Interestingly, a recent study 281 of gene expression in the Drosophila midbrain found that transcription of these ARGs was correlated across 282 many types of neurons (Croset et al., 2018) under normal conditions -that is, there appears to be a common 283 regulatory program across neural cell types for these genes. The epigenetic effects of social housing on marks 284 were more highly correlated (by 2.4 times) among these ARGs than among all genes (t=2.336, df=20, 285 p=0.03). Of the 10 genes found by Croset et al.,9 were also present in our top 40% expressed genes in 286 dopaminergic neurons (Figure 4, B). These 9 ARGs had GH/SH fold changes ranging from 1.44 to 2.11 287 (mean 1.70; p=0.004, binomial test; Supplement Table 3). Similarly, genes with log fold change above 1.5 288 in Figure  In summary, several ARGs expressed in dopaminergic neurons respond similarly to 4 days of social 293 housing and 60 minutes of thermogenetic stimulation. We report below the effects of these ARG transcription factors on downstream targets using both bioinformatic analysis and by manipulating levels of these ARG 295 TFs in dopaminergic neurons and measuring the effect on sleep. 296

ARGs predict transcriptional changes due to social experience 297
It has been suggested that ARGs may in some cases act as transcriptional repressors to fine tune 298 responses to neuronal stimulation (Croset et al., 2018). To test if the factor encoded by the ARG cbt is acting 299 as a transcriptional repressor, we compared a published dataset for cbt (Bartok et al., 2015) with our data. In 300 the Bartok et al. study, genome-wide transcriptional responses were measured upon overexpression and 301 knockdown of cbt in adult male fly heads. We used mRNA expression from this study to define two sets of 302 genes: 'repressed by Cbt' and 'activated by Cbt'. The 'repressed by Cbt' set contains genes whose expression 303 is increased upon cbt knockdown and decreased upon cbt overexpression (Supplement Table 4). Conversely, 304 the 'activated by Cbt' set contains genes whose expression is decreased upon cbt knockdown and increased 305 upon cbt overexpression (Supplement Table 5). cbt is up-regulated by 94% in dopaminergic neurons of GH 306 males compared to SH males. Hence, if Cbt indeed acts as a transcriptional repressor in dopaminergic 307 neurons, we should see downregulation of genes repressed by Cbt and/or upregulation of genes activated by 308 Cbt in GH males. To test this, we compared gene expression between the two datasets using the top 40% of 309 expressed genes in dopaminergic neurons. Consistent with our hypothesis, we found reduced expression of 310 genes repressed by Cbt in GH males compared to SH males (two-sided Student t: t=-3.31, df=1143, 311 p=0.0001). Genes activated by Cbt were upregulated in GH males, although this effect was not statistically 312 significant (t=1.30, df=269, p=0.196). Thus, Cbt appears to act primarily as a transcriptional repressor in 313 dopaminergic neurons in response to social stimulation: it is higher in GH than in SH neurons, and genes 314 repressed by it in heads (Bartok et al., 2015) are lower in dopaminergic neurons in GH compared to SH genes. For each mark, the difference between the two gene sets was significant at p values ranging from 10 -318 12 to 10 -32 . The activating marks H3K4me3, H3K36me3, and H3K9-14ac, and the repressive mark 319 H3K27me3 were higher in GH males in 'repressed by Cbt' genes than in 'activated by Cbt' genes. By 320 contrast, the marks H3K27ac and H3K9me3 were higher in SH males in the 'repressed by Cbt' genes than 321 in the 'activated by Cbt' genes ( Figure 4C). Interestingly, genes in the 'repressed by Cbt' group were over-322 represented in our eighth k-means cluster ( Figure 3) containing genes involved in neuronal function (odds 323 ratio 1.7:1, chi-squared 95.9, df=1, p=1.8*10 -22 ). We present a hypothesis for this unusual pattern of mark 324 changes in the Discussion. We did the above multilinear regressions for several functional sets of genes that were (a) enriched 336 in the three clusters containing genes expressed at medium or high levels (clusters 6, 7 and 8, Figure 3), (b) 337 involved in epigenetic regulation or neural function, and (c) relevant to male fly behavior. Since GO analysis 338 is ineffective in functionally classifying small sets of genes, we manually categorized these genes in each MAPK signaling, and certain epigenetic genes ( Table 2). The writers and erasers of marks were grouped by 342 whether their marks tend to activate or repress gene transcription. 343 Interestingly, significant amounts of mRNA change between GH and SH flies were explained by TF 344 binding sites in these functional groups, as shown in Table 2. The table shows the r and p-values for the  345 regressions, and which TF motifs were significantly different. Hr38, cbt, CrebA, and sr putative binding sites 346 each show a significant connection to mRNA change in one or more of the 6 functional gene groups. Of note, 347 in eight out of nine functional groups where ARG-TFs motifs were significantly different, the direction of 348 the effect of Hr38, cbt, and CrebA binding sites was negative -that is, the more potential binding sites the 349 TF had in a gene, the lower the difference in mRNA between GH and SH flies was. This is consistent with 350 the putative role of ARGs as transcriptional repressors for some genes. By contrast, pho, known primarily as 351 a transcriptional repressor (Brown et al., 1998), showed a consistent positive effect on fold change between 352 GH and SH flies. 353 In summary, changes in the numbers of a few putative transcription factor binding sites were 354 sufficient to predict mRNA changes due to housing in ten functionally relevant gene groups with r values 355 ranging from 0.25 to 0.86 (Table 2). The influence of ARGs on differential transcription in GH and SH males 356 in biologically relevant gene groups led us to hypothesize that ARGs might also affect phenotypes known to 357 vary with housing conditions. 358 359  Table 6). 551 552 Clustering: Gene Ontology (GO) analysis was done using two web tools: DAVID (Huang et al., 2009) and 553

Regulation of social isolation-induced behavior by ARGs
GOrilla (Eden et al., 2009). For mRNA differential expression analysis, genes in the top 40% of expression 554 level were used as the background lists for both tools, and genes with FCROS significant differential 555 expression (FDR=0.2) were analyzed. For analysis of clusters (see below) genes belonging to each cluster 556 were compared to the appropriate background list (top 40% genes for 8-clusters). K-means clustering 557 (Hartigan and Wong, 1979) was done using the kmeans package in R. To understand the impact of social 558 isolation on epigenetics of genes expressed in dopaminergic neurons, a dataset of the top 40% of genes by 559 mRNA TPM expression (5,372 genes) was constructed containing normalized differences between group-560 housed and isolated flies for mRNA and for the 6 epigenetic marks. Tests using an information criterion 561 approach (BIC) were used to determine the optimal numbers of clusters, which was k=8 for the 5,372-gene 562 dataset. K-means clustering is a stochastic process that may yield very different results each time it is run if 563 there is no strong pattern in the data. To determine robustness of the gene assignments to clusters, we re-ran 564 clustering with random seed changes to create N cluster assignments. We then compared each cluster 565 assignment to every other (1035 = N*(N-1)/2 comparisons for 8-cluster assignments). In each comparison, 566 we calculated the percent overlap of a cluster in assignment i with clusters in assignment j, and reported the 567 maximum percent overlap for that cluster. We therefore generated 8,280=8*1,035 comparisons. Figure 3-568 Figure Supplement 1 shows a histogram of cluster overlap percentages. For eight clusters, the median percent 569 overlap of a cluster in one assignment to its best match in a second assignment was 94%, and was >99%, 570 72% of the time. Thus, we concluded that cluster identity is fairly stable, in spite of the randomness inherent 571 in the k-means algorithm.
Cluster functional enrichment was determined using the DAVID 6.8 functional annotation tool 573 genes") were tested for motif enrichment using Centrimo compared to a control set of sequences from an 584 equal number of randomly selected genes not in the cluster ("control genes"). We report a motif as "enriched" 585 if the Centrimo's adjusted p-value was < 1 x 10 -10 . 586 To quantify the number of potential binding sites of each enriched motif in each gene, we used FIMO version 587 4 (Grant et al., 2011) with default parameters. Log fold changes in mRNA levels between group housed and 588 single housed treatments were the dependent variable in multilinear regressions in which numbers of each 589 enriched TF motifs were used as dependent variables. The "lm" program from R was used; non-significant 590 dependent variables were removed in a step-wise manner using "stepAIC" (least significant first) until only 591 significant variables remained; the results of these regressions are reported with multilinear r (square root of 592