Novel diversity identified by sequence similarity networks. Novel diversity in the dataset was defined as every Louvain Community at the 85% similarity threshold which exclusively consisted of BioMarKs sequences and which sequences were on average less than 95% identical to any reference sequence of the cultured ciliate database. The first eight columns display the composition of the respective LC with regard to sampling sites and habitats. The colors of the circles indicate in which habitat the sequences had been detected. Blue represents subsurface, green represents DCM and orange represents sediment samples. Multicolored circles were found in more than one habitat at the same sampling site. Taxonomy is displayed to the species level of the closest cultured reference if possible. Whenever more than one closest reference was assigned to the LC, the last common taxonomic level (at least class level) is given. All V4 sequences incorporated into the listed LCs are publicly available at Figshare ; separate fasta files have been deposited for DNA  and cDNA data . DCM deep chlorophyll maximum; LC, Louvain community.