Skip to main content

Table 1 Assembly and annotation metrics of Amoebophrya A25, A120, and AT5 genomes, of the Symbiodiniaceae Breviolum minutum (Bmin), Fugacium kawagutii (Fkav), S. microadriaticum (Smic), and for Perkinsus marinus (Pmar)

From: Rapid protein evolution, organellar reductions, and invasive intronic elements in the marine aerobic parasite dinoflagellate Amoebophrya spp

  A25 A120 AT5 Fkav Bmin Smic Pmar
Assembly
 Number of scaffolds 557 50 2351 30,040 21,899 9695 17,897
 Cumulative size (Mb) 116 115.5 87.7 935 609 808 87
 Scaffold N50 / L50 1.082 Mb / 35 9.243 Mb / 5 83.9 kb / 298 381 kb / 772 125 kb / 1448 574 kb / 420 158 kb / 124
 Scaffold N90 / L90 423 kb / 106 1.464 Mb / 18 19.6 kb / 1095 109 kb / 2477 31 kb / 5103 146 kb / 1442 1.2 kb / 9284
 Scaffold max. size 3.013 Mb 16.512 Mb 537 kb 1.914 Mb 811 kb 3.145 Mb 1.8 Mb
 %N 2.27 1.41 2.25 3.4 0.9 7.7 0.64
 %GC 47.8 51.2 55.92 45.5 43.5 50.5 47.4
Genes
 Number 28,091 26,441 19,925 31,520 32,803 29,728 23,654
 Density (genes/Mb) 247.78 232.18 227.2 39.4 68.78 60.8 273.1
 Average length (bp) 2965 3482 2782 8836 10,069 9281 1581
 Median length (bp) 1890 2442 1803 2039 7899 7255 1038
Exons
 Number 117,411 121,327 67,639 150,118 985,369 1,072,528 133,410
 Av. length (bp) 475 541 578 256 99 109 177
 Median length (bp) 235 265 319 81 53 51 112
 Longest (bp) 79,744 44,016 14,772 11,064 14,818 13,755 16,293
 Average number of exons / gene 4.18 4.59 3.39 4.07 20.96 21.8 5.64
 % GC 51.9% 56.3% 54.7% 52.7% 50.8% 56.9% 50.95%
Introns
 Number 81,610 90,882 47,714 113,268 938,355 1,023,342 109,756
 % of spliced genes 69.8% 66.9% 71.3% 64.1% 95.4% 98.6% 72.4%
 Average length (bp) 345 335 337 893 517 505 124
 Median length (bp) 208 247 228 501 297 231 49
 Longest (bp) 90,415 35,152 3556 9977 88,176 177,825 11,034
 % GC 44% 46.5% 49.4% 44.5% 41.8% 47.1% 43.4%
 % of introns with GT-AG splice sites 34.02% 30.41% 99.98% 65.38% 48.23% 0.26 99.3%
 % of introns with GC|GA-AG splice sites 0.45% 2.95% 0.02% 25.30% 51.77% 73.95% 0.7%
 % of introns with other splices sites 65.53% 66.64% 0% 9.32% 0% 0.05% 0%
CDS
 Average coding size (bp) 1337 1773 1962 1041 1916 2375 4839
 Genome coverage of coding bases, % in brackets 32.4% 40.6% 44.6% 4.1% 13.1% 14.4% 26.4%
Gene families
 Number of genes belonging to families, % in brackets 7074 (25.2) 7428 (28.1) ND 20,374 (55.3) 25,809 (61.5) 32,796 (66.8) 18,258 (77.2)
 Avg. of genes in a family 3.5 3.6 ND 6.7 5.9 7 ND
 Max. of genes in a family 171 157 ND 889 703 831 ND
Annotation
 Number of proteins with at least one significant match 8360 8690 4366 29,720 13,813 5538 ND
 Number of proteins with KO assignation 5774 (21%) 5983 (23%) 2018 14,926 (40%) 10,954 (65%) 3008 (54%) ND
 Number of proteins with BRITE assignation 5774 5856   14,764 10,755 2960 ND
 Number of proteins of with an IPR domains 8444 9054 7404 16,895 13,541 4059 ND
 Number of proteins with UniProt matches (%) 9101 (32.4) 9404 (35.6) ND ND ND ND ND