Skip to main content

Advertisement

Table 1 Gene annotations in Gencode, Ensembl, RefSeq, and CHESS

From: Open questions: How many genes do we have?

  Gencodea Ensemblb RefSeqc CHESSd
Protein-coding genes 19,901 20,376 20,345 21,306
lncRNA genes 15,779 14,720 17,712 18,484
Antisense RNA 5501   28 2694
Miscellaneous RNA 2213 2222 13,899 4347
Pseudogenes 14,723 1740 15,952  
Total transcripts 203,835 203,903 154,484 323,827
  1. Note that despite the many differences shown for Gencode and Ensembl, Gencode is created by merging the Havana manual annotation and the Ensembl automated annotation, and the releases coincide (https://www.gencodegenes.org/faq.html)
  2. aGencode statistics for version 28 from www.gencodegenes.org/stats/current.html as of July 12.2018
  3. bEnsemble statistics for version 92.38, which corresponds to Gencode v28, from ensembl.org/Homo_sapiens/Info/Annotation as of July 12, 2018
  4. cRefSeq statistics for release 108 from www.ncbi.nlm.nih.gov/genome/annotation_euk/Homo_sapiens/108/ as of July 12, 2018
  5. dCHESS statistics for version 2.0 from ccb.jhu.edu/chess as of July 12, 2018. CHESS does not currently include pseudogenes