Skip to main content

Table 1 Gene annotations in Gencode, Ensembl, RefSeq, and CHESS

From: Open questions: How many genes do we have?

 

Gencodea

Ensemblb

RefSeqc

CHESSd

Protein-coding genes

19,901

20,376

20,345

21,306

lncRNA genes

15,779

14,720

17,712

18,484

Antisense RNA

5501

 

28

2694

Miscellaneous RNA

2213

2222

13,899

4347

Pseudogenes

14,723

1740

15,952

 

Total transcripts

203,835

203,903

154,484

323,827

  1. Note that despite the many differences shown for Gencode and Ensembl, Gencode is created by merging the Havana manual annotation and the Ensembl automated annotation, and the releases coincide (https://www.gencodegenes.org/faq.html)
  2. aGencode statistics for version 28 from www.gencodegenes.org/stats/current.html as of July 12.2018
  3. bEnsemble statistics for version 92.38, which corresponds to Gencode v28, from ensembl.org/Homo_sapiens/Info/Annotation as of July 12, 2018
  4. cRefSeq statistics for release 108 from www.ncbi.nlm.nih.gov/genome/annotation_euk/Homo_sapiens/108/ as of July 12, 2018
  5. dCHESS statistics for version 2.0 from ccb.jhu.edu/chess as of July 12, 2018. CHESS does not currently include pseudogenes