RNAcentral is a new resource to organise data for non-protein coding RNA genes.

RNAcentral will offer integrated access to a comprehensive and up-to-date set of RNA sequences. These sequences will be provided by a collaborating group of expert databases and supplemented by sequences from the International Nucleotide Sequence Database archives (INSDC). The vision for RNAcentral is presented in the paper by Bateman et al, 2011.

The RNAcentral portal (available at this URL by mid-2014) will provide access to the data via sequence and meta-data search, will link together the expert resources, and provide a defined identifier space for individual molecules. In a second phase of development, this reference data will be used to provide annotation for genome sequences from across the taxonomic space.

List of RNAcentral Expert Databases

ENA comprehensive record of the world's nucleotide sequencing information
Comparative RNA Web (CRW) Site comparative sequence and structure information for ribosomal, intron, and other RNAs
Genomic tRNA Database tRNA predictions in genomes
HGNC HUGO Gene Nomenclature Committee
lncRNAdb annotations of eukaryotic long non-coding RNAs
miRBase microRNA sequences and annotation
MODOMICS RNA modification data
NONCODE integrative annotation of long noncoding RNAs
NPInter experimentally determined functional interactions between ncRNAs and proteins, mRNAs or genomic DNA
piRNABank comprehensive resource on Piwi-interacting RNAs
RefSeq comprehensive, integrated, non-redundant, well-annotated set of reference sequences
Rfam collection of RNA families, each represented by multiple sequence alignments, consensus secondary structures and covariance models
Ribosomal Database Project ribosome-related data and services
RNApathwaysDB RNA maturation and decay pathways
SILVA quality checked and aligned ribosomal RNA sequences
snoRNA Database predicted snoRNA genes
SRPDB aligned, annotated and phylogenetically ordered sequences related to structure and function of SRP
tmRNA Website tmRNA sequence data
VEGA high quality manual annotation of vertebrate finished genome sequence


21,318 sequences from the tmRNA Website have been imported into RNAcentral.

November 27, 2013

RNAcentral is pleased to announce the import of 21,388 human lncRNA from the VEGA (Vertebrate Genome Annotation) expert database.

The set of VEGA lncRNA records is visible from http://www.ebi.ac.uk/ena/data/view/HG491497-HG512884, and the entire RNAcentral set is available at any time from http://www.ebi.ac.uk/ena/data/search?query=RNAcentral.

October 24, 2013

RNAcentral is pleased to announce the import of 3,661 transcriptome assembly contigs representing mature miRNA sequences from miRBase. The accessions are as follows:

These may be seen by doing a text search for 'miRBase' from the ENA search page.

August 22, 2013

On 8th July RNAcentral announced the first import of new data from SRPDB. 855 SRPDB entries have been imported as Third Party Annotations (TPAs) using the new "RNAcentral" and "TPA:specialist_db" keywords. The new accessions are HG322958 to HG323812.

July 8, 2013

RNAcentral Non-coding Product Current version r118

The Non-coding product is the main feed source of sequence data for the RNAcentral database.

Latest release Latest updates

The Non-coding product contains nucleotide sequences of non-coding features annotated in the EMBL nucleotide sequence database in an analogous manner to that of the CDS product containing coding features. Features with the following feature names are included in this product:

  1. rRNA
  2. tRNA
  3. tmRNA
  4. precursor_RNA
  5. ncRNA
  6. misc_RNA

Data are distributed in a flat-file format, similar to that of the EMBL database, but with each entry representing a single feature and with the sequence of only that entry.

Each entry's description comprises the organism name followed by the product or, if that is not given, the gene. All keywords from the parent entry that match a standard set are included, with the addition of the “RNAcentral” keyword for entries imported from expert databases. All citations from the entry within the range of the feature are included. If the parent entry has citations without a range these are also included. Database cross references to expert databases are also included.

The number of sequences of various classes in the latest release:

RNA class r118 release
rRNA 6,005,856
tRNA 1,276,670
tmRNA 22,229
misc_RNA 1,441,954
precursor_RNA 9,335
ncRNA 160,796
Total 8,916,840

The development of RNAcentral is being coordinated by the European Bioinformatics Institute, and is funded by the United Kingdom Biotechnology and Biological Sciences Research Council grant BB/J019321/1.