RNAcentral is an open public resource that offers integrated access to a comprehensive and up-to-date set of ncRNA sequences.
RNAcentral assigns identifiers to distinct ncRNA sequences and automatically updates links between sequences and identifiers maintained by expert databases.
Each sequence in RNAcentral is assigned a Unique RNA Sequence identifier (URS). These identifiers are stable and are not expected to change.
The identifiers have the following format: URS + sequentially assigned hexadecimal number
and can be parsed using this regular expression: /URS[0-9A-F]{10}/
.
Example identifiers: URS0000000001, URS00000478B7.
Species-specific identifiers also include NCBI taxid, for example: URS00000478B7_9606 or URS00000478B7/9606.
To find an RNAcentral identifier for a single sequence, one can use RNAcentral sequence search.
For a large number of sequences, one can:
use an example script that works with the RNAcentral API;
download a mapping file from the RNAcentral FTP site with correspondences between md5 values and RNAcentral ids;
download a mapping file with correspondences between external database identifiers and RNAcentral ids.
sequences shorter than 10 nucleotides
sequences with more than 10% of unknown characters (Ns).
Once an ncRNA sequence is submitted to an INSDC database, including ENA, GenBank, and DDBJ, it will automatically appear in a subsequent RNAcentral release.
If you run an ncRNA database and would like to join the RNAcentral Consortium, please get in touch.
The RNAcentral data is updated every 3 months, while the user interface and website functionality is continuously updated.
The content on this website is licensed under a Creative Commons Zero license, which means that you can use the data in any way and for any purpose.
Explore all RNAcentral training materials to find information about the project as well as exercises, tips, a quiz, and more.