RNAcentral text search supports advanced queries using the following syntax:
Use double quotes ("") to search for exact matches.
Example: "hsa-mir-21"
will find only hsa-mir-21 and not hsa-mir-212
A wildcard character (*) can match any number of characters. Wildcards are added automatically to all search terms that are not enclosed in double quotes.
Example: a search for HOTAIR
(no double quotes) will find both HOTAIR and HOTAIRM1 genes
and a search for "HOTAIR"
(with double quotes) will find only HOTAIR.
Search can be restricted to specific fields using the field_name:"field value" syntax. Please note that "field value" must be enclosed in double quotes.
Field | Examples |
---|---|
expert database | expert_db:"tmrna website" , expert_db:"mirbase" , search for RNA and look at the "Expert databases" facet |
NCBI taxonomic identifier | taxonomy:"9606" where 9606 is the NCBI taxonomy id for Homo sapiens |
taxonomy string | tax_string:"primates" - allows to search for taxonomic group |
scientific species name | species:"Mus musculus" |
common species name | common_name:"mouse" |
RNA type | rna_type:"pirna" or so_rna_type_name:"pirna" (the latter is classified using Sequence Ontology) |
gene | gene:"hotair" |
organelle | organelle:"mitochondrion" , organelle:"plastid" |
description | description:"16S" |
length | length:"1500" , length:[9000 to 10000] (supports range queries) |
publication title | pub_title:"Danish population" |
author | author:"Girard A." |
PubMed id | pubmed:"17881443" |
Digital Object Identifier | doi:"10.1093/nar/19.22.6328" |
MD5 | md5:"020711a90d35bb197e29e085595dd52e" MD5 hash value of uppercase DNA corresponding to RNAcentral sequence. |
interacting proteins | interacting_protein:"ENSG00000277791" |
interacting rna | interacting_rna:"ENSG00000235652" |
evidence for interaction | evidence_for_interaction:"ago-ip" |
secondary structure | has_secondary_structure:"True" |
conserved structure | has_conserved_structure:"True" |
GO annotation | has_go_annotations:"True" |
mapped vs aligned | has_genomic_coordinates:"True" |
any interacting protein | has_interacting_proteins:"True" |
any interacting rna | has_interacting_rnas:"True" |
publications | has_lit_scan:"True" |
AI generated summary | has_litsumm:"True" |
RNA editing events | has_editing_event:"True" |
and (default)
Multiple search terms separated by white spaces are combined using AND,
so a query like Homo sapiens
is treated as Homo AND sapiens
and only entries having both terms will be found.
or (to indicate equivalence)
*Example: rna_type:"pirna" or rna_type:"mirna"
not (to indicate exclusion)
*Example: expert_db:"lncrnadb" not expert_db:"rfam"
.
Use parentheses to group and nest logical terms.
Example: (expert_db:"mirbase" OR expert_db:"lncrnadb") NOT expert_db:"rfam"
Make sure your spelling is correct.
Example: misspelled terms like Esherichia
(missing "c") won't find any results.
Use full species names.
Example: use Escherichia coli
and not E. coli
as your search terms.
The RNAcentral text search now supports exporting any number of search results. Also refer to the public Postgres database and the RNAcentral FTP Archive for exporting large amounts of data.
Our latest article describes different ways of accessing the data.
RNAcentral is powered by the EBI search, which provides a publicly available REST interface for querying the data.