|
Rfam 8.1 ::
Help
Guide to using the Rfam database and web server |
|
Rfam is the RNA version of the Pfam protein families database. Rfam is a collection of multiple sequence alignments and covariance models covering many common non-coding RNA families.
In conjunction with the Infernal software package, Rfam covariance models (CMs) can be used to search genomes or other DNA sequence databases for homologs to known structural RNA families. However, Infernal remains too slow for routine genome annotation use, unless you have a large cluster, or are willing to write some sequence-based search filters to put in front of an Infernal search (or both). Thus the Rfam web servers are not very effective at searching your sequence for you; Janelia does not implement search at all, and Sanger uses a stringent BLAST-based filter that sacrifices much of the power of CM-based RNA search methods in order to achieve reasonable speed. People who have used Rfam in conjunction with Infernal to annotate sequences for RNA homologies have generally downloaded the entire Rfam database and run Infernal locally.
At this time, one main use of Rfam is as a source of RNA multiple alignments with consensus secondary structure annotation in a consistent format. This has been useful for people systematically training or testing RNA secondary structure prediction software, for example.The browse page allows you to find your RNA family of interest. Each entry in the browsing table leads to a family page which show annotation on family structure and function, links to our multiple sequence alignments, and links to the literature and other databases.
The sequence search page [at the Sanger Centre] allows you to search a nucleotide sequence against the Rfam model library. Any hits to Rfam families will be returned with start and end coordinates, model start/ends, and a score for each hit. However, because the site uses a stringent BLASTN prefilter for speed reasons, sensitivity relative to a full CM search is often poor.
S. Griffiths-Jones, S. Moxon, M. Marshall, A. Khanna, S. R. Eddy, A. Bateman, "Rfam: Annotating Non-Coding RNAs in Complete Genomes", Nucleic Acids Research 33:D121-D141, 2005. [NAR on-line.] [PubMed abstract.] [Download reprint.]
Rfam is produced by the Rfam Consortium, a collaboration between researchers at the Wellcome Trust Sanger Institute near Cambridge, UK, the University of Manchester in Manchester, UK, and HHMI Janelia Farm near Washington, DC. Rfam was led until recently by Sam Griffiths-Jones, who has recently moved to Manchester. The Sanger Institute is recruiting a new project leader.
The Janelia Farm mirror site is maintained by Sean Eddy.
Rfam is in a continuous state of flux, but the following access methods will stay stable. Rfam accession numbers are stable identifiers; Rfam names are not necessarily stable.
To get the full alignment instead, use "type=full" instead of "type=seed".
There are a number of high quality specialised RNA databases. Links to some of these are listed below. The RNA World has a more comprehensive list. Rfam contains alignments and annotation which derive from some of these sources. In such cases the individual source is referenced on the family page. Please credit the primary source if you make use of any data that we have repackaged.
We would like to thank William Mifsud for providing annotation for a number of Rfam families.