About RabDB & Rabfier

RabDB2 is a database containing information on the Rab family of small GTPases in 244 fully sequenced Eukaryotic genomes. The content has been generated using the Rabifier2 classification pipeline. If you have any questions, comments or suggestions about RabDB or the Rabifier don't hesitate to contact us at jsurkont (at) igc.gulbenkian.pt or jleal (at) igc.gulbenkian.pt.

Rabifier2 is an major update over the pipeline originally developed by Yoan Diekmann to identify Rab GTPases and classify them into subfamilies based on the protein sequence.

  • Improvements to the Rabifier pipeline.
    • Redesigned and rewritten the Rabifier codebase:
      • The pipeline is now written purely in Python, where Rabifier1 depends also on R.
      • Both Python2 and Python3 are supported.
      • The source code is freely available, allowing for local installation. The package can be easily installed from the source or from PyPI.
      • HMMER3 is used for similarity search instead of BLAST.
      • Sequences are now classified into subfamilies based on sequence score comparison against a model of each subfamily (instead of empirical distributions), which is subsequently used as input for the naive Bayes classifier.
    • Updated the third-party software:
      • HMMER 3.1b1
      • BLAST 2.2.30
      • MEME 4.10.2
      • CD-HIT 4.6.4
      • PRANK v.150803
      • MAFFT 7.221
    • Reference datasets have been updated. The changes include updates to some of the underpopulated subfamilies (AtRabA3, AtRabA6, AtRabD1, AtRabF1, CeRabY6, DmRab9F, DmRabX4, DmRabX5, DmRabX6, EhRabA, Rab17, Rab31, Rab41, Rab42), based on genomes from the Ensembl database.
    • Sequence data from the Ensembl database:
      • RabDB is now based on the sequence data from Ensembl, which allows regular updates with the growing list of sequenced genomes.
      • RabDB predictions are cross-linked to Ensembl enabling easy browsing of protein and evolutionary information.
  • Improvements to RabDB.
    • Improvements to the sequence data:
      • RabDB is now based on the sequence data from Ensembl, which allows regular updates with the growing list of sequenced genomes.
      • RabDB predictions are cross-linked to Ensembl enabling easy browsing of protein and evolutionary information.
    • Improvements to the web tool interface:
      • RabDB now allows browsing Rabs using different levels of the taxonomic organization.
      • Given the major speed improvement of the Rabifier, RabDB now allows for annotation of up to 500 user submitted sequences.
      • RabDB-Rabifier user interface now allows modifying classification parameters.
      • A detailed classification summary for each sequence is now available.

Reference: If Rabifier is useful in your research, please cite:

Availability

The source code of the Rabifier classification pipeline is publicly available at GitHub. Rabifier is also distributed as a Python package at PyPI.

The Rabifier pipeline uses a manually curated set of Rab and non-Rab sequences to identify Rabs and classify them into subfamilies. These data can be downloaded from GitHub.

Rabifier is an ongoing project. If you wish to contribute visit the project's website or send us feedback or comments.

The original version of RabDB is available at rabdb.org/legacy.

Creators

Jaroslaw Surkont Yoan Diekmann Jose Pereira Leal
Jaroslaw Surkont Yoan Diekmann Jose Pereira Leal