gammaBOriS (Gammaproteobacteria) & gammaBOriTax

Your browser doesn't have HTML5 support.


Example genome Download gammaBOriS

Filter


ID Species Classification Genbank Assembly Ori. Start Stop Class Family Genus
Download All Sequences

gammaBOriS is a tool that is able to identify origin of initiation (oriC) sequences in chromosomes of Gammaproteobacteria. In contrast to many other tools with this aim, it is not based on chromosome-wide nucleotide disparities but on the motif composition of oriC sequences and is therefore able to run on not only full but also fragmentary chromosomes. More details can be found in the paper cited below.

This web server of gammaBOriS can easily be used to identify oriC sequences by uploading a file on the main tab on this website. A link will be presented to where the results of gammaBOriS can be downloaded after it has finished. This might take between ten minutes and some hours, depending on the webserver workload. The output of gammaBOriS consists of two fasta-formatted text files, one containing the sequence fragments that contain oriC, the other one those fragments for which the classifier abstained from classification. As an example, we provide the Escherichia coli genome (Refseq-ID NC_000913.3) as sample input and the corresponding output at the bottom of the ''Prediction'' tab. We are collecting the IP adresses of the users of the webserver in order to be able to prevent DDOS attacks on our servers (and no other purpose).

Also, gammaBOriS can be run from the command line. This requires R, the R packages BioStrings and stringr, as well as gkmpredict from the lsgkm package (available here), and a unix-based operating system. In order for gammaBOriS to run, gkmpredict needs to be executable and put into the same folder as the seed file and the ls-gkm model file. gammaBOriS, as a offline version, can be downloaded here.

BOriS DB is a database of oriC sequences gathered using BOriS. Currently, it contains around 26.000 sequences that were identified from Gammaproteobacteria using gammaBOriS, which makes it the largest dataset of oriC sequences publicly available for both Gammaproteobacteria and Bacteria in general. BOriS DB, which will be updated when new instances of BOriS are published, can be downloaded in full and in user-defined parts.

If you use gammaBOriS or BOriS DB please cite: T. Sperlea, L. Muth, R. Martin, C. Weigel, T. Waldminghaus and D. Heider. gammaBOriS: Identification and Taxonomic Classification of Origins of Replication in Gammaproteobacteria using Motif-based Machine Learning. Sci Rep 10, 6727 (2020). 10.1038/s41598-020-63424-7.

For any questions or comments please get in touch: theodor.sperlea@staff.uni-marburg.de.