Services: Genes and genomes

Name of service		Tag	Related links*	Key Collection
Ocean Gene Atlas	The Ocean Gene Atlas service provides data mining access to three complementary data objects: gene sequence catalogs (ENA), sample environmental context (PANGAEA), and gene abundances estimates in samples (computed by mapping raw sequence reads onto gene catalogs). User queries are composed of either a sequence (nucleic or protein), or a hidden Markov model derived from a multiple sequence alignment. Homologs of the user query in the gene catalogs are identified using standard sequence similarity search tools (eg BLAST or HMMER), and their read based estimated abundance are displayed in interactive world maps and ecological plots. A phylogenetic tree is also inferred in order to situate the user query within its context of marine environmental homologs as well as known homologs from reference sequences.	Genes and genomes	bio.tools
OMA	OMA identifies orthologs among 2000 genomes from all domains of life. Other distinctive characteristics are the high quality of its inferences, the feature-rich web interface, and frequent update schedule of two releases per year.	Evolution and phylogeny Genes and genomes	bio.tools FAIRsharing
ORCAE	Online collaborative genome annotation resource offering a range of tools and information to validate and correct gene annotations.	Genes and genomes	bio.tools
Orphadata	Orphadata provides the scientific community with comprehensive, quality data sets related to rare diseases and orphan drugs from the Orphanet knowledge base, in reusable formats.	Genes and genomes	bio.tools FAIRsharing	CDD
Orphanet	Orphanet is the reference resource for information and data on rare diseases and orphan drugs. Orphanet derives from its knowledge base an ontology of rare diseases, information on rare diseases and data on rare diseases.	Genes and genomes	bio.tools FAIRsharing
OrthoDB	OrthoDB is a comprehensive catalog of evolutionary and functional annotations of orthologs, covering over 22 million genes from over 5000 species of animals, fungi, plants, archaea, bacteria, and viruses.	Evolution and phylogeny Genes and genomes	bio.tools FAIRsharing
ParameciumDB	ParameciumDB is a community model organism database for the ciliate Paramecium. The web site gives access to genomes of many Paramecium species and their annotations. ParameciumDB also integrates genome-wide datasets (DNA-seq, RNA-seq, ChIP-seq) provided by the community. This portal is used to query, retrieve, visualize and compare the most up-to-date public data.	Genes and genomes	bio.tools
PatSearch	To searches user submitted sequences for any combination of Position Weight Matrices (PWMs), primary sequence patterns and structural motifs.	Genes and genomes	bio.tools
PHI-base	A catalogue of experimentally-verified pathogenicity, virulence and effector genes involved in the infection of animal, plant, fungal and/or insect hosts.	Genes and genomes	bio.tools FAIRsharing
PhyML	PhyML is a software that estimates maximum likelihood phylogenies from alignments of nucleotide or amino acid sequences. The main strength of PhyML lies in the large number of substitution models coupled to various options to search the space of phylogenetic tree topologies, going from very fast and efficient methods to slower but generally more accurate approaches. PhyML was designed to process moderate to large data sets. In theory, alignments with up to 4,000 sequences 2,000,000 character-long can be processed. PhyML can process data sets made of multiple genes and fit sophisticated substitution models with heterogeneous components across partition elements.	Evolution and phylogeny Genes and genomes	bio.tools
PiCnIc	Pipeline for Cancer Inference A pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes.	Genes and genomes	bio.tools
PIPPA	Web-interface and database providing tools for the management of different plant phenotyping platforms, and the analysis of images and data.	Genes and genomes	bio.tools
PlantsDB	Providing a data and information resource for individual plant species	Genes and genomes	bio.tools
PLAZA	Plant-oriented online resource for comparative, evolutionary and functional genomics.	Genes and genomes	bio.tools FAIRsharing	RIR
PredictSNP	Tool for prediction of disease related mutations in proteins. Tool version 2 (PredictSNP 2) for prediction of disease related mutatins within human genome available since 2016.	Genes and genomes	bio.tools
Primer3	Primer3 is a program for designing PCR primers and oligos.	Genes and genomes	bio.tools
RAP	RNA-Seq Analysis Pipeline A cloud computing web application implementing a complete and modular RNA-Seq analysis workflow.	Genes and genomes	bio.tools
ReadXplorer	Exploring and evaluating NGS data utilizing a modular programming structure allowing easy plugins.	Genes and genomes	bio.tools
REDIdb	A database annotating organellar RNA editing processes in their biological context.	Genes and genomes	bio.tools
REDIportal	A database of RNA editing events in humans from RNA-Seq and DNA-Seq data.	Genes and genomes	bio.tools
REDItools	Python scripts developed with the aim to study RNA editing at genomic scale by next generation sequencing data.	Genes and genomes	bio.tools
RepeatExplorer	Set of tools and a web server for complex characterization of repetitive DNA based on data from next generation of sequence reads.	Genes and genomes	bio.tools
REPET	The REPET package integrates bioinformatics pipelines dedicated to detecte, annotate and analyse transposable elements (TEs) in genomic sequences. The main pipelines are (i) TEdenovo, which search for interspersed repeats, build consensus sequences and classify them according to TE features, and (ii) TEannot, which mines a genome with a library of TE sequences, for instance the one produced by the TEdenovo pipeline, to provide TE annotations exported into GFF3 files.	Genes and genomes Molecular and cellular structures	bio.tools
Rfam	The Rfam database is a collection of RNA families, each represented by multiple sequence alignments, consensus secondary structures and covariance models (CMs).	Genes and genomes	bio.tools FAIRsharing TeSS
RNA Galaxy Workbench	Providing access to many NGS and RNA tools, visualisations, interactive environments (e.g. IPython) as well as various utilities, reference genomes and data libraries.	Genes and genomes
RNA-seq end-to-end workflow	End-to-end gene-level RNA-Seq differential expression workflow using Bioconductor packages. Starting from the FASTQ files are aligned to the reference genome, and a count matrix which tallies the number of RNA-seq reads/fragments within each gene for each sample is prepared. Performance of exploratory data analysis (EDA) for quality assessment and exploration of the relationship between samples, performance of differential gene expression analysis, and visual exploration of the results.	Genes and genomes	TeSS
Roddy	Roddy is a framework for large scale NGS processing pipelines on Petabyte scale. It is used for the management of workflows in the Pan-Cancer Analysis of Whole Genomes (PCAWG) project.	Genes and genomes	bio.tools
rPredictor	Web tool for prediction of rRNA secondary structures.	Genes and genomes Molecular and cellular structures	bio.tools
SalmoBase	A comprehensive data resource for salmonids species based on different omics data	Genes and genomes Proteins and proteomes	bio.tools
SARS-CoV-2 DB	A database with high-quality curated and freely accessible SARS-CoV-2 genomics- and contextual resources.	Genes and genomes Proteins and proteomes