Services: data resources

Name of service Tag Related links* Key Collection
MESBL GC-MS metabolite peak database

MESBL GC-MS metabolite peak database is a standardized library of more than 900 metabolite peaks from MS-reconstructed gas chromatograms integrating the in-house standard compound and peak library of the FORTH/ICE-HT Metabolic Engineering and Systems Biology Laboratory, appropriately filtered GOLM database peak information and Human Metabolome database information.

MÉTA Programme

Website encompassing vegetation heritage data of Hungary, including
distribution maps of habitat types, plant invasion of habitat types.

Metabolic Atlas

Open-source infrastructure service for research and engineering of metabolism in model organisms and human

MetaboLights

MetaboLights is a database for metabolomics experiments and derived information. 

EDD
MGnify

Formerly called EBI Metagenomics, MGnify is an automated pipeline for the analysis and archiving of metagenomic data.

CDD
MicroScope

MicroScope is an integrated Web platform for the annotation and exploration of microbial gene functions through genomic, pangenomic and metabolic comparative analysis. It supports submissions of newly assembled genomes and metagenomes, and also provides analysis services for RNA-seq data. The user interface of MicroScope enables collaborative work in a rich comparative context to improve community-based curation efforts.

MINT

The Molecular INTeraction Database of protein-protein interactions curated from peer-reviewed papers. An ELIXIR Core Data Resources and founder member of the IMEx Consortium.

MirGeneDB

A manually curated database of animal small non-coding RNAs

MitoZoa

MITOchondrial genome database of metaZOAns, a resource for comparative analyses of metazoan mitochondrial genomes at the sequence and genomic levels.

MobiDB

A database of protein disorder and mobility annotations, designed centralize resources for annotations of intrinsic protein disorder and its function. Part of the InterPro consortium, an ELIXIR Core Data Resource.

Model Archive

The archive for structural models which are not based on experimental data and complements the PDB archive for experimental structures and PDB- Dev for integrative structures. Any type of macromolecular structure which would otherwise be suitable for the PDB but whose coordinates are not based on experimental data can be deposited in ModelArchive. This includes single chains or complexes consisting of proteins, RNA, DNA, or carbohydrates including small molecules bound to them.

EDD
NEOF

The NERC Environmental Omics Facility (NEOF) will enable environmental researchers in the UK to access the full range of omics supporting technology.

neXtProt

An innovative knowledge platform dedicated to human proteins. It contains a wealth of data on all the human proteins that are produced by the 20'000 protein- coding genes found in the human genome. 

Nextstrain

Nextstrain is an open-source project to harness the scientific and public health potential of pathogen genome data. We provide a continually updated view of publicly available data alongside powerful analytic and visualization tools for use by the community. Our goal is to aid epidemiological understanding and improve outbreak response. This resource supports COVID-19 / SARS-CoV-2 research.

Norine

Norine has been and remains the unique resource dedicated to nonribosomal peptides (NRPs). It contains a database complemented with analysis tools (structure comparison, inference of monomer structure from chemical structure, mass spectrum analysis, NRP annotation for submission to the database). NRPs are secondary metabolites produced by bacteria and fungi and display a diverse spectrum of biological activity.

Ocean Gene Atlas

The Ocean Gene Atlas service provides data mining access to three complementary data objects: gene sequence catalogs (ENA), sample environmental context (PANGAEA), and gene abundances estimates in samples (computed by mapping raw sequence reads onto gene catalogs). User queries are composed of either a sequence (nucleic or protein), or a hidden Markov model derived from a multiple sequence alignment. Homologs of the user query in the gene catalogs are identified using standard sequence similarity search tools (eg BLAST or HMMER), and their read based estimated abundance are displayed in interactive world maps and ecological plots. A phylogenetic tree is also inferred in order to situate the user query within its context of marine environmental homologs as well as known homologs from reference sequences.

OLIDA

Curated database of oligogenic diseases and genetic variants causing these diseases. The successor of DIDA, a similar database for digenic diseases

OMA

OMA identifies orthologs among 2000 genomes from all domains of life. Other distinctive characteristics are the high quality of its inferences, the feature-rich web interface, and frequent update schedule of two releases per year. 

OmniPath

A database of causal protein-protein interactions, transcriptional and post-transcriptional regulation, enzyme-PTM interactions, protein complexes, annotations (function, disease roles, expression, localization, structure) and intercellular communication. Integrates data from more than 100 resources.

Orphadata

Orphadata provides the scientific community with comprehensive, quality data sets related to rare diseases and orphan drugs from the Orphanet knowledge base, in reusable formats.

CDD
Orphanet

Orphanet is the reference resource for information and data on rare diseases and orphan drugs. Orphanet derives from its knowledge base an ontology of rare diseases, information on rare diseases and data on rare diseases.

OrthoDB

OrthoDB is a comprehensive catalog of evolutionary and functional annotations of orthologs, covering over 22 million genes from over 5000 species of animals, fungi, plants, archaea, bacteria, and viruses. 

ORVAL

(Oligogenic Resource for Variant Analysis) is the first web bioinformatics platform for the exploration of predicted candidate disease-causing variant combinations, aiming to aid in uncovering the causes of oligogenic diseases (i.e. diseases caused by variants in a small number of genes).

PANGAEA

A service for the publishing, archiving and re-using data.

ParameciumDB

ParameciumDB is a community model organism database for the ciliate Paramecium. The web site gives access to genomes of many Paramecium species and their annotations. ParameciumDB also  integrates  genome-wide datasets (DNA-seq, RNA-seq, ChIP-seq) provided by the community. This portal is used to query, retrieve, visualize and compare the most up-to-date public data.

PED

The Protein Ensemble Database (PED) is an open access database for the deposition of structural ensembles, including intrinsically disordered proteins (IDPs). Manually curated data of structural ensembles measured with nuclear magnetic resonance spectroscopy, small-angle X-ray scattering, fluorescence resonance energy transfer are annotated in PED.

Pfam

Pfam is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). 

PHI-base

A catalogue of experimentally-verified pathogenicity, virulence and effector genes involved in the infection of animal, plant, fungal and/or insect hosts.

Plant Experimental Assay Ontology (PEAO)

A comprehensive ontology for the plant domain, that is useful for data integration and querying heterogeneous data.

PlantsDB

Providing a data and information resource for individual plant species