Integrated Data Sources

#Data Source NameCategoryDescriptionWebsource
1BioGRIDInteractionProtein, Genetic and Chemical InteractionsBioGRID
3Borealis - list2Geneconservation scores from canadian research data repository dataBorealis: The Canadian Dataverse Repository
2BiomartGenemappings of gene ids from biomart using pybiomartBioMart
4cBioPortalCancer GenomicsOpen-source resource for interactive exploration of multidimensional cancer genomics data setscBioPortal
5Cell OntologyTerminologyan ontology of cell typesCell Ontology
6CellosaurusTerminologya knowledge resource on cell linesCellosaurus
7ChEBIDrug/CompoundIDs and accession numbers onlyChemical Entities of Biological Interest (ChEBI)
8ChEMBLDrug/Compounda manually curated database of bioactive molecules with drug-like propertiesChEMBL Database
9ClinVarGeneticsaggregated information about genomic variation and its relationship to human healthClinVar
10CompartmentsGeneprotein subcellular localization from manually curated literature, high-throughput screens, automatic text mining, and sequence-based prediction methodsCOMPARTMENTS
11Complex PortalInteractiona manually curated, encyclopaedic resource of macromolecular complexes from a number of key model organisms (only human data is included)Complex Portal
12dbSNPGeneticscontains human single nucleotide variations, microsatellites, and small-scale insertions and deletions along with publication, population frequency, molecular consequence, and genomic and RefSeq mapping information for both common variations and clinical mutationsdbSNP
13depmapExpressionCancer Dependency Map dataThe Cancer Dependency Map Project at Broad Institute
14DisGeNETGeneall gene disease associations in DisGeNethttps://www.disgenet.org/downloads
15DrugbankDrug/Compounddrug informationDrugBank
16DrugcentralDrug/Compounddrug informationDrug Central
17EFOTerminologysystematic description of experimental variables available in EBI databasesEFO - The Experimental Factor Ontology < EMBL-EBI
18Expasy - EnzymeProteinthe nomenclature of enzymesExpasy - ENZYME
19FDA - UNIIClinicalFDA’s global Substance Registration System enables an efficient and accurate exchange of information on substances through their Unique Ingredient Identifiers (UNIIs) which can be generated at any time in the regulatory life cycleFDA's Global Substance Registration System
20GencodeGenegenecode for mouseGENCODE - Mouse Release M32
21GeneRIFGenefunctional annotation of genesAbout Gene RIF - Gene - NCBI
22GenontologyPathwayinformation on the functions of genesGene Ontology
23GSEA - MSigDBPathwaycurated gene sets (C2) from msigdbHuman MSigDB Collections
24GTExPortalExpressiongene expression data from samples collected from 54 non-diseased tissue sites across nearly 1000 individualsGTEx Portal
25HomoloGeneGenean automated system for constructing putative homology groups from the complete gene sets of a wide range of eukaryotic speciesHome - HomoloGene - NCBI
26IntActInteractionmolecular interaction dataIntAct
27InterProProteinthe classification of protein families, predicting domains and important sitesInterPro
28MeSHTerminologymedical subject headingsHome - MeSH - NCBI
29miRWalkChemical BiologymiRNA-target interactionsmiRWalk
30MyGeneGenegene annotation dataMyGene.info
31NA (Paper)Chemical Biologydata from paper: 'Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries'Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries | Nature Biotechnology
32NA (Paper)Chemical Biologydata from paper: 'hUbiquitome: a database of experimentally verified ubiquitination cascades in humans'hUbiquitome: a database of experimentally verified ubiquitination cascades in humans
33NA (Paper)Chemical Biologydata from The PROTACtable genome paperThe PROTACtable genome | Nature Reviews Drug Discovery
34NCBI - GeneGenehuman gene id, name, symbol and synonyms from NCBI geneHome - Gene - NCBI
35NCBI - OrthologsGenegene_orthologs table from NCBI geneHow are orthologs calculated? - NCBI
36NCBI - TaxonomyGenea curated classification and nomenclature for all of the organisms in the public sequence databasesHome - Taxonomy - NCBI
37PDB - Ligand expoProteinchemical and structural information about small molecules within the structure entries of the Protein Data BankLigand Expo
38PDBbind-CNProteina comprehensive collection of experimentally measured binding affinity data for all biomolecular complexes deposited in the Protein Data BankPDBbind
39Protein antibody tagsProteinBayer internal curated protein antibody tags for CITEseq dataNA
40ReactomePathwayChebi to reactome mappingDownload - Reactome Pathway Database
41ReactomePathwaysignaling and metabolic molecules and their relations organized into biological pathways and processeshttps://reactome.org/
42SAbDabProteinall the antibody structures available in the PDB, annotated and presented in a consistent fashionSAbDab: The Structural Antibody Database
43ScannetNANANA
44SelleckchemDrug/Compoundcompound data (smiles ...)Selleck Chemicals
45SIFTSProteinStructure Integration with Function, Taxonomy and Sequence (SIFTS) is a project in the PDBe-KB resource for residue-level mapping between UniProt and PDB entriesSIFTS < PDBe < EMBL-EBI
46STRINGGeneticsprotein interactionsSTRING
47Swiss-ModelProteinannotated 3D protein structure models generated by the SWISS-MODEL homology-modelling pipelineSWISS-MODEL Repository
48TCGA - The cancer genome atlas programExpressiondata from molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer typesThe Cancer Genome Atlas Program (TCGA) - NCI
49The Human Protein AtlasExpressionexpression profiles in human tissues of genes both on the mRNA and protein levelThe Human Protein Atlas
50UberonTerminologycross-species ontologyUber-anatomy ontology
51UbiNetDrug/Compoundupdated, validated, and abundant E3-substrate interactions, detailed classification of human E3 ligasesUbiNet 2.0
52UMLSTerminologymedical and biomedical vocabularies and standardsUMLS Knowledge Sources: File Downloads
53UniprotProtein/Geneorthologous protein dataUniProt
54UniprotProteinprotein to protein family mappingUniProt
55Uniprot - OrthoDBProteindatabase of orthologous groupsOrthoDB | Cross-referenced databases | UniProt
56Uniprot - Subcellular locationProteinLocation of proteins within cellsSubcellular location | UniProt help
57Uniprot - Swiss-ProtProtein/Genecurated protein dataUniProt
58Uniprot - TremblProteinnon curated computationally analyzed recordsUniProt