Scientific Drug Data

Comprehensive drug knowledge for data science, drug discovery, and machine learning.


Join thousands of researchers who trust DrugBank data, daily.

Since DrugBank’s inception 13 years ago, researchers around the world have relied on DrugBank to push forward the fields of bioinformatics, in silico drug discovery and drug repurposing, and artificial intelligence for drug discovery.

With more than 7,000 DrugBank citations, DrugBank is one of the most widely used scientific databases in the world.

Product Overview

Drugs From Discovery
To Approval

Access data including early preliminary findings, investigational drugs, and approved or withdrawn drugs.

Data Points

We provide 200 different data fields including chemical structure to physiochemical properties, drug targets, pharmacology and toxicological data, clinical trials, patents, and drug indications.

Text & Structured
Detail Information

Our data provides context for researchers, as well as structured details which capture the information in a format suitable for bioinformatics, data science, and machine learning applications.


Our robust help center features interactive tutorials, reference documentation, as well as a cross-referenced glossary. Or set up a phone call with our product support team.


All DrugBank datasets are available as JSON, XML or SQL and include daily updates.


Since inception, DrugBank has been used by tens of thousands of academic researchers and has been featured in more than 7,000 scientific publications.

Available Datasets

DrugBank offers a variety of datasets available in multiple formats, with daily updates and access to phone support. Our datasets include drug ingredients, pharmacology, drug products, drug product concepts, patents, investigational and experimental drugs, drug targets, drug metabolism, structured drug indications, and structured drug-drug interactions.

You can access summaries for more than 3,000 clinical drugs and 7,500 pre-clinical drugs. Information about each drug includes names, synonyms, a unique Identifier, detailed descriptions, the chemical structure, and the chemical structure of formulations and salts.
We provide overviews of clinical trial status and approval status for drugs, including max phase, approval types (OTC, prescription), approval dates and generic availability. The status is available for FDA, Health Canada, and EMA.
Access chemical structures and protein sequences for pre-clinical drugs, and every drug approved by the FDA and Health Canada. The structure dataset also includes structures of formulations and salts and structures of drug metabolites. Chemical structures are available in multiple formats including SDF, SMILES, and InChi and protein sequences are available in FASTA and include UniProt and Genbank identifiers.
The Drug Product dataset provides over 130,000 drug product listings, covering all products approved by the FDA, Health Canad and EMA including withdrawn drug products. Each listing includes the brand or generic name, dosage, dose form, route of administration, codes, labeller, approval status, and marketing start and end date. The drug ingredients are linked with the drug dataset, and all products are linked with approved indications which include links to MedDRA and ICD10.
Our pharmacology dataset includes detailed descriptions of the mechanism of action, metabolism, absorption, distribution, elimination and pharmacokinetic and pharmacodynamic parameters such as half-life, clearance, and LD50.
Drugs are categorized for searching, filtering and comparing drugs. The chemical taxonomy organizes drugs based on the features of the chemical structure; drug categories organize drugs based on class of drug, effect, affected organ systems and targeted proteins the categories are linked with MESH identifiers when available; and, all drugs are also linked with ATC drug classifications.
The Therapeutic Categories are based on the FDA's “Established Pharmacologic Class” (EPC) which is a pharmacologic class associated with an approved indication of an active moiety that the FDA has determined to be scientifically valid and clinically meaningful. We have extended the EPC categories to cover other jurisdictions as well.
The product concept dataset provides more than 240,000 unique identifiers that each describe a distinct set of product attributes including brand, route, ingredients and dosage. Each product concept represents a set of drug products that matches the attributes. The product concepts are organized in a hierarchical structure that enables easy navigation, and the comparison of products at different levels of similarity (even across regulatory jurisdictions). When available, product concepts are mapped to RxNorm concepts to allow for easy integration.
You can access robust clinical trial information including trial date, ID, title, phase, date started/stopped, trial description, conditions (indication), intervention groups, PubMed references for the trial, countries, and intervention descriptions. Information from has been parsed, normalized and integrated directly with DrugBank datasets.
We provide you with over 5,600 drug patents, including information on patent ID, grant date, historical expiry date, or estimated future expiry date.
We include data on SNP Mediated Adverse Drug Reactions and SNP Mediated Pharmacological Effects including a description of the effect, affected drugs, references, SNP IDs, and allele name, gene identifier and affected genotype and coverage of predicted markers for some pre-clincal drugs. In addition, the structured indication dataset provides detailed information on genetic variants that are part of the approved indication.
There are over 21,000 drug-protein interactions, covering targets, enzymes, carriers, and transporters. This includes annotations describing the pharmacological action, and the type of interaction (antagonist, agonist, substrate, inhibitor, inducer). Drug targets also include the protein and gene identifiers and sequences and are associated with over 13,000 unique references.
DrugBank offers robust drug metabolism data, including metabolism reactions associated with 110 enzymes. The data includes the type of reaction, the enzymes and other proteins involved and the name and structure of the drug metabolites.
We have manually extracted over 10,000 drug indications from FDA drug labels and scientific publications that cover every FDA and Health Canada approved indications as well as common off-label indications. They include a text description, severity, type of indication, and associated ICD10 and MedDRA identifiers.
We provide a comprehensive list of contraindications and black box warnings manually extracted from FDA drug labels and scientific publications. Each listing includes details about the contraindication, the name of the condition associated with the condtraindication, and associated ICD10 or MedDRA identifiers.
The adverse effects are datasets collected from clinical trial data, drug labels and post-market reporting and include incidence rates when available. Each listing includes the name of the condition, synonyms for the condition and associated ICD10 or MedDRA identifiers.
We provide over 1.3 million accurate, updated, and informed drug-drug interactions covering all FDA and Health Canada approved drugs sourced from drug labels and references.

Contact Sales

Our products and services can be tailored to your company’s needs. Contact us today to request a demo, and talk about what solution is right for you.