mmtfPyspark.datasets.drugBankDataset module

drugBankDatset.py

This module provides access to DrugBank containing drug structure and drug target imformation. These datasets contain identifiers and names for integration with other data resources.

References

get_dataset(url, username=None, password=None)[source]

Downloads a DrugBank dataset

Parameters:

url : str

DrugBank dataset download links

username : str, optional

DrugBank username <None>

password : str, optional

DrugBank password <None>

Returns:

dataset

DrugBank dataset

Downloads drug structure external links and identifiers from DrugBank. Either all or subsets of data can be downloaded by specifying the DrugGroup:

ALL, APPROVED, EXPERIMENTAL, NUTRACEUTICAL, ILLLICT, WITHDRAWN, INVESTIGATIONAL.

The structure external links datasets include drug structure information in the form of InChI/InChI Key/SMILES as well as identifiers for other drug-structure resources (such as ChEBI, ChEMBL,ChemSpider, BindingDB, etc.). Included in each dataset is also the PubChem Compound ID (CID) and the particular PubChem Substance ID (SID) for the given DrugBank record.

These DrugBank datasets are released under a Creative Common’s Attribution-NonCommercial 4.0 International License. They can be used freely in your non-commercial application or project. A DrugBank user account and authentication is required to download these datasets.

Parameters:

durgGroup : str

specific dataset to be downloaded, has to be in the pre-defined DURG_GROUP list

usesrname : str

DrugBank username

password : str

DrugBank password

Returns:

dataset

DrugBank Dataset

References

External Drug Links:
https://www.drugbank.ca/releases/latest#external-links

Examples

Get dataset of external links and identifiers of approved drugs:

>>> username = "<your DrugBank username>"
>>> String password = "<your DrugBank password>"
>>> drugLinks = get_drug_links("APPROVED", username, password)
>>> drugLinks.show()

Downloads drug target external links and identifiers from DrugBank. Either all or subsets of data can be downloaded by specifying the DrugGroup:

ALL, APPROVED, EXPERIMENTAL, NUTRACEUTICAL, ILLLICT, WITHDRAWN, INVESTIGATIONAL.
OR DrugType:
SMALL_MOLECULE, BIOTECH.

The drug target external links datasets include drug name, drug type (small molecule, biotech), UniProtID and UniProtName.

These DrugBank datasets are released under the Creative Common’s Attribution-NonCommercial 4.0 International License. They can be used freely in your non-commercial application or project. A DrugBank user account and authentication is required to download these datasets.

Parameters:

durg : str

specific dataset to be downloaded, has to be either in the DrugGroup list OR DrugType list.

usesrname : str

DrugBank username

password : str

DrugBank password

Returns:

dataset

DrugBank Dataset

References

Target Drug-UniProt:
https://www.drugbank.ca/releases/latest#external-links

Examples

Get dataset of drug target external links and identifiers of all drugs in DrugBank:

>>> username = "<your DrugBank username>"
>>> password = "<your DrugBank password>"
>>> drugTargetLinks = get_drug_target_links("ALL",
...                                         username,
...                                         password)
>>> drugTargetLinks.show()

Downloads the DrugBank Open Data dataset with drug structure external links and identifiers. See DrugBank.

This dataset contains drug common names, synonyms, CAS numbers, and Standard InChIKeys.

The DrugBank Open Data dataset is a public domain dataset that can be used freely in your application or project (including commercial use). It is released under a Creative Common’s CC0 International License.

Returns:

dataset

DrugBank Dataset

References

Open Data dataset. https://www.drugbank.ca/releases/latest#open-data

Examples

Get DrugBank open dataset:

>>> openDrugLinks = DrugBankDataset.get_open_drug_links()
>>> openDrugLinks.show()
+----------+--------------------+-----------+--------------------+
|DrugBankID|          Commonname|        CAS|    StandardInChIKey|
+----------+--------------------+-----------+--------------------+
|   DB00006|         Bivalirudin|128270-60-0|OIRCOABEOLEUMC-GE...|
|   DB00014|           Goserelin| 65807-02-5|BLCLNMBMMGCOAS-UR...|
+----------+--------------------+-----------+--------------------+