CHEMBL is a one of big public database. It has lots of useful data.
If you are good at python, pychembldb will be good tool.
I like the python package.
And I found another package chembl “chembl_web_client”.
The package can install using pip.
iwatobipen$ sudo pip install chembl_web_client
Yah, it easy. OK Let’s get data from python !
For example following code get data about Ty-kinase inh c-Met.
For example, chembl_assay_id is 1003887.
from chembl_webresource_client import * assay = AssayResource() comp = CompoundResource() target = TargetResource() bio_act = assay.bioactivities("CHEMBL1003887") print bio_act[0] {u'activity_comment': u'Unspecified', u'assay_chemblid': u'CHEMBL1003887', u'assay_description': u'Inhibition of MET', u'assay_type': u'B', u'bioactivity_type': u'IC50', u'ingredient_cmpd_chemblid': u'CHEMBL509101', u'name_in_reference': u'2', u'operator': u'=', u'organism': u'Homo sapiens', u'parent_cmpd_chemblid': u'CHEMBL509101', u'reference': u'J. Med. Chem., (2008) 51:17:5330', u'target_chemblid': u'CHEMBL3717', u'target_confidence': 8, u'target_name': u'Hepatocyte growth factor receptor', u'units': u'nM', u'value': u'1.8'}
Result can get as python dict-type. If you can use pandas, dic can convert data frame.
import pandas as pd df = pd.DataFrame(bio_act) In [15]: df.head() Out[15]: activity_comment assay_chemblid assay_description assay_type \ 0 Unspecified CHEMBL1003887 Inhibition of MET B 1 Unspecified CHEMBL1003887 Inhibition of MET B 2 Unspecified CHEMBL1003887 Inhibition of MET B 3 Unspecified CHEMBL1003887 Inhibition of MET B 4 Unspecified CHEMBL1003887 Inhibition of MET B bioactivity_type ingredient_cmpd_chemblid name_in_reference operator \ 0 IC50 CHEMBL509101 2 = 1 IC50 CHEMBL459876 49 = 2 IC50 CHEMBL459875 4 = 3 IC50 CHEMBL508403 47 = 4 IC50 CHEMBL451789 40 = organism parent_cmpd_chemblid reference \ 0 Homo sapiens CHEMBL509101 J. Med. Chem., (2008) 51:17:5330 1 Homo sapiens CHEMBL459876 J. Med. Chem., (2008) 51:17:5330 2 Homo sapiens CHEMBL459875 J. Med. Chem., (2008) 51:17:5330 3 Homo sapiens CHEMBL508403 J. Med. Chem., (2008) 51:17:5330 4 Homo sapiens CHEMBL451789 J. Med. Chem., (2008) 51:17:5330 target_chemblid target_confidence target_name units \ 0 CHEMBL3717 8 Hepatocyte growth factor receptor nM 1 CHEMBL3717 8 Hepatocyte growth factor receptor nM 2 CHEMBL3717 8 Hepatocyte growth factor receptor nM 3 CHEMBL3717 8 Hepatocyte growth factor receptor nM 4 CHEMBL3717 8 Hepatocyte growth factor receptor nM value 0 1.8 1 1.8 2 1.3 3 470 4 1400
Also easy to get compound data.
cmp = comp.get( df.ingredient_cmpd_chemblid[0] ) c=comp.get(df.ingredient_cmpd_chemblid[0]) print c {u'smiles': u'Fc1ccc(cc1)N2C=CC=C(C(=O)Nc3ccc(Oc4ccnc5[nH]ccc45)c(F)c3)C2=O', u'chemblId': u'CHEMBL509101', u'passesRuleOfThree': u'No', u'molecularWeight': 458.42, u'molecularFormula': u'C25H16F2N4O3', u'acdLogp': 1.92, u'stdInChiKey': u'OBSFXHDOLBYWRJ-UHFFFAOYSA-N', u'acdLogd': 1.91, u'knownDrug': u'No', u'medChemFriendly': u'No', u'rotatableBonds': 5, u'acdAcidicPka': 10.72, u'alogp': 3.52, u'numRo5Violations': 0, u'species': u'NEUTRAL', u'acdBasicPka': 5.46}
Easy to use. ;-)
One thought on “access CHEMBL DB with python”