It was just memorandum for my self.
PostgreSql can save sql result as csv using COPY( sql statement ) TO ‘filelocation/filename’
And ‘\i file.sql’ command can execute SQL from file.
https://www.postgresql.org/docs/9.1/static/app-psql.html
COPY( SELECT AC.DOC_ID, AC.MOLREGNO, ST.MOLREGNO, ST.CANONICAL_SMILES, CP.QED_WEIGHTED, CP.ALOGP, CP.MOLECULAR_SPECIES, AC.STANDARD_TYPE, AC.STANDARD_RELATION, AC.STANDARD_VALUE, AC.STANDARD_UNITS, TD.PREF_NAME, BS.SITE_NAME, PCS.PROTEIN_CLASS_SYNONYM, PCS.SYN_TYPE FROM ASSAYS AS A, ACTIVITIES AS AC, COMPOUND_STRUCTURES AS ST, COMPOUND_PROPERTIES AS CP, TARGET_DICTIONARY AS TD, BINDING_SITES AS BS, PROTEIN_CLASS_SYNONYMS AS PCS, COMPONENT_CLASS AS CC, COMPONENT_SEQUENCES AS CS, TARGET_COMPONENTS AS TC WHERE A.ASSAY_ID=AC.ASSAY_ID AND ST.MOLREGNO=AC.MOLREGNO AND TD.TID = A.TID AND TD.TID = BS.TID AND TD.TID = TC.TID AND ST.MOLREGNO = CP.MOLREGNO AND TC.COMPONENT_ID = CS.COMPONENT_ID AND CC.COMPONENT_ID = CS.COMPONENT_ID AND CC.PROTEIN_CLASS_ID = PCS.PROTEIN_CLASS_ID AND CONFIDENCE_SCORE >= 9 AND ASSAY_TYPE='B' AND AC.STANDARD_RELATION='=' AND AC.STANDARD_TYPE = 'IC50' AND AC.STANDARD_UNITS='nM' AND TD.TARGET_TYPE='SINGLE PROTEIN' AND CP.MW_FREEBASE <= 600 ) TO '/Users/iwatobipen/chembldata/chembl.csv' DELIMITER ',';
I’m totally new to extracting data from Chembl using sql. I downloaded chembl sql database. However, I’m not sure how to point to the database and use your script to extract. Could you please direct me how to do that? My main goal is to obtain chembl data for target prediction but I wanted to start from your script to learn how to do these extractions first. Thank you!
Hi, thanks for your comment. To run the script, you should install postgreSQL and create ChEMBL DB on it.
You can get ChEMBL DB from here. https://chembl.gitbook.io/chembl-interface-documentation/downloads
After installation, run the PostgreSQL on your PC, then login. Then you can run the script from console.
\i hogehoge.sql is command of postgreSQL console which executes script written in hogehoge.sql.
Pls let me know you have questions or troubles.
Also if you use ubuntu following URL will be help for you or if you use MacOX, you can install postgreSQL with homebrew.
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-ubuntu-18-04
Thanks,
Thank you for your help! I’ll do that and let you know if I run into any issues.
So I installed postgresql on linuxdownload the chembl postgresql database from the link you gave (i have chembl_25_postgresql.dmp downloaded)
in chembl postgresql installation instructions it asks to do 2 things:
I created a database name like this:
sudo -i -u postgres
psql
create database chembl_25
But I’m not sure what to give for HOST, PORT, USERNAME arguments in this commands
$> pg_restore -h HOST -p PORT -U USERNAME -d chembl_25 chembl_25_postgresql.dmp
Sorry these must be really basic sql database setup questions!
Thank you!
Hi Sue_chem,
If you want to use ChEMBLDB in you PC only, you don’t pass port and host option. FYI, postgresql uses port 5432.
If I could understand your situation correctly, now you have installed postgres SQL and got Chembldb.
Now you created postgresql user.
Following code is an example.
Then changed pg_hba.conf.
Please check following URL.
https://stackoverflow.com/questions/18664074/getting-error-peer-authentication-failed-for-user-postgres-when-trying-to-ge
Now ready, run the pg_restore.
OK check db.
Thank you so much for the detailed instructions! The linux environment I used is on a AWS instance. I then also tried to follow your instructions on my mac. Installed postgresql using brew and downloaded the chembl postgresql database.However systemctl wasn’t recognized on the mac. And seems like I can’t even get it on my mac. So I again went back to the linux instance on the AWS. Gave the host (ec2 id), port 5432 but got errors. I went back and checked on AWS /usr/lib/postgresql/9.3/bin/initdb is not even there. That folder has many other files but not initdb you used. I’m not sure what would be easier, try on Mac or the AWS linux instance. Sorry this is a way too complicated question to ask.
Hi, thank you for your prompt reply.
I’m sorry that I have no experience the installation on AWS. And systemctl is not available on MacOS. My understanding is correctly, you couldn’t find initdb. How about try to updatedb command from shell and search initdb?
https://stackoverflow.com/questions/44104241/postgresql-error-initdb-command-not-found
Here is an example for OSX postgresql installation.
https://www.codementor.io/engineerapart/getting-started-with-postgresql-on-mac-osx-are8jcopb
And there is another option I think. ChEBLDB is provided not only postgresql dump file and also sqlite3 dumpfile. If you use sqlite3 dumpfile, you don’t need to setup DB server.
Here is an example http://www.dalkescientific.com/writings/diary/archive/2017/03/24/chembl_bioactivity_data.html.
Also sqlite3 is available from python.
I hope this will help for you.
Thanks,
Thank you so much for all your suggestions! I appreciate it very very much! It’s more than I ever expected. I’m going to take all your suggestions and try to get it to work. So many things to try. Thank you!