Can machine learn important feature from SMILES?

Today I found challenging article in arxiv. It describes about SMILES2Vec. https://arxiv.org/pdf/1712.02034.pdf You know word2vec is very attractive and major application for ML area and SMILES2Vec has same concept. It converts smiles to vector and learn which character is important. The author use “black box” models for building model. I am not sure about “blackContinue reading “Can machine learn important feature from SMILES?”

Advertisement

QED calculation on RDKit 2017.09 #RDKit

QED (quantitative estimate of drug-likeness ) is an one of score of drug likeness reported by Hopkins group. https://www.ncbi.nlm.nih.gov/pubmed/22270643 The author provided QED calculator for pipeline pilot. So QED could not calculate without pipeline pilot. But, now we can calculate QED by using RDKit! RDKit 201709 was implemented QED descriptor. Seems good, let’s use theContinue reading “QED calculation on RDKit 2017.09 #RDKit”

Create MMPDB ( matched molecular pair )!

Matched molecular pair analysis is very common method to analyze SAR for medicinal chemists. There are lots of publications about it and applications in these area. I often use rdkit/Contrib/mmpa to make my own MMP dataset. The origin of the algorithm is described in following URL. https://www.ncbi.nlm.nih.gov/pubmed/20121045 Yesterday, good news announced by @RDKit_org. It isContinue reading “Create MMPDB ( matched molecular pair )!”

3d conformer fingerprint calculation using RDKit # RDKit

Recently, attractive article was published in ACS journal. The article describes how to calculate 3D structure based fingerprint and compare some finger prints that are well known in these area. New method called “E3FP” is algorithm to calculate 3D conformer fingerprint like Extended Connectivity Fingerprint(ECFP). E3FP encodes information only atoms that are connected but alsoContinue reading “3d conformer fingerprint calculation using RDKit # RDKit”

Installing TensorFlow on Mac OX X with GPU support

Yesterday, I tried to install tensorflow-gpu on my mac. My PC is MacBook Pro (Retina, 15-inch, Mid 2014). The PC has NVIDA GPU. OS is Seirra. Details are described in following URL. https://www.tensorflow.org/install/install_mac I installed tensorflow directly by using pip command. Almost done, but not finished yet. To finish the installation, I need to disableContinue reading “Installing TensorFlow on Mac OX X with GPU support”

Open drug discovery toolkit for python

Recently There are lots of python libraries for chemoinformatics and machine learning. One of my favorites is RDKit. ;-) These area is still active. And today I tried new library named “ODDT” open drug discovery toolkit. Reference URL is https://jcheminf.springeropen.com/articles/10.1186/s13321-015-0078-2. ODDT is well documented in http://oddt.readthedocs.io/en/latest/index.html?highlight=InteractionFingerprint. ⭐️ Oddt is implemented shape and electronic similarities!! IContinue reading “Open drug discovery toolkit for python”

Graph convolution classification with deepchem

I posted about graph convolution regression using deepchem. And today, I tried graph convolution classification using deepchem. Code is almost same as regression model. The only a difference point is use dc.models.MultitaskGraphClassifier instead of dc.models.MultitaskGraphRegressor. I got sample ( JAK3 inhibitor ) data from chembl and tried to make model. At first I used pandasContinue reading “Graph convolution classification with deepchem”

Graph convolution regression with deepchem

Somedays ago, I posted blog about deepchem. I am still playing with deepchem. Today I tried to use graph convolution regression model. Deepchem provided Graph convolution Regressor. Cool. I used solubility data provided from AstraZeneca. https://www.ebi.ac.uk/chembl/assay/inspect/CHEMBL3301364 My test code is following. Almost same as deepchem”s example code. CSVLoader method is very useful because it canContinue reading “Graph convolution regression with deepchem”

integration of spotfire and pdb viewer

Some years ago, I heard a presentation about implementation of pdb viewer in spotfire in JCUP. It was really impressive for me because spotfire can not handle PDB files. You know, spotfire is one of the popular tool for data visualization. I like the tool. Recently I found unique library for spotfire named ‘JSViz’. TheContinue reading “integration of spotfire and pdb viewer”

how to get molecular graph features

Belated I am interested in deepchem that is an open-source deep learning toolkit for drug discovery. Deep-chem supported many features for chemoinformatics. And one of interested feature is calculation of molecular graphs. It is more primitive than hashed finger print. I tried to caluclate it. Currently the toolkit supports only linux, so I installed deepchemContinue reading “how to get molecular graph features”

Target prediction using local ChEMBL

Yesterday, I posed about target prediction using ChEMBLDB web API. If I want to predict many molecules, it will need many time. So, I changed code to use local chembldb. I used sqlalchemy, because the library is powerful and flexible to use any RDB. Test code is following. The sample code needs a smiles stringsContinue reading “Target prediction using local ChEMBL”

Target prediction using ChEMBL

You know, there are some database that can publicly available database in chemo informatics area. ChEMBL DB is one of useful database. George Papadatos introduced useful tool for target prediction using ChEMBL. He provided chembl target prediction model via ftp server ! So, everyone can use the model. I used the model and tried toContinue reading “Target prediction using ChEMBL”

Draw molecule with atom index in RDKit

I found interesting topics in rdkit discuss. How to draw molecule with atom index. Greg developer of RDKit answered tips to do it. It can use molAtomMapNumber. https://sourceforge.net/p/rdkit/mailman/message/31663468/ I didn’t know that! I tried that in my PC. RDKit can draw molecule easily using IPythonConsole. Test in a kinase inhibitor Draw molecule. https://github.com/iwatobipen/chemo_info/blob/master/rdkit_notebook/drawmol_with%2Bidx.ipynb

Convert rdkit molecule object to igraph graph object.

Molecules are often handled as graph in chemoinformatics. There are some libraries for graph analysis in python. Today, I wrote a sample script that convert from molecule to graph. I used python-igraph and rdkit. RDkit has method to get adjacency matrix from molecule so, I used the method. Code is following. Now test it. SeemsContinue reading “Convert rdkit molecule object to igraph graph object.”

Convert chemical file format .

Recently I knew KCF file format. The format represents molecules as graph structure. And it is used in KEGG. KCF uses atom label and orientation, and bond information. RDKit or Openbabel can not convert sdf 2 kcf. Someone who want to convert sdf to kcf. KEGG site provide us API for file format conversion. ExampleContinue reading “Convert chemical file format .”

Build regression model in Keras

I introduced Keras in mishimasyk#9. And my presentation was how to build classification model in Keras. A participant asked me that how to build regression model in Keras. I could not answer his question. After syk#9, I searched Keras API and found good method. Keras has Scikit-learn API. The API can build regression model. ;-)Continue reading “Build regression model in Keras”

Cool web based data analytical platform

Yesterday, I enjoyed mishima.syk#9. ;-) I hope all participants also enjoyed the meeting. BTY, I found cool platform for data analysis, named “Superset”. https://github.com/airbnb/superset You can see cool review in README.md. If reader who want to install superset, it is very easy. For example in MacOS. Only use pip ! Now you can access localhost:8088.Continue reading “Cool web based data analytical platform”

GLARE algorithm using RDKit with python3

Now version of RDKit has many tools. And I interested in the Glare algorithm. https://github.com/rdkit/rdkit/blob/master/Contrib/Glare/glare.py This algorithm is used for good quality library generation from large set of reagents. In the method, key point is pre calculation of reagent properties and sum the value for product. So, It does not need calculate product property onContinue reading “GLARE algorithm using RDKit with python3”