Make Drug central ER diagram with python #chemoinfo

Recently I knew useful database “DrugCentral“. From About. DrugCentral provides information on active ingredients chemical entities, pharmaceutical products, drug mode of action, indications, pharmacologic action. We monitor FDA, EMA, and PMDA for new drug approval on regular basis to ensure currency of the resource. By using the site, user can search many information on webContinue reading “Make Drug central ER diagram with python #chemoinfo”


Make MMP network and send to cytoscape #chemoinfo

Recently I use cytoscape in my laboratory. You know Cytoscape is nice tool for network visualization. I often make data with python and import data from cytoscape. The work flow is not so bad but I am thinking that it will be nice if python can communicate with cytoscape. Fortunately cytocape has REST plugin calledContinue reading “Make MMP network and send to cytoscape #chemoinfo”

inter and intra reaction handling in RDKit #RDKit

RDKit can handle reaction. Enumeration of many molecules with template reaction and building blocks are useful for library generation. Recently I have a question about how to handle intramolecular reactions with RDKit such as micro cyclization etc. In the case of amidation reaction that is often used for drug synthesis SMARTS query is below. ‘[C:1][C:2](=[O:6])[O:3].[N:4][C:5]>>[C:1][C:2](=[O:6])[N:4][C:5]’Continue reading “inter and intra reaction handling in RDKit #RDKit”

RDKit 2018.03.01 release! #rdkit

Dear RDKitter, It’s good news that new version of rdkit is released! You can find details in original repository. There are many improvement and bug fixes in the release. I appreciate developers work! Recent version of RDKit has lots of 3D descriptors. PMI/NPR. And new version of rdkit has new function “ComputePrincipalAxesAndMoments” thatContinue reading “RDKit 2018.03.01 release! #rdkit”

Install indigo tool kit to OSX and make python wrapper #Indigo #chemoinfo

I am not familiar with Indigo TK. I only have used Indigo TK via Knime. Indigo TK provides python wrapper, so if I can build indigo TK from source and python wrapper all task can do only python. ( for me ) That sounds nice. So I tried to install Indigo TK from source. ItContinue reading “Install indigo tool kit to OSX and make python wrapper #Indigo #chemoinfo”

Multi-armed Bandit problem

I am interested in reinforcement learning. It is difficult for me. @_@ I tried to implement very simple and famous problem called ‘multi-armed bandit’. Image from wikipedia.. The multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes theirContinue reading “Multi-armed Bandit problem”

Edge Attention-based Multi-Relational GCN #pytorch #RDKit #DeepLearning

In the chemoinformatics area molecules are represented as graph, atom as node and bond as edge. In the ML area, Graph Convolution is catching a great deal of attention I think. Today I would like to introduce new approach which is proposed by Chao SHANG’s group. They developed Edge attention-based Multi-Relational Graph Convolutional Networks. URLContinue reading “Edge Attention-based Multi-Relational GCN #pytorch #RDKit #DeepLearning”

Chemical space visualization and clustering with HDBSCAN and RDKit #RDKit

I caught a flu last Monday. So I stay home and rest in this week…… :-( I’m having a fever and hope will get better soon. BTW, recently I found new technique for dimension reduction called Uniform Manifold Approximation and Projection (UMAP). It was also topics in my twitter’s TL. URL links of original paperContinue reading “Chemical space visualization and clustering with HDBSCAN and RDKit #RDKit”

mol encoder with Pytorch

Variable Auto Encoder (VAE) is unique method that is used for learning latent representations. VAE encodes discriminative vector to continuous vector in latent space. There are lots of examples in github. In 2016, Alán Aspuru-Guzik reported new de novo design method by using VAE. The approach represents molecules as SMLIES and SMILES strings are convertedContinue reading “mol encoder with Pytorch”

Get 3D feature descriptors from PDB file

If reader is interested in drug discovery, chemoinformatics, deep learning or MD, I think the reader might read the article below. KDEEP is predictor that uses Deep learning(CNN) for affinity prediction. Regarding the article, I found the new python library named HTMD ( High Through Put Molecular Dynamics ). Really I am not goodContinue reading “Get 3D feature descriptors from PDB file”

mol2vec analogy of word2vec #RDKit

Last year researcher who is in Bio MedX published following article. And recently the article was published from ACS. The concept of mol2vec is same as word2vec. Word2vec converts word to vector with large data set of corpus and showed success in NLP. Mol2Vec converts molecules to vector with ECFP information. Fortunately Mol2Vec sourceContinue reading “mol2vec analogy of word2vec #RDKit”

Simple way for making SMILES file #RDKit

To convert SDF to SMILES I write like a following code. In this way, to write smiles strings with properties it is needed to get properties by using GetProp(“some prop”). If I need several properties my code tend to be long. Greg who is developer of RDKit advised me to use SmilesMolWriter. ;) I haveContinue reading “Simple way for making SMILES file #RDKit”

API for opentargets

Association of drug targets with diseases are important information for drug discovery. There are lots of databases to provide these information I think. I like python. ;-) So, I am interested in following article. Opentargets is a ” a data integration and visualization platform that provides evidence about the association of known and potentialContinue reading “API for opentargets”

Build QSAR model with pytorch and rdkit #RDKit

There are many frameworks in python deeplearning. For example chainer, Keras, Theano, Tensorflow and pytorch. I have tried Keras, Chainer and Tensorflow for QSAR modeling. And I tried to build QSAR model by using pytorch and RDKit. You know, pytorch has Dynamic Neural Networks “Define-by-Run” like chainer. I used solubility data that is provided fromContinue reading “Build QSAR model with pytorch and rdkit #RDKit”

Ultra fast clustering script with RDKit #RDKit

Some years ago, I got very useful information for molecular clustering. Bayon is ultra fast clustering tool. The author made not only Japanese-tutorial but also English-tutorial. This tools is easy to use but to use bayon in chemoinformatics area, user needs data preparation. I wrote simple script that converts smiles to bayon input format andContinue reading “Ultra fast clustering script with RDKit #RDKit”

Latent semantic analysis with python

Yesterday, I learned about gensim. Gensim is a free python library for topic modeling. The library seems easy to use and is implemented lots of method like doc2vec, word2vec…. At first, I tried basic tutorial for doc2vec and similarity queries. Code is following. Almost is same as official site. In this tutorial, I foundContinue reading “Latent semantic analysis with python”

Calculate USRCAT with RDKit #RDKit

Some years ago, I posted blog about USRCAT. USRCAT is shape based method like ROCS. And it works very fast. The code was freely available but to use the code, user need to install it. But as you know, new version of RDKit implements this function! That is good news isn’t it. I triedContinue reading “Calculate USRCAT with RDKit #RDKit”

Draw high quality molecular image in RDKit #rdkit

Recently, I want to draw high quality image molecule using RDKit. Older version of RDKit png image is not enough for me. I found the solution in RDKit discuss. The discussion recommended to install cairocffi. I installed cairocffi via conda. But… Result is not enough for me. ( this case is my Mac environment. LinuxContinue reading “Draw high quality molecular image in RDKit #rdkit”

QED calculation on RDKit 2017.09 #RDKit

QED (quantitative estimate of drug-likeness ) is an one of score of drug likeness reported by Hopkins group. The author provided QED calculator for pipeline pilot. So QED could not calculate without pipeline pilot. But, now we can calculate QED by using RDKit! RDKit 201709 was implemented QED descriptor. Seems good, let’s use theContinue reading “QED calculation on RDKit 2017.09 #RDKit”