Think about de novo molecule generation #memo #journal #RDKit #CReM

Recently there are many publications about de-novo molecular generator which mainly use Deep Learning. One problem of the approach is that generated molecules are not systematic so it’s difficult to synthesis them with parallel chemistry. So sometime chemists dislike the proposal from generated form the method I think. Rule or Rxn or MMP based moleculeContinue reading “Think about de novo molecule generation #memo #journal #RDKit #CReM”

Use RDKit from NIM language #RDKit #Nim #memo

Recently I bought a book which title is ‘Nim in Action’ and started to learn NIM language. I never touched language like a nim-lang which is required compile to run the code. I had interest about nim-lang because, many documents say NIM is speedy and efficient, and grammar seems like python. And also, rdkit bindingContinue reading “Use RDKit from NIM language #RDKit #Nim #memo”

Python package for Ensemble learning #Chemoinformatics #Scikit learn

Ensemble learning is a technique for machine learning. I wrote post about blending learning before. URL is below.https://iwatobipen.wordpress.com/2018/11/11/ensemble-learning-with-scikit-learn-and-xgboost-machine-learning/I implemented the code by myself at that time. Ensemble learning sometime outperform than single model. So it is useful for try to use the method. Fortunately now we can use ensemble learning very easily by using aContinue reading “Python package for Ensemble learning #Chemoinformatics #Scikit learn”

Small molecule MD with openMM #MD #Openforcefield

I updated openforcefield from ver 0.5 to ver 0.6. ForceField of SMIRNOFF is also updated. I tried to use new version of OpenFF.At first, I calculated partial charge with semi empirical method ‘AM1-BCC’. Ambertools is used for the calculation, it is easy. Just finished, check the result. Nitrogen has the most negative charge and neighborContinue reading “Small molecule MD with openMM #MD #Openforcefield”

Useful package for descriptor calculation #chemoinformatics #rdkit

Descriptor calculation is an important task for chemoinfomatics. I often use rdkit to do it. And today I found very useful package for descriptor calculation which name is descriptorus. URL is below. https://github.com/bp-kelley/descriptastorus It is very easy to install the package. Just following command. After did it, I could use the package. By using theContinue reading “Useful package for descriptor calculation #chemoinformatics #rdkit”

Calculate solvent effect in Psi4 #psi4 #quantumchemistry

Recently I use not only chemoinformatics tools but also quantum chemistry tool, my favorite is Psi4. Psi4 has many options and plug-ins for quantum calculation. Most setting of calculation is vacuum, but it actually true. So considering the solvent around the molecules is important. Can psi4 perform calculation with solvent effect? Yes! PCMSolver is pluginContinue reading “Calculate solvent effect in Psi4 #psi4 #quantumchemistry”

Psikit update/Draw ESP, HOMO LUMO #RDKit #Chemoinformatics #quantumchemistry

I just updated psikit which is package for quantum-chemoinformatics ;) It can be installed from conda / pypi :) I added and updated new function for molecular property rendering. Current version of psikit can draw not only frontier orbital but also ESP and dual descriptor. Dual descriptor is calculated by psi4. What is dual descriptor?Continue reading “Psikit update/Draw ESP, HOMO LUMO #RDKit #Chemoinformatics #quantumchemistry”

Enumerate partial heteroaromatic rings in a molecule #RDKit #Chemoinformatics

I posted hetero shuffling before. It worked well but redundant. There is a nice code in RDKit UGM2017 material. URL is below. https://github.com/rdkit/UGM_2017/blob/master/Notebooks/Cole-Enumerate-Heterocycles.ipynb The code defined transformation with hard coding and seems nice. In case of real project, we sometime would like to do enumeration against partial substructure not all structure. I thought how toContinue reading “Enumerate partial heteroaromatic rings in a molecule #RDKit #Chemoinformatics”

Current version Openforcefield supports rdkit #RDKit #Openforcefield #chemoinformatics

I posted about openforcefield(OpenFF) before. You know, old version of openff supports only OpenEyeTK but current version supports RDKit too. It is worth to know that we can use openff with open source tool kit. I really appreciate developer’s work! It is great. Today I use the package and ipymol which can control pymol inContinue reading “Current version Openforcefield supports rdkit #RDKit #Openforcefield #chemoinformatics”

Adjustment bond length and align molecule to scaffold #RDKit #chemoinformatics

To align rdkitmol object to given scaffold, GenerateDepictionMatching2DStructure is useful to do that. But, somedays ago I asked my colleague that the function could not adjust bond length when scaffold which is read from sdf and it has different bond length of RDKit’s default settings. I found useful information in rdkit-discuss about rdMolTransforms. rdMolTransforms.TransformConformer methodContinue reading “Adjustment bond length and align molecule to scaffold #RDKit #chemoinformatics”

Open Source Lilly’s Chemoinformatics Package

In 2012, lilly’s researchers published Lilly-MedChem Rules in J. Med. Chem. and disclosed their code on github. After the publication, the rules are used in many applications, papers and chemoinformatics applications. Open source tool made a big impact on chemoinformatics. Several hours ago I found an interesting tweet from @jcheminf. They reported an algorithm ofContinue reading “Open Source Lilly’s Chemoinformatics Package”

Make interactive chemical space plot in jupyter notebook #cheminformatics #Altair

I often use seaborn for data visualization. With the library, user can make beautiful visualization. BTW, today I tried to use another library that can make interactive plot in jupyter notebook. Name of the library is ‘altair’. https://altair-viz.github.io/index.html The library can be installed from pip or conda and this package based vega and vega-lite. VegaContinue reading “Make interactive chemical space plot in jupyter notebook #cheminformatics #Altair”

Build stacking Classification QSAR model with mlxtend #chemoinformatics #mlxtend #RDKit

I posed about the ML method named ‘blending’ somedays ago. And reader recommended me that how about try to use “mlxtend”. When I learned ensemble learning package in python I had found it but never used. So try to use the library to build model. Mlxtend is easy to install and good document is providedContinue reading “Build stacking Classification QSAR model with mlxtend #chemoinformatics #mlxtend #RDKit”

Vote Vote Vote #chemoinformatics

Somedays ago, I posted about ensemble classification method named ‘blending’. The method is not implemented in scikit-learn. So I am implementing the function now. By the way, scikit-learn has an ensemble classification method named ‘VotingClassifer’. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html#sklearn.ensemble.VotingClassifier Following explanation from sklearn document. The idea behind the VotingClassifier is to combine conceptually different machine learning classifiers andContinue reading “Vote Vote Vote #chemoinformatics”

Visualize pharmacophore in RDKit #RDKit

RDKit has pharmacophore feature assignment function. The function can retrieve molecular features based on pre-defined ph4core. And RDKit IPythonconsole can draw molecules on ipython notebook. Today I tried to visualize ph4core on notebook. Code is very simple. First, load feature definition. Then calculate pharmacophore. And compute 2D cordes. Next I defined drawing function. To highlightContinue reading “Visualize pharmacophore in RDKit #RDKit”

Generate possible list of SMLIES with RDKit #RDKit

In the computer vision, it is often used data augmentation technique for getting large data set. On the other hand, Canonical SMILES representations are used in chemoinformatics area. RDKit UGM in last year, Dr. Esben proposed new approach for RNN with SMILES. He expanded 602 training molecules to almost 8000 molecules with different smiles representationContinue reading “Generate possible list of SMLIES with RDKit #RDKit”

Tracking progress of machine learning #MachineLearning

To conduct machine learning it is needed to optimize hyper parameters. For example scikit-learn provides grid search method. And you know there are several packages to do that such as hyperopt or gyopt etc. How do you mange builded models? It is difficult for me. Recently I am interested in mlflow . MLflow is anContinue reading “Tracking progress of machine learning #MachineLearning”

Read maestro format file from RDKit

RDKitter knows that Schrodinger contributes RDKit I think. https://www.schrodinger.com/news/schr%C3%B6dinger-contributes-rdkit Schrodinger provides many computational tools for drug discovery, that is not only GUI tool but also python API. Many tool can call from python and also RDKit. And RDKit can read maestro file vise versa. It is easy to do it like reading SDFiles. I amContinue reading “Read maestro format file from RDKit”

Run rdkit and deep learning on Google Colab! #RDKit

If you can not use GPU on your PC, it is worth to know that you can use GPU and/or TPU on google colab. Now you can use google colab no fee. So, I would like to use rdkit on google colab and run deep learning on the app. Today I tried it. At firstContinue reading “Run rdkit and deep learning on Google Colab! #RDKit”

Make predictive models with small data and visualize it #Chemoinformatics

I enjoyed chemoinformatics conference held in Kumamoto in this week. The first day of the conference, I could hear about very interesting lecture. That was very basic data handling and visualization tutorial but useful for newbie of chemoinformatics. I would like to reproduce the code example, so I tried it. First, visualize training data. ItContinue reading “Make predictive models with small data and visualize it #Chemoinformatics”