Update RDKit and use new contrib #RDKit #Chemoinformatics

As chemoinformatitians know ;) recently new version of rdkit is released. I appreciate the great work for developpers! One of the interesting point for me is that, FreeWilson(FW) analysis was added to Contrib. FW analysis is a traditional approach of chemoinformatics but I think it’s really MedChem friendly approach. You can get lots of informationContinue reading “Update RDKit and use new contrib #RDKit #Chemoinformatics”


New tool for automatic docking simulation with python #chemoinformatics #CADD #rdkit #vina

After my surgery, I could run 16km today at the first time ;) The pace was slow but I felt that I’m getting well. So I could spend nice week end… Ah, let’s back to chemoinforamtics topic. Previously, I posted how to run docking study with autodock-vina via python. I think it’s interesting because IContinue reading “New tool for automatic docking simulation with python #chemoinformatics #CADD #rdkit #vina”

CLI based FreeWilson Analysis #chemoinformatics #RDKit

I love a great blog post by @wpwalters‘s ‘Practical Chemoinformatics‘. The blog is worth to read because it provides not only useful knowledge of chemoinformatics but also code. And one of my favorite post is about Free-Wilson(FW). http://practicalcheminformatics.blogspot.com/2018/05/free-wilson-analysis.html BTW if SAR has an additive trend, FW will be useful tool for finding the best combinationContinue reading “CLI based FreeWilson Analysis #chemoinformatics #RDKit”

Attentive FP with PyG #RDKit #PyG #pytorch_geometric #Chemoinformatics

As you know PyG is one of the useful package for graph based neural network as same as DGL-lifesci. Fortunately recent version of PyG is easy to install because it supports conda. So to install PyG, user don’t need to install related package such as pytorch_scatter, pytorch-cluster etc. etc. And PyG has lots of predefinedContinue reading “Attentive FP with PyG #RDKit #PyG #pytorch_geometric #Chemoinformatics”

Run autodock-vina from ODDT #oddt #chemoinformatics #SBDD

I posted about auto_dock vina python bindings. It’s useful for python user because it can call autodock vina from python and run docking study on your python interprinter, jupyter notebook or script. I knew that ODDT (open drug discovery toolkit) supports virtual screening with vina and it has lots of useful method for drug discoveryContinue reading “Run autodock-vina from ODDT #oddt #chemoinformatics #SBDD”

Plot calibration curve with scikit-learn 1.0 #chemoinformatics #scikit-learn #memo

Recently scikit-learn ver. 1.0(nightly build) is released. I often use sklearn for my ask. So I would like to use new version ;) Current stable version is 0.24 so I installed 1.0.rc2 via pip. Here is a release Highlights and notes. https://scikit-learn.org/dev/auto_examples/release_highlights/plot_release_highlights_1_0_0.html https://scikit-learn.org/dev/whats_new/v1.0.html#changes-1-0 Ver 1.0 CalibrationDisplay method which can make calibration-curve plot easily. So IContinue reading “Plot calibration curve with scikit-learn 1.0 #chemoinformatics #scikit-learn #memo”

Try to use exmol to explain why the model predicts it #chemoinfomratics #RDKit #exmol

One of the difficult point of ML predictive model for chemoinformatics task is explainability of the model, why the model predicts these molecules the class. Especially if we use non liner model such as SVM, RF, NN, the problem is very important to have discussion with chemists because chemists would like to know that whyContinue reading “Try to use exmol to explain why the model predicts it #chemoinfomratics #RDKit #exmol”

Convert bit-vector to comma separated strings #memo #chemoinformatics #RDKit

Today’s post will be very short :) I had an MRI scan for my knee at the hospital today. It took almost 6 moths after my surgery. And the result was very well. So I could be able to running as same as before getting the surgery. I’m not young but I would like toContinue reading “Convert bit-vector to comma separated strings #memo #chemoinformatics #RDKit”

Probabilistic Random Forest approach to predict experimental value #RDKit #chemoinformatics #machine_learning

To build predictive model, input value(X) and target value(y) is required. But in the drug discovery area target value always has experimental error. So any experimental value (target value) may have uncertainly and it makes difficult to build predictive model. Recently Ola Engkvist group who is in AZ published interesting article in Jounral of chemoinformatics.Continue reading “Probabilistic Random Forest approach to predict experimental value #RDKit #chemoinformatics #machine_learning”

Run FMCS from C++ #RDKit #Chemoinformatics

I often write code with python because it has lots of useful packages, documents and community. And first programming language which I learned is python. So I’ve never wrote chemoinformatics code without python. But I have interested in coding with C++ / Rust because it works very fast. Today, I tried to wrote code withContinue reading “Run FMCS from C++ #RDKit #Chemoinformatics”

Cross docking study with python #Vina #Pymol #RDKit

I hope reader doing well and having nice weekend. Due to COVID-19 pandemic, our life is dramatically changed. I would like to go camp with my family when the pandemic is over. Last month, I wrote post about self docking (how to prepare input file and run vina from python) with vina-python API. Today IContinue reading “Cross docking study with python #Vina #Pymol #RDKit”

Self docking study workflow with vina #chemoinformatics #vina #RDKit #pdb-tools

I posted about how to run vina from python. But I split receptor and ligand with pymol GUI at previous post, Hmm…. it’s not automated process. I tried to write code for full auto self docking with vina. It will work only very limited option and case but It’ll be first step for Virtual screeningContinue reading “Self docking study workflow with vina #chemoinformatics #vina #RDKit #pdb-tools”

Memo from ACS medicinal chemistry letters #memo #journal

Recently I spend most of my working time at my desk because I’m member of chemoinformatics team. When I worked at bench as a chemist, I often used not only LCMS but also TLC for checking reaction progress. UV lump is common tool for visualizing spot if reagents has uv reactive substituents such as phenylContinue reading “Memo from ACS medicinal chemistry letters #memo #journal”

Run docking study from python #chemoinformatics #vina #RDKit

Docking is one of the popular approach for computer aided drug design. There are lots of applications to run docking not only commercial software but also open source. AutoDock Vina is one of the popular OSS for docking study and it was updated recently. The publication is below. https://pubs.acs.org/doi/full/10.1021/acs.jcim.1c00203 Vina has python binding and itContinue reading “Run docking study from python #chemoinformatics #vina #RDKit”

Get environment SMILES around cutting points #chemoinformatics #memo #RDKit

In this week, I’m in summer vacation but can’t go travel due to COVID19 pandemic and heavy rain. It’s really unusual summer vacation. I hope everyone stay safe. BTW, I often use R-Group decomposition and Matched molecular pairs and these method generate many fragment smiles which has [*] at attachment points. And I would likeContinue reading “Get environment SMILES around cutting points #chemoinformatics #memo #RDKit”

An experiment with MorganFP #RDKit #chemoinformatics

Tokyo Olympic opening ceremony is coming. I hope the Olympic will be safe and secure. Today I tested to compare fingerprints from molecule and fragments. Because if I can construct the identical fingerprint from fragments, it’s useful. Following code is very simple example used 1,3 and 1,2 di-substituted benzene. I calculated fingerprint of each moleculeContinue reading “An experiment with MorganFP #RDKit #chemoinformatics”

Data analysis of MMP “from OR to Fluorine” #memo #journal

Control lipophilicity is key strategy for drug design. Medicinal chemist often struggle to ADMET issue due to their compound lipophilicity. Sometime ADMET issue can be improved by reducing LogD but it cause loss of potency. Matched molecular pair approach is used to find bioisosteric replacements that mean the substructure replacement with keeping potency but changeContinue reading “Data analysis of MMP “from OR to Fluorine” #memo #journal”

Extract macro cyclic compounds from ChEMBLDB with rdkit_cartridge #chemoinformatics #RDKit

As you know, RDKit has really useful postgreSQL cartridge. So it’s easy to do substructure or similarity search of query structure from postgresql ChEMBLDB. But I don’t know how to retrieve only macrocyclic compound from ChEMBLDB with the cartridge. I searched RDKit mailing list and got good answer! Thanks for the community. This is theContinue reading “Extract macro cyclic compounds from ChEMBLDB with rdkit_cartridge #chemoinformatics #RDKit”

Depict molecules which are aligned by user defined scaffold #RDKit #Chemoinformatics #Basics

I’m happy that I could start running again after my knee’s surgery ;) Chemoinformatics & Running are nice combination for my life. BTW if you are medicinal chemist, 2D coordinate of molecules in your SAR table is really important I think. For example molecules are not aligned to your favorite orientation of scaffold is notContinue reading “Depict molecules which are aligned by user defined scaffold #RDKit #Chemoinformatics #Basics”