Make pandas dataframe with r-group information #memo

I often forget many things …. So there are same topics will be posted in my blog. Sometime it’s updated due to change of package version or some reasons. And I posted very similar code previously. But I posted again to remember the procedure for myself. It’s just memo… PandasTools of RDKit makes easy toContinue reading “Make pandas dataframe with r-group information #memo”

Integration razi and pychembldb #RDKit #Chemoinformatics #razi #sqlalchemy #ChEMBL

As you know, sqlalchemy is very useful ORM of python. I love the package and also chemoinformatician is familiar to ChEMBLDB I think. There are very useful package for these people one is razi and the other is pychembldb. Razi is chemical cartridge for postgressql with rdkit functionality and pychembldb is python wrapper of ChEMBLDBContinue reading “Integration razi and pychembldb #RDKit #Chemoinformatics #razi #sqlalchemy #ChEMBL”

Rendering molecular image tooltips on Bokeh #RDKit #memo #visualization

Recently there many plotting tools for python package! I can’t follow everything… I mainly use seaborn and matplotlib. These tools are nice for rendering beautiful chart but if I would like to interactive plot, I need to switch plotting tools. So I started learn another tools and today I used Bokeh. (Reader already know bokeh,Continue reading “Rendering molecular image tooltips on Bokeh #RDKit #memo #visualization”

Conformal prediction with python and rdkit_2 #RDKit #QSAR #Conformal_prediction

I posted about conformal prediction with python and rdkit some days ago. After that I could get very informative advice from @kjelljorner. Thanks a lot! His advice was below. Kjell Jorner @kjelljorner3dReplying to @iwatobipen I can recommend the cross conformal prediction or bootstrapped conformal prediction (also in nonconformist) to avoid having to put aside data for calibration.Continue reading “Conformal prediction with python and rdkit_2 #RDKit #QSAR #Conformal_prediction”

Replace core with DeLinker #RDKit #Chemoinformatics #DeepLearning

In the FBDD projects, fragment linking strategy is very easy to understand about the strategy but it is difficult to linking two fragments in the real world I think. There are many tools for linking fragments in virtually. These tools are used not only be applied to FBDD but also scaffold hopping etc. There areContinue reading “Replace core with DeLinker #RDKit #Chemoinformatics #DeepLearning”

Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics

Some days ago, I posted how to visualize SVG images horizontally and ngboost for QSAR problem. It worked well. And I found that different models showed different performance. So my question is that which point each model detects important for molecular properties. Fortunately rdkit has GetSimilarityMapForModel method which can render the probe molecule with model’sContinue reading “Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics”

Ultra fast similarity search with GPU #RDKit #chemoinformatics #postgresql-rdkit

Recently chemoinformatician need to tackle against huge amount of molecules. Search similar molecules from millions of compound database. Last year, schrodinger which is computer science company disclosed useful code for fast compound search module named gpusimilarity. You can get details of the module from schrodingers github repository. URL is below.https://github.com/schrodinger/gpusimilarity/blob/master/gpusimilarity_rdkit_presentation.pdf The algorithm is implemented inContinue reading “Ultra fast similarity search with GPU #RDKit #chemoinformatics #postgresql-rdkit”

One liner command tool for LillyMedChemRules #Chemoinformatics #memo

There are many substructure files are available in these days. And LillyMedChem Rules is one of useful and famous filter. It works very fast and provides reasonable results. However the implementation returns the result as multiple files. So user need to marge files after filtration. So I wrote small script to conduct filter the moleculesContinue reading “One liner command tool for LillyMedChemRules #Chemoinformatics #memo”

Draw scaffold tree as network with molecular image #RDKit #Cytoscape

I posted new function about scaffold tree which is implemented in rdkit 2020 03 before. In previous my post, I showed example to draw scaffold tree with networkx. It could draw the scaffold tree as a network but molecular structures are not shown on the node. For chemist, structure image is important so I triedContinue reading “Draw scaffold tree as network with molecular image #RDKit #Cytoscape”

Test new method of rdkit:2020

Now beta version of rdkit is available from anaconda! So I would like to try it. However I would like to test without contaminating current my environment. So I tried new version of rdkit with Docker. Fortunately rdkit can be installed via conda, so I made Dockerfie based on miniconda3. Following dockerfile used continuumio/miniconda3. ByContinue reading “Test new method of rdkit:2020”

Use ORBKIT for rendering MO #orbkit #rdkit #psikit #quantum_chemistry

As you know, there many packages for quantum chemistry not only commercial software but also free tools. And each soft has own output format. So user need to understand how to use it. But it is very time consuming step I think. ORBIKIT is a modular python toolbox for cross-platform post processing of quantum chemicalContinue reading “Use ORBKIT for rendering MO #orbkit #rdkit #psikit #quantum_chemistry”

New molecular fingerprint for chemoinformatics #map4 #RDKit #memo #chemoinformatics

Molecular fingerprint(FP) is a very important for chemoinformatics because it is used for building many predictive models not only ADMET but also biological activities. As you know, ECFP (Morgan Fingerprint) is one of golden standard FP of chemoinformatics. Because it shows stable performance against any problems. After ECFP is reported, many new fingerprint algorithm isContinue reading “New molecular fingerprint for chemoinformatics #map4 #RDKit #memo #chemoinformatics”

Benchmarking platform for generative models. #RDKit #Chemoinformatics #DeepLearning #guacamol

Yesterday I posted benchmarking platform named ‘moses’ and found it worked for test data. And then I could get comment from @Mufei Li, developer of DGL that how about to try guacamol. I checked guacamol before but didn’t try it. So I installed guacamol and used it. From original repo, GuacaMol is an open source PythonContinue reading “Benchmarking platform for generative models. #RDKit #Chemoinformatics #DeepLearning #guacamol”

Benchmarking platform for generative models. #RDKit #Chemoinformatics #DeepLearning #moses

There are lots of publications about molecular generators. Each publication implements novel algorithms so we need tool for comparing these models that which is better for us. I often use PCA, tSNE for chemical space visualization and calculate some scores such as QED, SA/SC Score and molecular properties. However I need the unified metrics. SoContinue reading “Benchmarking platform for generative models. #RDKit #Chemoinformatics #DeepLearning #moses”

Example usage of psi4-openmm-interface #Psi4 #OpenMM #RDKit

Molecular dynamics and Quantum Chemistry are important tools for CADD. I have interested in these topics and OpenMM and Psi4 are nice tool to handing MD and QM. Today I tried to use psi4-openmm-interface which allows passing of molecular systems between each program. I reviewed test script and found that the package pass the moleculeContinue reading “Example usage of psi4-openmm-interface #Psi4 #OpenMM #RDKit”

Example code of DGL for chemoinformatics task #DGL #chemoinformatics #RDKit #memo

There are many publications about graph based approach for chemoinformatics area. I can’t cover all of them but still have interest these area. I think pytorch_geometric (PyG) and deep graph library (DGL) are very attractive and useful package for chemoinformaticians. I wrote some posts about DGL and PyG. Recent DGL is more chemoinformatics friendly soContinue reading “Example code of DGL for chemoinformatics task #DGL #chemoinformatics #RDKit #memo”

Use pytorch for QSAR model building more simply like scikit-learn #pytorch #chemoinformatics #RDKit

I often use pytorch for deep learning framework. I like pytorch because it is very flexible and many recent articles are used it for their implementation. But to build model and train the model, I need to define training method. So it seems nice if I can train pytorch model just calling fit like scikit-learnContinue reading “Use pytorch for QSAR model building more simply like scikit-learn #pytorch #chemoinformatics #RDKit”

Make molecule mesh data #RDKit #chemoinformatics #meshlab

I have an interest to predictive model build with 3D compound information. Pytorch3d and open3d seems attractive package for me. However, to use the package, I need to convert molecular information to 3D data such as pointcloud etc. At first I tried it to use openbabel because recent version of openbabel can convert molecule fromContinue reading “Make molecule mesh data #RDKit #chemoinformatics #meshlab”

Use RDKit from NIM language #RDKit #Nim #memo

Recently I bought a book which title is ‘Nim in Action’ and started to learn NIM language. I never touched language like a nim-lang which is required compile to run the code. I had interest about nim-lang because, many documents say NIM is speedy and efficient, and grammar seems like python. And also, rdkit bindingContinue reading “Use RDKit from NIM language #RDKit #Nim #memo”

Scaffold growing with RNN #RDKit #Pytorch #Chemoinformatics

My favorite molecular generator is REINVENT which is SMILES RNN based generator. Because it is very flexible and easy to modify. And recently same group in Astrazeneca published new version of REINVENT, its title is ‘SMILES-Based Deep Generative Scaffold Decorator for De-Novo Drug Design‘ It seems very exciting for me! Because there are many molecularContinue reading “Scaffold growing with RNN #RDKit #Pytorch #Chemoinformatics”