Make pandas dataframe with r-group information #memo

I often forget many things …. So there are same topics will be posted in my blog. Sometime it’s updated due to change of package version or some reasons. And I posted very similar code previously. But I posted again to remember the procedure for myself. It’s just memo… PandasTools of RDKit makes easy toContinue reading “Make pandas dataframe with r-group information #memo”

Integration razi and pychembldb #RDKit #Chemoinformatics #razi #sqlalchemy #ChEMBL

As you know, sqlalchemy is very useful ORM of python. I love the package and also chemoinformatician is familiar to ChEMBLDB I think. There are very useful package for these people one is razi and the other is pychembldb. Razi is chemical cartridge for postgressql with rdkit functionality and pychembldb is python wrapper of ChEMBLDBContinue reading “Integration razi and pychembldb #RDKit #Chemoinformatics #razi #sqlalchemy #ChEMBL”

Rendering molecular image tooltips on Bokeh #RDKit #memo #visualization

Recently there many plotting tools for python package! I can’t follow everything… I mainly use seaborn and matplotlib. These tools are nice for rendering beautiful chart but if I would like to interactive plot, I need to switch plotting tools. So I started learn another tools and today I used Bokeh. (Reader already know bokeh,Continue reading “Rendering molecular image tooltips on Bokeh #RDKit #memo #visualization”

Conformal prediction with python and rdkit_2 #RDKit #QSAR #Conformal_prediction

I posted about conformal prediction with python and rdkit some days ago. After that I could get very informative advice from @kjelljorner. Thanks a lot! His advice was below. Kjell Jorner @kjelljorner3dReplying to @iwatobipen I can recommend the cross conformal prediction or bootstrapped conformal prediction (also in nonconformist) to avoid having to put aside data for calibration.Continue reading “Conformal prediction with python and rdkit_2 #RDKit #QSAR #Conformal_prediction”

Conformal prediction with python and rdkit #RDKit #QSAR #Conformal_prediction

Recently Greg shared nice webiner about conformal prediction in Youtube. He introduced basic concept of conformal prediction and demonstration with excellent work flow of KNIME. I recommend to check the site. ;) Conformal prediction is not new method. It can estimate confidence of predicted values. Traditional predictive model can predict probability, or class ofContinue reading “Conformal prediction with python and rdkit #RDKit #QSAR #Conformal_prediction”

Think about de novo molecule generation #memo #journal #RDKit #CReM

Recently there are many publications about de-novo molecular generator which mainly use Deep Learning. One problem of the approach is that generated molecules are not systematic so it’s difficult to synthesis them with parallel chemistry. So sometime chemists dislike the proposal from generated form the method I think. Rule or Rxn or MMP based moleculeContinue reading “Think about de novo molecule generation #memo #journal #RDKit #CReM”

Replace core with DeLinker #RDKit #Chemoinformatics #DeepLearning

In the FBDD projects, fragment linking strategy is very easy to understand about the strategy but it is difficult to linking two fragments in the real world I think. There are many tools for linking fragments in virtually. These tools are used not only be applied to FBDD but also scaffold hopping etc. There areContinue reading “Replace core with DeLinker #RDKit #Chemoinformatics #DeepLearning”

Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics

Some days ago, I posted how to visualize SVG images horizontally and ngboost for QSAR problem. It worked well. And I found that different models showed different performance. So my question is that which point each model detects important for molecular properties. Fortunately rdkit has GetSimilarityMapForModel method which can render the probe molecule with model’sContinue reading “Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics”

Predict probabilistic distribution with NGBoost #NGBoost #RDKit #QSAR #Chemoinformatics

Recently novel gradient boosting method was published from Andrew Ng group. It is interesting that NGBoost can calculate not only probability but also probabilistic distribution. It is useful for QSAR because we would like to know not only predicted value/class but also uncertainly of the prediction. Fortunately NGBoost is available from python! It can beContinue reading “Predict probabilistic distribution with NGBoost #NGBoost #RDKit #QSAR #Chemoinformatics”

Draw molecules as SVG in horizontal layout #Drawing #RDKit #memo

As you know, Greg posted cool code about new drawing code options of rdkit 202003. You can read details of them in following URL It’s really cool! New version of rdkit can render molecule with many options in high quarity. In the post, molecules are rendered as SVG image one molecule per one cell. IContinue reading “Draw molecules as SVG in horizontal layout #Drawing #RDKit #memo”

Ultra fast similarity search with GPU #RDKit #chemoinformatics #postgresql-rdkit

Recently chemoinformatician need to tackle against huge amount of molecules. Search similar molecules from millions of compound database. Last year, schrodinger which is computer science company disclosed useful code for fast compound search module named gpusimilarity. You can get details of the module from schrodingers github repository. URL is below. The algorithm is implemented inContinue reading “Ultra fast similarity search with GPU #RDKit #chemoinformatics #postgresql-rdkit”

Draw scaffold tree as network with molecular image #RDKit #Cytoscape

I posted new function about scaffold tree which is implemented in rdkit 2020 03 before. In previous my post, I showed example to draw scaffold tree with networkx. It could draw the scaffold tree as a network but molecular structures are not shown on the node. For chemist, structure image is important so I triedContinue reading “Draw scaffold tree as network with molecular image #RDKit #Cytoscape”

Test new method of rdkit:2020

Now beta version of rdkit is available from anaconda! So I would like to try it. However I would like to test without contaminating current my environment. So I tried new version of rdkit with Docker. Fortunately rdkit can be installed via conda, so I made Dockerfie based on miniconda3. Following dockerfile used continuumio/miniconda3. ByContinue reading “Test new method of rdkit:2020”

Use ORBKIT for rendering MO #orbkit #rdkit #psikit #quantum_chemistry

As you know, there many packages for quantum chemistry not only commercial software but also free tools. And each soft has own output format. So user need to understand how to use it. But it is very time consuming step I think. ORBIKIT is a modular python toolbox for cross-platform post processing of quantum chemicalContinue reading “Use ORBKIT for rendering MO #orbkit #rdkit #psikit #quantum_chemistry”

New molecular fingerprint for chemoinformatics #map4 #RDKit #memo #chemoinformatics

Molecular fingerprint(FP) is a very important for chemoinformatics because it is used for building many predictive models not only ADMET but also biological activities. As you know, ECFP (Morgan Fingerprint) is one of golden standard FP of chemoinformatics. Because it shows stable performance against any problems. After ECFP is reported, many new fingerprint algorithm isContinue reading “New molecular fingerprint for chemoinformatics #map4 #RDKit #memo #chemoinformatics”

Benchmarking platform for generative models. #RDKit #Chemoinformatics #DeepLearning #guacamol

Yesterday I posted benchmarking platform named ‘moses’ and found it worked for test data. And then I could get comment from @Mufei Li, developer of DGL that how about to try guacamol. I checked guacamol before but didn’t try it. So I installed guacamol and used it. From original repo, GuacaMol is an open source PythonContinue reading “Benchmarking platform for generative models. #RDKit #Chemoinformatics #DeepLearning #guacamol”

Benchmarking platform for generative models. #RDKit #Chemoinformatics #DeepLearning #moses

There are lots of publications about molecular generators. Each publication implements novel algorithms so we need tool for comparing these models that which is better for us. I often use PCA, tSNE for chemical space visualization and calculate some scores such as QED, SA/SC Score and molecular properties. However I need the unified metrics. SoContinue reading “Benchmarking platform for generative models. #RDKit #Chemoinformatics #DeepLearning #moses”

Rendering molecular orbital on Jupyter notebook #psikit #py3dmol #rdkit #memo

@fmkz___ and I( @iwatobipen ) are developing psikit which is a thin wrapper of psi4 and rdkit. I hope the package integrates quantum chemistry (Psi4) and chemoinformatics (RDKit). By using psikit, user can make molecular orbital data very convinienlry. Rendering MO is useful for understanding molecular electrostatic shape and nature, but sometime it is difficultContinue reading “Rendering molecular orbital on Jupyter notebook #psikit #py3dmol #rdkit #memo”

Example usage of psi4-openmm-interface #Psi4 #OpenMM #RDKit

Molecular dynamics and Quantum Chemistry are important tools for CADD. I have interested in these topics and OpenMM and Psi4 are nice tool to handing MD and QM. Today I tried to use psi4-openmm-interface which allows passing of molecular systems between each program. I reviewed test script and found that the package pass the moleculeContinue reading “Example usage of psi4-openmm-interface #Psi4 #OpenMM #RDKit”

Example code of DGL for chemoinformatics task #DGL #chemoinformatics #RDKit #memo

There are many publications about graph based approach for chemoinformatics area. I can’t cover all of them but still have interest these area. I think pytorch_geometric (PyG) and deep graph library (DGL) are very attractive and useful package for chemoinformaticians. I wrote some posts about DGL and PyG. Recent DGL is more chemoinformatics friendly soContinue reading “Example code of DGL for chemoinformatics task #DGL #chemoinformatics #RDKit #memo”