Useful python package for QSAR related tasks of chemoinformatician #chemoinformatics #oloren-ai #RDKit

When I posted my memo about open science, @OlorenAI introduced python package named Oloren ChemEngine (OCE). I often use chemprop or interanly build system for QSAR tasks. ChemProp is the one of favorite package because it is easy to use and it includes web application flamework for users. I’ve never used OEC so I triedContinue reading “Useful python package for QSAR related tasks of chemoinformatician #chemoinformatics #oloren-ai #RDKit”

Useful package for ploting chemical space rapidly #chemoinformatics #memo

Visualize chemical space is important task for chemoinformatitian. And there are lots of way to represent chemical space. One of the common approach is PCA. And recently tSNE and UMAP are used. I wrote template code for plotting these data in my task but didn’t write code as a package. Today I found useful packageContinue reading “Useful package for ploting chemical space rapidly #chemoinformatics #memo”

Define a function after the request #Flask #memo #python

I love flask and django for making web app and often use Flask for web app development. Sometime the app will serve files after getting user request. In this case, static files which are generated by the app will be stored in static folder. And the folder will store lots of files. So I wouldContinue reading “Define a function after the request #Flask #memo #python”

Compare shape and electrostatic similarity of molecules #RDKit #espsim #python

There are lots of way to define molecular similarity, for example fingerprint based, descriptor based, graph based, shape based etc. etc… In the 2D world, circular fingerprint based similarity is used in many case. However, 3D based similarity approach is also useful for drug design. As you now, OpenEye provides useful software named ‘ROCS’. ROCSContinue reading “Compare shape and electrostatic similarity of molecules #RDKit #espsim #python”

Convert bit-vector to comma separated strings #memo #chemoinformatics #RDKit

Today’s post will be very short :) I had an MRI scan for my knee at the hospital today. It took almost 6 moths after my surgery. And the result was very well. So I could be able to running as same as before getting the surgery. I’m not young but I would like toContinue reading “Convert bit-vector to comma separated strings #memo #chemoinformatics #RDKit”

Self docking study workflow with vina #chemoinformatics #vina #RDKit #pdb-tools

I posted about how to run vina from python. But I split receptor and ligand with pymol GUI at previous post, Hmm…. it’s not automated process. I tried to write code for full auto self docking with vina. It will work only very limited option and case but It’ll be first step for Virtual screeningContinue reading “Self docking study workflow with vina #chemoinformatics #vina #RDKit #pdb-tools”

Run docking study from python #chemoinformatics #vina #RDKit

Docking is one of the popular approach for computer aided drug design. There are lots of applications to run docking not only commercial software but also open source. AutoDock Vina is one of the popular OSS for docking study and it was updated recently. The publication is below. https://pubs.acs.org/doi/full/10.1021/acs.jcim.1c00203 Vina has python binding and itContinue reading “Run docking study from python #chemoinformatics #vina #RDKit”

Get environment SMILES around cutting points #chemoinformatics #memo #RDKit

In this week, I’m in summer vacation but can’t go travel due to COVID19 pandemic and heavy rain. It’s really unusual summer vacation. I hope everyone stay safe. BTW, I often use R-Group decomposition and Matched molecular pairs and these method generate many fragment smiles which has [*] at attachment points. And I would likeContinue reading “Get environment SMILES around cutting points #chemoinformatics #memo #RDKit”

Embed molecular editor into Streamlit app #streamlit #chemoinformatics #RDKit

I wrote some posts about usage of combination chemoinformatics and streamlit. One was predictive model application which was used rdkit and scikit-learn. When I tweeted that, Jan Jansen (who is Great quantum chemist and I met him RDKit UGM!!!) commented me that it is useful that if molecular drawer can use in the app ;)Continue reading “Embed molecular editor into Streamlit app #streamlit #chemoinformatics #RDKit”

Useful ML tool for chemoinformatics #chemoinformatics #RDKit #Machine learning

Yesterday, I moved my main PC from Ubuntu18.04 to 20.04LTS. Now it works well. And I’m building new(clean) env for my coding. Today I would like to share useful package for machine learning named pycaret. Brief introduction of PyCaret is below. —from original site—PyCaret is an open-source, low-code machine learning library in Python that automatesContinue reading “Useful ML tool for chemoinformatics #chemoinformatics #RDKit #Machine learning”

Difference between santize mol and not sanitize mol #memo #rdkit

I posted about fast compound search with rdkit. And in the post, I used patternfinger print in the post. Today I checked behavior of the fingerprint. Patternfingerprint can calculate molecules which is not sanitized. However the fingerprint is different to the fingerprint which is calculated from sanitized mol. Here is a simple example. The outputContinue reading “Difference between santize mol and not sanitize mol #memo #rdkit”

Relation ship between dihedral deg and atomic charge #psi4 #RDKit #psikit

Recently psikit repository got PR about RESP charge calculation. Thanks for PR. And I have question about the relation ship between compound conformation and partial charge. Fortunately, psikit already has an example for torsion scan thank @fmkz___ for sharing useful code. The example code is here. Following code is same as example code linked above.Continue reading “Relation ship between dihedral deg and atomic charge #psi4 #RDKit #psikit”

Substructure search with SMARTS query against ChEMBLDB #rdkit #razi #pychembldb

Recently I often use razi for making structure search because it is very easy to integrate many workflow written in python. Today I would like to show how to perform substructure search with SMARTS query in ChEMBL. Because I’m modifying pychembldb to integrate razi for enabling structure search in pychembldb. To perform substructure seach withContinue reading “Substructure search with SMARTS query against ChEMBLDB #rdkit #razi #pychembldb”

Optimize ML model with optuna and visualize the result with MLFlow #informatics #machine learning

As you know Optuna is very useful and powerful package for machine learning. I often use the package in my own task. And MLFLOW is also useful package. I posted about mlflow before. MLflow has many functions for visualize experiment results and manage models. https://iwatobipen.wordpress.com/2018/11/14/tracking-progress-of-machine-learning-machinelearning/ I think it will be useful if models can beContinue reading “Optimize ML model with optuna and visualize the result with MLFlow #informatics #machine learning”

Conformal prediction with python and rdkit_2 #RDKit #QSAR #Conformal_prediction

I posted about conformal prediction with python and rdkit some days ago. After that I could get very informative advice from @kjelljorner. Thanks a lot! His advice was below. Kjell Jorner @kjelljorner3dReplying to @iwatobipen I can recommend the cross conformal prediction or bootstrapped conformal prediction (also in nonconformist) to avoid having to put aside data for calibration.Continue reading “Conformal prediction with python and rdkit_2 #RDKit #QSAR #Conformal_prediction”

Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics

Some days ago, I posted how to visualize SVG images horizontally and ngboost for QSAR problem. It worked well. And I found that different models showed different performance. So my question is that which point each model detects important for molecular properties. Fortunately rdkit has GetSimilarityMapForModel method which can render the probe molecule with model’sContinue reading “Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics”

Draw molecules as SVG in horizontal layout #Drawing #RDKit #memo

As you know, Greg posted cool code about new drawing code options of rdkit 202003. You can read details of them in following URLhttp://rdkit.blogspot.com/2020/04/new-drawing-options-in-202003-release.html It’s really cool! New version of rdkit can render molecule with many options in high quarity. In the post, molecules are rendered as SVG image one molecule per one cell. IContinue reading “Draw molecules as SVG in horizontal layout #Drawing #RDKit #memo”

One liner command tool for LillyMedChemRules #Chemoinformatics #memo

There are many substructure files are available in these days. And LillyMedChem Rules is one of useful and famous filter. It works very fast and provides reasonable results. However the implementation returns the result as multiple files. So user need to marge files after filtration. So I wrote small script to conduct filter the moleculesContinue reading “One liner command tool for LillyMedChemRules #Chemoinformatics #memo”

Cut molecule to ring and linker with RDKit #RDKit #chemoinformatics #memo

Sometime chemists analyze molecule as small parts such as core, linker and substituents. RDKit has functions for molecular decomposition RECAP, BRICS and rdMMPA. It’s useful functions but these functions can’t extract directly extract core and linker from molecule. I had interested how to do it and tried it. Following code, Core is defined Rings inContinue reading “Cut molecule to ring and linker with RDKit #RDKit #chemoinformatics #memo”

New trial of AttentiveFP with new atom feature #DGL #RDKit #Chemoinformatics

Recently I posted an example of AttentiveFP and I found that atom weights doesn’t directly reflect functional groups. And I could get useful suggestion via comment from DGL developper! And I wonder that how about to use functional group feature to train the model. But how can I detect functional groups in the molecule? BecauseContinue reading “New trial of AttentiveFP with new atom feature #DGL #RDKit #Chemoinformatics”