chemoinformatics – Is life worth living?

Add hydrogen with user defined pH from python #openbabel #cheminformatics

As many cheminformaticians know that Openbabel is one of the famous and useful package of Cheminformatics as same as RDKit. Openbabel provides not only CLI but also API for some programming languages including python of course ;). Openbabel can protonate molecule with user defiened pH. The function is not available from current version of RDKit.Continue reading “Add hydrogen with user defined pH from python #openbabel #cheminformatics”

Try to use LBDD package #lig3dlens #datamol #rdkit

Now we can use folding AI such as Alpha-fold to predict 3D structure of target proteins of course homology modeling is also used to do it. But these structures are snap shot and apo form. So Ligand based drug design(LBDD) Ligand based virtual screening (LBVS) are still important strategy for drug discovery. Lots of LBVSContinue reading “Try to use LBDD package #lig3dlens #datamol #rdkit”

Generate new molecules from fragments with Diffusion model #cheminformatics #rdkit #difflinker #memo

Designing linked molecule from fragments is one of the important task for drug desing such as FBDD, Scaffold hopping (e.g. replace core) and PROTAC molecule design. As readers know there are lots of solutions to do it, for examoke BROOD is one of the famous commercial package for fragment replacement. I can’t use commercial packageContinue reading “Generate new molecules from fragments with Diffusion model #cheminformatics #rdkit #difflinker #memo”

Try to use new version of REINVENT #cheminformatics #memo #rdkit

As many cheminformaticians know that (I expected…) REINVENT which is developed by AZ team is one of the useful and famous AI based compound generator in cheminformatics field. New version of REINVENT 4 is still active and recently it is version apped and added some useful code on the repositly. The DL framework is movedContinue reading “Try to use new version of REINVENT #cheminformatics #memo #rdkit”

Edit atom indices of RDKit Mol object #memo #cheminformatics

Atom indecies are unique number of each atom. And RDKit adds the index when mol object is generated. RDKit makes mol object from SMILES, Inchi, molblock and lots of formats. To make canonical representation of molecules, the atom indices are asigned allways same roules in automatically. The indices are asigned regardless of scaffold. So ifContinue reading “Edit atom indices of RDKit Mol object #memo #cheminformatics”

Current status of DMTA cycle in AZ #memo #DDT #publication #AI/ML

March is end of fiscal year in most of Japanese company. I spent lots of time for paper work in these days… ;P As many readers know that, DMTA cycle is key of drug discovery/optimization process and lots of computational and high thoughput experimental apporaches are available in the process recenlty. I think AZ isContinue reading “Current status of DMTA cycle in AZ #memo #DDT #publication #AI/ML”

Pocket awaer structure generation #DiffDec #cheminformatics

Diffusion model is the one of hot area of generative model. It’s not only computer vision but also cheminformatics. Diffusion model is interesting because it generates object from some noise. BTW, de novo compound design with target protein structure information is really attractive but difficult approach in drug design. There are some approaches to conductContinue reading “Pocket awaer structure generation #DiffDec #cheminformatics”

Update rdkit/shape-it #RDKit #shape-it #cheminformatics

Today I tried to build rdkit/shape-it because I could not build shape-it with current version of rdkit. So I struggled error message to fix the issue.(I’m not so good at C++ ;P) There are two issues in the current code.1. Version of c++ in the CMakeLists.txt is old, so I changed it from c++14 toContinue reading “Update rdkit/shape-it #RDKit #shape-it #cheminformatics”

Run conformer generation in parallel #RDKit #Cheminformatcs #lig3dlens

There are useful open source packages not only SBDD but also LBDD in these days. VSFlow is the one of useful package for LBDD and I used it in my cheminformatcs tutorial before. And recently I found another interesting package for LBDD named lig3dlens which is developed by researchers at healx, AI drug discovery company.Continue reading “Run conformer generation in parallel #RDKit #Cheminformatcs #lig3dlens”

New ML package for cheminformatics #cheminformatics #QSAR #ML

I introduced scikit-mol in my blog post before. The package integrates scikit-learn and rdkit. It’s easy to use because user can build QSAR model from scikit-learn’s API. I like the package. And recently I found another useful package for cheminformatics named ‘molflux‘ witch is developed by researchers in Exsicentia, famous AI Drug Discovery pharma. molfluxContinue reading “New ML package for cheminformatics #cheminformatics #QSAR #ML”

New type of python notebook #marimo #cheminformatics #RDKit

Jupyter-lab, Jupyter-note book, streamlt and other packages are useful for data science beucase it can analyze and visualize data step by step. I like streamlit and dash for making simple web app. And some days ago I found new and cool package named marimo. From the documentaion, marimo is an open-source reactive notebook for Python — reproducible,Continue reading “New type of python notebook #marimo #cheminformatics #RDKit”

ChEMBL multitask prediction with python requrests #memo #chembl #cheminfo

ChEMBL provides multitask prediction model from its github repo. And shared useful blog post.https://chembl.blogspot.com/2019/05/multi-task-neural-network-on-chembl.html By using the code, we can get predicted target list from given molecules. And the prediction can run python, C++, JS and Knime! ChEMBL team provides not only source code but also predicted results when we search compound in ChEMBL DB.Continue reading “ChEMBL multitask prediction with python requrests #memo #chembl #cheminfo”

Useful package for filtering molecules of python #RDKit #Python #memo

I wrote blog post of how to use Lilly filter from REINVENT4. In the post, I build lilly filter from source code. After posted, Hadrien introduced me a package for using lilly filter which can call from python. Thanks! The package is provided from datamol-io’s reposidory named medchem. I had intreste the package so I installedContinue reading “Useful package for filtering molecules of python #RDKit #Python #memo”

Try to use new version of compound generator REINVENT4 #reinvent #RDKit #cheminfo

Recently there are lots of Deep Learning based compound generators which are called AI compound generator in some case are reported. One of the useful and my favorite package is REINVENT developed by AstraZeneca’s team. Original repository of Reinvent is archived so I thought that the development is completed or they took into the projectContinue reading “Try to use new version of compound generator REINVENT4 #reinvent #RDKit #cheminfo”

Regist new molecules with lwreg from web app #RDKit #lwreg #cheminformatics

Now there is RDKit based chemical cartridge for sqlite3 and posgresql. So it’s useful for developping cheminfo web app with these databases. By using these databases with web app, we need to defne the schema of databases at first. Also many cheminformaticians handle lots of molecules. So it’s useful to register their idea or handledContinue reading “Regist new molecules with lwreg from web app #RDKit #lwreg #cheminformatics”

Build new version of RDKit from source #RDKit #memo

I enjoyed RDKit UGM 2023 and found that new version of rdkit has lots of cool new functions. RDKit 20230901 isn’t released yet but beta version is released recently. The beta version isn’t installed from conda now. So I tried to build and install rdkit 2023091b from source. RDKit provides Install.md for the building ofContinue reading “Build new version of RDKit from source #RDKit #memo”

Visualize result of molshap #RDKit #chemoinformatics #memo

Finding the best combination of substituents is very important task for compound optimization. Recently there are lots of methods to predict the combination. By using Deep learning or other complex methods are difficult understand for chemists. So I like Free Wilson analysis, because it’s simple but easy to understand because FW analysis uses liner regression.Continue reading “Visualize result of molshap #RDKit #chemoinformatics #memo”

Some updates of rdkit_cli #RDKit #chemoinformatics #typer

I enjoyed a user group meeting last week I could have lots of useful discussion there thanks for all presenters and perticipants ;) BTW, I’m still enjoying to make RDKit based CLI tool and I added some routine task to the code. Of course some function is already impremented other tool such as openbabel etc.Continue reading “Some updates of rdkit_cli #RDKit #chemoinformatics #typer”

Develop rdkit-cli tool with typer #RDKit #chemoinformatics #typer

Recently I’m learning not only coding but also no-code tool such as knime. It’s important to chatch up wide range of informatics IMHO…. BTW, as you know rdkit has lots of useful tools. And there are many useful packages which depend on RDKit. It’s because rdkit is growing very rapidly. Today I learned useful packageContinue reading “Develop rdkit-cli tool with typer #RDKit #chemoinformatics #typer”

Make learning process with human in the loop #optuna #memo #python #rdkit

Many chemofinromaticians have checked Greg’s great blog post which describes how to draw draw molecules in various way. https://greglandrum.github.io/rdkit-blog/posts/2023-05-26-drawing-options-explained.html Rendering molecule doesn’t directly contribute drug design but it’s really importnt for us because medicinal chemists have their own preferences for drawing moleucles such as font size, highlight color etc, etc. It’s too difficult to parametarizeContinue reading “Make learning process with human in the loop #optuna #memo #python #rdkit”