RDKit – Is life worth living?

Try to use LBDD package #lig3dlens #datamol #rdkit

Now we can use folding AI such as Alpha-fold to predict 3D structure of target proteins of course homology modeling is also used to do it. But these structures are snap shot and apo form. So Ligand based drug design(LBDD) Ligand based virtual screening (LBVS) are still important strategy for drug discovery. Lots of LBVSContinue reading “Try to use LBDD package #lig3dlens #datamol #rdkit”

Generate new molecules from fragments with Diffusion model #cheminformatics #rdkit #difflinker #memo

Designing linked molecule from fragments is one of the important task for drug desing such as FBDD, Scaffold hopping (e.g. replace core) and PROTAC molecule design. As readers know there are lots of solutions to do it, for examoke BROOD is one of the famous commercial package for fragment replacement. I can’t use commercial packageContinue reading “Generate new molecules from fragments with Diffusion model #cheminformatics #rdkit #difflinker #memo”

Try to use new version of REINVENT #cheminformatics #memo #rdkit

As many cheminformaticians know that (I expected…) REINVENT which is developed by AZ team is one of the useful and famous AI based compound generator in cheminformatics field. New version of REINVENT 4 is still active and recently it is version apped and added some useful code on the repositly. The DL framework is movedContinue reading “Try to use new version of REINVENT #cheminformatics #memo #rdkit”

Edit atom indices of RDKit Mol object #memo #cheminformatics

Atom indecies are unique number of each atom. And RDKit adds the index when mol object is generated. RDKit makes mol object from SMILES, Inchi, molblock and lots of formats. To make canonical representation of molecules, the atom indices are asigned allways same roules in automatically. The indices are asigned regardless of scaffold. So ifContinue reading “Edit atom indices of RDKit Mol object #memo #cheminformatics”

Current status of DMTA cycle in AZ #memo #DDT #publication #AI/ML

March is end of fiscal year in most of Japanese company. I spent lots of time for paper work in these days… ;P As many readers know that, DMTA cycle is key of drug discovery/optimization process and lots of computational and high thoughput experimental apporaches are available in the process recenlty. I think AZ isContinue reading “Current status of DMTA cycle in AZ #memo #DDT #publication #AI/ML”

Pocket awaer structure generation #DiffDec #cheminformatics

Diffusion model is the one of hot area of generative model. It’s not only computer vision but also cheminformatics. Diffusion model is interesting because it generates object from some noise. BTW, de novo compound design with target protein structure information is really attractive but difficult approach in drug design. There are some approaches to conductContinue reading “Pocket awaer structure generation #DiffDec #cheminformatics”

Update rdkit/shape-it #RDKit #shape-it #cheminformatics

Today I tried to build rdkit/shape-it because I could not build shape-it with current version of rdkit. So I struggled error message to fix the issue.(I’m not so good at C++ ;P) There are two issues in the current code.1. Version of c++ in the CMakeLists.txt is old, so I changed it from c++14 toContinue reading “Update rdkit/shape-it #RDKit #shape-it #cheminformatics”

Visualize feature importance with marimo #cheminformatics #RDKit #marimo

I posted new generation of notebook, marimo recently. It is cool and easy to make interactive analysis environment with python. I’m interested in the package and am thinking how to use in chemoinformatics tasks. In QSAR tasks, chemoinformaticians are often asked the reason of prediction of the model. So XAI (explainable AI) is an attractiveContinue reading “Visualize feature importance with marimo #cheminformatics #RDKit #marimo”

Run conformer generation in parallel #RDKit #Cheminformatcs #lig3dlens

There are useful open source packages not only SBDD but also LBDD in these days. VSFlow is the one of useful package for LBDD and I used it in my cheminformatcs tutorial before. And recently I found another interesting package for LBDD named lig3dlens which is developed by researchers at healx, AI drug discovery company.Continue reading “Run conformer generation in parallel #RDKit #Cheminformatcs #lig3dlens”

New ML package for cheminformatics #cheminformatics #QSAR #ML

I introduced scikit-mol in my blog post before. The package integrates scikit-learn and rdkit. It’s easy to use because user can build QSAR model from scikit-learn’s API. I like the package. And recently I found another useful package for cheminformatics named ‘molflux‘ witch is developed by researchers in Exsicentia, famous AI Drug Discovery pharma. molfluxContinue reading “New ML package for cheminformatics #cheminformatics #QSAR #ML”

Try to use mmpdb v3.1 #RDKit #MMPDB #cheminformatics

As many RDKitters know that Andrew released new version of MMPDB! Recent version of MMPDB has lots of useful methods, one of the generate method it can generate new molecules from given smiles. The method can generate not only all possible molecules from MMPDB but also constrained molecules with options, ‘–query’ and ‘–constant’. ‘–query’ optionContinue reading “Try to use mmpdb v3.1 #RDKit #MMPDB #cheminformatics”

Useful package for filtering molecules of python #RDKit #Python #memo

I wrote blog post of how to use Lilly filter from REINVENT4. In the post, I build lilly filter from source code. After posted, Hadrien introduced me a package for using lilly filter which can call from python. Thanks! The package is provided from datamol-io’s reposidory named medchem. I had intreste the package so I installedContinue reading “Useful package for filtering molecules of python #RDKit #Python #memo”

Use Lilly MedChem rules for scoring of REINVENT4 #cheminformatics #reinvent4 #rdkit

Now I’m reading code of new version of REINVENT4. The pacakge supports not only json but also toml format config file. It make easy to set config for users. And I found that this version seems easy add new scoring function which defined by users. Long time ago, I wrote blog post about Lilly MedChemRules.Continue reading “Use Lilly MedChem rules for scoring of REINVENT4 #cheminformatics #reinvent4 #rdkit”

Try to use new version of compound generator REINVENT4 #reinvent #RDKit #cheminfo

Recently there are lots of Deep Learning based compound generators which are called AI compound generator in some case are reported. One of the useful and my favorite package is REINVENT developed by AstraZeneca’s team. Original repository of Reinvent is archived so I thought that the development is completed or they took into the projectContinue reading “Try to use new version of compound generator REINVENT4 #reinvent #RDKit #cheminfo”

Useful ML package for cheminformatics #RDKit #cheminformatics #ML

As many readers know that scikit-learn is the one of useful python package for cheminformatics. However to use scikitk-learn in cheminformatics tasks user need to prepare data with other packages becuase scikit-learn doesn’t support chemicaldata handling. So is it nice if you can use chemical data in scikitlearn API? I think so. Fortunately there isContinue reading “Useful ML package for cheminformatics #RDKit #cheminformatics #ML”

Regist new molecules with lwreg from web app #RDKit #lwreg #cheminformatics

Now there is RDKit based chemical cartridge for sqlite3 and posgresql. So it’s useful for developping cheminfo web app with these databases. By using these databases with web app, we need to defne the schema of databases at first. Also many cheminformaticians handle lots of molecules. So it’s useful to register their idea or handledContinue reading “Regist new molecules with lwreg from web app #RDKit #lwreg #cheminformatics”

Compare seed of cheminformatics task between python and CPP #RDKit #cheminformatics

I’m not so good at C++ coding but RDKit supports not only python but also C++. And C++ often works faster than python. I have interest to compare the performance of cheminformatics task both languages. So today I tried to compare the performance. At first, here is a python code which is generate set ofContinue reading “Compare seed of cheminformatics task between python and CPP #RDKit #cheminformatics”

Build new version of RDKit from source #RDKit #memo

I enjoyed RDKit UGM 2023 and found that new version of rdkit has lots of cool new functions. RDKit 20230901 isn’t released yet but beta version is released recently. The beta version isn’t installed from conda now. So I tried to build and install rdkit 2023091b from source. RDKit provides Install.md for the building ofContinue reading “Build new version of RDKit from source #RDKit #memo”

New Knime node for cheminformatics #RDKit #Knime

I posted an example of how to develop the original knime node with python before. It seems quite useful because coder can develop new knime node speedly. As I posted last my blog post, I participated RDKit UGM2023 and joined Hackathon the last day. I joined Knime node development team at the Hackathon and developedContinue reading “New Knime node for cheminformatics #RDKit #Knime”

Visualize result of molshap #RDKit #chemoinformatics #memo

Finding the best combination of substituents is very important task for compound optimization. Recently there are lots of methods to predict the combination. By using Deep learning or other complex methods are difficult understand for chemists. So I like Free Wilson analysis, because it’s simple but easy to understand because FW analysis uses liner regression.Continue reading “Visualize result of molshap #RDKit #chemoinformatics #memo”