diary – Is life worth living?

Try to use LBDD package #lig3dlens #datamol #rdkit

Now we can use folding AI such as Alpha-fold to predict 3D structure of target proteins of course homology modeling is also used to do it. But these structures are snap shot and apo form. So Ligand based drug design(LBDD) Ligand based virtual screening (LBVS) are still important strategy for drug discovery. Lots of LBVSContinue reading “Try to use LBDD package #lig3dlens #datamol #rdkit”

Try to use new LLM phi3 #memo #LLM

As name of LLM means that to use these kinds of models, we need enough GPU memory and it’s not so cost effective for personal use ;) To overcome the limitation, there are lots of technologies are developt and still be developping. LLAMA-cpp is one of the them. Today I would like to share newContinue reading “Try to use new LLM phi3 #memo #LLM”

Generate new molecules from fragments with Diffusion model #cheminformatics #rdkit #difflinker #memo

Designing linked molecule from fragments is one of the important task for drug desing such as FBDD, Scaffold hopping (e.g. replace core) and PROTAC molecule design. As readers know there are lots of solutions to do it, for examoke BROOD is one of the famous commercial package for fragment replacement. I can’t use commercial packageContinue reading “Generate new molecules from fragments with Diffusion model #cheminformatics #rdkit #difflinker #memo”

Try to use new version of REINVENT #cheminformatics #memo #rdkit

As many cheminformaticians know that (I expected…) REINVENT which is developed by AZ team is one of the useful and famous AI based compound generator in cheminformatics field. New version of REINVENT 4 is still active and recently it is version apped and added some useful code on the repositly. The DL framework is movedContinue reading “Try to use new version of REINVENT #cheminformatics #memo #rdkit”

Edit atom indices of RDKit Mol object #memo #cheminformatics

Atom indecies are unique number of each atom. And RDKit adds the index when mol object is generated. RDKit makes mol object from SMILES, Inchi, molblock and lots of formats. To make canonical representation of molecules, the atom indices are asigned allways same roules in automatically. The indices are asigned regardless of scaffold. So ifContinue reading “Edit atom indices of RDKit Mol object #memo #cheminformatics”

Enumerate molecules with CXSMILES #RDKit #cheminforamtics #memo

Recent version of RDKit supports not only SMILES but also CXSMILES (chemaxon extended smiles). As name shown that CXSMILES can add lots of informations in SMILES strings but it’s little bit difficult to understand. But I found that is useful to enumerate molecules ;) For example I would like to enumerate core and two R-groupsContinue reading “Enumerate molecules with CXSMILES #RDKit #cheminforamtics #memo”

Current status of DMTA cycle in AZ #memo #DDT #publication #AI/ML

March is end of fiscal year in most of Japanese company. I spent lots of time for paper work in these days… ;P As many readers know that, DMTA cycle is key of drug discovery/optimization process and lots of computational and high thoughput experimental apporaches are available in the process recenlty. I think AZ isContinue reading “Current status of DMTA cycle in AZ #memo #DDT #publication #AI/ML”

Predict protein-ligand complex dynamics with python #dynamicbind #cheminformatics #memo

Computer aided drug design is one of the powerful approach for drug discovery these days. Docking study of target protein and ligands is common proceduer to evaluate whether the compound fit target protein’s pocket or not. However there is a limitation in the method. Most of the docking apporach handle protein and ligand as rigidContinue reading “Predict protein-ligand complex dynamics with python #dynamicbind #cheminformatics #memo”

Pocket awaer structure generation #DiffDec #cheminformatics

Diffusion model is the one of hot area of generative model. It’s not only computer vision but also cheminformatics. Diffusion model is interesting because it generates object from some noise. BTW, de novo compound design with target protein structure information is really attractive but difficult approach in drug design. There are some approaches to conductContinue reading “Pocket awaer structure generation #DiffDec #cheminformatics”

Update rdkit/shape-it #RDKit #shape-it #cheminformatics

Today I tried to build rdkit/shape-it because I could not build shape-it with current version of rdkit. So I struggled error message to fix the issue.(I’m not so good at C++ ;P) There are two issues in the current code.1. Version of c++ in the CMakeLists.txt is old, so I changed it from c++14 toContinue reading “Update rdkit/shape-it #RDKit #shape-it #cheminformatics”

Visualize feature importance with marimo #cheminformatics #RDKit #marimo

I posted new generation of notebook, marimo recently. It is cool and easy to make interactive analysis environment with python. I’m interested in the package and am thinking how to use in chemoinformatics tasks. In QSAR tasks, chemoinformaticians are often asked the reason of prediction of the model. So XAI (explainable AI) is an attractiveContinue reading “Visualize feature importance with marimo #cheminformatics #RDKit #marimo”

Run conformer generation in parallel #RDKit #Cheminformatcs #lig3dlens

There are useful open source packages not only SBDD but also LBDD in these days. VSFlow is the one of useful package for LBDD and I used it in my cheminformatcs tutorial before. And recently I found another interesting package for LBDD named lig3dlens which is developed by researchers at healx, AI drug discovery company.Continue reading “Run conformer generation in parallel #RDKit #Cheminformatcs #lig3dlens”

New ML package for cheminformatics #cheminformatics #QSAR #ML

I introduced scikit-mol in my blog post before. The package integrates scikit-learn and rdkit. It’s easy to use because user can build QSAR model from scikit-learn’s API. I like the package. And recently I found another useful package for cheminformatics named ‘molflux‘ witch is developed by researchers in Exsicentia, famous AI Drug Discovery pharma. molfluxContinue reading “New ML package for cheminformatics #cheminformatics #QSAR #ML”

New type of python notebook #marimo #cheminformatics #RDKit

Jupyter-lab, Jupyter-note book, streamlt and other packages are useful for data science beucase it can analyze and visualize data step by step. I like streamlit and dash for making simple web app. And some days ago I found new and cool package named marimo. From the documentaion, marimo is an open-source reactive notebook for Python — reproducible,Continue reading “New type of python notebook #marimo #cheminformatics #RDKit”

Try to use mmpdb v3.1 #RDKit #MMPDB #cheminformatics

As many RDKitters know that Andrew released new version of MMPDB! Recent version of MMPDB has lots of useful methods, one of the generate method it can generate new molecules from given smiles. The method can generate not only all possible molecules from MMPDB but also constrained molecules with options, ‘–query’ and ‘–constant’. ‘–query’ optionContinue reading “Try to use mmpdb v3.1 #RDKit #MMPDB #cheminformatics”

Extract format data from chembl with Dask #cheminformatics #memo #chembl

Happy new year. I hope reader had nice year-end holidays. I started running from 1st Jan. I would like to run 1500 km in this year. ast year, I posted about REST api usage of chembl multitask target prediction. It allows predict possible target without any ML env in user PC. But sometime user wouldContinue reading “Extract format data from chembl with Dask #cheminformatics #memo #chembl”

Look Back at 2023 #diary

I’m writing this post after last run of 2023. It have been over 10 years after staring my blog site. Today I would like to look back 2023 ;-) This will be the last post of this year. Thanks for reading. I want to wish you and your family a safe, beautiful and happy NewContinue reading “Look Back at 2023 #diary”

ChEMBL multitask prediction with python requrests #memo #chembl #cheminfo

ChEMBL provides multitask prediction model from its github repo. And shared useful blog post.https://chembl.blogspot.com/2019/05/multi-task-neural-network-on-chembl.html By using the code, we can get predicted target list from given molecules. And the prediction can run python, C++, JS and Knime! ChEMBL team provides not only source code but also predicted results when we search compound in ChEMBL DB.Continue reading “ChEMBL multitask prediction with python requrests #memo #chembl #cheminfo”

Build Postgresql rdkit cartridge and install new version of postgresql #RDKit #postgresql #cheminformatics

As readers know that rdkit supports not only python, C++ but also postgresql cartridge. We can install rdkit-postgresql via conda command without building it from source. However conda’s rdkit-postgresql supports only old version of postgresql and isn’t updated recently. I would like to install rdkit-postgresql into new version of postgresql. Today I tried to buildContinue reading “Build Postgresql rdkit cartridge and install new version of postgresql #RDKit #postgresql #cheminformatics”

Useful package for filtering molecules of python #RDKit #Python #memo

I wrote blog post of how to use Lilly filter from REINVENT4. In the post, I build lilly filter from source code. After posted, Hadrien introduced me a package for using lilly filter which can call from python. Thanks! The package is provided from datamol-io’s reposidory named medchem. I had intreste the package so I installedContinue reading “Useful package for filtering molecules of python #RDKit #Python #memo”