Make pandas dataframe with r-group information #memo

I often forget many things …. So there are same topics will be posted in my blog. Sometime it’s updated due to change of package version or some reasons. And I posted very similar code previously. But I posted again to remember the procedure for myself. It’s just memo… PandasTools of RDKit makes easy toContinue reading “Make pandas dataframe with r-group information #memo”

Does the fastest computer find drug more efficiently?

Recently I read good news that the latest supercomputer developed by Japan named ‘Fugaku’ has the world’s fastest computing speed. Th e ULR of the japan times is below. https://www.japantimes.co.jp/news/2020/06/23/national/fugaku-supercomputer-ranked-fastest/ Recently computational chemists need to handle huge amount of virtual compounds and data (medicinal chemist too ;)). So many computer resources are required to drugContinue reading “Does the fastest computer find drug more efficiently?”

Optimize ML model with optuna and visualize the result with MLFlow #informatics #machine learning

As you know Optuna is very useful and powerful package for machine learning. I often use the package in my own task. And MLFLOW is also useful package. I posted about mlflow before. MLflow has many functions for visualize experiment results and manage models. https://iwatobipen.wordpress.com/2018/11/14/tracking-progress-of-machine-learning-machinelearning/ I think it will be useful if models can beContinue reading “Optimize ML model with optuna and visualize the result with MLFlow #informatics #machine learning”

Simple chemistry generates novel compounds! #memo #paper #chemoinformatics

If you are chemist, amide formation seems one of easy reaction. It’s really simple, can the reaction make novel compounds? Recently I found very interesting article in ChemRxiv written by researchers of AZ. ‘Can “easy” chemistry produce complex, diverse and novel molecules?’https://chemrxiv.org/articles/Can_Easy_Chemistry_Produce_Complex_Diverse_and_Novel_Molecules_/12563231 The author used in-house ELN dataset for analysis, compare compounds which were madeContinue reading “Simple chemistry generates novel compounds! #memo #paper #chemoinformatics”

Make simple web API with FastAPI, pycembldb and razi #chemoinformatics #RDKit #ChEMBLDB

As reader know, chembl is not only useful but also opensource database. Recently I’m playing with chembl and its python wrapper named pychembldb and cartridge named razi. I recommend reader who is interested in the chemoinformatics to install these very useful python packages. ;) BTW, I think it is useful if I could use theseContinue reading “Make simple web API with FastAPI, pycembldb and razi #chemoinformatics #RDKit #ChEMBLDB”

Bio-isoster of Aryl amine #memo

As medicinal chemists know that aryl amine building blocks are often used in drug design. N linked Bi Aryl motif can found many kinase inhibitors but sometime aryl amine cause toxic issues so the structure is recognized as alert structure. So, there are many researches for aryl amine replacement with another structure. Here is aContinue reading “Bio-isoster of Aryl amine #memo”

Integration razi and pychembldb #RDKit #Chemoinformatics #razi #sqlalchemy #ChEMBL

As you know, sqlalchemy is very useful ORM of python. I love the package and also chemoinformatician is familiar to ChEMBLDB I think. There are very useful package for these people one is razi and the other is pychembldb. Razi is chemical cartridge for postgressql with rdkit functionality and pychembldb is python wrapper of ChEMBLDBContinue reading “Integration razi and pychembldb #RDKit #Chemoinformatics #razi #sqlalchemy #ChEMBL”

Rendering molecular image tooltips on Bokeh #RDKit #memo #visualization

Recently there many plotting tools for python package! I can’t follow everything… I mainly use seaborn and matplotlib. These tools are nice for rendering beautiful chart but if I would like to interactive plot, I need to switch plotting tools. So I started learn another tools and today I used Bokeh. (Reader already know bokeh,Continue reading “Rendering molecular image tooltips on Bokeh #RDKit #memo #visualization”

Target prediction by conformal prediction with ChEMBL data #chemoinformatics #docker #memo

Recently I’m learning conformal prediction and today I used the ready to use model for target prediction which is trained by chembl_24 (little bit old) with lightGBM. ChEMBL team provides nice package. You can get the model as docker image. URL is below.https://github.com/chembl/of_conformalAnd original article is here. https://jcheminf.biomedcentral.com/track/pdf/10.1186/s13321-018-0325-4 To use the image in your localContinue reading “Target prediction by conformal prediction with ChEMBL data #chemoinformatics #docker #memo”

Communicate ChEMBL27 with rdkit postgres cartridge and sqlalchemy #RDKit #ChEMBL #Postgres #razi

As you know ChEMBL 27 was released recently, thanks great effort for EBI ;) Fortunately ChEMBL provides common DB format dump file and RDKit has postgres DB cartridge. It means that you can search compound with rdkit functionality in postgres. BTW, to handle the database, sqlalchemy which is ORMapper is very useful. So is itContinue reading “Communicate ChEMBL27 with rdkit postgres cartridge and sqlalchemy #RDKit #ChEMBL #Postgres #razi”

Conformal prediction with python and rdkit_2 #RDKit #QSAR #Conformal_prediction

I posted about conformal prediction with python and rdkit some days ago. After that I could get very informative advice from @kjelljorner. Thanks a lot! His advice was below. Kjell Jorner @kjelljorner3dReplying to @iwatobipen I can recommend the cross conformal prediction or bootstrapped conformal prediction (also in nonconformist) to avoid having to put aside data for calibration.Continue reading “Conformal prediction with python and rdkit_2 #RDKit #QSAR #Conformal_prediction”

Conformal prediction with python and rdkit #RDKit #QSAR #Conformal_prediction

Recently Greg shared nice webiner about conformal prediction in Youtube. https://www.youtube.com/watch?v=_ZVuEWEfwuw He introduced basic concept of conformal prediction and demonstration with excellent work flow of KNIME. I recommend to check the site. ;) Conformal prediction is not new method. It can estimate confidence of predicted values. Traditional predictive model can predict probability, or class ofContinue reading “Conformal prediction with python and rdkit #RDKit #QSAR #Conformal_prediction”

Deep learning based reaction mapper #rdkit #deeplearning #AAM #chemoinformatics

Here is a great article about AtomAtom Mapping with Deep Learning!https://chemrxiv.org/articles/Unsupervised_Attention-Guided_Atom-Mapping/12298559The author use attention method and train model for reaction mapping. AAM is important technology but there are few tools to do it. The author shared their code! Thanks. I have interest the code. So I installed the code and used the package. Following codeContinue reading “Deep learning based reaction mapper #rdkit #deeplearning #AAM #chemoinformatics”

Think about de novo molecule generation #memo #journal #RDKit #CReM

Recently there are many publications about de-novo molecular generator which mainly use Deep Learning. One problem of the approach is that generated molecules are not systematic so it’s difficult to synthesis them with parallel chemistry. So sometime chemists dislike the proposal from generated form the method I think. Rule or Rxn or MMP based moleculeContinue reading “Think about de novo molecule generation #memo #journal #RDKit #CReM”

Molecules drawing code memo (Highlight Functional groups) #memo

Here is an example for draw molecules with FG highlighting. Improved version of the code can be found in last part of the post. RDKit has fdef file for common functional groups. So we can use it for highlighting atoms as same as pharmacohore. Today’s example is just for my memo. Because I often forgetContinue reading “Molecules drawing code memo (Highlight Functional groups) #memo”

Replace core with DeLinker #RDKit #Chemoinformatics #DeepLearning

In the FBDD projects, fragment linking strategy is very easy to understand about the strategy but it is difficult to linking two fragments in the real world I think. There are many tools for linking fragments in virtually. These tools are used not only be applied to FBDD but also scaffold hopping etc. There areContinue reading “Replace core with DeLinker #RDKit #Chemoinformatics #DeepLearning”

Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics

Some days ago, I posted how to visualize SVG images horizontally and ngboost for QSAR problem. It worked well. And I found that different models showed different performance. So my question is that which point each model detects important for molecular properties. Fortunately rdkit has GetSimilarityMapForModel method which can render the probe molecule with model’sContinue reading “Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics”

Try to use target DB #OSS #targetDB #memo

I posted the article about CADD for drug discovery today. And in the article, author introduced many useful CADD tools and targetDB is one of them. Reader who would like to install it, pip or conda command is available.https://github.com/sdecesco/targetDB I installed targetdb with pip because conda caused error during the instllation. After installing the packageContinue reading “Try to use target DB #OSS #targetDB #memo”

Computer supports drug discovery #memo #journal

In this week is called ‘Golden week’ in Japan, which is a collection of national holidays. Due to fight against new corona-virus, I spend most of time in my home and around my home with my family. And in this morning I read a miniperspective in JMC. The title was ‘Computational Chemistry on a Budget:Continue reading “Computer supports drug discovery #memo #journal”

Predict probabilistic distribution with NGBoost #NGBoost #RDKit #QSAR #Chemoinformatics

Recently novel gradient boosting method was published from Andrew Ng group. It is interesting that NGBoost can calculate not only probability but also probabilistic distribution. It is useful for QSAR because we would like to know not only predicted value/class but also uncertainly of the prediction. Fortunately NGBoost is available from python! It can beContinue reading “Predict probabilistic distribution with NGBoost #NGBoost #RDKit #QSAR #Chemoinformatics”