Visualize feature importance with marimo #cheminformatics #RDKit #marimo

I posted new generation of notebook, marimo recently. It is cool and easy to make interactive analysis environment with python. I’m interested in the package and am thinking how to use in chemoinformatics tasks. In QSAR tasks, chemoinformaticians are often asked the reason of prediction of the model. So XAI (explainable AI) is an attractiveContinue reading “Visualize feature importance with marimo #cheminformatics #RDKit #marimo”

New ML package for cheminformatics #cheminformatics #QSAR #ML

I introduced scikit-mol in my blog post before. The package integrates scikit-learn and rdkit. It’s easy to use because user can build QSAR model from scikit-learn’s API. I like the package. And recently I found another useful package for cheminformatics named ‘molflux‘ witch is developed by researchers in Exsicentia, famous AI Drug Discovery pharma. molfluxContinue reading “New ML package for cheminformatics #cheminformatics #QSAR #ML”

Useful python package for QSAR related tasks of chemoinformatician #chemoinformatics #oloren-ai #RDKit

When I posted my memo about open science, @OlorenAI introduced python package named Oloren ChemEngine (OCE). I often use chemprop or interanly build system for QSAR tasks. ChemProp is the one of favorite package because it is easy to use and it includes web application flamework for users. I’ve never used OEC so I triedContinue reading “Useful python package for QSAR related tasks of chemoinformatician #chemoinformatics #oloren-ai #RDKit”

Conformal prediction with python and rdkit #RDKit #QSAR #Conformal_prediction

Recently Greg shared nice webiner about conformal prediction in Youtube. https://www.youtube.com/watch?v=_ZVuEWEfwuw He introduced basic concept of conformal prediction and demonstration with excellent work flow of KNIME. I recommend to check the site. ;) Conformal prediction is not new method. It can estimate confidence of predicted values. Traditional predictive model can predict probability, or class ofContinue reading “Conformal prediction with python and rdkit #RDKit #QSAR #Conformal_prediction”

Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics

Some days ago, I posted how to visualize SVG images horizontally and ngboost for QSAR problem. It worked well. And I found that different models showed different performance. So my question is that which point each model detects important for molecular properties. Fortunately rdkit has GetSimilarityMapForModel method which can render the probe molecule with model’sContinue reading “Compare the view point of different QSAR models #RDKit #visualize #chemoinformatics”

Predict probabilistic distribution with NGBoost #NGBoost #RDKit #QSAR #Chemoinformatics

Recently novel gradient boosting method was published from Andrew Ng group. It is interesting that NGBoost can calculate not only probability but also probabilistic distribution. It is useful for QSAR because we would like to know not only predicted value/class but also uncertainly of the prediction. Fortunately NGBoost is available from python! It can beContinue reading “Predict probabilistic distribution with NGBoost #NGBoost #RDKit #QSAR #Chemoinformatics”

Build stacking Classification QSAR model with mlxtend #chemoinformatics #mlxtend #RDKit

I posed about the ML method named ‘blending’ somedays ago. And reader recommended me that how about try to use “mlxtend”. When I learned ensemble learning package in python I had found it but never used. So try to use the library to build model. Mlxtend is easy to install and good document is providedContinue reading “Build stacking Classification QSAR model with mlxtend #chemoinformatics #mlxtend #RDKit”