Probabilistic Random Forest approach to predict experimental value #RDKit #chemoinformatics #machine_learning

To build predictive model, input value(X) and target value(y) is required. But in the drug discovery area target value always has experimental error. So any experimental value (target value) may have uncertainly and it makes difficult to build predictive model. Recently Ola Engkvist group who is in AZ published interesting article in Jounral of chemoinformatics.Continue reading “Probabilistic Random Forest approach to predict experimental value #RDKit #chemoinformatics #machine_learning”

Make original sklearn classifier-2 #sklearn #chemoinfo

After posted ‘Make original sklearn classifier’, I could get comment from my follower @yamasaKit_-san and @kzfm-san. (Thanks!) So I checked diversity of models with principal component analysis(PCA).The example is almost same as yesterday but little bit different at last part.Last part of my code is below. Extract feature importances from L1 layer classifiers and mono-randomContinue reading “Make original sklearn classifier-2 #sklearn #chemoinfo”

Ensemble learning with scikit-learn and XGBoost #machine learning

I often post about the topics of deep learning. But today I would like to post about ensemble learning. There are lots of documents describes Ensemble learning. And I think following document is very informative for me. Kaggle Ensembling Guide I interested one of the method, named ‘blending’. Regarding above URL, the merit of ‘blending’Continue reading “Ensemble learning with scikit-learn and XGBoost #machine learning”

Make predictive models with small data and visualize it #Chemoinformatics

I enjoyed chemoinformatics conference held in Kumamoto in this week. The first day of the conference, I could hear about very interesting lecture. That was very basic data handling and visualization tutorial but useful for newbie of chemoinformatics. I would like to reproduce the code example, so I tried it. First, visualize training data. ItContinue reading “Make predictive models with small data and visualize it #Chemoinformatics”