Vote Vote Vote #chemoinformatics

Somedays ago, I posted about ensemble classification method named ‘blending’. The method is not implemented in scikit-learn. So I am implementing the function now. By the way, scikit-learn has an ensemble classification method named ‘VotingClassifer’. Following explanation from sklearn document. The idea behind the VotingClassifier is to combine conceptually different machine learning classifiers andContinue reading “Vote Vote Vote #chemoinformatics”


Visualize pharmacophore in RDKit #RDKit

RDKit has pharmacophore feature assignment function. The function can retrieve molecular features based on pre-defined ph4core. And RDKit IPythonconsole can draw molecules on ipython notebook. Today I tried to visualize ph4core on notebook. Code is very simple. First, load feature definition. Then calculate pharmacophore. And compute 2D cordes. Next I defined drawing function. To highlightContinue reading “Visualize pharmacophore in RDKit #RDKit”

Applicable Domain on Deep Neural Networks #JCIM #chemoinformatics

I read interesting article from JCIM. Dissecting Machine-Learning Prediction of Molecular Activity: Is an Applicability Domain Needed for Quantitative Structure−Activity Relationship Models Based on Deep Neural Networks? URL is below. The pros of DNN is feature extraction. And there are many articles which use DNN for molecular activity prediction. BTW, is it true thatContinue reading “Applicable Domain on Deep Neural Networks #JCIM #chemoinformatics”

Generate possible list of SMLIES with RDKit #RDKit

In the computer vision, it is often used data augmentation technique for getting large data set. On the other hand, Canonical SMILES representations are used in chemoinformatics area. RDKit UGM in last year, Dr. Esben proposed new approach for RNN with SMILES. He expanded 602 training molecules to almost 8000 molecules with different smiles representationContinue reading “Generate possible list of SMLIES with RDKit #RDKit”

Tracking progress of machine learning #MachineLearning

To conduct machine learning it is needed to optimize hyper parameters. For example scikit-learn provides grid search method. And you know there are several packages to do that such as hyperopt or gyopt etc. How do you mange builded models? It is difficult for me. Recently I am interested in mlflow . MLflow is anContinue reading “Tracking progress of machine learning #MachineLearning”

Ensemble learning with scikit-learn and XGBoost #machine learning

I often post about the topics of deep learning. But today I would like to post about ensemble learning. There are lots of documents describes Ensemble learning. And I think following document is very informative for me. Kaggle Ensembling Guide I interested one of the method, named ‘blending’. Regarding above URL, the merit of ‘blending’Continue reading “Ensemble learning with scikit-learn and XGBoost #machine learning”

Visualize important features of machine leaning #RDKit

As you know, rdkit2018 09 01 has very exiting method named ‘DrawMorganBit’ and ‘DrawMorganBits’. It can render the bit information of fingerprint. It is described the following blog post. And if you can read Japanese, Excellent posts are provided. View at What I want to do in the blog post is thatContinue reading “Visualize important features of machine leaning #RDKit”

convert rdkit mol object to schrodinger’s mol object #RDKit #Chemoinformatics

I posted a memo about how to read maestro file format from RDKit. It means that rdkitter can use “mae” format from RDKit. ;-) BTW, schrodinger’s site provides API for python. I would like to know the way to communicate rdkit from schrodinger python API. I read the API in lunch break and testedContinue reading “convert rdkit mol object to schrodinger’s mol object #RDKit #Chemoinformatics”

Read maestro format file from RDKit

RDKitter knows that Schrodinger contributes RDKit I think. Schrodinger provides many computational tools for drug discovery, that is not only GUI tool but also python API. Many tool can call from python and also RDKit. And RDKit can read maestro file vise versa. It is easy to do it like reading SDFiles. I amContinue reading “Read maestro format file from RDKit”

Run rdkit and deep learning on Google Colab! #RDKit

If you can not use GPU on your PC, it is worth to know that you can use GPU and/or TPU on google colab. Now you can use google colab no fee. So, I would like to use rdkit on google colab and run deep learning on the app. Today I tried it. At firstContinue reading “Run rdkit and deep learning on Google Colab! #RDKit”