Make QSAR models and predict activity using pandas_ml and RDKit

In mishima.syk y_sama introduced us about pandas_ml. I know it’s so late,but I was interested in the module, so I used pandas_ml for QSAR. Pandas_ml is library of python to integrate pandas, scikit-learn, xgboost and seaborn. To use pandas_ml, I installed xgboost python binding before and installed pandas_ml using pip command. I tried to buildContinue reading “Make QSAR models and predict activity using pandas_ml and RDKit”

Try to use casperjs

CasperJS is navigation scripting and testing utility for the PhantomJS and SlimerJS written in Javascript. You know, PhantomJS, and SlimerJS are headless browsers. Some years ago, I used selenium for web scraping because selenium has python binding and easy to use. Today, I used CasperJS for test. Installation is very easy. Just use homebrew(for MacContinue reading “Try to use casperjs”

New index for compound prioritization

Somedays ago, I found a report about compound prioritization in single-concentration screening. The early stage of drug discovery project, we need to screening lots of compounds. Single-concentration assays are often used because of throughput. In this case, inhibition percent or related output are used. Medicinal chemist need to prioritize compound using these dataset. InContinue reading “New index for compound prioritization”

RandomForest Classification on Hadoop.

Belatedly I’m interested in hadoop. I felt that it’s difficult for me to handle hadoop ( I’m not good at data science…. ) but somedays ago I found very attractive library named ‘Hivemall’. Following document get from github page. Hivemall is a scalable machine learning library that runs on Apache Hive. Hivemall is designed toContinue reading “RandomForest Classification on Hadoop.”