Try to use casperjs

CasperJS is navigation scripting and testing utility for the PhantomJS and SlimerJS written in Javascript.
You know, PhantomJS, and SlimerJS are headless browsers.
Some years ago, I used selenium for web scraping because selenium has python binding and easy to use.
Today, I used CasperJS for test.
Installation is very easy. Just use homebrew(for Mac users) or npm (Need to install PhantomJS before). ;-)
I wrote simple code that the code search patents in google patent and echo the each link.
At first, create casper object. And then write next action like ‘casper.then( function() { /* your function */ } );’ .
fill function is useful for form input, user don’t need push button command.
Following code access google patent and search patents that are written about JAK3.
Then, echo urls.

var casper = require( 'casper' ).create();
function getLinks() {
        var links = [];
        var list = document.querySelectorAll( 'article > a' );

        for ( var i = 0; i < list.length; i++ ){
            var a = list[i];
            links.push( a.href );
        return links;

casper.start().viewport( 1600,1000 );

casper.thenOpen( '',
                   this.echo( this.getTitle() );
                 function(){ this.capture('top.png') }

casper.then( function(){
             this.fill("form", { q : "JAK3" }, true);
casper.wait( 5000,
                 function(){ this.capture('res.png') }

                        links = this.evaluate( getLinks );
                        this.echo( links.length + 'patents found' );
                        for ( i = 0; i < links.length; i++ ){
                                    this.echo( links[i]  );

To run the code, just type casperjs yourscript.js.

 iwatobipen$ casperjs googlepat.js 
Google Patents
10patents found

Works fine and I got following screenshot.
CasperJS has more function for scraping. I’ll read API as soon as possible.



Use rdkitjs in local application

Electron is tool to build cross platform desktop apps with javascript, HTML, and CSS.
Users can develop their own app like web app. I think it’s interesting, because the developed application will run local environment.

Today I tried to use electron. My code (index.js) is almost same as quick start.
I installed browserfiy, rdkitjs, jquery and highcharts via using ‘npm install command’. ( highcharts was not used following code.)

To start, at first type…

npm init

Then I got packages.json

  "name": "electron_test",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  "author": "iwatobipen",
  "license": "GPL",
  "dependencies": {
    "highcharts": "^4.2.5",
    "jquery": "^3.0.0",
    "rdkit": "^0.1.1"
  "devDependencies": {
    "electron-prebuilt": "^1.2.2"

Then made index.js ( same as quick start ).

const electron = require('electron');
const $ = require( 'jquery' );
const Highcharts = require('highcharts');
// Module to control application life.
const {app} = electron;
// Module to create native browser window.
const {BrowserWindow} = electron;

// Keep a global reference of the window object, if you don't, the window will
// be closed automatically when the JavaScript object is garbage collected.
let win;

function createWindow() {
  // Create the browser window.
  win = new BrowserWindow({width: 800, height: 600, 'node-integration': false});

  // and load the index.html of the app.

  // Open the DevTools.

  // Emitted when the window is closed.
  win.on('closed', () => {
    // Dereference the window object, usually you would store windows
    // in an array if your app supports multi windows, this is the time
    // when you should delete the corresponding element.
    win = null;

// This method will be called when Electron has finished
// initialization and is ready to create browser windows.
// Some APIs can only be used after this event occurs.
app.on('ready', createWindow);

// Quit when all windows are closed.
app.on('window-all-closed', () => {
  // On OS X it is common for applications and their menu bar
  // to stay active until the user quits explicitly with Cmd + Q
  if (process.platform !== 'darwin') {

app.on('activate', () => {
  // On OS X it's common to re-create a window in the app when the
  // dock icon is clicked and there are no other windows open.
  if (win === null) {

// In this file you can include the rest of your app's specific main process
// code. You can also put them in separate files and require them here.

Next, I wrote index.html.
This code gets smiles as input and write molecule as SVG image.
It’s very simple.
Before writing code, I installed browserify, so the code can call library using require( ‘package name’ ).
BTW, I want to use jQuery but following it couldn’t call jQuery. So I wrote document.getElementById ….. instead of $( ‘#hogehoge’ )…. ;-(

<!DOCTYPE html>
    <meta charset="UTF-8">
    <title>RDKIT js with electron</title>
    <!-- jQuery is not loaded first run. Why???  -->
    <script> window.$ = window.jQuery =  require( 'jquery' );</script>
    <script> var Highcharts = require( 'highcharts' );</script>
    <script> var RDKit = require( 'rdkit' );</script>
    <script> var drawmol = function() {
                        var smi = document.getElementById('smi').value;
                        console.log( smi );
			var mol = RDKit.Molecule.fromSmiles( smi );
			var svg = mol.Drawing2D();
			var svg = svg.split('svg:').join('');
                        document.getElementById("drawer").innerHTML = svg;         

    <h1>Smiles parser</h1>
    We are using node <script>document.write(process.versions.node)</script>,
    Chrome <script>document.write(</script>,
    and Electron <script>document.write(process.versions.electron)</script>.<br>
    <p>This is simple smiles converter using RDKit js.</p>
    <p>Input smiles to text box and push the button.</p>
    <input id='smi' type='text' ><br>
    <button type='button' onclick='drawmol()' >draw</button><br>

    <div id='drawer'></div>

Next run electron.

electron .

Then app will run. And input smiles, push button I got following image.

It’s seems work well.
Electron-packager can make local app for multi platforms.
For convenient, just type
$ electron-packager . –all
I got and the code run local environment.
I uploaded the code to my repo.
Maybe zip archive will run without installing further js library.

Electron can make application like a web app it’s amazing for me.

Visualize chemical space using RDKit-Scikitlearn-Highchart

I often use Principle component analysis (PCA) to visualize chemical space. PCA is useful to describe chemical diversity. I wonder if I could project new designed molecules to reference current chemical space.
I think that sci-kitlearn and rdkit is suitable to do that. Recently I often use seaborn to visualization, but today I used highcharts to visualize data. Because highchart can handle data interactively in web app.
Flask was used web-app framework, and rdkit was used to fingerprint calculation.

My first example was following. All function and data were embedded ‘’.
Structures.sdf is data of DrugBank.
Following code is ….
1st- calculate fingerprints about reference molecules and test molecules.
2nd- Do PCA against reference molecules.
3rd- projection test mols to reference molecules chemical space.
4th- Convert molecue to svg text. ( It is nice work of RDKIT! )
5th- pass datas( PC1, PC2, SVG ) to highcharts.js
To convert molecules to SVG is important to visualize molecules in tooltip.

from flask import Flask, render_template
app = Flask( __name__ )

from rdkit import Chem
from rdkit.Chem import PandasTools
from rdkit.Chem.Draw import MolDraw2DSVG
from rdkit.Chem.Draw import rdMolDraw2D
from rdkit.Chem import rdDepictor, Descriptors, AllChem, DataStructs
import numpy as np
import pandas as pd
from sklearn.decomposition import PCA
import pickle
#structures from drug bank
drugs = [ mol for mol in Chem.SDMolSupplier( &amp;amp;quot;structures.sdf&amp;amp;quot; ) if mol != None ][:500]
test = [ mol for mol in Chem.SDMolSupplier( &amp;amp;quot;testset.sdf&amp;amp;quot; ) if mol != None ]

def calc_fp_arr( mols ):
    fplist = []
    for mol in mols:
        arr = np.zeros( (1,) )
        fp = AllChem.GetMorganFingerprintAsBitVect( mol, 2 )
        DataStructs.ConvertToNumpyArray( fp, arr )
        fplist.append( arr )
    return np.asarray( fplist )

def getsvgtext( mol ):
    d2d = rdMolDraw2D.MolDraw2DSVG(200,200)
    d2d.DrawMolecule( mol )
    svg = d2d.GetDrawingText()
    return svg.replace( &amp;amp;quot;svg:&amp;amp;quot;,&amp;amp;quot;&amp;amp;quot; )

drugfparr = calc_fp_arr( drugs )
testfparr = calc_fp_arr( test )

#do PCA
pca = PCA( n_components=2 ) drugfparr )
f = open( 'drugpca.pkl', 'wb' )
pickle.dump( pca, f )

drugsX = pca.transform( drugfparr )
data1 = [ { 'x' : drugsX[i][0], 'y':drugsX[i][1], 'svg': getsvgtext( drugs[i] ) } for i in range(len(drugsX)) ]
testX = pca.transform( testfparr )
data2 = [ {  'x': testX[i][0], 'y':testX[i][1], 'svg': getsvgtext( test[i] ) } for i in range(len(testX)) ]

@app.route( '/' )
@app.route( '/chart' )
def chart():
    return render_template( 'chart.html', data1 = data1, data2 = data2 )

if __name__ == '__main__':
    app.debug = True  )

Next, wrote template ‘chart.html’.
It’s important to load jquery at first, if highcharts is loaded at first following code did not run.
I embedded SVG in tooltip, so useHTML set true.
And another option is almost default settings.
Highcharts can access attribute of dataset like ‘ this.point.hogehoge’.
So, I used this.point.svg to get the svgtext from dataset.

&amp;amp;lt;!DOCTYPE html&amp;amp;gt;

    &amp;amp;lt;title&amp;amp;gt; test &amp;amp;lt;/title&amp;amp;gt;
    &amp;amp;lt;script type='text/javascript' src =&amp;amp;quot;{{ url_for('static', filename='jquery-2.2.4.min.js') }}&amp;amp;quot;&amp;amp;gt;&amp;amp;lt;/script&amp;amp;gt;
    &amp;amp;lt;script type='text/javascript' src = &amp;amp;quot;{{ url_for( 'static', filename='highcharts/js/highcharts.js' ) }}&amp;amp;quot;  &amp;amp;gt;&amp;amp;lt;/script&amp;amp;gt;
    &amp;amp;lt;script type='text/javascript' src = &amp;amp;quot;{{ url_for( 'static', filename='highcharts/js/modules/exporting.js' ) }}&amp;amp;quot;  &amp;amp;gt;&amp;amp;lt;/script&amp;amp;gt;

      chart :{
        type : 'scatter',
        zoomType : 'xy'
      title : {
        text : 'chemical space mapping'
      xAxis : {
        title : { text : 'PC1'},
        gridLineWidth : 2,
      yAxis : {
        title : { text : 'PCA2'}
      tooltip :{
        useHTML : true,
        formatter : function(){
          return this.point.svg
      series : [{
        name : 'drugs',
        color : 'rgba( 223, 83, 83, .3 )',
        data : {{ data1|safe }}
        name : 'testmol',
        color : 'rgba( 119, 152, 191, .8 )',
        data : {{ data2|safe }}

    &amp;amp;lt;p&amp;amp;gt; scatter plot &amp;amp;lt;/p&amp;amp;gt;&amp;amp;lt;/br&amp;amp;gt;
    &amp;amp;lt;div id = &amp;amp;quot;container&amp;amp;quot; style = &amp;amp;quot;width:500px; height:500px;&amp;amp;quot;&amp;amp;gt;&amp;amp;lt;/div&amp;amp;gt;


Then run code.


I got interactive scatter plot.
Easy to zoom!!
It works fine. I pushed all code to my github repo.

Interesting Web app Named ‘ChemTreeMap’ using RDKit.

I like web app because user does not need client soft to use it.
I often use cytoscape to visualise molecular network. Network view is very informative.
Yesterday, I found cool web app that using RDKit.
URL is following.
The app is an open source application for visualizing molecular networks.
If user can use docker, Installation is very easy. ;-)
I’m mac user, so I start boot2docker at first.
And run following command.

iwatobipen$ boot2docker start
iwatobipen$ docker pull ajing/chemtreemap

Wait several minutes……
Then run the server.

iwatobipen$ docker run -t -i -p 8000:8000 ajing/chemtreemap /bin/bash
root@6969be03007e:/# cd examples/
root@6969be03007e:/examples# python

Access to server from mac.
http://’docker’s ip’:8000/dist/#/aff

To get docker’s ip just input following command  ‘boot2docker ip’.

I got following interactive view.

Worked fine. The app seems light weight and colourful.
And the library using RDKit for chemical structure handling, it’s meaning the app has  flexibility for development.
I wish the app had structure search function.

Docker is very useful to share nice technology for everyone.
I’m looking forward to major release of native docker app for MAC.

SAR visualization with RDKit ver2.

Somedays ago I posted SARviz test code using RDKit.
It was based on command line type. So, I think it is easy to use for applying SDF etc but not familiar for chemist.
Because almost of chemist, don’t like black command screen….;-)
Next step, I tried to make very simple web-app using
I used flask for web framework and JSME for molecular editor.( Point!! I want to make app, only using open source tools !!!)
Structure of app folder is following.
– chemolib folder stored library for calculate molprop. and make image.
– dataprep folder strored sklearn predictive model as pkl format.
– static folder stored static files. js and tempfig( figure generated sarviz function. ) => static and templates folder is needed to run app.( flask manner.)

├── chemolib
│   ├──
│   ├── __pycache__
│   │   ├── __init__.cpython-34.pyc
│   │   └── sarviz.cpython-34.pyc
│   └──
├── dataprep
│   └── svcmodel.pkl
├── static
│   ├── js
│   │   ├── jquery-2.2.2.min.js
│   │   └── jsme
│   │       ├── 0C2FB4F99F888620E6F186FB989A1E5F.cache.js
| ~~~~~~~~~~~~~~cut too long ~~~~~~~~~~~~~~~
│   └── tempfig
│       └── activemol_dummy.png
└── templates
    ├── query.html
    ├── result.html
    ├── template.html
    └── top.html is main code of this app.
But not so complicated.

# this is test app for SARviz.
# Author iwatobipen
# Licence FREE and please enjoy chemoinfromatics

from flask import Flask
from flask import render_template, url_for
from flask_bootstrap import Bootstrap
from flask_wtf import Form
from wtforms import StringField
from wtforms.validators import DataRequired
from chemolib import sarviz
import pickle
from rdkit import Chem

class Smiles( Form ):
    smi = StringField( "- mol to smiles", validators=[ DataRequired() ] )

def create_app():
    app = Flask( __name__ )
    Bootstrap( app )
    return app

app = create_app()

@app.route( "/top/" )
def top():
    return render_template( "top.html" )

@app.route( "/predict/", methods = ( "GET", "POST" ) )
def predict():
    actcls = { -1: "non-active", 1: "active" }
    form = Smiles( csrf_enabled=False )
    if form.validate_on_submit():
        smi =
            mol = Chem.MolFromSmiles( smi )

            mol = Chem.MolFromSmiles( "c1ccccc1" )
        # get molwt, mollogp, tpsa
        molprop = sarviz.molprop_calc( mol )
        # predict active / nonactive as integer and save image.
        res, fname = sarviz.mapperfunc( mol )
        res = int( res[0] )
        return render_template( "result.html", res = actcls[res], fname=fname, molprop = molprop )
        return render_template( "query.html", form = form, fname="dummy.png" )

if __name__ == "__main__": debug = True )

And one of key function is
I used uuid to avoid crash of names.
I added some function to calculate molecular property.
Property will be provided for user as table format.

# this is core function of SAR visualizetion.

from rdkit import Chem, DataStructs
from rdkit.Chem import AllChem
from rdkit.Chem import Descriptors
from rdkit.Chem.Draw import SimilarityMaps
from sklearn.svm import SVC
from matplotlib import cm
import numpy as np
import pickle
import uuid

modelf = open( "./dataprep/svcmodel.pkl", "rb" )
model = pickle.load( modelf )

def calc_fp_arr( mol ):
    fp = AllChem.GetMorganFingerprintAsBitVect( mol, 2 )
    arr = np.zeros((1,))
    DataStructs.ConvertToNumpyArray( fp, arr )
    return arr

def getProba( fp, probabilityfunc ):
    # probability function returns 2 x 1 matrix.
    return probabilityfunc( fp )[0][1]
#save fig random or unique name!!! avoid chaching
def mapperfunc( mol ):
    fig, weight = SimilarityMaps.GetSimilarityMapForModel( mol,
                                                            lambda x:getProba( x, model.predict_proba ),
                                                            colorMap=cm.bwr )
    cls_result = model.predict( calc_fp_arr(mol) )
    fname = uuid.uuid1().hex+".png"
    fig.savefig( "static/tempfig/"+fname, bbox_inches = "tight" )
    return cls_result, fname

def molprop_calc( mol ):
    mw = round( Descriptors.MolWt( mol ), 2 )
    mollogp = round( Descriptors.MolLogP( mol ), 2 )
    tpsa = round( Descriptors.TPSA( mol ), 2  )
    return [ mw, mollogp, tpsa ]

Finally I made template. It was difficult task for me, because I’m not good at html… ;-/
Following code shows only results.html. I uploaded all code to my github repo.
Using Flask-Boot strap for good visualisation.

{% extends "template.html" %}

{% block header %}
{% endblock %}

{% block content %}
<div class="container">
<div class="row">
<div class="col-md-5">
      <img src="{{ url_for( 'static', filename='tempfig/'+fname) }}" height="300" width="300" alt="IMAGE"></br></div>
<div class="col-md-3">
<table class="table table-hover">
<th> Pram</th>
<th> Val</th>
<td>{{ molprop[0] }}</td>
<td>{{ molprop[1] }}</td>
<td>{{ molprop[2] }}</td>
<td>{{ res }}</td>
<td><a href="/predict/"> predict again </a></td>
{% endblock %}

OK, All done.
Let’s run app.
From terminal tipe.
Then access localhost:5000/predict.
User get query input page.
Screen Shot 2016-03-27 at 8.04.16 AM

I embed javascript function to get smiles from JMSE. When user mouse over the button, smiles will be got automatically.

Then push button, prediction will start and image will be generated.
Screen Shot 2016-03-27 at 8.04.29 AM

This accuracy of this app depend on performance of sklern model.
I’ll introduce in this point in mishima.syk8. ;-)
Readers who are interested in mishima.syk, I recommend check following URL.
The event will be held on 28.Mar.2016.

And my code was uploaded to my repo.

Admin portal about MongoDB

I use mongodb to make MMP-DB.
MongoDB is noSQL, and flexible database program.
Also I use postgresSQL. PostgresSQL has pgAdmin to provide adminportal for administrator.
It’s useful for me.
I wonder if I could use Adminportal in mongDB too.
I searched web, and I found it.
You know, mongo-exporess is one of the app that I want.
Mongo-express can get from npm.
So, It’s easy to use it.
At first install from npm using follwing command.

iwatobipen$ npm install mongo-express

Then copy config.default.js to config.js
And run the app.js.
( Before run the script, I ran mongo server. )

iwatobipen$ mongod --dbpath=./data/db/
iwatobipen$ cp python3env/node_modules/mongo-express/config.default.js python3env/node_modules/mongo-express/config.js
iwatobipen$ cd python3env/node_modules/mongo-express/ && node app.js
Mongo Express server listening on port 8081 at
Database connected!

Now access localhost 8081, I got admin portal view.
Screen Shot 2015-10-12 at 3.37.38 PM
User can download data as json format.
Works fine♫ ;-)

Draw Molecular Matched Pair as SVG.

Somedays ago, I posted drawing molecule as SVG using RDKit.
It works fine.
So, I challenged draw MMP as SVG.
My plan is …
1. Generate MMP using RDKit.
2. Store MMP data to MongoDB.
3. Provide MMP data to user using flask.
4. Draw structure on the fly using Ajax.
OK, Let start!
Step 1 is skipped. If reader who want to know how to generate MMP using RDKit I recommend check
Step 2 Use Mongodb.
I used mongodb from pymongo.( I installed using following command ‘conda install pymongo’.)
Data is stored in MongoDB like json format.
Like this..
> db.molcollection.findOne()
“_id” : ObjectId(“55e84d6b4058aa5e3eade147”),
“idl” : “2963575”,
“transform” : “[*:1]c1cccc([*:2])c1>>[*:1]c1ccc([*:2])cc1”,
“moll” : “Cc1cccn2cc(-c3cccc(S(=O)(=O)N4CCCCC4)c3)nc12”,
“context” : “[*:1]c1cn2cccc(C)c2n1.[*:2]S(=O)(=O)N1CCCCC1”,
“pairid” : “2963575>>1156028”,
“molr” : “Cc1cccn2cc(-c3ccc(S(=O)(=O)N4CCCCC4)cc3)nc12”,
“idr” : “1156028”
OK, let’s make APP.
I used some javascript library for convenience.
jquery, footable are very cool and useful. is very simple script.
Root page provide toppage and _moldraw function return mmp as SVG.
I used simple tricks for draw molecule, using transform data to highlight structure that is transformed.

Script was following code.

from flask import Flask, render_template, request, url_for, jsonify
from flask_bootstrap import Bootstrap
from moduleset import chemistry
import pandas as pd
import numpy as np
import json
import pymongo
from pymongo import MongoClient
import urllib

db = MongoClient().test
collection = db.molcollection
app = Flask( __name__ )
dataset = collection.find()
dataset = [ data for data in dataset]

def top():
    return render_template("top.html", data=dataset)

def drawmol(pairid):
    pairid = urllib.parse.unquote(pairid)

    res = collection.find_one({ "pairid" : pairid })
    moll = res['moll']
    molr = res['molr']
    trans = res['transform']
    core = res["context"]
    trans = trans.replace("[*:1]","[*]")
    trans = trans.replace("[*:2]","[*]")
    trans = trans.replace("[*:3]","[*]")

    transl, transr = trans.split(">>")

    svgl = chemistry.depictmol( moll, transl )
    svgr = chemistry.depictmol( molr, transr )

    return jsonify( ml=svgl, mr=svgr )

if __name__ == '__main__': debug = True )

I embedded javascript directory in top.html file.
Footable.js is very easy to use and it can generate cool table view!
I used jQuery to show molecule when user mouseover the pairID.

	<title>test web page</title>
	<link href={{ url_for("static", filename="js/FooTable-2/css/footable.core.min.css")}} rel="stylesheet" type="text/css" />
    <link href={{ url_for("static", filename="js/FooTable-2/css/footable.standalone.min.css")}} rel="stylesheet" type="text/css" />
	<script type="text/javascript" src={{url_for("static", filename="js/jquery-2.1.4.min.js")}}></script>
	<script type="text/javascript" src={{url_for("static", filename="js/FooTable-2/js/footable.js")}}></script>
	<script type="text/javascript" src={{url_for("static", filename="js/FooTable-2/js/footable.sort.js")}}></script>
	<script type="text/javascript" src={{url_for("static", filename="js/FooTable-2/js/footable.paginate.js")}}></script>
	<script type="text/javascript">
			var url = "/_draw/"+this.textContent;
				url: url,
				contentType:"application/json; charset=utf-8",
				success: function(json){




<div style="float:left;width:400px">
	<table class="footable" data-page-size="5">
	<th data-type="numeric">IDL</th>
	{% for row in data %}
	<tr><td>{{row['idl']}}</td><td>{{row['idr']}}</td><td class='pair_id'>{{row['pairid']}}</td></tr>
	{% endfor %}
							<td colspan="5">
								<div class="pagination pagination-centered"></div>
<div style="float:left;" >
<table name="pair_table" border=1>
<tr style="height:200px"><td style="width:300px;" id="moll">MOL</td><td style="width:300px;" id="molr">MOR</td></tr>
<script type="text/javascript">
	$(function () {

Finally I wrote script to drawing molecule like following

from rdkit import Chem
from rdkit.Chem import rdDepictor, AllChem
from rdkit.Chem.Draw import rdMolDraw2D

def depictmol( smiles, core ):
    #mol = Chem.MolFromSmiles( smiles  )
    mol = Chem.MolFromSmiles( smiles  )
    atoms = mol.GetSubstructMatch(Chem.MolFromSmarts( core ))
        Chem.Kekulize( mol )
        rdDepictor.Compute2DCoords( mol )
        rdDepictor.Compute2DCoords( mol )
    drawer = rdMolDraw2D.MolDraw2DSVG( 300, 200 )
    drawer.DrawMolecule( mol, highlightAtoms=atoms )
    svg = drawer.GetDrawingText().replace( 'svg:', '' )
    return svg

Run the script from terminal.
It works fine!
Screen Shot 2015-09-06 at 9.31.10 PM

Upload code snippet to myrepo.

I want to align pair molecules core, but it can not now.