Plot Chemical space with d3js based library #RDKit #Chemoinformatics

Recently I posted making interactive plot on jupyter notebook.
https://iwatobipen.wordpress.com/2018/12/09/make-interactive-chemical-space-plot-in-jupyter-notebook-cheminformatics-altair/
I used altair for doing it. Today, I used d3js and matplotlib based package to make scatter plot.
mlpd3 is another tool for making interactive plot with python.
https://mpld3.github.io/index.html

In the original site, many examples are provided. It seems easy to make any kinds of plot with tooltip. ;) So, I want to try whether the module can SVG image as tooltip.
From original site document mlpd3 supports following version of python. The mpld3 project is compatible with Python 2.6-2.7 and 3.3-3.4. But I want to run my code on python3.6. So I installed mpld3 to py3.6 env.
And current version of mpld3 causes error. I referred SOF and installed bug fixed version from github repo.


python -m pip install --user "git+https://github.com/javadba/mpld3@display_fix"

Now ready. Let’s write code! To run the code on jupyter notebook, mpld3.enable_notebook() option is recommended.

%matplotlib inline
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import mpld3
from rdkit import Chem
from rdkit.Chem import RDConfig
from rdkit.Chem import rdFingerprintGenerator
from rdkit.Chem import DataStructs
from sklearn.decomposition import PCA
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import rdDepictor
from rdkit.Chem.Draw import rdMolDraw2D
from mpld3 import plugins
mpld3.enable_notebook()

Then perform PCA with morgan fingerprint. This is not so difficult. And make moltosvg function for tooltip. I browed the function from rdkit blog post. I used svg strings as tooltip.

def fp2arr(fp):
    arr = np.zeros((1,))
    DataStructs.ConvertToNumpyArray(fp,arr)
    return arr

# Original code is described in the rdkit blog post.
# http://rdkit.blogspot.com/2015/02/new-drawing-code.html

def moltosvg(mol,molSize=(450,15),kekulize=True):
    mc = Chem.Mol(mol.ToBinary())
    if kekulize:
        try:
            Chem.Kekulize(mc)
        except:
            mc = Chem.Mol(mol.ToBinary())
    if not mc.GetNumConformers():
        rdDepictor.Compute2DCoords(mc)
    drawer = rdMolDraw2D.MolDraw2DSVG(molSize[0],molSize[1])
    drawer.DrawMolecule(mc)
    drawer.FinishDrawing()
    svg = drawer.GetDrawingText()
    return svg.replace('svg:','')

fpgen = rdFingerprintGenerator.GetMorganGenerator(2)
mols = [m for m in Chem.SDMolSupplier(os.path.join(RDConfig.RDDocsDir,'Book/data/cdk2.sdf'))]
for m in mols:
    AllChem.Compute2DCoords(m)
fps = [fpgen.GetFingerprint(m) for m in mols]
X = np.asarray([fp2arr(fp) for fp in fps])

pca = PCA(n_components=3)
res = pca.fit_transform(X)
# make set of SVG
svgs = [moltosvg(m) for m in mols]

Finally make scatter plot. This can do with almost same way to matplotlib. To use mlpd3, user do not need to write Javascript for making tooltip. Of course you can write your own javascript for implementation of more complex features .

fig, ax = plt.subplots()
ax.set_xlabel('PCA1')
ax.set_ylabel('PCA2')
ax.set_title('Viz chemical space!')
points = ax.scatter(res[:,0], res[:,1])
# This is key point for making tooltip!
tooltip = plugins.PointHTMLTooltip(points, svgs)
plugins.connect(fig, tooltip)

After running the code, I could get interactive plot which has chemical structures as a tooltip. ;)
I would like to make frame work for chemical space visualizer.

Scatter plot

Whole code can view from nb viewer.
https://nbviewer.jupyter.org/github/iwatobipen/playground/blob/master/tooltip_test.ipynb

2 thoughts on “Plot Chemical space with d3js based library #RDKit #Chemoinformatics

  1. Hi ! Thank you soso much for your blog posts, I enjoy them a lot and I learn a lot from them.
    I want to do something similar with a scatterplot. So I ran your program in jupyter notebook, however the interactive part did not work, as I received an error (TypeError: Object of type ndarray is not JSON serializable). I played around with it a bit, but did not get to your result. Do you might know, what the problem is?
    Thank you so much in advance and all the best to Japan!

    • Hi Greta,
      Thanks for the comment.
      It’s difficult to answer the question because I can’t check whole of your code and error message.
      But it seems that the error says that you passed numpy array data and it can’t JSON serialize.
      Did you pass svg strings as list?
      For my environment code worked well. If you could share more details of your code and error, I’ll check it.
      And I would like to share following URL.
      https://discuss.mxnet.io/t/make-ndarray-json-serializable/1627
      Thanks,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.