As you know Jupyter notebook is very useful tool for data scientist. It can analyze scientific data with nice view. And there are lots of packages for data visualization. And I often use matplotlib and seaborn for my task. However few days ago, I found an interesting package named Panel which is high level app and dashbording app for python. I posted another package dash before but I’ve never used panel. So I tried to use panel with rdkit.
Panel can install via conda from pyviz channel or pip. I installed panel by using conda.
After installed panel I tested it.
At first I tried to visualize rdkit mol object. Import packages and molecules.
import numpy as np import pandas as pd import os import panel as pn from rdkit import Chem from rdkit import RDPaths from rdkit.Chem.Draw import IPythonConsole from rdkit.Chem import Draw from rdkit.Chem import AllChem import matplotlib.pyplot as plt plt.style.use('ggplot') sdf = os.path.join(RDPaths.RDDocsDir, 'Book/data/cdk2.sdf') mols = [m for m in Chem.SDMolSupplier(sdf)] for m in mols: AllChem.Compute2DCoords(m)
Then made IntSlider to set index of molecule which would like to render. Following example uses depends decorator for making interactive view.
pn.extension() slider = pn.widgets.IntSlider(start=0, end=len(mols), step=1, value=0) @pn.depends(slider.param.value) def callback(value): return Draw.MolToImage(mols[value]) row = pn.Column(slider, callback) row
Now molecule was rendered with slider widget.
It seems work well, next I tried to add range slider to draw molecules. Code is almost same as above.
pn.extension() rangeslider = pn.widgets.IntRangeSlider(start=0, end=len(mols), step=1) @pn.depends(rangeslider.param.value) def callback(value): return Draw.MolsToGridImage(mols[value: value], molsPerRow=5) pn.Column(rangeslider, callback)
IntRangeSlider returns tuple of user defined value. So I could select range which would like to render on the notebook.
Next example, I tried to make interactive scatter plot of molecular properties.
To do it, I calculated molecular descriptors with rdkit Descriptors class.
from rdkit.Chem import Descriptors from collections import defaultdict dlist = Descriptors._descList desc_dec = defaultdict(list) for mol in mols: for k, f in dlist: desc_dec[k].append(f(mol)) df = pd.DataFrame(desc_dec)
Following example used Select widget to select x and y axis and FloatSlider for setting alpha of scatter plot.
from matplotlib.figure import Figure, FigureCanvasBase columns = df.columns.to_list() x = pn.widgets.Select(options=columns, name='x_', value='MolWt') y = pn.widgets.Select(options=columns, name='y_', value='qed') alpha = pn.widgets.FloatSlider(name='alpha', value=0.5) @pn.depends(x.param.value, y.param.value, alpha.param.value) def get_plot(x_v, y_v, a): with plt.xkcd(): fig = Figure() ax = fig.subplots() FigureCanvasBase(fig) ax.set_xlabel(x_v) ax.set_ylabel(y_v) ax.scatter(df[x_v], df[y_v], c='blue', alpha = a) #fig = df.plot.scatter(x_v, x_v) return fig pn.Column(pn.Row(x, y, alpha), get_plot)
I updated matplotlib version to 3.1.3. From version 3.1.2 matplotlib can make plot with xkcd taste. ;) It’s easy to do it just make plot in with plt.xkcd() line.
Now the scatter plot can interactively select axis and alpha params. I would like to know how to get index of each point and get the value. If I can it I could render molecule when I mouse over the point.
I uploaded today’s code my repo.
Interactive plot can’t see from githubrepo or nbvier so if reader has interest the package I recommend to try it your own PC. ;)