Visualise dataset using seaborn.

I often use scatter plot to analyse relationship of 2 valuables.
There are a lots of tools for visualise dataset.
Now, I challenged to use seaborn.
Seaborn is python library based on matplotlib. And easy to make cool visualisation.

Following snippet, I read sample data from rdkit install folder and plot MolWt vs LogP using jointplot.

Jointplot can show not only relation ship but also distribution of dataset.
If you want to draw it, you need only type ‘sns.jointplot’! 😉
Also this library is compatible with Pandas.
Seaborn seems to young library (most resent version is ver 0.6.) but very useful and cool library I think.
Here is sample. This is jointplot using hexbinplot.
Screen Shot 2015-08-24 at 10.47.04 PM

I pushed snippet to git-hub.

#this is not .py format
#code from ipynb
#using pybal inline mode

get_ipython().magic(u'matplotlib inline')
from rdkit import Chem
from rdkit.Chem import PandasTools
from rdkit.Chem import Descriptors
from rdkit import rdBase
from rdkit import RDConfig
import seaborn as sns
import os
datapath = os.path.join(RDConfig.RDDocsDir,"Book/data/cdk2.sdf")
moldf = PandasTools.LoadSDF(datapath)
def molwt(mol):
    mw = Descriptors.MolWt(mol)
    return mw
def logp(mol):
    logp = Descriptors.MolLogP(mol)
    return logp
moldf['MW'] = moldf.ROMol.apply(molwt)
moldf['LOGP'] = moldf.ROMol.apply(logp)
from scipy.stats import kendalltau
x = moldf.MW
y = moldf.LOGP
sns.jointplot(x,y, kind='hex', stat_func=kendalltau)
sns.jointplot(x,y, kind='kde', stat_func=kendalltau)
sns.jointplot(x,y, kind='resid', stat_func=kendalltau)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s