Tautomer canonicalisation in RDKit

I enjoyed CompChem soft user meeting yesterday and today. Presentation was very impressive for me.
I had nice discussion…. maybe,,,
I talked to another company computational chemist about rdkit.
He asked me how do you handle tautomer in rdkit.
I didn’t have clear answer at that time. But I found nice solution to solve that problem.
‘MolVs’ library can canonicalise tautomer using rdkit and it’s easy to handle it. 😉
MolVs is uploaded to pypi https://pypi.python.org/pypi/MolVS.
So, you can install it using pip.Ok, let’s write simple code.
First, install molvs.

$pip install molvs

Next I coded simple example.

# coding in IPython notebook 😉
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
# load molvs library
from molvs import tautomer
from molvs import standardize
from molvs import standardize_smiles

I selected 2-hydroxy pyridine as example.

t1 = Chem.MolFromSmiles('C1=CC=CNC(=O)1')
t2 = Chem.MolFromSmiles( 'c1ccc(O)nc1' )

enum instance can make tautomers from one smiles.
And enumerate( mol ) return list of molecules that contain all possible tautomers.

ts2 = enum.enumerate(t2)

canon instance can return canonicalise_tautomer.
canon.canonicalize(t1) and canon.canonicalize(t2) return same structure.

canon = tautomer.TautomerCanonicalizer()

Screen Shot 2015-06-05 at 11.37.18 PM

MolVs is easy to use.
Also more functions can use. More details are described in following site.

I uploaded the code to my github repo.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s