I enjoyed CompChem soft user meeting yesterday and today. Presentation was very impressive for me.
I had nice discussion…. maybe,,,
I talked to another company computational chemist about rdkit.
He asked me how do you handle tautomer in rdkit.
I didn’t have clear answer at that time. But I found nice solution to solve that problem.
‘MolVs’ library can canonicalise tautomer using rdkit and it’s easy to handle it. ;-)
MolVs is uploaded to pypi https://pypi.python.org/pypi/MolVS.
So, you can install it using pip.Ok, let’s write simple code.
<>
First, install molvs.
$pip install molvs
Next I coded simple example.
# coding in IPython notebook ;-) from rdkit import Chem from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonConsole # load molvs library from molvs import tautomer from molvs import standardize from molvs import standardize_smiles
I selected 2-hydroxy pyridine as example.
t1 = Chem.MolFromSmiles('C1=CC=CNC(=O)1') t2 = Chem.MolFromSmiles( 'c1ccc(O)nc1' )
enum instance can make tautomers from one smiles.
And enumerate( mol ) return list of molecules that contain all possible tautomers.
enum=tautomer.TautomerEnumerator() ts=enum.enumerate(t1) ts2 = enum.enumerate(t2)
canon instance can return canonicalise_tautomer.
canon.canonicalize(t1) and canon.canonicalize(t2) return same structure.
canon = tautomer.TautomerCanonicalizer() canon.canonicalize(t1) canon.canonicalize(t2)
MolVs is easy to use.
Also more functions can use. More details are described in following site.
http://molvs.readthedocs.org/en/latest/
I uploaded the code to my github repo.
repo