Convert function in RDKit

It become large data file when large amount of molecules are saved as SDF format. So I often convert to SMILES from SDF.
I use MolToSmiles function to do that. But, new version of RDKit has convenient method to convert file format.
Here is sample snippet.

from rdkit import Chem
from rdkit.Chem.ChemUtils import SDFToCSV
f = open( 'out.csv', 'w' )
suppl = Chem.SDMolSupplier( 'cdk2.sdf' )
# convert sdf to smiles
SDFToCSV.Convert( suppl, f )

Now I got out.csv file. Check the file.

iwatobipen$ head -n 10 out.csv
CC(C)C(=O)COc1nc(N)nc2[nH]cnc12,ZINC03814457,1,CORINA 3.44 0027  09.01.2008,1,-78.6454,0.000213629,1
Nc1nc(OCC2CCCO2)c2nc[nH]c2n1,ZINC03814459,2,CORINA 3.44 0027  09.01.2008,1,-67.4705,9.48919e-05,1
Nc1nc(OCC2CCC(=O)N2)c2nc[nH]c2n1,ZINC03814460,2,CORINA 3.44 0027  09.01.2008,1,-89.4303,5.17485e-05,1
Nc1nc(OCC2CCCCC2)c2nc[nH]c2n1,ZINC00023543,3,CORINA 3.44 0027  09.01.2008,1,-70.2463,6.35949e-05,1
Nc1nc(OCC2CC=CCC2)c2nc[nH]c2n1,ZINC03814458,3,CORINA 3.44 0027  09.01.2008,1,-72.9091,6.51479e-05,1
Cn1cnc2c(NCc3ccccc3)nc(NCCO)nc21,ZINC01641925,3,CORINA 3.44 0027  09.01.2008,1,-42.2404,0.000120409,1
CCC(CO)Nc1nc(NCc2ccccc2)c2ncn(C(C)C)c2n1,ZINC01649340,3,CORINA 3.44 0027  09.01.2008,1,-33.4734,7.14544e-05,1
COc1ccc(CNc2nc(N(CCO)CCO)nc3c2ncn3C(C)C)cc1,ZINC01487345,3,CORINA 3.44 0027  09.01.2008,1,-23.1357,8.18592e-05,1
Nc1nc(N)c(N=O)c(OCC2CCCCC2)n1,ZINC03814479,4,CORINA 3.44 0027  09.01.2008,1,-112.542,8.83166e-05,1



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s