タグ: jython

Call CDK from Python

Somedays ago, I wanted to use molecular signature descriptors. But the descriptor can’t calculate from RDKit.
ref.
http://pubs.acs.org/doi/abs/10.1021/ci020345w

It was implemented in CDK.
Hmm, CDK,,, Java library, I’m not good at JAVA :-/.
I found old, but good information from CDK news and noel’s blog!
URL is following
https://www.redbrick.dcu.ie/~noel/CDKJython.html
Jython is one of solution. I tested it.
I installed jython using homebrew and downleaded cdk-1.4.19.jar.. 😉
Set CLASSPATH of cdk-*.jar file.( cdk-1.5.*.jar did not worked.)
Ready!
Let’s start.

from org.openscience.cdk.io.iterator import IteratingMDLReader
from org.openscience.cdk import DefaultChemObjectBuilder
f = open('cdk_2.sdf','r')
mols = []
for mol in IteratingMDLReader( f, DefaultChemObjectBuilder.getInstance()):
mols.append( mol )
"""
len(mols)
I got 47
"""

Next, calculate Mol Sig.

</pre>
from org.openscience.cdk import signature
for i, m in enumerate(mols):
print str(i), signature.AtomSignature( 2, m )

0 [C]([C]([C]([H][H][H])[C]([C]([H][H][O]([C]([C](=[C]([N]([C][H])[N])[N](=[C]([H])))=[N]([C](=[N][N]([H][H]))))))=[O])[H])[H][H][H])
1 [C]([C](=[C]([N][O]([C]([C]([C]([C]([C][H][H])[H][H])[H][O]([C]([H][H])))[H][H])))[N](=[C]))[N]([C]([H])[H])=[N]([C](=[N][N]([H][H]))))
2 [C]([C](=[C]([N][O]([C]([C]([C]([C]([C][H][H])[H][H])[H][N]([C](=[O])[H]))[H][H])))[N](=[C]))[N]([C]([H])[H])=[N]([C](=[N][N]([H][H]))))
3 [C]([C](=[C]([N][O]([C]([C]([C]([C]([C]([H][H])[H][H])[H][H])[C]([C]([C][H][H])[H][H])[H])[H][H])))[N](=[C]))[N]([C]([H])[H])=[N]([C](=[N][N]([H][H]))))
4 [C]([C](=[C]([N][O]([C]([C]([C]([C]([C]([H])[H][H])[H][H])[C]([C](=[C][H])[H][H])[H])[H][H])))[N](=[C]))[N]([C]([H])[H])=[N]([C](=[N][N]([H][H]))))
5 [C]([H][N]([C]([H][H][H])[C]([C]=[N]([C](=[N][N]([C]([C]([H][H][O]([H]))[H][H])[H])))))=[N]([C](=[C]([N][N]([C]([C]([C](=[C]([C]([H])[H])[H])=[C]([C](=[C][H])[H]))[H][H])[H])))))
...

Worked.
But, I used ECFP instead of MolSig some reasons. 😉