Create matched molecular series with RDKit.

New version of rdkit implemented new function about MMP named rdMMPA.
The class has FragmentMol function that returns fragments for MMP.
The function can set max number of Cut, and also can set cutting rules.
That’s means rdkit provide flexibility to chemo-informatician.
Now I’m trying to develop web app about mmpa and mmps.
So, I tested new function.
Following code was written in python3.( Dictionary object dose not have has_key method.)

from rdkit import Chem
from rdkit.Chem import rdMMPA
# I used sample files about cdk2.sdf in RDKit
mols = [ mol for mol in Chem.SDMolSupplier("cdk2.sdf") ]
# generate Fragment list using rdMMPA module.
# Condition 1) single cut, 2) get results as smiles.
fragmentlist = [ rdMMPA.FragmentMol( mol, maxCuts=1, resultsAsMols=False ) for mol in mols ]

Now I got fragmentlist.
Check it.

In [43]:fragmentlist[1]
Out[43]:(('', 'C1COC(C1)CO[*:1].Nc1nc(c2nc[nH]c2n1)[*:1]'),
 ('', 'N[*:1].c1nc2c(nc(nc2[nH]1)[*:1])OCC1CCCO1'),
 ('', 'C1COC(C1)C[*:1].Nc1nc(O[*:1])c2nc[nH]c2n1'),
 ('', 'C1COC(C1)[*:1].Nc1nc(OC[*:1])c2nc[nH]c2n1'))

OK!

Next, I made MMS(?) as python dictionary object.

fragdict = dict()
for fragments in fragmentlist:
    for fragment in fragments:
        core = fragment[1].split('.')[1]
        chain =  fragment[1].split('.')[0]
        if core in fragdict:
            fragdict[core].append( chain )
        else:
            fragdict.setdefault( core, [ chain ] )        

Results was….

In[70]:
# print result that has more than 3 fragments.
for k,v in fragdict.items():
    if len(v) >= 3:
        print(k, v)
        print( "="*20 )


OC[*:1] ['CC(C)C(Nc1nc(Nc2ccc(C(=O)[O-])c(Cl)c2)c2ncn(c2n1)C(C)C)[*:1]', 'CC(C)C(Nc1nc(Nc2cccc(Cl)c2)c2ncn(c2n1)C(C)C)[*:1]', 'CCC(Nc1nc(NCc2ccccc2)c2ncn(c2n1)C(C)C)[*:1]', 'COc1ccc(cc1)CNc1nc(nc2c1ncn2C(C)C)N(CCO)C[*:1]', 'Cn1cnc2c(nc(nc21)NC[*:1])NCc1ccccc1']
====================
Nc1nc(OC[*:1])c2nc[nH]c2n1 ['C1=CCC(CC1)[*:1]', 'C1CCC(CC1)[*:1]', 'C1COC(C1)[*:1]', 'CC(C)C(=O)[*:1]']
====================
c1ccc(cc1)C[*:1] ['CCC(CO)Nc1nc(N[*:1])c2ncn(c2n1)C(C)C', 'Cn1cnc2c(nc(nc21)NCCO)N[*:1]', 'O=C(c1ccccc1)c1cnc2n[nH]cc2c1O[*:1]', '[NH3+]C1CCC(CC1)Nc1nc(N[*:1])c2ncn(c2n1)C1CCCC1']
====================
C[*:1] ['CC(C(=O)COc1nc(N)nc2[nH]cnc12)[*:1]', 'CC(C)C(=O)Nc1ncc(SCc2ncc(C[*:1])o2)s1', 'CC(C)C(CO)Nc1nc(Nc2ccc(C(=O)[O-])c(Cl)c2)c2ncn(c2n1)C(C)[*:1]', 'CC(C)C(CO)Nc1nc(Nc2cccc(Cl)c2)c2ncn(c2n1)C(C)[*:1]', 'CC(C)n1cnc2c(nc(nc21)N(CCO)CCO)NCc1ccc(cc1)O[*:1]', 'CC(C)n1cnc2c(nc(nc21)NC(CO)C(C)[*:1])Nc1ccc(C(=O)[O-])c(Cl)c1', 'CC(C)n1cnc2c(nc(nc21)NC(CO)C(C)[*:1])Nc1cccc(Cl)c1', 'CC(C)n1cnc2c(nc(nc21)NC(CO)C[*:1])NCc1ccccc1', 'CCC(CO)Nc1nc(NCc2ccccc2)c2ncn(c2n1)C(C)[*:1]', 'CCc1cnc(CSc2cnc(NC(=O)C(C)[*:1])s2)o1', 'CN(C)NC(=O)Nc1cccc2-c3n[nH]c(-c4ccc(cc4)O[*:1])c3C(=O)c12', 'COc1ccc(cc1)-c1[nH]nc2-c3cccc(NC(=O)NN(C)[*:1])c3C(=O)c21', 'COc1ccc(cc1)CNc1nc(nc2c1ncn2C(C)[*:1])N(CCO)CCO']
====================
Cl[*:1] ['CC(C)C(CO)Nc1nc(Nc2ccc(C(=O)[O-])c(c2)[*:1])c2ncn(c2n1)C(C)C', 'CC(C)C(CO)Nc1nc(Nc2cccc(c2)[*:1])c2ncn(c2n1)C(C)C', 'C[NH+]1CCC(c2c(O)cc(O)c3c(=O)cc(oc23)-c2ccccc2[*:1])C(O)C1']
====================
c1ccc(cc1)[*:1] ['CCC(CO)Nc1nc(NC[*:1])c2ncn(c2n1)C(C)C', 'Cc1nc2ccccn2c1-c1ccnc(n1)N[*:1]', 'Cn1cnc2c(nc(nc21)NCCO)NC[*:1]', 'O=C(c1ccccc1)c1cnc2n[nH]cc2c1OC[*:1]', 'O=C(c1cnc2n[nH]cc2c1OCc1ccccc1)[*:1]', '[NH3+]C1CCC(CC1)Nc1nc(NC[*:1])c2ncn(c2n1)C1CCCC1']
====================
c1ccc(cc1)CN[*:1] ['CCC(CO)Nc1nc(c2ncn(c2n1)C(C)C)[*:1]', 'Cn1cnc2c(nc(nc21)NCCO)[*:1]', '[NH3+]C1CCC(CC1)Nc1nc(c2ncn(c2n1)C1CCCC1)[*:1]']
====================
F[*:1] ['CCCCOc1c(cnc2[nH]ncc12)C(=O)c1c(F)cc(Br)cc1[*:1]', 'COc1cc(-c2ccc[nH]2)c2C(=O)Nc3ccc(c1c32)[*:1]', 'Cc1ccc(c(c1)Nc1ccnc(n1)Nc1ccc(cc1)S(N)(=O)=O)[*:1]']
====================
N[*:1] ['C1=CCC(CC1)COc1nc(nc2[nH]cnc12)[*:1]', 'CC(C)C(=O)COc1nc(nc2[nH]cnc12)[*:1]', 'NC(=O)c1ccc(cc1)Nc1nc(OCC2CCCCC2)c(N=O)c(n1)[*:1]']
====================
O=[N+]([O-])[*:1] ['COc1cc[nH]c1/C=C1\\C(=O)Nc2ccc(c(c21)N1CCCC(C1)C(N)=O)[*:1]', 'COc1cc[nH]c1/C=C1\\C(=O)Nc2ccc(cc21)[*:1]', 'NS(=O)(=O)c1ccc(cc1)Nc1cc([nH]n1)-c1ccc(cc1)[*:1]']
====================

It seems work fine.
This is very simple example, I’ll make mms with assay data, and assist medchem data analysis more easily.

ref…
https://www.nextmovesoftware.com/matsy.html
I think MATSY is interesting and familiar for medchem.

広告

コメントを残す

以下に詳細を記入するか、アイコンをクリックしてログインしてください。

WordPress.com ロゴ

WordPress.com アカウントを使ってコメントしています。 ログアウト / 変更 )

Twitter 画像

Twitter アカウントを使ってコメントしています。 ログアウト / 変更 )

Facebook の写真

Facebook アカウントを使ってコメントしています。 ログアウト / 変更 )

Google+ フォト

Google+ アカウントを使ってコメントしています。 ログアウト / 変更 )

%s と連携中