PAINS filter using RDKit

PAINS means “Pan Assay Interference Compounds”. There are sets of substructures that appears as frequent hitters in many HTS campaign.
Recently new PAINS structures were reported in JMCL.

Lots of tools inpremented PAINS substructure filter. For example knime, stardrop, pp or etc….

To avoid false positive, or any other problems, PAINS filter is useful I think.

Fortunately, these substructures were provided as smarts string. 😉
So, I wrote PAINS filter using RDKit.
The strategy is very simple.
At first make substructure dictionary from text file of SMARTS.
Then define the filter function.
This function check that dose molecule have any PAINS functional groups in the structure.
If the molecule has some reactive of toxic substructure, add the SMARTS strings and description as molecular properties.
That’s all.

I tested the code using some molecules and it seems to work fine.
I upload the code and smarts list to my github repo.
https://github.com/iwatobipen/rdkit_pains
Also IPython notebook viewer.
http://nbviewer.ipython.org/github/iwatobipen/rdkit_pains/blob/master/testcode.ipynb
But if reader who can use knime, pp, or any other tools, I recommend use your own tools. 😉
Do you have any PAINFUL compounds in your HTS deck ?

# -*- coding: utf-8 -*-

from rdkit import Chem
inf = open("pains.txt", "r")
sub_strct = [ line.rstrip().split(" ") for line in inf ]
smarts = [ line[0] for line in sub_strct]
desc = [ line[1] for line in sub_strct]
dic = dict(zip(smarts, desc))

def pains_filt(mol):
    """
    >>> mol = Chem.MolFromSmiles("c1ccccc1N=Nc1ccccc1")
    >>> checkmol = pains_filt( mol )
    >>> props = [ prop for prop in checkmol.GetPropNames() ]
    >>> props[0]
    'azo_A(324)'
    
    """

    for k,v in dic.items():
        subs = Chem.MolFromSmarts( k )
        if subs != None:
            if mol.HasSubstructMatch( subs ):
                mol.SetProp(v,k)
    return mol


if __name__ == "__main__":
    import doctest
    doctest.testmod()

PAINS substructures as SMARS can get following Link.
http://blog.rguha.net/?p=850

広告

コメントを残す

以下に詳細を記入するか、アイコンをクリックしてログインしてください。

WordPress.com ロゴ

WordPress.com アカウントを使ってコメントしています。 ログアウト / 変更 )

Twitter 画像

Twitter アカウントを使ってコメントしています。 ログアウト / 変更 )

Facebook の写真

Facebook アカウントを使ってコメントしています。 ログアウト / 変更 )

Google+ フォト

Google+ アカウントを使ってコメントしています。 ログアウト / 変更 )

%s と連携中