PAINS filter using RDKit

PAINS means “Pan Assay Interference Compounds”. There are sets of substructures that appears as frequent hitters in many HTS campaign.
Recently new PAINS structures were reported in JMCL.

Lots of tools inpremented PAINS substructure filter. For example knime, stardrop, pp or etc….

To avoid false positive, or any other problems, PAINS filter is useful I think.

Fortunately, these substructures were provided as smarts string. 😉
So, I wrote PAINS filter using RDKit.
The strategy is very simple.
At first make substructure dictionary from text file of SMARTS.
Then define the filter function.
This function check that dose molecule have any PAINS functional groups in the structure.
If the molecule has some reactive of toxic substructure, add the SMARTS strings and description as molecular properties.
That’s all.

I tested the code using some molecules and it seems to work fine.
I upload the code and smarts list to my github repo.
https://github.com/iwatobipen/rdkit_pains
Also IPython notebook viewer.
http://nbviewer.ipython.org/github/iwatobipen/rdkit_pains/blob/master/testcode.ipynb
But if reader who can use knime, pp, or any other tools, I recommend use your own tools. 😉
Do you have any PAINFUL compounds in your HTS deck ?

# -*- coding: utf-8 -*-

from rdkit import Chem
inf = open("pains.txt", "r")
sub_strct = [ line.rstrip().split(" ") for line in inf ]
smarts = [ line[0] for line in sub_strct]
desc = [ line[1] for line in sub_strct]
dic = dict(zip(smarts, desc))

def pains_filt(mol):
    """
    >>> mol = Chem.MolFromSmiles("c1ccccc1N=Nc1ccccc1")
    >>> checkmol = pains_filt( mol )
    >>> props = [ prop for prop in checkmol.GetPropNames() ]
    >>> props[0]
    'azo_A(324)'
    
    """

    for k,v in dic.items():
        subs = Chem.MolFromSmarts( k )
        if subs != None:
            if mol.HasSubstructMatch( subs ):
                mol.SetProp(v,k)
    return mol


if __name__ == "__main__":
    import doctest
    doctest.testmod()

PAINS substructures as SMARS can get following Link.
http://blog.rguha.net/?p=850

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s