Lots of tools inpremented PAINS substructure filter. For example knime, stardrop, pp or etc….
To avoid false positive, or any other problems, PAINS filter is useful I think.
Fortunately, these substructures were provided as smarts string. ;-)
So, I wrote PAINS filter using RDKit.
The strategy is very simple.
At first make substructure dictionary from text file of SMARTS.
Then define the filter function.
This function check that dose molecule have any PAINS functional groups in the structure.
If the molecule has some reactive of toxic substructure, add the SMARTS strings and description as molecular properties.
I tested the code using some molecules and it seems to work fine.
I upload the code and smarts list to my github repo.
Also IPython notebook viewer.
But if reader who can use knime, pp, or any other tools, I recommend use your own tools. ;-)
Do you have any PAINFUL compounds in your HTS deck ?
# -*- coding: utf-8 -*- from rdkit import Chem inf = open("pains.txt", "r") sub_strct = [ line.rstrip().split(" ") for line in inf ] smarts = [ line for line in sub_strct] desc = [ line for line in sub_strct] dic = dict(zip(smarts, desc)) def pains_filt(mol): """ >>> mol = Chem.MolFromSmiles("c1ccccc1N=Nc1ccccc1") >>> checkmol = pains_filt( mol ) >>> props = [ prop for prop in checkmol.GetPropNames() ] >>> props 'azo_A(324)' """ for k,v in dic.items(): subs = Chem.MolFromSmarts( k ) if subs != None: if mol.HasSubstructMatch( subs ): mol.SetProp(v,k) return mol if __name__ == "__main__": import doctest doctest.testmod()
PAINS substructures as SMARS can get following Link.