Chemically Advanced Template Search (CATS) is developed by Prof. Gisbert Schneider. CATS descriptor is ligand pharmacophore based fuzzy descriptor. So it is suitable for Scaffold hopping of virtual screening. Last week, I attended his lecture and had interest the descriptor again. Fortunately I could find some implementation of the CATS2D descriptor in github repo.
Arthuc’s work (original work is Rajarshi Guha) is nice because the code is written with python and rdkit is used in chemical engine.
I modified the work and made package for CATS2D descriptor calculation with RDKit.
Let’s see an example for distance calculation. At first import packages for calculation.
from rdkit import Chem from rdkit.Chem import DataStructs from rdkit.Chem import AllChem from scipy.spatial.distance import euclidean from cats2d.rd_cats2d import CATS2D import numpy as np from rdkit.Chem.Draw import IPythonConsole from rdkit.Chem import Draw
Then load two test molecules. These molecules are well known drugs.
mol1 = Chem.MolFromSmiles('CCCC1=NN(C2=C1N=C(NC2=O)C3=C(C=CC(=C3)S(=O)(=O)N4CCN(CC4)C)OCC)C') mol2 = Chem.MolFromSmiles('CCCC1=NC(=C2N1N=C(NC2=O)C3=C(C=CC(=C3)S(=O)(=O)N4CCN(CC4)CC)OCC)C')
OK let’s calculate distance of these molecules with Morgan FP and CASTS2D descriptors. CATS2D descriptor is not bit fingerprint, so I used eucdean distance. Surprisingly Tanimoto distance (1.0 – Tanimoto similarity) is very low even if these molecules looks similar.
fp1 = AllChem.GetMorganFingerprintAsBitVect(mol1, 2) fp2 = AllChem.GetMorganFingerprintAsBitVect(mol2, 2) dist = 1.0 - DataStructs.TanimotoSimilarity(fp1, fp2) print(dist) > 0.41025641025641024
On the other hand, cats2d descriptor based distance is 0.0. It indicates that the two molecules are almost same based on their pharmacophore features.
cats = CATS2D() cats1 = cats.getCATs2D(mol1) cats2 = cats.getCATs2D(mol2) euclidean(cats1, cats2) > 0.0
Also the package can provide information of pharmacophore.
print(cats.getPcoreGroups(mol1)) > ['', ['L'], '', '', ['A'], '', '', '', ['A'], '', ['D'], '', ['A'], '', '', '', '', '', '', '', ['A'], ['A'], ['A'], '', '', ['A'], '', '', '', ['A'], '', '', ''] print(cats.getPcoreGroups(mol2)) > ['', ['L'], '', '', ['A'], '', '', '', ['A'], '', ['D'], '', ['A'], '', '', '', '', '', '', '', ['A'], ['A'], ['A'], '', '', ['A'], '', '', '', '', ['A'], '', '', '']
Scaffold hopping is very useful strategy of drug discovery for not only improving compound properties but also expanding IP space.
I would like to improve the package because the package is still under development.
Any comments a/o suggestions are greatly appreciated.
My code can be found following URL.
Thanks for developing and sharing CATS2D descriptor implementation!