Get environment SMILES around cutting points #chemoinformatics #memo #RDKit

BTW, I often use R-Group decomposition and Matched molecular pairs and these method generate many fragment smiles which has [*] at attachment points. And I would like to get smiles around attachment points. How can I do it?

I found solution. FindAtomEnvironmentOfRadiusN method finds the bonds within a certain radius of an atom in a molecule. So I use this method and mmpdblib.environment method to get centers which means attachment point atoms[*] of the SMILES.

To run following code, mmpdb should be installed with pip.

OK, let’s write code. Here is an example. At first, I defined core of tofacitinib.

# Following code is written on jupyter-notebook
from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
from IPython import display
tofacitinib = Chem.MolFromSmiles('CC1CCN(CC1N(C)C2=NC=NC3=C2C=CN3)C(=O)CC#N')
core = Chem.MolFromSmiles('[*:1]N(C)C2=NC=NC3=C2C=CN3')
const_smi = Chem.MolToSmiles(core)

Now I defined core and center of environment as [*:1].

Next code shows how to get envsmi with given radius. To display mol object in for loop statement, IPython.display.display method was used.

from mmpdblib import environment
def get_envsmi(constant_smi, radius):
    centers = environment.find_centers(constant_smi)
    res = []
    for atom_id in centers.atom_ids:
        env = Chem.FindAtomEnvironmentOfRadiusN(centers.mol, radius, atom_id)
        submol = Chem.PathToSubmol(centers.mol, env)
        smi = Chem.MolToSmiles(submol)
    return ".".join(res)

for radi in range(7):
    print(f'RADIUS {radi}')
    envmol = Chem.MolFromSmiles(get_envsmi(const_smi, radi), sanitize=False)
radius = 1
radius = 2
radius = 3
radius = 4
radius = 5
radius = 6

As you can see, get_smi function works well to get the environments around center atoms with given radius.

FindAtomEnvironmentOfRadiusN is useful method to get information around atoms.


