Get environment SMILES around cutting points #chemoinformatics #memo #RDKit

In this week, I’m in summer vacation but can’t go travel due to COVID19 pandemic and heavy rain. It’s really unusual summer vacation. I hope everyone stay safe.

BTW, I often use R-Group decomposition and Matched molecular pairs and these method generate many fragment smiles which has [*] at attachment points. And I would like to get smiles around attachment points. How can I do it?

I found solution. FindAtomEnvironmentOfRadiusN method finds the bonds within a certain radius of an atom in a molecule. So I use this method and mmpdblib.environment method to get centers which means attachment point atoms[*] of the SMILES.

To run following code, mmpdb should be installed with pip.

OK, let’s write code. Here is an example. At first, I defined core of tofacitinib.

# Following code is written on jupyter-notebook
from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
from IPython import display
IPythonConsole.drawOptions.comicMode=True
IPythonConsole.drawOptions.minFontSize=8
tofacitinib = Chem.MolFromSmiles('CC1CCN(CC1N(C)C2=NC=NC3=C2C=CN3)C(=O)CC#N')
tofacitinib
core = Chem.MolFromSmiles('[*:1]N(C)C2=NC=NC3=C2C=CN3')
const_smi = Chem.MolToSmiles(core)
core

Now I defined core and center of environment as [*:1].

Next code shows how to get envsmi with given radius. To display mol object in for loop statement, IPython.display.display method was used.

from mmpdblib import environment
def get_envsmi(constant_smi, radius):
    centers = environment.find_centers(constant_smi)
    res = []
    for atom_id in centers.atom_ids:
        env = Chem.FindAtomEnvironmentOfRadiusN(centers.mol, radius, atom_id)
        submol = Chem.PathToSubmol(centers.mol, env)
        smi = Chem.MolToSmiles(submol)
        res.append(smi)
    return ".".join(res)

for radi in range(7):
    print(f'RADIUS {radi}')
    envmol = Chem.MolFromSmiles(get_envsmi(const_smi, radi), sanitize=False)
    display.display(envmol)
radius = 1
radius = 2
radius = 3
radius = 4
radius = 5
radius = 6

As you can see, get_smi function works well to get the environments around center atoms with given radius.

FindAtomEnvironmentOfRadiusN is useful method to get information around atoms.

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: