RDKit can handle reaction. Enumeration of many molecules with template reaction and building blocks are useful for library generation.
Recently I have a question about how to handle intramolecular reactions with RDKit such as micro cyclization etc.
In the case of amidation reaction that is often used for drug synthesis SMARTS query is below.
‘[C:1][C:2](=[O:6])[O:3].[N:4][C:5]>>[C:1][C:2](=[O:6])[N:4][C:5]’
The query is inter molecular reaction in RDKit. A + B => C. So this query can not apply to intramolecular reaction such like a “OC(=O)CCCCN => C(=O)1CCCCN1”
In the RDKit, intramolecular reaction query is represented by including reactants in parentheses.
You can found in the document.
http://www.rdkit.org/docs/RDKit_Book.html
By the way how to distinguish between intra and inter reaction in my code?🤔
I propose simple solution, multiple SMILES is handled as one mol object and perform reaction and then separate it.
OK my english is difficult to understand, let’s go to code.
In the blog post, I wrote an example for amidation.
from rdkit import Chem from rdkit.Chem import rdChemReactions from rdkit.Chem import AllChem from rdkit.Chem.Draw import IPythonConsole from rdkit.Chem import Draw IPythonConsole.ipython_useSVG=True # define intra and inter molecular reaction intra_rxn = AllChem.ReactionFromSmarts('([C:1][C:2](=[O:6])[O:3].[N:4][C:5])>>[C:1][C:2](=[O:6])[N:4][C:5]') inter_rxn = AllChem.ReactionFromSmarts('[C:1][C:2](=[O:6])[O:3].[N:4][C:5]>>[C:1][C:2](=[O:6])[N:4][C:5]') # basic acid / amine acid = Chem.MolFromSmiles('CC(=O)O') amine = Chem.MolFromSmiles('NC') # intramolecular aminoacid = Chem.MolFromSmiles('N(C)CCC(O)CC(=O)O') # two molecules in one mol object! combmol = Chem.MolFromSmiles("CC(=O)O.N1CCC(C)1")
In the case of A + B => C is below
inter_rxn.RunReactants([acid, amine])[0][0]
In this case reaction seems good. By the way, in case of intra reaction is below.
# Intra reaction can not represent with inter molecular reactoin query inter_rxn.RunReactant(aminoacid,0)[0][0]
But intra SMIRKS query works fine.
#inra moleclar query works fine intra_rxn.RunReactant(aminoacid,0)[0][0]
Next run the reaction with one mol object in two molecules and intra molecular reaction object.
# paired molecular object also works but reactant two molecules is handled as one object intra_rxn.RunReactant(combmol,0)[0][0]
Finally combined molecules are separated with Sep_mol function. The function convert molecule to SMILES and split by ‘.’ then transforms SMILES to molecules.
def sepMol(mol): smi_list = Chem.MolToSmiles(mol).split('.') mols = [ Chem.MolFromSmiles(smi) for smi in smi_list] return mols
ms = sepMol(combmol) print(len(ms)) ms.append(intra_rxn.RunReactant(combmol,0)[0][0]) # out 2
I tried the function only amidation and do not know whether the method is efficient or not.
Any comment or advice is appreciated.
All my code can check from following URL.
https://nbviewer.jupyter.org/github/iwatobipen/chemo_info/blob/master/rdkit_notebook/reaction_rdkit.ipynb
https://github.com/iwatobipen/chemo_info/tree/master/rdkit_notebook
Hi,
There is a rogue variable in cell 7 that is not defined before: “ps2”.
On the specific problem of separating unbound fragments inside a Mol object, I would have used GetMolFrags (http://rdkit.org/Python_Docs/rdkit.Chem.rdmolops-module.html#GetMolFrags) and then copy the selected atoms into a new Mol. Is there a particular reason not to do that? Did you test the efficency?
Best,
Hi,
Thank you for your comment! Nice catch! ps2 is not defined and I fixed my code.
There is no particular reason that I do not use GetMolFrags method for this problem. I have never used the method before. I’ll try to use GetMoFrags. :-D
I appreciate your kind suggestion.
Best,