In the FBDD projects, fragment linking strategy is very easy to understand about the strategy but it is difficult to linking two fragments in the real world I think. There are many tools for linking fragments in virtually. These tools are used not only be applied to FBDD but also scaffold hopping etc.
There are few examples are reported for de novo fragment linking with deep learning compared to the de novo compound( SMILES ) generatrion.
Recently interesting package is reported in JCIM, it’s open reader can get PDF from ACS site.
The title is ‘Deep Generative Models for 3D Linker Design’ and URL is below. https://pubs.acs.org/doi/10.1021/acs.jcim.9b01120
The author developed python package named DeLniker which link two rdkit mol object with deep generative model.
https://github.com/oxpig/DeLinker
The package is available in python3.6 with tensorflow 1.10. I would like to test DeLinker with my development env(python3.7) so I modified the code and tried to use it.
My environment was python 3.7, tensorflow-gpu 1.14, rdkit 2020.03.01
To use the env described above, I changed DeLinker_test.py. API was changed from tensorflow1.10 to tensorflow1.14, GRUCell should be called from compat.v1.nn.rnn_cell.
#cell = tf.contrib.rnn.GRUCell(new_h_dim) cell = tf.compat.v1.nn.rnn_cell.GRUCell(new_h_dim) #cell = tf.nn.rnn_cell.DropoutWrapper(cell, # state_keep_prob=self.placeholders['graph_state_keep_prob']) cell = tf.compat.v1.nn.rnn_cell.DropoutWrapper(cell, state_keep_prob=self.placeholders['graph_state_keep_prob'])
After changing the part, I modified example notebook from fragment linking task to core replacement task.
Following code run on jupyter notebook and example folder. Let’s test it.
At first import packages.
import sys sys.path.append("../") sys.path.append("../analysis/") from rdkit import Chem from rdkit.Chem import AllChem from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonConsole from rdkit.Chem.Draw import MolDrawing, DrawingOptions from rdkit.Chem import MolStandardize import numpy as np from itertools import product from joblib import Parallel, delayed import re from collections import defaultdict from IPython.display import clear_output IPythonConsole.ipython_useSVG = True from DeLinker_test import DenseGGNNChemModel import frag_utils import rdkit_conf_parallel from data.prepare_data import read_file, preprocess import example_utils import rdkit AllChem.SetPreferCoordGen(True) rdkit.__version__ > '2020.03.1'
Then add some basic settings and read molecule from smiles. Following code generate 3D conformer for core replacement because DeLinker generates linker which keep fragment linking point angle and exit vector. So I need to generate 3D at first. Fortunately RDKit can do it very easy. After generating the conformer I removed core structure. Now I got side chains with 3D conformation and attachment point as *.
# How many cores for multiprocessing n_cores = 4 # Whether to use GPU for generating molecules with DeLinker use_gpu = True vemurafenib = Chem.MolFromSmiles('CCCS(=O)(=O)Nc1ccc(F)c(c1F)C(=O)c2c[nH]c3c2cc(cn3)c4ccc(Cl)cc4') core = Chem.MolFromSmiles('c12c(cc[NH]2)cccn1') Draw.MolsToGridImage([vemurafenib, core]) tempmol = Chem.AddHs(vemurafenib) AllChem.EmbedMolecule(tempmol) vemurafenib_3d = Chem.RemoveHs(tempmol) vemurafenib_3d sidechains = Chem.ReplaceCore(vemurafenib_3d, core)

Ok get some query related data.
# Get distance and angle between fragments dist, ang = frag_utils.compute_distance_and_angle(sidechains, "", Chem.MolToSmiles(sidechains)) Chem.MolToSmiles(sidechains), dist, ang >('[1*]C(=O)c1c(F)ccc(NS(=O)(=O)CCC)c1F.[2*]c1ccc(Cl)cc1', > 5.6219243402884125, > 1.4035194537415576)
In my example code, file name and path settings are same as original example code, so if you would like to trace it be careful because the code will over write original example output.
# Write data to file data_path = "./fragments_test_data.txt" with open(data_path, 'w') as f: f.write("%s %s %s" % (Chem.MolToSmiles(sidechains), dist, ang)) raw_data = read_file(data_path) preprocess(raw_data, "zinc", "fragments_test", True)
Almost there load model and train it.
import os if not use_gpu: os.environ['CUDA_VISIBLE_DEVICES'] = '-1' # Arguments for DeLinker args = defaultdict(None) args['--dataset'] = 'zinc' args['--config'] = '{"generation": true, \ "batch_size": 1, \ "number_of_generation_per_valid": 50, \ "min_atoms": 6, "max_atoms": 15, \ "train_file": "molecules_fragments_test.json", \ "valid_file": "molecules_fragments_test.json", \ "output_name": "DeLinker_example_generation.smi"}' args['--freeze-graph-model'] = False args['--restore'] = '../models/pretrained_DeLinker_model.pickle' # Setup model and generate molecules model = DenseGGNNChemModel(args) model.train()
Let’s read generated molecules and visualize them.
# Load molecules generated_smiles = frag_utils.read_triples_file("./DeLinker_example_generation.smi") in_mols = [smi[1] for smi in generated_smiles] frag_mols = [smi[0] for smi in generated_smiles] gen_mols = [smi[2] for smi in generated_smiles] du = Chem.MolFromSmiles('*') clean_frags = [Chem.MolToSmiles(Chem.RemoveHs(AllChem.ReplaceSubstructs(Chem.MolFromSmiles(smi),du,Chem.MolFromSmiles('[H]'),True)[0])) for smi in frag_mols] clear_output(wait=True) print("Done") # Check valid results = [] for in_mol, frag_mol, gen_mol, clean_frag in zip(in_mols, frag_mols, gen_mols, clean_frags): if len(Chem.MolFromSmiles(gen_mol).GetSubstructMatch(Chem.MolFromSmiles(clean_frag)))>0: results.append([in_mol, frag_mol, gen_mol, clean_frag]) print("Number of generated SMILES: \t%d" % len(generated_smiles)) print("Number of valid SMILES: \t%d" % len(results)) print("%% Valid: \t\t\t%.2f%%" % (len(results)/len(generated_smiles)*100)) > Number of generated SMILES: 500 > Number of valid SMILES: 495 > % Valid: 99.00% from rdkit.Chem import Draw from IPython.display import display gemols = [] for res in results[:100]: gemols.append(Chem.MolFromSmiles(res[2])) im = Draw.MolsToGridImage([vemurafenib]+[Chem.MolFromSmiles(s) for s in res[1:3]], molsPerRow=4) display(im) Draw.MolsToGridImage(gemols[:30], molsPerRow=4)
The model could generated new molecules with high validity. It seems nice. And generated molecules images are below.

Original molecule vemurafenib has bicyclic core (azaindole) but generated molecule has aliphatic or mono cyclic linker. I’m not sure the results are due to training set or not. I would like to check training data later.
Any way Delinker works for fragment linking task. For practically we need to filter generated molecule with other in silico tools such as docking MD etc.
It’s interesting tool however there are many tools for fragment linking, scaffold replacement in silico tools. So how to differentiate them for example MOE, OE schrodinger etc. IMHO open source package has many flexibility. Looking forward to further researches with the package.
Today’s code was uploaded my gist. Thank for DeLinker developer!
Dear Iwatobipen, I have some problems with Delinker. Please, could you contact me by email. proftaranto@hotmail.com. My best regards Alex