Recently there are many publications about de-novo molecular generator which mainly use Deep Learning. One problem of the approach is that generated molecules are not systematic so it’s difficult to synthesis them with parallel chemistry. So sometime chemists dislike the proposal from generated form the method I think.
Rule or Rxn or MMP based molecule generation is another approach to do that. It’s based on more chemist friend rules. They are not new but useful method and also related approaches are still reported in these days.
Some days ago I found new article in J. Cheminform. The title was ‘CReM: chemically reasonable mutation framework for structure generation’. URL is below.
The author proposed new workflow for molecular framework mutation it seems like MMP approach, it degrade molecule to fragment with local context (radi1-5) for making interchangable fragment database, like MMP key-value structures are stored. And the data is used for ‘MUTATE’, ‘GROW’ and ‘LINK’ for new structure generation.
I felt that the article is very similar to following ACS article reported by Kawai et al.
They proposed similar approach for molecular generation with fragment database.
Compared these approach, I think main difference is that CreM can set context radius. The setting affects feature of generated molecules.
In the fig4, and fig5 of the J. Cheminform article, the author shows properties of generated molecules with different radius. For example novelty, diversity score is decreased when large radius(5) is used. It means that more context similar compounds are generated with the setting.
As you know, CreM author disclosed the implementation so let’s use it. It is easy to install crem has very few dependency, just rdkit and gaucamol(optional). At first I installed CReM with pip and get ready to use DB.
$ git clone https://github.com/DrrDom/crem.git
$ cd crem
$ pip install .
# get data set Thanks for providing the db!
$ wget http://www.qsar4u.com/files/cremdb/replacements02_sc2.5.db.gz
$ gzip -d replacements02_sc2.5.db.gz
And I uploaded an example code on my gist.
By using the dataset, it took few minutes for structure generation. After generating the molecule rdkit can render mols with Drawing function.
It seems that radius=1 generates more diverse compounds set. It is easy to use for molecular generation.
Ok now we can use deep learning based and rule based structure generator. Each methods has pros and cons. As author said that CReM can generate chemistry reasonable structure but can’t generate new rings which isn’t fragment db.
Which is good proposal for medchem new structure constructed from know fragments or new structure with novel fragments?
It’s depends on situation but novel fragments requires new chemistry or many wet experiments. AI driven drug discovery can’t replace all wet experiments to dry experiments. Which molecules do you make at first and next, experimental design is key for the many projects.
Have a nice weekend. ;)