Particle Swarm Optimization for molecular design #RDKit #Chemoinformatics

I participated RDKit UGM last week. It was worth to go I think. And in the meeting I got useful information for de novo molecular design. You can find the slide deck following URL.

They used Particle Swarm Optimization(PSO) for de novo molecular design. PSO is very simple method for parameter optimization. Details are described in wikipedia.

The algorithm of PSO is similar to Q-learning. PSO try to improve objective function by updating velocity during the iteration.

Fortunately, PSO algorithm for molecular generation is disclosed in github. ;)

So I installed mso from github and tried to use it.

At first I installed cddd because mso is depended with cddd. cddd encode molecules to latent space and decode latent space to molecules.
. In the readme of cddd, tensorflow==1.10 is required but it worked tensorflow==1.13. Then installed mso. Both library can install from github.

$ git clone
$ cd cddd
$ pip install -e .
$ cd ../
$ git clone
$ cd mso
$ pip install -e .

After installing the package I used mso. To convenience, I used pretrained model which is provided from cddd repo. Downloaded default model from following URL and stored ./cddd/data folder and unzip it. URL:

Now ready, let’s try.

MSO optimize particle positions with objective functions. The package have many functions. Such as QED, substructure and alert, etc.
Multiple combination is scoring function is also available.

Following example is simple. At first, I used substructure and QED function. Substructure function return 1 if generated molecule has user defined structure. It is useful because RNN based generator often generates molecule randomly so it is difficult to keep specific substructure.

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw

from mso.optimizer import BasePSOptimizer
from mso.objectives.scoring import ScoringFunction
from mso.objectives.mol_functions import qed_score
from mso.objectives.mol_functions import sa_score
from mso.objectives.mol_functions import substructure_match_score
from functools import partial
from cddd.inference import InferenceModel

sub_match = partial(substructure_match_score, query=Chem.MolFromSmiles('c1ccncc1'))

init_smiles = "c1c(C)cccc1" 
scoring_functions = [ScoringFunction(func=qed_score, name="qed", is_mol_func=True), ScoringFunction(func=sub_match, name='subst', is_mol_func=True)]

substructure_match_score function requires several arguments including substructure. To use it, partial function is used for freeze several args. And then pass functions to ScoringFunction class.

infermodel = InferenceModel()
opt = BasePSOptimizer.from_query(

Then defined optimizer instance. num_part shows number of particles which means number of molecules. And finally run the optimization.

res =
res0 = res[0]

res0 has best scored smiles and other generated smiles. I retrieve them and draw. Fitness is normalized score.

mols = []
for idx, smi in enumerate(res0.smiles):
        mol = Chem.MolFromSmiles(smi)
        if mol != None and[idx] > 0.7:

OK, generated molecules has use defined substructure. However it seems similar compounds, does it depend on training set?

opt instance has fitness history as pandas dataframe. Fortunately rdkit can PandasTools. ;)

from IPython.display import HTML
from rdkit.Chem import PandasTools
PandasTools.AddMoleculeColumnToFrame(opt.best_fitness_history, smilesCol='smiles')

It’s very easy. MSO optimizer does not use GPU, so the process works very fast if user has many CPUs.

MSO seems useful package for molecular generation. User can also define your own objective functions. I use it more deeply.

Today’s code is uploaded to gist.

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
view raw msotest.ipynb hosted with ❤ by GitHub

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: