Compound Generator with Graph Networks, GraphINVENT take2 #chemoinformatics #RDKit #PyTorch

I posted about graph based compound generator named ‘GraphINVENT’ some days ago.

Fortunately I could get response from author and could get useful information about their model.

The hyperparameter of the model is very important and difficult to optimize but I could get suitable learning rate to train the GDB-13 small set. I changed lr from 1e-3 to 5e-2 (currently original repository changed to same value). And tried model building again. Train model with same training molecules but different learning rate and more epochs (from 100 to 400), I could generate following molecules.

molecules from 10 epochs.

Model trained only 10 epochs generates many invalid molecules (Xe) and ….. not so interesting molecules.

molecules from 100 epochs.
molecules from 400 epochs

Hmm… After 400 epochs training molecules seem not so but for me. Most of molecules has suitable ring size.

In summary, it is difficult to optimize hyperparameter and it’s really improve compound quality. Does it mean loss function of model optimization still has space for improvement? I think so, because the model uses KL divergence to compare distribution of molecules. But generator performance should evaluated not only compound distribution but also any other druglikeness metrics (for drug like molecule generation). It’s worth to know for me that hyperparameters are very important for the generative model. ;)


Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: