Compound Generator with Graph Networks, GraphINVENT take2 #chemoinformatics #RDKit #PyTorch

I posted about graph based compound generator named ‘GraphINVENT’ some days ago.

https://wordpress.com/block-editor/post/iwatobipen.wordpress.com/3450

Fortunately I could get response from author and could get useful information about their model.

The hyperparameter of the model is very important and difficult to optimize but I could get suitable learning rate to train the GDB-13 small set. I changed lr from 1e-3 to 5e-2 (currently original repository changed to same value). And tried model building again. Train model with same training molecules but different learning rate and more epochs (from 100 to 400), I could generate following molecules.

molecules from 10 epochs.

Model trained only 10 epochs generates many invalid molecules (Xe) and ….. not so interesting molecules.

molecules from 100 epochs.
molecules from 400 epochs

Hmm… After 400 epochs training molecules seem not so but for me. Most of molecules has suitable ring size.

In summary, it is difficult to optimize hyperparameter and it’s really improve compound quality. Does it mean loss function of model optimization still has space for improvement? I think so, because the model uses KL divergence to compare distribution of molecules. But generator performance should evaluated not only compound distribution but also any other druglikeness metrics (for drug like molecule generation). It’s worth to know for me that hyperparameters are very important for the generative model. ;)

Advertisement

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: