Generate molecules from molecular formula #Chemoinformatics #memo #jcheminf

Most of chemoinformatitian will think that C6H6 means benzene and its SMILES strings will be ‘c1ccccc1’.

However how do you think that how many possible combinations will be generated from molecular formula C6H6?


Yah, it’s interesting but difficult question.

Recently I read interesting article published from Jounral of chemoinformaitcs. The title is ‘Surge: a fast open-source chemical graph generator’.

The authors developed a fast chemical graph generator which generates molecules from formula.

To generate chemical graph from formula, several steps are required 1. generate graph generation and check automorphism, bond multipicity.

In the case of C6H6, over 200 molecules are generated with surge!!!

Fortunately, binay version of surge is provided from following URL.

So I used to it. At first, I got program from the URL above and generate molecules from formula C6H6.

$ $ ./surge-linux-v1.0 -o hoge.sdf C6H6

Then hoge.sdf was generated. And I checked generated structure.

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
view raw C6H6.ipynb hosted with ❤ by GitHub

As described in the article, surge has a limitation. Current version doesn’t perform a Huckel aromaticity test. It means surge will generate dupilicates structure for kekule versions of aromatic rings.

However it works fast and interesting tool for molecular generation. BTW it’s difficult to filter from generated molecules with desired compound properties in the drug discovery field.


Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: