Make SMILES with atomic information #RDKit #chemoinformatics

SMILES is widely used in chemoinformatics area due to its small datasize and easy to handle it in compound generation etc. However SMILES string can’t keep many kinds of atomic information except of chirality, charge, atom_mapping number.

ChemAxon developed Extended SMILES strings named CXSMILES. The details are described following URL.

And recent version of rdkit can handle these kinds of molecular representation. Chem.MolToCXSmiles and Chem.MolFromSmiles which can read not only basic smiles but also cxsmiles.

Following code is an example to compare default smiles and cxsmiles. I defined simple molecule and calculate atomic charge with extended huckel theory which is implemented rdkit. Then added atomic index and atomic charge in to each atom.

After that the molecule is converted basic SMILES and CSXSMILES and the constructed molecule from these SMILES.

As you can see molecule from CXSMILES can keep atomic properties which are defined before converting CXSMILES. It’s interesting for me that cxsmiles can keep many information in its strings.

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Have a nice weekend :)


Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: