Convert bit-vector to comma separated strings #memo #chemoinformatics #RDKit

Today’s post will be very short :)

I had an MRI scan for my knee at the hospital today. It took almost 6 moths after my surgery. And the result was very well. So I could be able to running as same as before getting the surgery. I’m not young but I would like to challenge full marathon again! So I’ve started running again. Balance of mental and physical health is very important because I’ve spend of my working time with my PC.

BTW, I often make and use predictive model with python package such as rdkit, sklearn, pytorch, lightGBM etc… Most of case, intermediate files aren’t required because all task can conduct on python env. But in some case, I need to make intermediate files to communicate external applications.

RDKit fingerprint object has ToBitString method which converts fingerprint object to string. So I found easiest way to convert fp to comma separated Bit string.

Here is an example.

from rdkit import Chem
from rdkit.Chem import rdFingerprintGenerator
gen = rdFingerprintGenerator.GetMorganGenerator(radius=3, fpSize=1024)

def smi2fpstring(smi, fp_gen):
    mol = Chem.MolFromSmiles(smi)
    fp = fp_gen.GetFingerprint(mol)
    bs =','.join(fp.ToBitString())
    return f'{Chem.MolToSmiles(mol)},{bs}'

smi2fpstring('c1ccccc1', gen)

>'c1ccccc1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
...snip...
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'

rdFingerprintGenerator can generate any type of Fingerprint it’s not limited MorganFP.

By using the approach, I can make csv file which has fingerprint information for chemoinformatics tasks.

Original publication is here!

https://chemrxiv.org/engage/chemrxiv/article-details/6115baf04cb4797dc42df605

Advertisement

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: