Convert fingerprint to numpy array and convert numpy array to fingerprint #RDKit #memorandum

This is just memorandum for my self.
RDKit has ConvertToNumpyArray method for converting rdkit fp to numpy array. But there is not direct method for convert numpy array to rdkit fp.
However, rdkit has CreateFromBitString method.

So, I tried to convert numpy array to rdkit fp with the method.

import numpy as np
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
mol = Chem.MolFromSmiles('C1CCCOC1')
fp = AllChem.GetMorganFingerprintAsBitVect(mol, 2, nBits=512)
arr = np.zeros((0,), dtype=np.int8)
arr
> array([], dtype=int8)

Then convert fp to numpy array.

DataStructs.ConvertToNumpyArray(fp,arr)
arr
> array([0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

Next convert numpy array to rdkit fp.

bitstring="".join(arr.astype(str))
fp2 = DataStructs.cDataStructs.CreateFromBitString(bitstring)

Finally, check both fingerprint.

list(fp.GetOnBits())
> [2, 4, 11, 28, 144, 225, 381, 414, 438]

list(fp2.GetOnBits())
> [2, 4, 11, 28, 144, 225, 381, 414, 438]

RDKit has many functions. I am happy to find new methods that I have not used before. ;)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.