Install mdfptools #Chemoinformatics #RDKit #memo

I heard many exciting presentation at virtual meeting of ‘3rd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry’. And I had interest about MDFP (molecular dynamics Finger Print).

MDFP can analyze molecular flexibility and solvent accessible surface are (SASA) etc. They are important parameters for permeability, p-gp and other molecular properties. In the following article combination of ECFP-MDFP showed good performance for predicting p-gp substrate classification task.

So I would like to test MDFP. Fortunately the code is an open source and can get from github. To use mdfp some packages should be installed before.

$ git clone
$ cd mlddec
$ pip install -e .
$ cd ../
$ git clone
$ cd mdfptools
$ pip install -e .

mlddec is the package which predicts atomic pertial charges with machine learning method. More details are described here.

Of course, rdkit is required!!!! ;) (or open eye TK)

For testing, I tried to use solubility data in RDKit install dir, but MD calculation on my PC is too heavy… So I pasted an example to get a mdfp from one molecule.

By using mdfptools, the required parameters can be calculated from SMILES, but stereo chemistry should be assigned before calculation, so I defined simple (but not good) function to generate 3D information to the given molecules.

OK, let’s go to an example code. It is little but long to write on the page, so I uploaded the code on my gist.

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

To apply mdfp against real projects, there are lots of space to improve for 3D conformation generation task. And also more GPU and CPU power is required to handle lots of molecules.

The example described in the post, MD simulation was performed by openmm but mdfptools can use not only openmm but also GROMACS as MD engine.

Original repository shows some examples and scripts for getting MDFP with GROMACS.

The calculation cost is high compared to calculate ECFP etc. but this approach seems very interesting and useful for improve performance of many predictive models.


Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: