One liner command tool for LillyMedChemRules #Chemoinformatics #memo

There are many substructure files are available in these days. And LillyMedChem Rules is one of useful and famous filter. It works very fast and provides reasonable results.

However the implementation returns the result as multiple files. So user need to marge files after filtration.

So I wrote small script to conduct filter the molecules and marge all generated files.

I uploaded the code to my repository. ;)
https://github.com/iwatobipen/lillymcrules

To use the code, user need to install LillyMedChemRules at first. And modify the FILTERPATH of the lillymcf.py to where you installed the filter.

Then install my code with pip.

$ git clone https://github.com/iwatobipen/lillymcrules.git
$ cd lillymcrules
# edit lillymcf.py
$ pip install .

Now ready to use the script.

After installing the code, ‘lillyfilter’ command will be available.

To use the script, it is very simple. Let me show….

tests$ ls
example_molecules.smi
tests$ lillyfilter example_molecules.smi 
Done
tests$ ls
bad0.smi  bad1.smi  bad2.smi  bad3.smi  example_molecules.smi  marged.csv  ok0.log  ok1.log  ok2.log  ok3.log  ok.smi

The script call Lilly_Medchem_Rules.rb script of original code so most of output file is same. But marged.csv is generated additionally.

The file has all molecules which are generated by the filter.

tests$ head -n 5 marged.csv 
ClC1=CC=CC=C1C=CC1=NC2=CC=CC3=C2C(=CC=C3)N1C,PBCHM1719983,passed,
S(C)C(=N(=O)C)CC,PBCHM6386587,passed,
ON(N(CCNCC)CC)N=O,PBCHM4516,passed,
S(C)CCC(=O)C(=CO)O,PBCHM562,passed,
ClCCN(CN1NC(=O)C=CC1=O)CCCl,PBCHM260812,passed,

tests$ tail -n5 marged.csv 
C1C2=C[C+](CC(=CC1=CC=CC=C2)C=C)C=CC,PBCHM2751984,bad0,TP1 no_interesting_atoms
FC(F)(C(F)(F)C(F)(F)F)C(F)(F)F,PBCHM9638,bad0,TP1 no_interesting_atoms
OC(CO)C=O,PBCHM751,bad0,TP1 not_enough_atoms
C(CCC)C1=CC(=C1)C,PBCHM11832205,bad0,TP1 no_interesting_atoms
S(=O)(=O)(O)OC1C(OC(=O)CC(C)C)C(OC2CC3(C)C4C5(CC(CC4)C(=C)C5O)CCC3C(C2)C(=O)O)OC(C1OS(=O)(=O)O)CO,PBCHM2255,bad0,TP1 too_many_atoms

The file has molid, smiles and description and ready to use rdkit and pandas.

By using click package, it is easy to develop command line tools. ;)

Any comments, suggestions Pull requests will be greatly appreciated.

Advertisement

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: