CLI tool for making ssslib #chemoinformatics #rdkit

As many RDKitter know that rdSubstructLibrary is one of the cool tool for conductiong substructure search. Greg Landrum introduced how to use it in his great blog post.

I love the method because it works very fast for substructure searching. So I would like to make CLI tool for making substructure library database.

To do it, I used click which is useful package for making CLI tool.

This is an example to use the code.

$ gh repo clone iwatobipen/rdsss
$ cd rdsss
$ pip install -e .

After installing the package, three commands will be available.
1. make_rdssslib command makes sslib from sdf.gz
2. update_rdssslib which updates sslib with new sdf.gz
3. run_rdsss which run SSS with given smarts query.

The example is shown below.

# make ssslib from sdf.gz
$ make_rdssslib cdk2.sdf.gz cdk2.sslib.pkl

# search with ssslib from CLI
$ run_rdsss 'c1ccccc1' cdk2.sslib.pkl

After running the run_rdsss, hits.csv file will be generated.

$ cat hits.csv

The csv file contains hit smiles and _Name props.

All process can do from CLI with the code. But to handle learge sslib. I think user should run sss on interprinter. Because IO of SSLIB will take bottle neck of the code.

This code is stil underl development. Any advice or suggestion will be greatly appreciated.


Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: