Virtual Screening is important task of drug discovery projects. There are lots of approach for example Finger print based, substructure based and shape based screening. All approaches listed above is not only used in SBDD but also LBDD.
And there are lots of apprications to do these tasks. I wrote scripts for these task and use then. But recently I found nice package for VS named VSflow which is developed by Paul Czodrowski’s group.
It seems interesting, so I tried to use it. At first, I prepared conda env and install it.
$ gh repo clone czodrowskilab/VSFlow
$ cd VSFlow
$ conda env create --quiet --force --file environment.yml
$ conda activate vsflow
$ pip install .
After running the code, I could use vsflow command.
Next, I prepared dabase. Database can be made from any kinds of dataset but I used default set. ‘-d pdb’ option means prepare database with smiles which cames from ligandexpo.
$ time vsflow preparedb -d pdb -o pdb_ligs -np 6
**************************
VV VV SSSSSSS VSFlow
VV VV SSS SS Virtual Screening
VV VV SSSS Workflow
VV VV SSSS
VVVV SS SSS
VV SSSSSSS
**************************
Start: 06/21/2022, 21:22:37
Running in parallel mode on 6 threads
Downloading database pdb ...
Finished downloading database
Generating database file ...
Finished in 168 seconds
real 2m48.611s
user 0m6.480s
sys 0m1.619s
Now I could get ‘pdb_ligs.vsdb’ which is pickled data for VSFlow. Next, I tried to substructure and fp sim search. I used SMILES as a query. The task done in a second.
$ vsflow substructure -smi 'c1ccnnc1' -d pdb_ligs.vsdb -o smi_sub_pdb.sdf
**************************
VV VV SSSSSSS VSFlow
VV VV SSS SS Virtual Screening
VV VV SSSS Workflow
VV VV SSSS
VVVV SS SSS
VV SSSSSSS
**************************
Start: 06/21/2022, 21:28:50
Running in single core mode
Loading database pdb_ligs.vsdb ...
Reading query ...
Finished substructure search in 0.83285 seconds
Generating output file(s) ...
313 matches found
Finished: 06/21/2022, 21:28:51
Finished in 0.91936 seconds
SSS hit compounds are below.

Following example is similarity sarch and I made similarity map as PDF.
$ vsflow fpsim -d pdb_ligs.vsdb -smi "CC1CCN(C(=O)CC#N)CC1N(C)c1ncnc2[nH]ccc12" -o sim.sdf --pdf --simmap
**************************
VV VV SSSSSSS VSFlow
VV VV SSS SS Virtual Screening
VV VV SSSS Workflow
VV VV SSSS
VVVV SS SSS
VV SSSSSSS
**************************
Start: 06/21/2022, 22:06:41
Running in single core mode
Loading database pdb_ligs.vsdb ...
Reading query input ...
Calculating fingerprints ...
Finished fingerprint generation in 6.04996 seconds
Calculating similarities ...
Finished calculating similarities in 0.08398 seconds
Writing 10 molecules to output file(s)
Generating output file(s) ...
Generating PDF file(s) ...
Calculating similarity maps for 10 matches ...
Finished: 06/21/2022, 22:06:56
Finished in 14.63942 seconds
Similarity map is nice approach to visualize similarity between query(tofacitinib) and hit compounds. This example used fcfp4 as FP however user can use other rdkit supported FP such as ECFP, RDKit, Atom etc.

Final example is shape similarity. To do it vsdb should have 3D structure information. So I got 3D data from ligand expo and made vsdb.
Data link is below.
http://ligand-expo.rcsb.org/dictionaries/Components-pub.sdf.gz
Then run shape sim search. I took long time compared to commercial package such as ROCS but could generate nice output.
$ vsflow shape -smi "CC1CCN(C(=O)CC#N)CC1N(C)c1ncnc2[nH]ccc12" -d pdb_ligs3d.vsdb -o shapesmi -np 6 --pymol
**************************
VV VV SSSSSSS VSFlow
VV VV SSS SS Virtual Screening
VV VV SSSS Workflow
VV VV SSSS
VVVV SS SSS
VV SSSSSSS
**************************
Start: 06/21/2022, 22:43:11
Running in parallel mode on 6 threads
Reading database ...
Reading query ...
Performing shape screening ...
Generating 3D conformer(s) for 1 query molecule(s)
Generating PyMOl file ...
Finished: 06/22/2022, 02:16:54
Finished in 12822.98383 seconds
Here is an example output of shape similarity screening. Green is query molecule. As you can see, vsflow got molecules which has similar 3D shape.

In summary vsflow is useful package for chemoinformatics.
More detials are described the arxiv and repository’s wiki.
https://chemrxiv.org/engage/chemrxiv/article-details/628c60215d9485a206cc8ecc
Hi There,
I converted the files from sdf to pdb (generate the conformers), it is in vsdb format.
is there way i can split the vsdb into individual pdb files that ready to dock ?