Reinforcement learning with docking score #RDKit #reinvent #chemoinforamtics

Previously I posted automated docking system called dockstream. It supports many compound-protein docking software and recent version of reinvent supports dockstream as scoring method.

From the dockstream documentation, reinvent v3.0 supports dockstream but it didn’t work due to some issue of reinvent-scoring which manage scoring function of reinvent. I modified source code to use dockdream as scoring function.

Fortunately the bug is fixed in recent version of reinvent (v3.1) . So I tried to used new version of reinvent with dockstream.

At first, I setup environment of new version of reinvent.

$ gh repo clone MolecularAI/Reinvent
$ cd Reinvent
$ conda env create -f reinvent.yml

After making the env, activate reinvent.v3.0 and write configuration file for reinforcement learning.

It’s worth to know that json format of reinvent configuration is slightly changed to v3.0. So user should update not only Reinvent but also ReinventCommunity which provides example code.

Following example I used same file as previous post.

The configuration for reinforcement learning is below.

# reinvent config named 'RL_config.json'
{
    "logging": {
        "job_id": "demo",
        "job_name": "Reinforcement learning demo",
        "logging_frequency": 10,
        "logging_path": "/home/iwatobipen/Desktop/REINVENT_RL_DockStream/progress.log",
        "recipient": "local",
        "result_folder": "/home/iwatobipen/Desktop/REINVENT_RL_DockStream/results",
        "sender": "http://0.0.0.1"
    },
    "parameters": {
        "diversity_filter": {
            "minscore": 0.4,
            "minsimilarity": 0.4,
            "name": "NoFilter",
            "nbmax": 25
        },
        "inception": {
            "memory_size": 100,
            "sample_size": 10,
            "smiles": []
        },
        "reinforcement_learning": {
            "agent": "/home/iwatobipen/dev/reinvent-3/ReinventCommunity/notebooks/models/random.prior.new",
            "batch_size": 128,
            "learning_rate": 0.0001,
            "margin_threshold": 50,
            "n_steps": 20,
            "prior": "/home/iwatobipen/dev/reinvent-3/ReinventCommunity/notebooks/models/random.prior.new",
            "reset": 0,
            "reset_score_cutoff": 0.5,
            "sigma": 128
        },
        "scoring_function": {
            "name": "custom_sum",
            "parallel": false,
            "parameters": [
                {
                    "component_type": "dockstream",
                    "model_path": "",
                    "name": "dockstream",
                    "smiles": [],
                    "specific_parameters": {
                        "configuration_path": "/home/iwatobipen/Desktop/AutoDock_Vina_demo/ADV_docking.json",
                        "docker_script_path": "/home/iwatobipen/dev/DockStream/docker.py",
                        "environment_path": "/home/iwatobipen/miniconda3/envs/DockStream/bin/python",
                        "transformation": {
                            "transformation_type": "reverse_sigmoid",                        
                            "high":-8,
                            "k":0.25,
                            "low":-12
                        }
                    },
                    "weight": 1
                }
            ]
        }
    },
    "run_type": "reinforcement_learning",
    "version": 3
}
# dockstream config
{
  "docking": {
    "header": {
      "logging": {
        "logfile": "/home/iwatobipen/Desktop/AutoDock_Vina_demo/ADV_docking.log"
      }
    },
    "ligand_preparation": {
      "embedding_pools": [
        {
          "pool_id": "RDkit",
          "type": "RDkit",
          "parameters": {
            "prefix_execution": ""
          },
          "input": {
            "standardize_smiles": false,
            #"type": "smi",
            #"input_path": "/home/iwatobipen/dev/DockStreamCommunity/notebooks/../data/1UYD/ligands_smiles.txt"
            "type": "console"
          },
          "output": {
            "conformer_path": "/home/iwatobipen/Desktop/AutoDock_Vina_demo/ADV_embedded_ligands.sdf",
            "format": "sdf"
          }
        }
      ]
    },
    "docking_runs": [
      {
        "backend": "AutoDockVina",
        "run_id": "AutoDockVina",
        "input_pools": [
          "RDkit"
        ],
        "parameters": {
          "binary_location": "/home/iwatobipen/src/autodock_vina_1_1_2_linux_x86/bin",
          "parallelization": {
            "number_cores": 8
          },
          "seed": 42,
          "receptor_pdbqt_path": [
            "/home/iwatobipen/Desktop/AutoDock_Vina_demo/ADV_receptor.pdbqt"
          ],
          "number_poses": 2,
          "search_space": {
            "--center_x": 3.3,
            "--center_y": 11.5,
            "--center_z": 24.8,
            "--size_x": 15,
            "--size_y": 10,
            "--size_z": 10
          }
        },
        "output": {
          "poses": {
            "poses_path": "/home/iwatobipen/Desktop/AutoDock_Vina_demo/ADV_ligands_docked.sdf"
          },
          "scores": {
            "scores_path": "/home/iwatobipen/Desktop/AutoDock_Vina_demo/ADV_scores.csv"
          }
        }
      }
    ]
  }
}

Now ready to run. Let’s do it.

$ conda activate reinvent.v3.1 (or reinven.v3.0)
$ python ~/dev/reinvent-3/Reinvent/input.py RL_config.json 
19:05:24: local_reinforcement_logger.log_message +32: INFO     starting an RL run
{'transformation_type': 'reverse_sigmoid', 'high': -8, 'k': 0.25, 'low': -12}
19:10:55: local_reinforcement_logger.timestep_report +41: INFO     
 Step 0   Fraction valid SMILES: 98.4   Score: 0.0322   Time elapsed: 331   Time left: 6620.0
  Agent     Prior     Target     Score     SMILES
-31.28    -31.28     -0.53      0.24      N1(S(c2ccccc2C(OC)=O)(=O)=O)CCCC(c2cc(F)ccc2)=N1
-40.00    -40.00    -15.47      0.19      C1C2CC(C)(C)Cc3c2c(ccc3)C2C1C1C(C(C)=C)CCC1(C)N2
-31.13    -31.13     -3.62      0.21      Clc1cccc(Cl)c1Cc1sc2n(c(-c3ccccc3)nn2)c1NC1CCCCC1
-35.44    -35.44     -4.68      0.24      c1c2cccc(COc3cnc(OC)cc3C)c2[nH]c1CC
-27.90    -27.90     10.06      0.30      c1ccc2c(n(CC(=O)N3Cc4c(cccc4)CC3)c(CCC(O)=O)c2-c2ccccc2)c1
-19.80    -19.80      4.74      0.19      FC(c1ccc(C(Nc2cc(CN3CCC(C(NC(C)(C)C)=O)CC3)ccc2)=O)cn1)(F)F
-34.63    -34.63     20.23      0.43      c12cccnc1ccc(C(=O)c1c(=O)c3cc(C)ccc3n(CCCC(=O)O)c1N)c2
-19.83    -19.83     18.14      0.30      c1(OC)c(O)c(OC)cc(C=Nn2c(COc3ccccc3)n[nH]c2=S)c1
-26.73    -26.73     41.87      0.54      c1(Nc2ccccc2C(=O)OCC)ccc(Cl)cc1C
-22.11    -22.11     12.13      0.27      C1(C)C(=Nc2ccc(Cl)c(Cl)c2)C(=O)c2c(cccc2)C1=O
dockstream   raw_dockstream
0.24025307595729828   -9.199999809265137   
0.19168232381343842   -9.0   
0.21497325599193573   -9.100000381469727   
0.24025307595729828   -9.199999809265137   
0.29661500453948975   -9.399999618530273   
0.19168232381343842   -9.0   
0.42853689193725586   -9.800000190734863   
0.29661500453948975   -9.399999618530273   
0.5359159111976624   -10.100000381469727   
0.26749271154403687   -9.300000190734863   

'''snip'''

{'transformation_type': 'reverse_sigmoid', 'high': -8, 'k': 0.25, 'low': -12}
21:09:22: local_reinforcement_logger.timestep_report +41: INFO     
 Step 19   Fraction valid SMILES: 97.7   Score: 0.0322   Time elapsed: 7438   Time left: 371.9
  Agent     Prior     Target     Score     SMILES
-31.16    -31.47     11.29      0.24      c1(CN2CCCCC2c2[nH]ncc2N)nc(C2CC2)on1
-17.78    -18.03     16.08      0.19      c1c2c(cc(OC)c1OC)-c1n(c(=O)[nH]c(=Nc3ccccc3)c1)CC2
-26.52    -26.43     11.84      0.21      S1CCN(S(=O)(=O)c2ccc(C(Nc3cc4c(cc3)CCC4)=O)cc2)CC1
-28.64    -28.58     14.18      0.24      O=C1CC2(CCN(CCc3c4nc(OC)ccc4ncc3F)CC2)C(=O)N1Cc1ccccc1
-58.15    -58.14     -5.34      0.30      CC12OC3C=CC45C(=C)C(C6(C(OC15C)C=C(C)CC6=O)O4)CC23
-33.17    -33.08      1.04      0.19      C1CN(c2c(C(N)c3ccccc3)c3ccccc3nc2-c2ccccc2)CCN1Cc1ccccc1
-29.99    -29.85     46.43      0.43      C(C1CCCN(c2n(CCO)nc(C)c2C#N)C1)(=O)N1CCCC1
-27.59    -27.54     25.26      0.30      N1C(=O)CC(CCCC)(CO)c2cc(-c3ccc(Cl)cc3)ccc21
-16.89    -16.86     78.53      0.54      c1c(C(c2ccccc2)=CCCN2CCCC(C(O)=O)C2)cccc1
-20.66    -20.57     27.04      0.27      c1(Cl)ccc(Cl)cc1NN=C(C#N)C(=O)c1noc(C(C)(C)C)c1
dockstream   raw_dockstream
0.24025307595729828   -9.199999809265137   
0.19168232381343842   -9.0   
0.21497325599193573   -9.100000381469727   
0.24025307595729828   -9.199999809265137   
0.29661500453948975   -9.399999618530273   
0.19168232381343842   -9.0   
0.42853689193725586   -9.800000190734863   
0.29661500453948975   -9.399999618530273   
0.5359159111976624   -10.100000381469727   
0.26749271154403687   -9.300000190734863   


It works well ;) After running the training, it’s easy to show docking poses by using pymol. Following image came from receptor(1UYD) and sdf which is generated from reinvent.

I know that there are still room for improvement docking score but it’s interesting that compound generator learns structure from 1D smiles strings but can optimize 3D structure of molecules.

Yah, it depends on which tool is used for conformer generation ;)

Handling 3D information in chemoinformatics is still difficult area for me.

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

5 thoughts on “Reinforcement learning with docking score #RDKit #reinvent #chemoinforamtics

  1. Thank you so much for your post.
    Would you like to share your tensorboard figures? I used your config file and ran reinvent 3.0. After 20 steps, I couldn’t see the score got improved. The file e0019_ADV_scores.csv is identical to ADV_scores.csv. I didn’t get any error messages. I am not sure if I am doing right. Many thanks

    1. Hi thanks for pointing out, there is wrong part in my post.
      Using dock stream from REINVENT, input type of dockstream config should be “console” because, compounds are generated from Deep Neural Net and vina should dock with these smiles.
      I fixed my post.
      Thanks,

      1. Thank you for your reply. After you changed the input type, what was your output in the terminal? I got totally different outputs in the terminal and took so long to train(about one day). But from your post, it only took about 3 hours for 20 epochs.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: