Use RDKit from Rust v2 #RDKit #Rust

I enjoyed 18th mishima.syk meeting at last weekend. I think the community is really cool and worth to join for caching the cutting edge of chemo/bio informatics ;) Feel free to participate and present here if you have interest the meeting.

In the meeting, yamasakit_ introduced “Rust basics” with live coding! Fortunately his presantation material is available from mishima.syk 18 repo (written in Japanese)!

I’m interested in Rust and posted how to integrate rdkit and rust before. In the previous post, rdkitciff is requried to use rdkit functionality from Rust. I felt that the process is a little annoying.

Recently, ‘Xavier Lange‘ developed rdkit-sys and it resigsterd crates.io. By using the package, user don’t need to use rdkitcffi for your rust code development.

OK let’s write code!

At first I made example project.

$ cargo new rdkrust
$ cd rdkrust

Then, add “rdkit-sys = “0.1.10” to Cargo.toml

[package]
name = "rdkrust"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
rdkit-sys = "0.1.10"

Next, wrote main.rs. Following code will convert molecules from sdf to SMILES.
mol_opt_list.into_iter().filter_map(|m| m).collect(); as same as mols = [ mol for mol in mols if mol != None] in python.

use std::env;
use rdkit_sys::molecule::Molecule;
use rdkit_sys::molecule::read_sdfile;
fn main() {
    let args: Vec<String> = env::args().collect();
    let sd_filename = &args[1];
    println!("filename is {}", sd_filename);
    let mol_opt_list: Vec<Option<Molecule>> = read_sdfile(sd_filename);
    let mut mol_list: Vec<Molecule> = mol_opt_list.into_iter().filter_map(|m| m).collect();
    mol_list.iter_mut().for_each(|m| m.remove_all_hs());
    for m in mol_list {
        let smi = m.get_smiles("");
        println!("{}", smi)
    }
    println!("Done");
}

Then compile the code! The process will take few minutes.

$ cargo build

After the process, rdkrust command is built in target/debug folder. Check it.

$ target/debug/rdkrust cdk2.sdf
filename is cdk2.sdf
CC(C)C(=O)COc1nc(N)nc2[nH]cnc12
Nc1nc(OCC2CCCO2)c2nc[nH]c2n1
Nc1nc(OCC2CCC(=O)N2)c2nc[nH]c2n1
Nc1nc(OCC2CCCCC2)c2nc[nH]c2n1
Nc1nc(OCC2CC=CCC2)c2nc[nH]c2n1
Cn1cnc2c(NCc3ccccc3)nc(NCCO)nc21
CCC(CO)Nc1nc(NCc2ccccc2)c2ncn(C(C)C)c2n1
COc1ccc(CNc2nc(N(CCO)CCO)nc3c2ncn3C(C)C)cc1
Nc1nc(N)c(N=O)c(OCC2CCCCC2)n1
COc1ccc2c(c1)/C(=C/c1cnc[nH]1)C(=O)N2
COc1cc[nH]c1/C=C1\C(=O)Nc2ccc([N+](=O)[O-])cc21
CCc1cnc(CSc2cnc(NC(=O)C(C)C)s2)o1
CCCCOc1c(C(=O)c2nccs2)cnc2[nH]ncc12
COc1cc(-c2ccc[nH]2)c2c3c(ccc(F)c13)NC2=O
[NH3+]CCSc1cc(-c2ccc[nH]2)c2c3c(ccc(F)c13)NC2=O
NC(=O)Nc1cccc2c1C(=O)c1c-2n[nH]c1-c1cccs1
COc1ccc2c(c1)/C(=C/c1[nH]cc3c1CCOC3=O)C(=O)N2
CNS(=O)(=O)c1ccc2c(c1)/C(=C/c1[nH]cc3c1CCNC3=O)C(=O)N2
CC(=O)Nc1cccc2c1C(=O)c1c-2n[nH]c1-c1ccncc1
COc1ccc(-c2[nH]nc3c2C(=O)c2c(NC(N)=O)cccc2-3)cc1
COc1ccc(-c2[nH]nc3c2C(=O)c2c(NC(=O)NN(C)C)cccc2-3)cc1
Cc1nc2ccccn2c1-c1ccnc(Nc2ccccc2)n1
Cc1nc2ccccn2c1-c1ccnc(Nc2ccc(OCC(O)C[NH+](C)C)cc2)n1
O=C(Nc1ccccn1)Nc1cccc2c1C1CCCN1C2=O
NS(=O)(=O)c1ccc(N/N=C2\C(=O)Nc3ccc(Br)cc32)cc1
CNS(=O)(=O)c1ccc(N/C=C2\C(=O)Nc3ccccc32)cc1
N=C(N)NS(=O)(=O)c1ccc(N/C=C2\C(=O)Nc3ccccc32)cc1
O=C1Nc2ccc(S(=O)(=O)O)cc2/C1=C1/Nc2ccccc2C1=O
c1ccc(Nc2nc(OCC3CCCCC3)c3nc[nH]c3n2)cc1
NS(=O)(=O)c1ccc(Nc2nc(OCC3CCCCC3)c3nc[nH]c3n2)cc1
O=C(c1ccccc1)c1cnc2n[nH]cc2c1OCc1ccccc1
NS(=O)(=O)c1ccc(Nc2cc(-c3ccc([N+](=O)[O-])cc3)[nH]n2)cc1
CCCCOc1c(C(=O)c2c(F)cc(Br)cc2F)cnc2[nH]ncc12
Cc1ccc(F)c(Nc2ccnc(Nc3ccc(S(N)(=O)=O)cc3)n2)c1
CC(C)C(CO)Nc1nc(Nc2cccc(Cl)c2)c2ncn(C(C)C)c2n1
CC(C)C(CO)Nc1nc(Nc2ccc(C(=O)[O-])c(Cl)c2)c2ncn(C(C)C)c2n1
[NH3+]C1CCC(Nc2nc(NCC3CC3)c3ncn(C4CCCC4)c3n2)CC1

It seems work well. Due to lack of my cpp knowledge, I can’t imprement MCS in the rdkitcffi. I think it’s great to call findMCS funtion from rust. Because finding MCS process required many computational cost. So I would like to search MCS more speedly. Please let me know if reader can imprement it to rdkitcffi ;)

Thhank Xavier Lange to developping cool package for rust!

Advertisement

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: