Call RDKit from Rust with conda env ver2 #RDKit #RDKit-sys #Rust #Chemoinformatics

To use rdkit from Rust, I introduced rdkit-sys before. And fortunately recent version of rdkit-sys cleat supports rdkit-env. It’s worth to use conda-env to build rdkit-sys because user don’t need to build rdkit from source code.

Following code is almost same as my previous post but I would like to share it.

At first, I cloned rdkit from rdkit-rs.

$ gh repo clone rdkit-rs/rdkit
$ cd rdkit

Then edit Cargo.toml. I modified dependencies part as below. Added features=[“dynamic-linking-from-conda”] option. And then added LD_LIBRARY_PATH to use it.

[package]
name = "rdkit"
version = "0.2.11"
edition = "2021"
authors = ["Xavier Lange <xrlange@gmail.com>"]
license = "MIT"
description = "High level RDKit functionality for rust"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
bitvec = "1"
cxx = "1"
log = "0.4"
#iwatobipen modified
rdkit-sys = {version="0.2.14", features=["dynamic-linking-from-conda"]}
flate2 = "1"

[dev-dependencies]
env_logger = "0.9.0"

## from bash
(base) $ conda activate chemo_py310
(chemo_py310)$ export LD_LIBRARY_PATH=$CONDA_PREFIX/lib;$LD_LIBRARY_PATH
(chemo_py310)$ cargo test # all tests will pass

Then I wrote rust_rdkit_v3

$ cargo new rust_rdkit_v3
$ cd rdkit_rust
$ vim src/main.rs
use std::env;
use std::path::PathBuf;
use rdkit;

fn main() {
    let args: Vec<String> = env::args().collect();
    let sdgz_filename = &args[1];

    println!("filename is {}", sdgz_filename);
    let mol_block_iter =
        rdkit::MolBlockIter::from_gz_file(sdgz_filename, false, false, false).unwrap();
    let mol_list = mol_block_iter.collect::<Vec<_>>();
    for m in mol_list {
        let smi = m.unwrap().as_smile();
        println!("{}", smi)
    }
    println!("Done");
}

$ vim Cargo.toml
[package]
name = "rust_rdkit_v3"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
#rdkit = "0.2.11"
rdkit={path="../rdkit"}
~

After writing the code, build it.

$ cargo build --release

Now I could run code.

./target/release/rust_rdkit_v3 cdk2.sdf.gz
filename is cdk2.sdf.gz
[H]C1=NC2=C(N=C(N([H])[H])N=C2OC([H])([H])C(=O)C([H])(C([H])([H])[H])C([H])([H])[H])N1[H]
[H]C1=NC2=C(OC([H])([H])C3([H])OC([H])([H])C([H])([H])C3([H])[H])N=C(N([H])[H])N=C2N1[H]
[H]C1=NC2=C(OC([H])([H])C3([H])N([H])C(=O)C([H])([H])C3([H])[H])N=C(N([H])[H])N=C2N1[H]
[H]C1=NC2=C(OC([H])([H])C3([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C3([H])[H])N=C(N([H])[H])N=C2N1[H]
[H]C1=NC2=C(OC([H])([H])C3([H])C([H])([H])C([H])=C([H])C([H])([H])C3([H])[H])N=C(N([H])[H])N=C2N1[H]
[H]OC([H])([H])C([H])([H])N([H])C1=NC(N([H])C([H])([H])C2=C([H])C([H])=C([H])C([H])=C2[H])=C2N=C([H])N(C([H])([H])[H])C2=N1
....

In summary current rdkit-sys supports conda env and it makes easy to call rdkit from rust.

Thanks!

Advertisement

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: