Use RDKit from Rust #RDKit #rdkitcffi #Rust

Recently Rust is becoming popular language and I have interest Rust. There is a hurdle for me to move programming language from python to others because I would like to use RDKit from my coding environment for chemoinformatics tasks ;)

As many chemoinformaticians know that recently rdkit provides new C Foreign Function Interface (CFFI). And many language supports cffi also Rust supports it too. It means that we can use minimal function of rdkit from Rust. It sound great isn’t it.

And I found really cool project in github named ‘rdkitcffi’ which is rdkit wrapper for Rust. https://github.com/chrissly31415/rdkitcffi

So I tried to use the crate ;)

At first, install rust and install other required packages and clone rdkitcffi.

# https://www.rust-lang.org/tools/install
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# my os is ubuntu 20.04 LTS 
$ sudo apt-get install build-essential
$ sudo apt-get install libclang-dev
$ gh repo clone chrissly31415/rdkitcffi

The build.rs of an original code defined ld_library_path as relative but to use the package as library, I modified it from relative to absolute path. And added ‘rdkitcffi_linux/linux-64/’ path to LD_LIBRARY_PATH (environment variable).

# rdkitcffi/build.rs

#from relative path
let shared_lib_dir = "./lib/rdkitcffi_linux/linux-64/";

#to absolute path
let shared_lib_dir = "/home/user/hogehoge/lib/rdkitcffi_linux/linux-64/";

Now almost there, let make new project. And add rdkitcffi in dependency of Cargo.toml. rdkitcffi is not published in cargo.io so local crate is used.

$ cargo new rdkrust
$ cat rdkrust/Cargo.toml

[package]
name = "rdkrust"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
rdkitcffi = {path="/home/hogehoge/rusttest/rdkitcffi"} 

Then let’s write src/main.rs for compound descriptor calculation from sdf.

# ./src/main.rs
use std::env;
use rdkitcffi::{Molecule, read_sdfile};

fn main() {
    let args: Vec<String> = env::args().collect();
    let sd_filename = &args[1];
    println!("filename is {}", sd_filename);
    let mut mol_opt_list: Vec<Option<Molecule>> = read_sdfile(sd_filename);
    let mut mol_list: Vec<Molecule> = mol_opt_list.into_iter().filter_map(|m| m).collect();
    mol_list.iter_mut().for_each(|m| m.remove_all_hs());
    for m in mol_list {
        let desc = m.get_descriptors();
        println!("{}", desc)
    }
    println!("Done");
}

OK let’s build the code.

$ cargo build
# build rdkitcffi wrapper and sample script

After the process, main command tool is generated, the tool can calculate molecular descriptors and returns it like below.

rust_rdkit$ time ./target/debug/rdkrust cdk2.sdf 
filename is cdk2.sdf
{"exactmw":235.10692,"amw":235.247,"lipinskiHBA":7.0,"lipinskiHBD":3.0,"NumRotatableBonds":4.0,"NumHBD":2.0,"NumHBA":6.0,"NumHeavyAtoms":17.0,"NumAtoms":30.0,"NumHeteroatoms":7.0,"NumAmideBonds":0.0,"FractionCSP3":0.4,"NumRings":2.0,"NumAromaticRings":2.0,"NumAliphaticRings":0.0,"NumSaturatedRings":0.0,"NumHeterocycles":2.0,"NumAromaticHeterocycles":2.0,"NumSaturatedHeterocycles":0.0,"NumAliphaticHeterocycles":0.0,"NumSpiroAtoms":0.0,"NumBridgeheadAtoms":0.0,"NumAtomStereoCenters":0.0,"NumUnspecifiedAtomStereoCenters":0.0,"labuteASA":97.42084,"tpsa":106.78,"CrippenClogP":0.53899,"CrippenMR":61.4361,"chi0v":9.59729,"chi1v":5.19743,"chi2v":2.25934,"chi3v":2.25934,"chi4v":1.23253,"chi0n":9.59729,"chi1n":5.19743,"chi2n":2.25934,"chi3n":2.25934,"chi4n":1.23253,"hallKierAlpha":-2.17999,"kappa1":11.3097,"kappa2":4.36167,"kappa3":2.32463,"Phi":2.90171}
---snip---

real	0m0.085s
user	0m0.077s
sys	0m0.004s

rdkitcffi supports not only descriptor calculation but also other function of rdkit. And Rust has lots of useful package like python pandas, plotly, scikit-learn etc….

I would like to write more code with rust.

Today’s my code is uploaded my githu repo. Thanks for reading.

https://github.com/iwatobipen/rust_rdkit/tree/main

Advertisement

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: