Recently I bought a book which title is ‘Nim in Action’ and started to learn NIM language. I never touched language like a nim-lang which is required compile to run the code.
I had interest about nim-lang because, many documents say NIM is speedy and efficient, and grammar seems like python. And also, rdkit binding is available! The binding is still under development but it’s good news for chemoinformatitian. Thank Axel Pahl for his great work!
Now rdkit-nim offers some chemoinformatics functions. So let’s try to use it.
At first I installed nim-lang. rdkit-nim supports nim version 1.1.1 or higher. This isn’t stable version of nim. I installed choosenim and switch the version to devel.
$ curl https://nim-lang.org/choosenim/init.sh -sSf | sh # Add path to bashrc # export PATH=/home/username/.nimble/bin:$PATH # Then switch to devel from stable version. $ choosenim devel $ nim -v Nim Compiler Version 1.1.1 [Linux: amd64] Compiled at 2020-02-05 Copyright (c) 2006-2019 by Andreas Rumpf
Next, install rdkit nim-rdkit. It is described in README.md of original repository.
#Add rdkit path to bashrc # export RDKIT_NIM_CONDA=/home/username/anaconda3/envs/myenv $ git clone https://github.com/apahl/rdkit_nim.git $ cd rdkit_nim $ nimble install $ nimble test Executing task test in /home/username/dev/nim_dev/rdkit_nim/rdkit.nimble [mol.nim] passed. [qed.nim] passed. [sss.nim] passed. All tests passed.
OK, all test is passed. ;-)
Let’s build simple example which calculate QED of given molecules.
Code is below.
#calc_qed.nim import rdkit / [mol, qed] let smiles_list = [ "CC1CCN(CC1N(C)C2=NC=NC3=C2C=CN3)C(=O)CC#N", "CC(C1=C(C=CC(=C1Cl)F)Cl)OC2=C(N=CC(=C2)C3=CN(N=C3)C4CCNCC4)N" ] for smi in smiles_list: echo(smi) let m = molFromSmiles(smi) let a = qedDefault(m) echo("QED") echo(a)
Current rdkit-nim supports mol, descriptors, sss and qed modules. The code above import rdkit/mol and rdkit/qed module.
Before build the code, it is required to make nim script file.
#calc_qed.nims import os # `/` let fileName = "calc_qed" condaPath = getEnv("RDKIT_NIM_CONDA") task build, "Building default cpp target...": switch("verbosity", "0") # switch("hint[Conf]", "off") switch("hints", "off") switch("out", "test2/bin" / toExe(fileName)) switch("run") switch("passL", "-lstdc++") switch("passL", "-L" & condaPath & "/lib") switch("passL", "-lRDKitGraphMol") switch("passL", "-lRDKitDescriptors") switch("passL", "-lRDKitSmilesParse") switch("passL", "-lRDKitChemTransforms") switch("passL", "-lRDKitSubstructMatch") switch("cincludes", condaPath & "/include/rdkit") switch("cincludes", condaPath & "/include") setcommand "cpp"
Almost there, let’s compile the code.
$ nim build calc_qed.nim Hint: used config file '/home/username/.choosenim/toolchains/nim-#devel/config/nim.cfg' [Conf] Hint: used config file '/home/username/.choosenim/toolchains/nim-#devel/config/config.nims' [Conf] Hint: used config file '/home/username/dev/nim_dev/rdkit_nim/test2/calc_qed.nims' [Conf] CC1CCN(CC1N(C)C2=NC=NC3=C2C=CN3)C(=O)CC#N QED 0.9284511020269033 CC(C1=C(C=CC(=C1Cl)F)Cl)OC2=C(N=CC(=C2)C3=CN(N=C3)C4CCNCC4)N QED 0.5330554587335657
After that I got compiled file, test2/bin/calc_qed.
Let’s call the function.
$ ./test2/bin/calc_qed CC1CCN(CC1N(C)C2=NC=NC3=C2C=CN3)C(=O)CC#N QED 0.9284511020269033 CC(C1=C(C=CC(=C1Cl)F)Cl)OC2=C(N=CC(=C2)C3=CN(N=C3)C4CCNCC4)N QED 0.5330554587335657
Works fine!
In this code, I embedded molecules as SMILES but if the code load smiles from text files, calculate target can change dynamically. It will be practical apprlcation.
I’m still new for NIM so I would like to learn NIM more and use the package to solve my chemoinformatics task!
Original repo provides many examples user can learn many things from the site. Please check it!