I posted about openforcefield(OpenFF) before. You know, old version of openff supports only OpenEyeTK but current version supports RDKit too. It is worth to know that we can use openff with open source tool kit. I really appreciate developer’s work! It is great. Today I use the package and ipymol which can control pymol inContinue reading “Current version Openforcefield supports rdkit #RDKit #Openforcefield #chemoinformatics”
Category Archives: programming
Comparison of rdChemReactions and EditableMol #RDKit #chemoinformatics
In this year I moved from MedChem team to CompChem team. And now I need to learn SBDD. Today I struggled mol object that has 3D information. I would like to replace hydrogen which attached aromatic carbon to some atoms. I thought it is easy if I use rdChemReactions method. But I found that itContinue reading “Comparison of rdChemReactions and EditableMol #RDKit #chemoinformatics”
Molecular encoder/decoder (VAE) with python3.x (not new topic) #rdkit #chemoinformatics
The first day of 10-day holiday is rainy. And I and my kid will go to dodge ball tournament. Two years ago, I tried to modify the keras-molecule which is code of molecular encoder decoder. The code is written for python 2.x. So I would like to run the code on python 3.6. I stoppedContinue reading “Molecular encoder/decoder (VAE) with python3.x (not new topic) #rdkit #chemoinformatics”
Visualize Molecular Orbital with pymol and psikit #RDKit #psi4 #psikit #pymol
I posted how to visualize MO with VMD, the data was generated from psikit. I used VMD because psi4 provides visualize function for cubefile and I could not find the same method on pymol. Pymol is familier for me than VMD. So I would like to do same thing on pymol. And today I couldContinue reading “Visualize Molecular Orbital with pymol and psikit #RDKit #psi4 #psikit #pymol”
Adjustment bond length and align molecule to scaffold #RDKit #chemoinformatics
To align rdkitmol object to given scaffold, GenerateDepictionMatching2DStructure is useful to do that. But, somedays ago I asked my colleague that the function could not adjust bond length when scaffold which is read from sdf and it has different bond length of RDKit’s default settings. I found useful information in rdkit-discuss about rdMolTransforms. rdMolTransforms.TransformConformer methodContinue reading “Adjustment bond length and align molecule to scaffold #RDKit #chemoinformatics”
Handling chemoinformatics data with pandas #RDKit #chemoinformatics
I often use Pandas for data analysis. RDKit provides useful method named PandasTools. The method can load sdf and return data as pandas dataframe. By using dataframe, It isn’t needed to do something with for loop. I found an interesting information in rdkit issues. A package named pyjanitor. The package wraps pandas and provides usefulContinue reading “Handling chemoinformatics data with pandas #RDKit #chemoinformatics”
New version of openforcefield supports RDKit! #RDKit #chemoinformatics #openforcefield
In this morning I found great news on my tiwtter TL that openforcefield supports RDKit! Older version of openforcefiled is required openeyeTK which is commercial license for industry. I had interested in the package but could not install because I’m not academia and our company does not OETK license now. https://open-forcefield-toolkit.readthedocs.io/en/latest/releasehistory.html I can’t wait toContinue reading “New version of openforcefield supports RDKit! #RDKit #chemoinformatics #openforcefield”
Make Graph convolution model with geometric deep learning extension library for PyTorch #RDKit #chemoinformatics #pytorch
In the chemoinformatics area, QSAR by using molecular graph as input is very hot topic. Examples of major implementations are deepchem and chainer-chemistry I think. I also have interest about Graph based QSAR model building. Recently I am using pytorch for my task of deeplearning so I would like to build model with pytorch. FortunatelyContinue reading “Make Graph convolution model with geometric deep learning extension library for PyTorch #RDKit #chemoinformatics #pytorch”
Generate possible heteroaromatic cores from query molecule #RDKit #chemoinformatics
Hetero shuffling is the approach which replace atoms of scaffold and generate new molecule with atom replaced scaffold. For example benzene as core, examples of shuffled cores will be pyridine, pyrimidine etc. The approach is often used medicinal chemistry project to improve ADMET properties, biological activities and also used for substance patent claim strategy. NativeContinue reading “Generate possible heteroaromatic cores from query molecule #RDKit #chemoinformatics”
Draw molecular network on Jupyter notebook with rdkit and cytoscape.js-2 #RDKit #cytoscape
Yesterday, I posted about cyjupyter. And got comment how to render the compound name on each node. I think it is possible to use context attribute in style settings. Let’s try. Code is almost same as yesterday Networkx is easy to make network with many attributes. I set smiles as name attribute. OK let’s drawContinue reading “Draw molecular network on Jupyter notebook with rdkit and cytoscape.js-2 #RDKit #cytoscape”
Draw molecular network on Jupyter notebook with rdkit and cytoscape.js #RDKit #cytoscape
I use cytsocape and cytoscape.js when I would like to draw molecular network. Molecular network can be made from similarity, matched molecular pair etc. In cytoscape there is a plug in for drawing chemical structures named ‘chemviz‘. There is no plugin for cytoscape.js. So it is needed for drawing function which draw chemical structures asContinue reading “Draw molecular network on Jupyter notebook with rdkit and cytoscape.js #RDKit #cytoscape”
Visualize HOMO LUMO with psi4 #RDKit #psi4 #psikit
Now thin wrapper of psi4 named psikit can generate cube file which has frontier orbital information. After calling getMOview, I would like to check the orbital shape. Psi4 provides vmd_cube.py script which generates cool view on VMD. To run the script on python3, it is needed to change line 332 from ‘for k,v in options.iteritems():’Continue reading “Visualize HOMO LUMO with psi4 #RDKit #psi4 #psikit”
Calculate atomic charges with psikit #RDKit #psi4
Recently I implemented new function in psikit for atomic charge calculation. Now user can get mulliken charges, RESP charges and Lowdin charges very easily. There is quick example below. At first import libraries. I used methanol and imidazole for demo. And psikit can generate 3D structure from SMILES and optimize the conformer with RDKit. ThenContinue reading “Calculate atomic charges with psikit #RDKit #psi4”
Make interactive chemical space plot in jupyter notebook #cheminformatics #Altair
I often use seaborn for data visualization. With the library, user can make beautiful visualization. BTW, today I tried to use another library that can make interactive plot in jupyter notebook. Name of the library is ‘altair’. https://altair-viz.github.io/index.html The library can be installed from pip or conda and this package based vega and vega-lite. VegaContinue reading “Make interactive chemical space plot in jupyter notebook #cheminformatics #Altair”
Build stacking Classification QSAR model with mlxtend #chemoinformatics #mlxtend #RDKit
I posed about the ML method named ‘blending’ somedays ago. And reader recommended me that how about try to use “mlxtend”. When I learned ensemble learning package in python I had found it but never used. So try to use the library to build model. Mlxtend is easy to install and good document is providedContinue reading “Build stacking Classification QSAR model with mlxtend #chemoinformatics #mlxtend #RDKit”
Vote Vote Vote #chemoinformatics
Somedays ago, I posted about ensemble classification method named ‘blending’. The method is not implemented in scikit-learn. So I am implementing the function now. By the way, scikit-learn has an ensemble classification method named ‘VotingClassifer’. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html#sklearn.ensemble.VotingClassifier Following explanation from sklearn document. The idea behind the VotingClassifier is to combine conceptually different machine learning classifiers andContinue reading “Vote Vote Vote #chemoinformatics”
Visualize pharmacophore in RDKit #RDKit
RDKit has pharmacophore feature assignment function. The function can retrieve molecular features based on pre-defined ph4core. And RDKit IPythonconsole can draw molecules on ipython notebook. Today I tried to visualize ph4core on notebook. Code is very simple. First, load feature definition. Then calculate pharmacophore. And compute 2D cordes. Next I defined drawing function. To highlightContinue reading “Visualize pharmacophore in RDKit #RDKit”
Generate possible list of SMLIES with RDKit #RDKit
In the computer vision, it is often used data augmentation technique for getting large data set. On the other hand, Canonical SMILES representations are used in chemoinformatics area. RDKit UGM in last year, Dr. Esben proposed new approach for RNN with SMILES. He expanded 602 training molecules to almost 8000 molecules with different smiles representationContinue reading “Generate possible list of SMLIES with RDKit #RDKit”
Tracking progress of machine learning #MachineLearning
To conduct machine learning it is needed to optimize hyper parameters. For example scikit-learn provides grid search method. And you know there are several packages to do that such as hyperopt or gyopt etc. How do you mange builded models? It is difficult for me. Recently I am interested in mlflow . MLflow is anContinue reading “Tracking progress of machine learning #MachineLearning”
Ensemble learning with scikit-learn and XGBoost #machine learning
I often post about the topics of deep learning. But today I would like to post about ensemble learning. There are lots of documents describes Ensemble learning. And I think following document is very informative for me. Kaggle Ensembling Guide I interested one of the method, named ‘blending’. Regarding above URL, the merit of ‘blending’Continue reading “Ensemble learning with scikit-learn and XGBoost #machine learning”