Coloring molecule with RESP charge on pymol #Pymol

To visualize 3D structure of molecule, PyMol is nice tool. What I would like to write on the post is how to visualize calculated RESP charge on pymol ;)
One idea is embed calculated RESP charge to b_factor of molecule pdb. PDB file can make easily from rdkit mol object.
A problem for me is how to edit b_factor of PDB file. I found good tool to solve it today, python package ‘BioPandas’! It can install from pypi or anaconda.
By using biopandas, user can handle pdb file like pandas dataframe.

At first I would like to show examples of biopandas. If you often working with PDB you feel it is very useful I think.
Following code is an example for loading PDB and retrieve protein and ligand data as pandas dataframe.

from biopandas.pdb import PandasPdb
pbdobj = PandasPdb('1atp.pdb')
#Check Atom data
#Check ligand data

Now I can access pdb file as pandas data frame, it means that it is easy to edit and analyze pdb data! More examples are described in official site. So I move to next.

Following example is …
Calculate RESP charge with Psikit and then make pdf file from rdkit mol object and finally replace the b_factor to calculated RESP charge.
At frist load packages.

import os
import sys
from biopandas.pdb import PandasPdb
import pandas as pd
sys.path.append('path for /psikit')
from psikit import Psikit

Then define the function which make pdb file.

def make_pdb_with_Resp(smi, fname=None):
    pk = Psikit()
    charge = pk.resp_charge
    RESP = charge['Restrained Electrostatic Potential Charges']
    if fname == None:
        fname = 'mol'
        fname = fname
    Chem.MolToPDBFile(pk.mol, '{}.pdb'.format(fname))
    pdbobj = PandasPdb().read_pdb('{}.pdb'.format(fname))
   # Following step is editing state, you can see the approach like a pandas method ;)
    pdbobj.df['HETATM']['b_factor'] = RESP

Run the function! After running the code, I could get two files named aceticacid.pdb and tetrazole.pdb

It can load from pymol.


Now I would like to color the molecule with edited b_factor. So I use spectrum command on pymol command line.

spectrum b, blue_red, minimum=-1., maximum=1.
Acetic acid with RESP charge

Acetic acid has negative charge on two oxygen atoms. And carbonyl carbon has the most positive charge.

Tetrazole with RESP charge

The result of tetrazole is strange for me because the nitrogen at position 2 which has hydrogen shows positive. And the nitrogen at position 1 is the most negative. I investigated two protonated state of the tetrazole, 1H-tetrazole and 2H-tetrazole.
But both results shows the nitrogen atom which has hydrogen has positive charge than other nitrogen atoms. I found that BioPandas is useful. But I could not get solution for the last problem. Hmm…

tensorboard embeddings + RDKit #RDKit

Mainly I use Keras for deep learning. Because Keras is easy to use and easy to understand for me.
Keras has callback function to call tensorboard. But It has difficulties in use tensorboard embeddings.
You know, tensorboard embeddings is unique function to visualize future of word vectors.
I want to use tensorboard embeddings for visualization of chemical space.
Basic introduction of embeddings is described in following URL.

I referred following URL and changed some lines.

Following code will read SDF and calculate Fingerprints and perform PCA or t-SNE.
And the results can view via tensorboard. Fortunately, RDKit has MolsToGridImage function. The function is useful to make spriteimage for embeddings !!!

mport numpy as np
import pandas as pd
import sys
import argparse
import os
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
from rdkit.Chem import Draw

import tensorflow as tf
from tensorflow.contrib.tensorboard.plugins import projector

FLAGS = None

def getFpArr( mols, nBits = 1024 ):
    fps = [ AllChem.GetMorganFingerprintAsBitVect( mol, 2, nBits=nBits ) for mol in mols ]
    X = []
    for fp in fps:
        arr = np.zeros( (1,) )
        DataStructs.ConvertToNumpyArray( fp, arr )
        X.append( arr )
    return np.array( X )

def getResponse( mols, prop="ACTIVITY" ):
    Y = []
    for mol in mols:
        act = mol.GetProp( prop )
        act = 9. - np.log10( float( act ) )
        if act >= 6:
            Y.append(np.asarray( [1,0] ))
            Y.append(np.asarray( [0,1] ))
    return np.asarray( Y )

def generate_embeddings():
    sdf = Chem.SDMolSupplier( FLAGS.sdf )
    X = getFpArr( [ mol for mol in sdf ]  )
    sess = tf.InteractiveSession()
    with tf.device( '/cpu:0' ):
        embedding = tf.Variable( tf.stack( X[:], axis=0 ), trainable=False, name='embedding' )
    saver = tf.train.Saver()
    writer = tf.summary.FileWriter( FLAGS.log_dir+'/projector', sess.graph )
    config = projector.ProjectorConfig()
    embed = config.embeddings.add()
    embed.tensor_name = 'embedding:0'
    embed.metadata_path = os.path.join( FLAGS.log_dir + '/projector/metadata.tsv' )
    embed.sprite.image_path = os.path.join( FLAGS.data_dir + '/mols.png' )
    embed.sprite.single_image_dim.extend( [100, 100] )
    projector.visualize_embeddings( writer, config ) sess, os.path.join(FLAGS.log_dir, 'projector/amodel.ckpt'), global_step=len(X) )
def generate_metadata_file():
    sdf = Chem.SDMolSupplier( FLAGS.sdf )
    Y = getResponse( [ mol for mol in sdf ])
    def save_metadata( file ):
        with open( file, 'w' ) as f:
            for i in range( Y.shape[0] ):
                c = np.nonzero( Y[i] )[0][0]
                f.write( '{}\t{}\n'.format( i, c ))
    save_metadata( FLAGS.log_dir + '/projector/metadata.tsv' )

def main(_):
    if tf.gfile.Exists( FLAGS.log_dir+'/projector' ):
        tf.gfile.DeleteRecursively( FLAGS.log_dir+'/projector' )
        tf.gfile.MkDir( FLAGS.log_dir + '/projector' )
    tf.gfile.MakeDirs( FLAGS.log_dir + '/projector' )

if __name__=='__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument( '--sdf', type=str )
    parser.add_argument( '--log_dir', type=str, default='/Users/iwatobipen/develop/py35env/testfolder/tensorflowtest/mollog' )
    parser.add_argument( '--data_dir', type=str, default='/Users/iwatobipen/develop/py35env/testfolder/tensorflowtest/mollog')

    FLAGS, unparsed = parser.parse_known_args()
    sdf = [ mol for mol in Chem.SDMolSupplier( FLAGS.sdf ) ]
    im = Draw.MolsToGridImage( sdf, molsPerRow=10, subImgSize=( 100, 100 )) os.path.join( FLAGS.data_dir + '/mols.png' )) main=main, argv=[sys.argv[0]] + unparsed )

To run the code.

$ python --sdf your.sdf

Then launch tensorboard and access localhost:6006.

$ tensorboard --logdir your_log_dir

Then, I could get following image.
This image is results of PCA, but also it can perform t-SNE analysis.

I pushed my code to my repo.
Tensorflow has cool function ;-).

Handle pymol via CUI.

I often use Pymol to visualize PDB files.
Recently I want to merge some PDB files in one Pymol session file from CUI.
Because I run the task as batch. So I searched API document and tried it.
At first I need launch pymol in silent mode ( no GUI ).
And then load pdb files.
Next I set color of each object by b factor as spectrum.
Finally save object as pymol session file and closed pymol.
Every thing worked well.

Following code and sample files are pushed my repo.


import pymol
from pymol import cmd
cmd.spectrum('b', 'blue_white_red','1atp', 0, 100)
cmd.spectrum('b', 'yellow_cyan_blue','1atp2', 0, 100)
cmd.spectrum('b', 'green_magenta','1atp3', 0, 100)'somecolors.pse')

Compile LeapPython and Manipulate protein structure with Leap Motion.

I was interested in Kinect as input device, because Kinect can detect motion as input it feels feature. ;-)
And some days ago, I got new input device “Leap Motion”.
The Leap Motion controller is a small USB peripheral device which is designed to be placed on a physical desktop, facing upward. ( from wiki )
I want to use this device with PyMol!
Let’s try it.
At first to use Leap Motion from python, user need to get Leapmotion SDK. My environment is OSX, so I got SDK v2.3 from developer site.
Installation is little bit complicated for me, because I want to use LeapPython library from virtual environment but the library is linked with native python.
So, I build or change dynamic link following procedure.

In Python2.7, I changed link with following command.

iwatobipen$ cd /somepath/LeapSDK/lib
iwatobipen$ otool -L
	/Library/Frameworks/Python.framework/Versions/2.7/Python (compatibility version 2.7.0, current version 2.7.0)
	@loader_path/libLeap.dylib (compatibility version 0.7.0, current version 2.3.1)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1197.1.1)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.0.0)
# change Python link from native to anaconda
iwatobipen$ install_name_tool -change /Library/Frameworks/Python.framework/Versions/2.7/Python /Users/iwatobipen/.pyenv/versions/anaconda-2.4.0/lib/libpython2.7.dylib

In Python3.5, I compiled using swig.
Newest swig will cause error, so I used swig version 3.0.3
I referred following url to do that.
OK Let’s go

# Copy Leap.h, LeapMath.h, Leap.i, and libLeap.dylib into one folder. And type following command.
# Generate LeapPython.cpp withswig -c++ -python -o LeapPython.cpp -interface LeapPython Leap.i
iwatobipen$ swig -c++ -python -o LeapPython.cpp -interface LeapPython Leap.i
# Compile and link 
clang++ -arch x86_64 -I/Users/iwatobipen/.pyenv/versions/anaconda-2.4.0/include/python2.7 LeapPython.cpp libLeap.dylib /Users/iwatobipen/.pyenv/versions/anaconda-2.4.0/lib/libpython2.7.dylib -shared -o

Now I got linked each python version in anaconda.

Unfortunately, I could not install pymol in python3.5. Following code was run in python2.7. I installed pymol by using conda install command.
Pymol Wiki provided sample code to use Leap Motion from pymol. I used the example. I added “path” which placed

import sys
import math
from pymol import cmd
sys.path.insert( 0, '/Users/iwatobipen/develop/py2env/LeapSDK/build/')

import Leap
from Leap import Matrix, Vector

class PymolListener(Leap.Listener):
    def __init__(self, *args, **kwargs):
        super(PymolListener, self).__init__(*args, **kwargs)

        self.prev_frame = None

        self.controller = Leap.Controller()

    def __del__(self):

        super(PymolListener, self).__del__()

    def update_view(self, frame):
        if not self.prev_frame:

        view = list(cmd.get_view())

        if frame.rotation_probability(self.prev_frame) > 0.1:
            m = frame.rotation_matrix(self.prev_frame)
            m *= Matrix(Vector(*view[0:3]),
            view[:9] = m.to_array_3x3()

        if frame.scale_probability(self.prev_frame) > 0.1:
            s = frame.scale_factor(self.prev_frame)
            delta_z = math.log(s) * 100.0
            view[11] += delta_z
            view[15] -= delta_z
            view[16] -= delta_z


    def on_frame(self, controller):
        frame = controller.frame()


        self.prev_frame = frame

listener = PymolListener()

Then run pymol and fetched sample pdb.

iwatobipen$ pymol
# from pymolconsole.
pymol> fetch 1atp
pymol> run

Result is …. It seems work well. Now, I am reading API reference to write code.

Call rdkit from pymol

PyMol is one of major tool for visualisation of protein, ligand, etc.
You know ‘Py’ means that pymol is written using python.
I want to embed pymol to web app but it’s difficult for me (technical reason..)
If anyone have advice to do it, please leave comment. ;-)

Next, I have one question that, pymol can call rdkit ?
If it possible, it will be good information for me.
Think drug design based 3D structure is key for design, and lots of tools for visualise 3D structures.
….But complicated tools is not familiar for medchem.(really ?)
I wrote simple test script to call rdkit from pymol.
Script is following.

 from rdkit import Chem
 from rdkit.Chem import AllChem
 from pymol import cmd
 #from __future__ import print_function
 mol = Chem.MolFromSmiles( 'CCCS(=O)(=O)Nc1ccc(F)c(c1F)C(=O)c2c[nH]c3c2cc(cn3    )c4ccc(Cl)cc4' )
 hmol = Chem.AddHs( mol )
 AllChem.EmbedMolecule( hmol )
 AllChem.MMFFOptimizeMolecule( hmol )    
 #print( hmol.GetNumConformers( ))
 pdbstring = Chem.MolToPDBBlock( hmol )
 cmd.read_pdbstr( pdbstring, 'vemrafenib' )

Next, run script via pymol

iwatobipen$ pymol 
 PyMOL(TM) Molecular Graphics System, Version
 Copyright (c) Schrodinger, LLC.
 All Rights Reserved.

Work fine.!
Screen Shot 2015-11-23 at 10.08.14 PM
I’ll research more details about pymol api.

I went to tobuzoo yesterday.
I and my family enjoyed.