tensorboard embeddings + RDKit #RDKit

Mainly I use Keras for deep learning. Because Keras is easy to use and easy to understand for me.
Keras has callback function to call tensorboard. But It has difficulties in use tensorboard embeddings.
You know, tensorboard embeddings is unique function to visualize future of word vectors.
I want to use tensorboard embeddings for visualization of chemical space.
Basic introduction of embeddings is described in following URL.

I referred following URL and changed some lines.

Following code will read SDF and calculate Fingerprints and perform PCA or t-SNE.
And the results can view via tensorboard. Fortunately, RDKit has MolsToGridImage function. The function is useful to make spriteimage for embeddings !!!

mport numpy as np
import pandas as pd
import sys
import argparse
import os
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
from rdkit.Chem import Draw

import tensorflow as tf
from tensorflow.contrib.tensorboard.plugins import projector

FLAGS = None

def getFpArr( mols, nBits = 1024 ):
    fps = [ AllChem.GetMorganFingerprintAsBitVect( mol, 2, nBits=nBits ) for mol in mols ]
    X = []
    for fp in fps:
        arr = np.zeros( (1,) )
        DataStructs.ConvertToNumpyArray( fp, arr )
        X.append( arr )
    return np.array( X )

def getResponse( mols, prop="ACTIVITY" ):
    Y = []
    for mol in mols:
        act = mol.GetProp( prop )
        act = 9. - np.log10( float( act ) )
        if act >= 6:
            Y.append(np.asarray( [1,0] ))
            Y.append(np.asarray( [0,1] ))
    return np.asarray( Y )

def generate_embeddings():
    sdf = Chem.SDMolSupplier( FLAGS.sdf )
    X = getFpArr( [ mol for mol in sdf ]  )
    sess = tf.InteractiveSession()
    with tf.device( '/cpu:0' ):
        embedding = tf.Variable( tf.stack( X[:], axis=0 ), trainable=False, name='embedding' )
    saver = tf.train.Saver()
    writer = tf.summary.FileWriter( FLAGS.log_dir+'/projector', sess.graph )
    config = projector.ProjectorConfig()
    embed = config.embeddings.add()
    embed.tensor_name = 'embedding:0'
    embed.metadata_path = os.path.join( FLAGS.log_dir + '/projector/metadata.tsv' )
    embed.sprite.image_path = os.path.join( FLAGS.data_dir + '/mols.png' )
    embed.sprite.single_image_dim.extend( [100, 100] )
    projector.visualize_embeddings( writer, config )
    saver.save( sess, os.path.join(FLAGS.log_dir, 'projector/amodel.ckpt'), global_step=len(X) )
def generate_metadata_file():
    sdf = Chem.SDMolSupplier( FLAGS.sdf )
    Y = getResponse( [ mol for mol in sdf ])
    def save_metadata( file ):
        with open( file, 'w' ) as f:
            for i in range( Y.shape[0] ):
                c = np.nonzero( Y[i] )[0][0]
                f.write( '{}\t{}\n'.format( i, c ))
    save_metadata( FLAGS.log_dir + '/projector/metadata.tsv' )

def main(_):
    if tf.gfile.Exists( FLAGS.log_dir+'/projector' ):
        tf.gfile.DeleteRecursively( FLAGS.log_dir+'/projector' )
        tf.gfile.MkDir( FLAGS.log_dir + '/projector' )
    tf.gfile.MakeDirs( FLAGS.log_dir + '/projector' )

if __name__=='__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument( '--sdf', type=str )
    parser.add_argument( '--log_dir', type=str, default='/Users/iwatobipen/develop/py35env/testfolder/tensorflowtest/mollog' )
    parser.add_argument( '--data_dir', type=str, default='/Users/iwatobipen/develop/py35env/testfolder/tensorflowtest/mollog')

    FLAGS, unparsed = parser.parse_known_args()
    sdf = [ mol for mol in Chem.SDMolSupplier( FLAGS.sdf ) ]
    im = Draw.MolsToGridImage( sdf, molsPerRow=10, subImgSize=( 100, 100 ))
    im.save( os.path.join( FLAGS.data_dir + '/mols.png' ))
    tf.app.run( main=main, argv=[sys.argv[0]] + unparsed )

To run the code.

$ python tensormolembedding.py --sdf your.sdf

Then launch tensorboard and access localhost:6006.

$ tensorboard --logdir your_log_dir

Then, I could get following image.
This image is results of PCA, but also it can perform t-SNE analysis.

I pushed my code to my repo.
Tensorflow has cool function ;-).


Handle pymol via CUI.

I often use Pymol to visualize PDB files.
Recently I want to merge some PDB files in one Pymol session file from CUI.
Because I run the task as batch. So I searched API document and tried it.
At first I need launch pymol in silent mode ( no GUI ).
And then load pdb files.
Next I set color of each object by b factor as spectrum.
Finally save object as pymol session file and closed pymol.
Every thing worked well.

Following code and sample files are pushed my repo.


import pymol
from pymol import cmd
cmd.spectrum('b', 'blue_white_red','1atp', 0, 100)
cmd.spectrum('b', 'yellow_cyan_blue','1atp2', 0, 100)
cmd.spectrum('b', 'green_magenta','1atp3', 0, 100)

Compile LeapPython and Manipulate protein structure with Leap Motion.

I was interested in Kinect as input device, because Kinect can detect motion as input it feels feature. 😉
And some days ago, I got new input device “Leap Motion”.
The Leap Motion controller is a small USB peripheral device which is designed to be placed on a physical desktop, facing upward. ( from wiki )
I want to use this device with PyMol!
Let’s try it.
At first to use Leap Motion from python, user need to get Leapmotion SDK. My environment is OSX, so I got SDK v2.3 from developer site.
Installation is little bit complicated for me, because I want to use LeapPython library from virtual environment but the library is linked with native python.
So, I build or change dynamic link following procedure.

In Python2.7, I changed link with following command.

iwatobipen$ cd /somepath/LeapSDK/lib
iwatobipen$ otool -L LeapPython.so
	/Library/Frameworks/Python.framework/Versions/2.7/Python (compatibility version 2.7.0, current version 2.7.0)
	@loader_path/libLeap.dylib (compatibility version 0.7.0, current version 2.3.1)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1197.1.1)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.0.0)
# change Python link from native to anaconda
iwatobipen$ install_name_tool -change /Library/Frameworks/Python.framework/Versions/2.7/Python /Users/iwatobipen/.pyenv/versions/anaconda-2.4.0/lib/libpython2.7.dylib LeapPython.so

In Python3.5, I compiled LeapPython.so using swig.
Newest swig will cause error, so I used swig version 3.0.3
I referred following url to do that. https://support.leapmotion.com/hc/en-us/articles/223784048-Generating-a-Python-3-3-0-Wrapper-with-SWIG-2-0-9
OK Let’s go

# Copy Leap.h, LeapMath.h, Leap.i, and libLeap.dylib into one folder. And type following command.
# Generate LeapPython.cpp withswig -c++ -python -o LeapPython.cpp -interface LeapPython Leap.i
iwatobipen$ swig -c++ -python -o LeapPython.cpp -interface LeapPython Leap.i
# Compile and link 
clang++ -arch x86_64 -I/Users/iwatobipen/.pyenv/versions/anaconda-2.4.0/include/python2.7 LeapPython.cpp libLeap.dylib /Users/iwatobipen/.pyenv/versions/anaconda-2.4.0/lib/libpython2.7.dylib -shared -o LeapPython.so

Now I got LeapPython.so linked each python version in anaconda.

Unfortunately, I could not install pymol in python3.5. Following code was run in python2.7. I installed pymol by using conda install command.
Pymol Wiki provided sample code to use Leap Motion from pymol. I used the example. I added “path” which placed LeapPython.so.

import sys
import math
from pymol import cmd
sys.path.insert( 0, '/Users/iwatobipen/develop/py2env/LeapSDK/build/')

import Leap
from Leap import Matrix, Vector

class PymolListener(Leap.Listener):
    def __init__(self, *args, **kwargs):
        super(PymolListener, self).__init__(*args, **kwargs)

        self.prev_frame = None

        self.controller = Leap.Controller()

    def __del__(self):

        super(PymolListener, self).__del__()

    def update_view(self, frame):
        if not self.prev_frame:

        view = list(cmd.get_view())

        if frame.rotation_probability(self.prev_frame) > 0.1:
            m = frame.rotation_matrix(self.prev_frame)
            m *= Matrix(Vector(*view[0:3]),
            view[:9] = m.to_array_3x3()

        if frame.scale_probability(self.prev_frame) > 0.1:
            s = frame.scale_factor(self.prev_frame)
            delta_z = math.log(s) * 100.0
            view[11] += delta_z
            view[15] -= delta_z
            view[16] -= delta_z


    def on_frame(self, controller):
        frame = controller.frame()


        self.prev_frame = frame

listener = PymolListener()

Then run pymol and fetched sample pdb.

iwatobipen$ pymol
# from pymolconsole.
pymol> fetch 1atp
pymol> run test.py

Result is …. It seems work well. Now, I am reading API reference to write code.

Call rdkit from pymol

PyMol is one of major tool for visualisation of protein, ligand, etc.
You know ‘Py’ means that pymol is written using python.
I want to embed pymol to web app but it’s difficult for me (technical reason..)
If anyone have advice to do it, please leave comment. 😉

Next, I have one question that, pymol can call rdkit ?
If it possible, it will be good information for me.
Think drug design based 3D structure is key for design, and lots of tools for visualise 3D structures.
….But complicated tools is not familiar for medchem.(really ?)
I wrote simple test script to call rdkit from pymol.
Script is following.

 from rdkit import Chem
 from rdkit.Chem import AllChem
 from pymol import cmd
 #from __future__ import print_function
 mol = Chem.MolFromSmiles( 'CCCS(=O)(=O)Nc1ccc(F)c(c1F)C(=O)c2c[nH]c3c2cc(cn3    )c4ccc(Cl)cc4' )
 hmol = Chem.AddHs( mol )
 AllChem.EmbedMolecule( hmol )
 AllChem.MMFFOptimizeMolecule( hmol )    
 #print( hmol.GetNumConformers( ))
 pdbstring = Chem.MolToPDBBlock( hmol )
 cmd.read_pdbstr( pdbstring, 'vemrafenib' )

Next, run script via pymol

iwatobipen$ pymol pymolrdkit.py 
 PyMOL(TM) Molecular Graphics System, Version
 Copyright (c) Schrodinger, LLC.
 All Rights Reserved.

Work fine.!
Screen Shot 2015-11-23 at 10.08.14 PM
I’ll research more details about pymol api.

I went to tobuzoo yesterday.http://www.tobuzoo.com/
I and my family enjoyed.