integration of spotfire and pdb viewer

Some years ago, I heard a presentation about implementation of pdb viewer in spotfire in JCUP. It was really impressive for me because spotfire can not handle PDB files. You know, spotfire is one of the popular tool for data visualization. I like the tool.

Recently I found unique library for spotfire named ‘JSViz’. The library is not native library but user can get it from community site. To use JSViz, spotfire can communicate JS library such as D3.js, highcharts.js etc. 😉

Lots of examples are provided from the site.
I thought “Hmm… If there is pdb viewer written in javascript, I can implement pdb viewer in spotfire”.

So, I tried it.
Install jsviz at first.
And then wrote pdb_loader_script using template. I used pv.js for PDB loader.
JSViz gets data from spotfire as sfdata. sfdata is JSON format. If reader who needs more details for the data structure, I recommend read original document. ( or comment the post )
My data format is following.
#, pdb_id, ligandname
1,1ATP, ANP

And used pdb_id and ligandname for sfdata.
My strategy is….
1. Build external pdb supply server. ( simple http server written in python )
2. Access the url and get pdb file from the server and render it ( using jsviz ).

Following code is JSViz sample code. The code render protein as cartoon and ligand as ball and stick.

/*
 Copyright (c) 2016 TIBCO Software Inc

 THIS SOFTWARE IS PROVIDED BY TIBCO SOFTWARE INC. ''AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
 SHALL TIBCO SOFTWARE BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
 EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

//////////////////////////////////////////////////////////////////////////////
// #region Drawing Code

var pv = require("bio-pv");

//
//
// Main Drawing Method
//

function renderCore(sfdata)
{
    if (resizing) {
        return;
    }

    // Log entering renderCore
    log ( "Entering renderCore" );

    // Extract the columns
    var columns = sfdata.columns;
    // Extract the data array section
	var chartdata = sfdata.data;

    // count the marked rows in the data set, needed later for marking rendering logic
    var markedRows = 0;
    for (var i = 0; i < chartdata.length; i++) {
        if (chartdata[i].hints.marked) {
            markedRows = markedRows + 1;
        }
    }
    var width = window.innerWidth;
    var height = window.innerHeight;

    //
    // Replace the following code with actual Visualization code
    // This code just displays a summary of the data passed in to renderCore
    //
    //displayWelcomeMessage ( document.getElementById ( "viewer" ), sfdata );

    displaypdb(document.getElementById('js_chart'), chartdata);
    wait ( sfdata.wait, sfdata.static );
};

//
// #endregion Drawing Code
//////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////
// #region Marking Code
//

//
// This method receives the marking mode and marking rectangle coordinates
// on mouse-up when drawing a marking rectangle
//
function markModel(markMode, rectangle)
{
	// Implementation of logic to call markIndices or markIndices2 goes here
}

//
// Legacy mark function 2014 HF2
//
function mark(event)
{
}

//
// #endregion Marking Code
//////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////
// #region Resizing Code
//

var resizing = false;

window.onresize = function (event) {
    resizing = true;
    if ($("#js_chart")) {
    }
    resizing = false;
};

//
// #endregion Resizing Code
//////////////////////////////////////////////////////////////////////////////

//
// This is a sample visualization that indicates that JSViz is installed
// and configured correctly.  It is an example of how to draw standard
// HTML objects based on the data sent from JSViz.
//

function displaypdb( div, chartdata ){
	var html;
	div.innerHTML = "
<div id='viewer'>pdb</div>
";
    var options = {
	  background: 'lightgrey',
      width: 800,
      height: 600,
      antialias: true,
       quality : 'medium'
       };
    // insert the viewer under the Dom element with id 'gl'.
    var viewer = pv.Viewer(document.getElementById('viewer'), options);
	var pdb_id = ( chartdata[0].items[0] );
	var ligand_name = ( chartdata[0].items[1] );
	var url = 'http://localhost:9000/'+pdb_id+'.pdb'
	$.ajax( url )
    .done(function(data) {
    var structure = pv.io.pdb(data);
	var ligand = structure.select({rnames : [ ligand_name ]});
	viewer.ballsAndSticks('ligand', ligand);

	viewer.cartoon('protein', structure, { color : color.ssSuccession() });
	viewer.centerOn(structure);
});
};

Go next. Following code is simple HTTP server. In the same folder, place pdbfiles for supply.

import os
import sys
import http.server
import socketserver
PORT = 9000
class HTTPRequestHandler(http.server.SimpleHTTPRequestHandler):
    def end_headers(self):
        self.send_header('Access-Control-Allow-Origin', '*')
        http.server.SimpleHTTPRequestHandler.end_headers(self)

def server(port):
    httpd = socketserver.TCPServer(('', port), HTTPRequestHandler)
    return httpd

if __name__ == "__main__":
    port = PORT
    httpd = server(port)
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
        print("\n...shutting down http server")
        httpd.shutdown()
        sys.exit()

This is very brief introduction. Also to use JSViz, it can get user event like a clicking the ligand, residue etc….
It seems very interesting. But do I need to develop new visualization in spotfire ? ;-p

ref
https://community.tibco.com/wiki/javascript-visualization-framework-jsviz-and-tibco-spotfire
https://biasmv.github.io/pv/

Get distance matrix via tibco spotfire.

I often calculate molecular distance matrix or similarity matrix.
Distance matrix is useful for visualise molecular similarity but some time it is bother to calculate it.
Today I wrote data function that calculate distance matrix for molecular set.
There are not native function to calculate that in Spotfire. And RCDK can not install TERR.
So, I used RinR. Using RinR, Spotfire can use local R environment function.
I set input parameter ‘smiles’ as column and output parameter ‘res’ as table.
Then register following data function to TS server.

res <- REvaluate(
         { library( rcdk );
           mols <- parse.smiles( smiles );
           fps <- lapply( mols, get.fingerprint, type='extended' );
           fp.sim <- fp.sim.matrix( fps, method=tanimoto'' );
           dist <- 1 - fp.sim;
           dist.df <- data.frame( smiles, dist );
           res <- dist.df;
          }, data = 'smiles'
)

I could get distance matrix when I ran the function.
RinR is very flexible to extend function of spotfire.

Calculate PK param. using TIBCO Spotfire?

Somedays ago,my colleague asked me how to calculate PK parameters in TIBCO Spotfire ?
Hmm. That’s sounds difficult, because Spotfire is not tool for analyse DMPK data.
But, there are some packages in CRAN for DMPK. So I thought that using TERR, some PK params could calculate.
Let’s do it. 😉
At first I installed ‘PK’ packages from source into TERR.

To install ‘PK’ package, is easy.
1st step download “PK” from http://cran.r-project.org/web/packages/PK/index.html, and put zip file into your TERRs folder.
2nd step launch TERR command prompt, and type “install.packages(‘PK’)”
That’s all !
If successful, you can call PK package using ‘liblary(‘PK’)’

Next, register data-function to spotfire.
# I wrote in my MAC BOOK(not installed spotfire), so there are no screen shot. #
Following function is very simple. The function calculate AUC, MRT, etc. using non-compartmental estimation.
I named following function ‘pk_nan’.

library('PK') 
res <- nca(conc, time, n.tail=3, dose=0, method="z", conf.level=0.95,nsample=1000, design="ssd" )
result <- data.frame( res$CIs )
result[' name '] <- row.names(result)

And set input data …
conc => column( real )
time => column( real )
Also set output data …
result => datatable # that means user can get data as new data table.

Finally run test.
Prepare data sample csv or excel format and load data to spotfire.
e.g.
time 1, 1, 2, 2, 4, 4, 8, 8, 24, 24
conc 2790, 3280, 4980, 7550, 5500, 6650, 2250, 3220, 213, 636

Then insert data function..
May new datatable has following data.
est
95% CI for the AUC to tlast using a z distribution 5.988600e+04
95% CI for the AUC to infinity using a z distribution 6.301882e+04
95% CI for the AUMC to infinity using a z distribution 4.918245e+05
95% CI for the Mean residence time using a z distribution 7.804406e+00
95% CI for the non-compartmental half-life using a z distribution 5.409602e+00
95% CI for the Clearance using a z distribution 0.000000e+00
95% CI for the Volume of distribution at steady state using a z distribution 0.000000e+00
….

Summary.
Tibco spotfire can calculate PK param indirectly using TERR.
I’m not charge of DMPK, and I think winnonline or another tools are used DMPK analysis.
This usage is limited for test.

Implementation of machine learning in Spotfire.

Today I coded for functions that predict molecular property using e1071.
Following code is almost pure R, but the code get data from Spotifre.
So, users don’t need to think about R coding. User can build model and predict data only using spotfire.

At first I get sample data in from Bursi Mutagenicity Dataset(link).
Convert SDF format to smiles format using RDKit, because to calculate fingerprint I used rcdk from Spotfire.

Let’s build model and predict data.
At first, upload data that has smiles and AMES Categorisation to library.

Second, register the data function .
The function builds model from smiles and Categorisation data, and saves the model in temp folder and return the test_result.
“inTable” means uploaded data that was mentioned above.
Of cause, to do that, user need R and rcdk, e1071.
To use following data function, user needs to set input=>inTable(table), output=>outTable(table).

Code is following….

library( RinR )
outTable &lt;- REvaluate({
                      library( rcdk );
                      library( e1071 );
                      inTable$CATEGORIATION &lt;- as.factor(inTable$CATEGORIATION);
                      inTable$SMILES &lt;- as.character(inTable$SMILES);
                      mols &lt;- lapply(inTable$SMILES, parse.smiles);
                      cmp.fp &lt;- vector("list", nrow(inTable));
                      for (i in 1: nrow(inTable)){
                                            cmp.fp[i] &lt;- lapply(mols[[i]][1], get.fingerprint, type="circular")
                                            };
                      fp.matrix &lt;- fp.to.matrix(cmp.fp);
                      cmp.fingerprint &lt;- as.data.frame(fp.matrix);
                      dataset &lt;- cbind(cmp.fingerprint, inTable$CATEGORIATION);
                      colnames(dataset)[1025] &lt;- "RESPONSE";
                      train &lt;- sample(dim(dataset)[1],3000);
                      test&lt;-c( 1:nrow(dataset) )[-train];
                      train_data&lt;-dataset[train,];
                      test_data&lt;-dataset[test,];
                      model &lt;- svm(RESPONSE ~., data=train_data);
                      #write.svm(model, svm.file = "c:/temp/svmdata.svm", scale.file = "c:/temp/svmdata.scale");
                      save(model,file="c:/temp/svmdata.svm");
                      res &lt;- predict(model, test_data);
                      res_mat &lt;- data.frame(res, test_data$RESPONSE);
                      colnames(res_mat)&lt;-c("pred","experimental");
                      outTable &lt;- data.frame(table(res_mat));
                     }, data="inTable")

Third, register the another data function to predict category from smiles.
The data function needs smiles column that you want predict, and returns predicted category column.
So, user need to set input=>inCol (column) and output=>outCol(column).
This function read models stored temp folder and predict data.

library( RinR )
outCol &lt;- REvaluate({
                      library( rcdk );
                      library( e1071 );
                      inCol &lt;- as.character(inCol);
                      mols &lt;- lapply( as.list(inCol), parse.smiles );
                      cmp.fp &lt;- vector("list", length(inCol));
                      for (i in 1:length(inCol) ){
                                            cmp.fp[i] &lt;- lapply(mols[[i]][1], get.fingerprint, type="circular")
                                            };
                      fp.matrix &lt;- fp.to.matrix(cmp.fp);
                      cmp.fingerprint &lt;- as.data.frame(fp.matrix);
                      load("c:/temp/svmdata.svm");
                      outCol &lt;- predict( model , cmp.fingerprint );
                      }, data="inCol")

I think TERR and RinR are useful not only comp chem. but also med chem. because to use datafuction, user don’t need cording.

I up loaded sample code to my github. 😉
https://github.com/iwatobipen/R_QSARtest

ggplot with spotfire

ggplot2 is nice tool for visualise data in R.
TIBCO Spotfire does not have density plot function.
So, I implemented density plot in spotfire.
I did it using RinR function and ggplot2.

* This is simple sample.
In spotfire data function, I wrote following…


library(RinR);
outPNG &lt;-RGraph(
             print(
                  ggplot(aes_string(x=idx),data=inTable)+geom_density()
                   ),
             packages=&quot;ggplot2&quot;,
             data =c(&quot;idx&quot;,&quot;inTable&quot;), display=FALSE,
       )

“idx” is column name which you want plot
In Spotfire’s input tab, I set “inTable” is table(data that using visualisation), and set “idx” is column name ( to make density plot ).
Then in output tab, I set “outPNG” is value.
Make text area and insert property binary label named “outPNG “.
All preparation is done.

Run the script.
In text area, density plot will apper.
I can’t upload screenshot, because I post the blog from my personal PC. 😉

use RinR

TIBCO Spotfire is tool for data visualisation.
I think that TERR “TIBCO Enterprise Runtime for R” is cool. This module allows us to develop in open source R.
It can develop not only serve-side, but also client-side.
So, I can use my local R environment to make data function.
If I use RinR, I don’t need think about server side R environment.
I wrote sample code about machine learning using local e1071.
To use RinR, “REValuate” function is need to communicate local R and TERR.

* This is simple sample.
Make svm model from input dataset and predict (classification) value using same table.
Like this….

library(RinR);
REValuate({ library(e1071);
                     table1 <- inTable; #read data for train.
                     target <- factor( target ); # i.e. Species in iris-dataset.
                     table1$target <- target; # join data
                     model.svm <- svm( target ~., data = table1 ); # make classification model.
                     res <- predict( model.svm, inTable );
                     output <- data.frame( table1, res ); # return result another data table.
                   }
                    , data=c("inTable", "target") # argument for local R env
               )

In Spotfire’s input tab, I set “inTable” is table(data that using prediction), and set “target” is column( data for classification ).
Then in output tab, I set “output” is table.

All preparation is done.
Run the script.
I use iris dataset for test, and get same result with R.
I think it’s useful function, because datafunction is stored in library and any other user can use the function without coding.
But, to develop function, need some tricky technics I think.
Someone who has more good practice, please advice me ;-).