タグ: pandas

Cytoscape and MMP

I posted an example about cytoscape.js using molecular similarity.
Cytoscape.js is cool but not good for lots of data. I tried to make network that has 1000 nodes. My PC freeze….
So I go back to cytoscape.
Today I made network from mmpa data for example.
If you had rdkit, make mmp is very easy.
First, I got sample data from Chembl (Jak3 inh.) as tsv.
For rdkit, file format must be, smiles id.
Let’s make it.
I used PANDAS for do it.

import pandas as pd
datatab = pd.read_table("bioactivity-14_7-19-38.txt", header=0)
datatab.columns
smi = datatab[[u'CANONICAL_SMILES', u'CMPD_CHEMBLID']]
smi.to_csv("smiles.smi", header=False, index=False)

Then make fragment file from smiles.
MMPA scripts are $RDBASE/Contrib/mmp .

iwatobipen$ wc bioactivity-14_7-19-38.txt 
     829   51884  448065 bioactivity-14_7-19-38.txt # 829 mols in this file.
iwatobipen$ time python ../rfrag.py < smiles.smi > frag.smi

real	3m50.721s
user	3m49.754s
sys	0m0.388s

iwatobipen$ wc frag.smi 
   42031   42031 6558703 frag.smi # 42031 frags made!

Next, make pair.

iwatobipen$ time python ../indexing.py < frag.smi > jak3_mmp.csv

real	0m31.788s
user	0m31.074s

sys	0m0.120s
iwatobipen$ wc jak3_mmp.csv 
    8466    8466 1729873 jak3_mmp.csv # Just done!

I got pair files jak3_mmp.csv.
This file has some field, SMILES_OF_LEFT_MMP,SMILES_OF_RIGHT_MMP,ID_OF_LEFT_MMP,ID_OF_RIGHT_MMP,SMIRKS_OF_TRANSFORMATION,SMILES_OF_CONTEXT.

I got mmp-csv file. Cytoscape can read this file.
So , from cytoscape, new network from file and SMILES_OF_LEFT_MMP as source, SMILES_OF_RIGHT_MMP as target.
Then layout set organic.
I used ChemViz to convert smiles to structure.
Screen Shot 2014-09-15 at 10.09.58 PM

Screen Shot 2014-09-15 at 10.11.08 PM

It’s easy to add node attribute(ex, activity, target, mol prop. etc…) and make your custom view using vizmapper.
I upload sample file to github. https://github.com/iwatobipen/mmp_example.
Be careful, this example code is incomplete, because this code ignore molecule that has no pair (singletone).

sqlalchemy and PANDAS

PANDAS is high-performance, easy-to-use data structures and data analysis tools for the Python.
I love it.
And Sqlalchemy is the Python SQL toolkit and Object Relational Mapper.
I’m not good at SQL… ;-(, Sqlalchemy is very helpful for me.
I want to make table from pandas dataframe in postgresql.
I found cool method. “DataFrame.to_sql”
Example is follow.
I used irisdata set as example.
* need Pandas ver 14.x and psycopg2 to connect postgresql.


from sqlalchemy import *
import pandas as pd
from sklearn import datasets
iris = datasets.load_iris()
#make DataFrame
iris_data = pd.DataFrame(iris.data)
#create engine
engine = create_engine("postgresql+psycopg2://<username>:<password>@localhost/<dbname>")
#set dataset column names.
iris_data.columns = ["sep_length", "sepal_width","petal_length","petal_width" ]
#write dataframe to postgresql database
iris_data.to_sql("iris_data", con=engine, if_exists="replace")

Only to_sql ! Don’t need some loop or commit command.
Great!
Next check DB.


iwatobipen$ psql -U postgres -d testdb
psql (9.3.4)
Type "help" for help.
testdb=> \z
                             Access privileges
 Schema |   Name    | Type  | Access privileges | Column access privileges 
--------+-----------+-------+-------------------+--------------------------
 public | iris_data | table |                   | 
(1 rows)

testdb=> select * from iris_data ;
 index | sep_length | sepal_width | petal_length | petal_width 
-------+------------+-------------+--------------+-------------
     0 |        5.1 |         3.5 |          1.4 |         0.2
     1 |        4.9 |           3 |          1.4 |         0.2
     2 |        4.7 |         3.2 |          1.3 |         0.2
     3 |        4.6 |         3.1 |          1.5 |         0.2
     4 |          5 |         3.6 |          1.4 |         0.2
     5 |        5.4 |         3.9 |          1.7 |         0.4
     6 |        4.6 |         3.4 |          1.4 |         0.3
     7 |          5 |         3.4 |          1.5 |         0.2
 ..............

works fine! 😉