家族で旅行に行った話

どちらかというと自分が出不精な性格であまり遠出しないタイプなのですが、この連休は家族や、子供の友達の家族と小田原の方に遊びに行きました。
昨日は小田原のわんぱくランドhttp://www.city.odawara.kanagawa.jp/public-i/park/wanpaku/というところに行ってきました。
駐車料金だけで入場料とかはかからないし、色々遊具があるので、アクティブに遊びたい家族にはいい場所かもと思います。
ただ、昨日は天気良すぎて暑いこと暑いこと、、、一緒になって遊んでるとこっちがやられてしまいそうになってきた。
暑い中ずーっと走り回って遊んで、ホテル行ってからもずっとハイテンションで花火して一瞬で寝てしまったようだ。
今日は今日で朝からプール行って公園行ってずーっと遊び通し。子供が成長して体力がついてくるのと逆にこっちはどんどん体力がなくなってくるのでしんどいしんどい。
負けないように走り込みと筋トレしないと。。。。

Mishima.syk #10に参加した話

 今回で記念すべき10回目となるmishisima.sykに参加してきました。
 私は、機械学習+luigiを使ったパイプラインの話をしました。何かしら参考になる情報が提供できていればいいのですが。
今回は機械学習ネタ、バイオ系のネタ、パイプライン系のネタ、国内創薬業界のアイドルの話、さらにはAI(愛?)の話、美味しいお好み焼きの話など多岐にわたる話題がありました。
発表者の皆様、参加者の皆様、感じの皆様本当にありがとうございました。とても楽しい時間を過ごすことができました。
 今回もいつもながら発表者のプレゼンのクオリティが高くビビりました。面白くかつ本質をビシッと表現できるってのはやっぱり問題の本質をしっかり見据えてかつ、それを理解してないとできないことだと思います。
 私は普段シーケンスとかNGSみたいな仕事に関わることないので、MinIONなど初めて知りました。 あのサイズで、、、技術、科学の進歩ってすごいですね、、、
 有機合成でもFlow ChemistryやLab on chipなどありますし、集積化高速化は研究のキーになるところですかね。
勉強会中も懇親会でも色々と良いお話を聞くことができました。
まだまだ勉強全然たらん。明日から頑張ろうと思います。
また次回も開催したいですね。

Platfrom-as-a-Service for Deep Learning.

Yesterday, I enjoyed mishima.syk #10. I uploaded my presentation and code to mishimasyk repo.
I introduced briefly about a PaaS for DL named ‘Floyd’. I think the service is interesting because I can run DL on cloud with GPU!

So, I describe very simple example to start DL with “FLOYD” 😉
At first, Make account from the site.
Next, install command line tools. Just type pip install -U floyd-cli!

# Install floyd-cli
$ pip install -U floyd-cli

Third step, login the floyd.

# from terminal
$ floyd login

Then web browser will launch and the page provides authentication token. Copy and paste it.
Ready to start!
Let’s play with floyd.
Fist example is iris dataset classification using sklearn.

import numpy as np
from sklearn.datasets import load_iris
from sklearn.svm import SVC
from sklearn.cross_validation import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
dataset = load_iris()
X = dataset.data
y = dataset.target

trainx, testx, trainy, testy = train_test_split( X, y, test_size=0.2,random_state = 123 )

svc = SVC( kernel='rbf' )
svc.fit( trainx, trainy )

rfc = RandomForestClassifier()
rfc.fit( trainx, trainy )

predsvc = svc.predict( testx )
predrf = rfc.predict( testx )

print( classification_report(testy, predsvc ))

Use floyd run command to start the code after initialize the project.

$ mkdir test_pj
$ cd test_pj
$ floyd init
$ floyd run 'python svc_rf_test.py'
Creating project run. Total upload size: 168.9KiB
Syncing code ...
[================================] 174656/174656 - 00:00:02
Done
RUN ID                  NAME                     VERSION
----------------------  ---------------------  ---------
xxxxxxxx  iwatobipen/test_pj:10         10

To view logs enter:
    floyd logs xxxxxxxx

I could check the status via web browser.

Next run the DNN classification model.
It is very very simple example. not so deeeeeeeeeeeep.

mport numpy as np
from sklearn.datasets import load_iris
from sklearn.cross_validation import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.utils import np_utils

dataset = load_iris()
X = dataset.data

xdim = 4
y = dataset.target
y = np_utils.to_categorical( y, 3 )
trainx, testx, trainy, testy = train_test_split( X, y, test_size=0.2,random_state = 123 )

model = Sequential()
model.add( Dense( 16, input_dim = xdim  ) )
model.add( Activation( 'relu' ))
model.add( Dense( 3 ))
model.add( Activation( 'softmax' ))
model.compile( loss = 'categorical_crossentropy',
               optimizer = 'rmsprop',
               metrics = ['accuracy'])

hist = model.fit( trainx, trainy, epochs = 50, batch_size = 1 )
classes = model.predict( testx, batch_size = 1 )

print( [ np.argmax(i) for i in classes ] )
print( [ np.argmax(i) for i in testy ] )
loss, acc = model.evaluate( testx, testy )

print( "loss, acc ={0},{1}".format( loss, acc ))

To run the code in the same manner.

iwatobipen$ floyd run 'python dnn_test.py'
Creating project run. Total upload size: 168.9KiB
Syncing code ...
[================================] 174653/174653 - 00:00:02
Done
RUN ID                  NAME                     VERSION
----------------------  ---------------------  ---------
xxxxxxx  iwatobipen/test_pj:11         11

To view logs enter:
    floyd logs xxxxxxx

Check the log from web site.

2017-07-09 01:51:37,703 INFO - Preparing to run TaskInstance <TaskInstance: iwatobipen/test_pj:11 (id: Uus7cp996732cBWdgt3nz3) (checksum: 144078ab50a63ea6276efee221669d13) (last update: 2017-07-09 01:51:37.694913) [queued]>
2017-07-09 01:51:37,723 INFO - Starting attempt 1 at 2017-07-09 01:51:37.708707
2017-07-09 01:51:38,378 INFO - adding pip install -r floyd_requirements
2017-07-09 01:51:38,394 INFO - Executing command in container: stdbuf -o0 sh command.sh
2017-07-09 01:51:38,394 INFO - Pulling Docker image: floydhub/tensorflow:1.1.0-py3_aws.4
2017-07-09 01:51:39,652 INFO - Starting container...
2017-07-09 01:51:39,849 INFO -
################################################################################

2017-07-09 01:51:39,849 INFO - Run Output:
2017-07-09 01:51:40,317 INFO - Requirement already satisfied: Pillow in /usr/local/lib/python3.5/site-packages (from -r floyd_requirements.txt (line 1))
2017-07-09 01:51:40,320 INFO - Requirement already satisfied: olefile in /usr/local/lib/python3.5/site-packages (from Pillow->-r floyd_requirements.txt (line 1))
2017-07-09 01:51:43,354 INFO - Epoch 1/50
2017-07-09 01:51:43,460 INFO - 1/120 [..............................] - ETA: 8s - loss: 0.8263 - acc: 0.0000e+00
 58/120 [=============>................] - ETA: 0s - loss: 1.5267 - acc: 0.6552
115/120 [===========================>..] - ETA: 0s - loss: 1.2341 - acc: 0.6522
120/120 [==============================] - 0s - loss: 1.2133 - acc: 0.6583
2017-07-09 01:51:43,461 INFO - Epoch 2/50
..........................
 57/120 [=============>................] - ETA: 0s - loss: 0.1135 - acc: 0.9649
115/120 [===========================>..] - ETA: 0s - loss: 0.1242 - acc: 0.9739
120/120 [==============================] - 0s - loss: 0.1270 - acc: 0.9750
2017-07-09 01:51:48,660 INFO - Epoch 50/50
2017-07-09 01:51:48,799 INFO - 1/120 [..............................] - ETA: 0s - loss: 0.0256 - acc: 1.0000
 57/120 [=============>................] - ETA: 0s - loss: 0.0911 - acc: 0.9825
114/120 [===========================>..] - ETA: 0s - loss: 0.1146 - acc: 0.9737
120/120 [==============================] - 0s - loss: 0.1161 - acc: 0.9750
2017-07-09 01:51:48,799 INFO - [1, 2, 2, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 2, 2, 2, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 2, 2, 0]
2017-07-09 01:51:48,800 INFO - [1, 2, 2, 1, 0, 2, 1, 0, 0, 1, 2, 0, 1, 2, 2, 2, 0, 0, 1, 0, 0, 2, 0, 2, 0, 0, 0, 2, 2, 0]
2017-07-09 01:51:48,800 INFO - 30/30 [==============================] - 0s
2017-07-09 01:51:48,800 INFO - loss, acc =0.23778462409973145,0.8666666746139526

The following software packages (in addition to many other common libraries) are available in all the environments:
h5py, iPython, Jupyter, matplotlib, numpy, OpenCV, Pandas, Pillow, scikit-learn, scipy, sklearn

Also, user can install additional packages from pypi. ( Not anaconda … 😦 ) To install that, put file named ‘floyd_requirements.txt’ in the project folder.

In summary, Floyd is very interesting service. Easy to set up DL environment and use GPU on cloud.
I want to support anaconda in FLOYD, because I want to use chemoinformatics package like RDKit, Openbabel etc…

ドッジボール

子供の頃ドッジボールよくやりました。最近長男がはまっている。公式ルールがあるなんて知らなかった。
https://www.dodgeball.or.jp/%E3%83%89%E3%83%83%E3%82%B8%E3%83%9C%E3%83%BC%E3%83%AB%E3%81%A8%E3%81%AF/

そして長男くんは最近地区のドッチボールチームに入りました。本格的にやってるチームはアタッカーとか、カッターとか役割もあり当てればいいって雰囲気ではなくしっかりと戦略があるので面白いし、迫力がある。練習しっかりやって成長したら負けそう。

さて、
今日は地域の小学校で予選を勝ち抜いてきたチームの大会でした。
体育館はむちゃくちゃ暑かったが、みんな元気に最後までやってました。朝8時から16時くらいまで、、さらにそのあと公園で遊びだし、、、。

体力すでに負けてそうだ。いや、負けてる、、、

まあ、好きなこと見つけて熱中してくれるといいなと思う。
しかし子供はHPが0になるまで全力で遊ぶなぁ。

GET TID and PREF_NAME from CHEMBL

I want to retrieve relationship between TID and PREF_NAME in specific case from ChEMBL DB.
SQL query is following.

COPY (
  2         SELECT  DISTINCT TID, PREF_NAME FROM ACTIVITIES
  3                       JOIN ASSAYS USING (ASSAY_ID)
  4                       JOIN TARGET_DICTIONARY USING (TID)
  5                       WHERE STANDARD_TYPE = 'Ki'
  6                       AND STANDARD_VALUE IS NOT NULL
  7                       AND STANDARD_RELATION = '='
  8                        )
  9         TO '/path/td.csv'
 10         ( FORMAT CSV )

I ran the sql and got result like ….

1,Maltase-glucoamylase
3,Phosphodiesterase 5A
6,Dihydrofolate reductase
7,Dihydrofolate reductase
8,Tyrosine-protein kinase ABL
9,Epidermal growth factor receptor erbB1
11,Thrombin
12,Plasminogen
13,Beta-lactamase TEM
14,Adenosine deaminase
15,Carbonic anhydrase II
19,Estrogen receptor alpha
21,Neuraminidase
23,Plasma kallikrein
24,HMG-CoA reductase
25,Glucocorticoid receptor
28,Thymidylate synthase
30,Aldehyde dehydrogenase
35,Insulin receptor
36,Progesterone receptor
41,Alcohol dehydrogenase alpha c

🙂

 

Installing TensorFlow on Mac OX X with GPU support

Yesterday, I tried to install tensorflow-gpu on my mac.
My PC is MacBook Pro (Retina, 15-inch, Mid 2014). The PC has NVIDA GPU.
OS is Seirra.
Details are described in following URL.
https://www.tensorflow.org/install/install_mac

I installed tensorflow directly by using pip command.

 $ pip install --upgrade tensorflow-gpu  # for Python 2.7 and GPU #for python2
 $ pip3 install --upgrade tensorflow-gpu # for Python 3.n and GPU  #for python2

Almost done, but not finished yet.
To finish the installation, I need to disable System Integrity Protection (SIP).
To do that I need follow these steps.

Restart my Mac.
Before OS X starts up, hold down Command-R and keep it held down until you see an Apple icon and a progress bar. ...
From the Utilities menu, select Terminal.
At the prompt type exactly the following and then press Return: csrutil disable.

I tested following code.

import tensorflow as tf

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)

# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

# Runs the op.
print sess.run(c)

And the results seems tensorflow can use GPU.

iwatobipen$ python testcode.py
2017-06-13 22:24:28.952288: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 22:24:28.952314: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 22:24:28.952319: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 22:24:28.952323: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 22:24:29.469570: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] OS X does not support NUMA - returning NUMA node zero
2017-06-13 22:24:29.470683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties:
name: GeForce GT 750M
major: 3 minor: 0 memoryClockRate (GHz) 0.9255
pciBusID 0000:01:00.0
Total memory: 2.00GiB
Free memory: 1.80GiB
2017-06-13 22:24:29.470713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0
2017-06-13 22:24:29.470720: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y
2017-06-13 22:24:29.470731: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0
2017-06-13 22:24:29.490805: I tensorflow/core/common_runtime/direct_session.cc:257] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0

MatMul: (MatMul): /job:localhost/replica:0/task:0/gpu:0
2017-06-13 22:24:29.495363: I tensorflow/core/common_runtime/simple_placer.cc:841] MatMul: (MatMul)/job:localhost/replica:0/task:0/gpu:0
b: (Const): /job:localhost/replica:0/task:0/gpu:0
2017-06-13 22:24:29.495384: I tensorflow/core/common_runtime/simple_placer.cc:841] b: (Const)/job:localhost/replica:0/task:0/gpu:0
a: (Const): /job:localhost/replica:0/task:0/gpu:0
2017-06-13 22:24:29.495395: I tensorflow/core/common_runtime/simple_placer.cc:841] a: (Const)/job:localhost/replica:0/task:0/gpu:0
[[ 22.  28.]
 [ 49.  64.]]

ref URL
https://github.com/tensorflow/tensorflow/issues/3723