Graph convolution classification with deepchem

I posted about graph convolution regression using deepchem. And today, I tried graph convolution classification using deepchem.
Code is almost same as regression model. The only a difference point is use dc.models.MultitaskGraphClassifier instead of dc.models.MultitaskGraphRegressor.
I got sample ( JAK3 inhibitor ) data from chembl and tried to make model.

At first I used pandas to convert activity class ( active, non active )to 0,1 bit. Easy to do it.

import panda as pd
import pandas as pd
df = pd.read_table('jak3_chembl.txt', header=0)
df['activity_class'] = pd.factorize( df.ACTIVITY_COMMENT )
pd.factorize( df.ACTIVITY_COMMENT )
len(pd.factorize( df.ACTIVITY_COMMENT ))
df['activity_class'] = pd.factorize( df.ACTIVITY_COMMENT )[0]

df.to_csv('./preprocessed_jak3.csv', index=False)

Next wrote model and test it.

import tensorflow as tf
import deepchem as dc
import numpy as np
import pandas as pd

graph_featurizer = dc.feat.graph_features.ConvMolFeaturizer()
loader = dc.data.data_loader.CSVLoader( tasks=['activity_class'], smiles_field="CANONICAL_SMILES", id_field="CMPD_CHEMBLID", featurizer=graph_featurizer )
dataset = loader.featurize( './preprocessed_jak3.csv' )

splitter = dc.splits.splitters.RandomSplitter()
trainset,testset = splitter.train_test_split( dataset )

hp = dc.molnet.preset_hyper_parameters
param = hp.hps[ 'graphconv' ]
print(param['batch_size'])
g = tf.Graph()
graph_model = dc.nn.SequentialGraph( 75 )
graph_model.add( dc.nn.GraphConv( int(param['n_filters']), 75, activation='relu' ))
graph_model.add( dc.nn.BatchNormalization( epsilon=1e-5, mode=1 ))
graph_model.add( dc.nn.GraphPool() )
graph_model.add( dc.nn.GraphConv( int(param['n_filters']), int(param['n_filters']), activation='relu' ))
graph_model.add( dc.nn.BatchNormalization( epsilon=1e-5, mode=1 ))
graph_model.add( dc.nn.GraphPool() )
graph_model.add( dc.nn.Dense( int(param['n_fully_connected_nodes']), int(param['n_filters']), activation='relu' ))
graph_model.add( dc.nn.BatchNormalization( epsilon=1e-5, mode=1 ))
graph_model.add( dc.nn.GraphGather( 10 , activation='tanh'))

with tf.Session() as sess:
    model_graphconv = dc.models.MultitaskGraphClassifier( graph_model,
                                                      1,
                                                      75,
                                                     batch_size=10,
                                                     learning_rate = param['learning_rate'],
                                                     optimizer_type = 'adam',
                                                     beta1=.9,beta2=.999)
    model_graphconv.fit( trainset, nb_epoch=50 )

train_scores = {}
#regression_metric = dc.metrics.Metric( dc.metrics.pearson_r2_score, np.mean )
classification_metric = dc.metrics.Metric( dc.metrics.roc_auc_score, np.mean )
train_scores['graphconvreg'] = model_graphconv.evaluate( trainset,[ classification_metric ]  )
p=model_graphconv.predict( testset )

for i in range( len(p )):
    print( p[i], testset.y[i] )

print(train_scores) 

root@08d8f729f78b:/deepchem/pen_test# python graphconv_jak3.py

And datalog file is….

Loading raw samples now.
shard_size: 8192
About to start loading CSV from ./preprocessed_jak3.csv
Loading shard 1 of size 8192.
Featurizing sample 0
TIMING: featurizing shard 0 took 2.023 s
TIMING: dataset construction took 3.830 s
Loading dataset from disk.
TIMING: dataset construction took 2.263 s
Loading dataset from disk.
TIMING: dataset construction took 1.147 s
Loading dataset from disk.
50
Training for 50 epochs
Starting epoch 0
On batch 0
...............
On batch 0
On batch 50
computed_metrics: [0.97176380945032259]
{'graphconvreg': {'mean-roc_auc_score': 0.97176380945032259}}

Not so bad.
Classification model gives better result than regression model.
All code is pushed my github repository.
https://github.com/iwatobipen/deeplearning

Build regression model in Keras

I introduced Keras in mishimasyk#9. And my presentation was how to build classification model in Keras.
A participant asked me that how to build regression model in Keras. I could not answer his question.
After syk#9, I searched Keras API and found good method.
Keras has Scikit-learn API. The API can build regression model. 😉
https://keras.io/scikit-learn-api/
Example code is following.
The code is used for build QSAR model.

import numpy as np
import pandas as pd
import sys

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs

from sklearn.cross_validation import train_test_split
from sklearn.cross_validation import cross_val_score
from sklearn.cross_validation import KFold
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score

from keras.models import Sequential
from keras.layers import Activation, Dense, Dropout
from keras.wrappers.scikit_learn import KerasRegressor

def getFpArr( mols, nBits = 1024 ):
    fps = [ AllChem.GetMorganFingerprintAsBitVect( mol, 2, nBits=nBits ) for mol in mols ]
    X = []
    for fp in fps:
        arr = np.zeros( (1,) )
        DataStructs.ConvertToNumpyArray( fp, arr )
        X.append( arr )
    return X

def getResponse( mols, prop="ACTIVITY" ):
    Y = []
    for mol in mols:
        act = mol.GetProp( prop )
        act = 9. - np.log10( float( act ) )
        Y.append( act )
    return Y

def base_model():
    model = Sequential()
    model.add( Dense( input_dim=1024, output_dim = 100 ) )
    model.add( Activation( "relu" ) )
    model.add( Dense( 100 ) )
    model.add( Activation( "relu" ) )
    model.add( Dense( 1 ) )
    #model.add( Activation( 'relu' ) )
    model.compile( loss="mean_squared_error",  optimizer="adam" )
    return model


if __name__ == '__main__':
    filename = sys.argv[1]
    sdf = [ mol for mol in Chem.SDMolSupplier( filename ) ]
    X = getFpArr( sdf )
    Y = getResponse( sdf )

    trainx, testx, trainy, testy = train_test_split( X, Y, test_size=0.2, random_state=0 )
    trainx, testx, trainy, testy = np.asarray( trainx ), np.asarray( testx ), np.asarray( trainy ), np.asarray( testy )
    estimator = KerasRegressor( build_fn = base_model,
                                nb_epoch=100,
                                batch_size=20,
                                 )
    estimator.fit( trainx, trainy )
    pred_y = estimator.predict( testx )
    r2 = r2_score( testy, pred_y )
    rmse = mean_squared_error( testy, pred_y )
    print( "KERAS: R2 : {0:f}, RMSE : {1:f}".format( r2, rmse ) )

Run the code.
I used CHEMBLdatast.

mishimasyk9 iwatobipen$ python keras_regression.py sdf/CHEMBL952131_EGFR.sdf 
Using Theano backend.
Epoch 1/100
102/102 [==============================] - 0s - loss: 62.2934     
........................   
Epoch 100/100
102/102 [==============================] - 0s - loss: 0.0123     
KERAS: R2 : 0.641975, RMSE : 0.578806

R2 is 0.64. It’s not so bad. 😉
I pushed the script to the syk 9 repository.
https://github.com/Mishima-syk/9/tree/master/iwatobipen

RemoteMonitor in keras

There are several packages to perform deep learning in python. And my favorite one is keras.
https://keras.io/
Today, I found new function in keras.callbacks named RemoteMonitor.
The function provide real time visualization of learning.
So, I wrote very simple example using IRIS dataset.
At first to use RemoteMonitor, I need clone api from following URL.
https://github.com/fchollet/hualos
And run api.py
Now, I can access localhost:9000 via web browser.

iwatobipen$ git clone https://github.com/fchollet/hualos
iwatobipen$ cd hualos
iwatobipen$ python api.py

Then, perform machine learning.

from sklearn.datasets import load_iris
from sklearn.cross_validation import train_test_split
from keras.utils.np_utils import to_categorical
from keras.callbacks import RemoteMonitor
# prepare dataset
irisdata = load_iris()
X,Y = irisdata["data"], irisdata["target"]
train_X, test_X, train_Y, test_Y = train_test_split( X, Y, test_size=0.2, random_state=123 )
# make monitor with default settings.
monitor = RemoteMonitor()

from keras.models import Sequential
# build model
model = Sequential()
from keras.layers import Dense, Activation
model.add( Dense( output_dim=12, input_dim=4 ) )
model.add( Activation( "relu" ) )
model.add( Dense( output_dim=3 ) )
model.add( Activation( "softmax" ) )
model.compile( optimizer="sgd",
               loss="sparse_categorical_crossentropy",
               metrics = ["accuracy"] )

OK. Let’s learn!
To monitor the progress, I set [ monitor ] in callbacks argument.

hist = model.fit( train_X, train_Y, nb_epoch=50, batch_size=20, callbacks=[ monitor ] )

During the learning, localhost:9000 is automatically updated and I can get following image.
This page made by Flask and d3.js. Cool !!!
keras_viz
“api.py” will not work on python3.x.
If you want to run the code in python3.x, I recommend replace following line because of iteritems is renamed to items in python3.

        lines = ["%s: %s" % (v, k)
--                 for k, v in self.desc_map.iteritems() if k]
++                 for k, v in self.desc_map.items() if k]

FYI.
9th Mishima.syk will be held 10th Dec. Topic is TensorFlow and Keras hans on.
https://connpass.com/event/42284/

Don’t miss it. 😉

Use convolution2D in QSAR.

Recently there are lots of report about deep learning to predict biological activity(QSAR).
I think almost of these predictors use MLP. I wonder if I could use another method like 2DCNN, I could get good predictor.
So, I tried to build 2DCNN QSAR model.
Fortunately, 1024 bit fingerprint is easily convert to 2D 32 x 32 fingerprint.
Let’s start.
At first, I converted molecules to 2D bit image. Data source is ChEMBL.

rom rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
from sklearn.cross_validation import train_test_split
from rdkit.Chem import Draw
import pandas as pd
import numpy as np
import glob
import pickle
import matplotlib.pyplot as plt
from sklearn.cross_validation import train_test_split


df = pd.read_table('bioactivity-15_13-09-28.txt', header=0)
df_bind = df[ df.ASSAY_TYPE=="B" ]
df_bind = df_bind[ df_bind.STANDARD_VALUE != None ]
df_bind = df_bind[ df_bind.STANDARD_VALUE >= 0 ]

rows = df_bind.shape[ 0 ]
mols = [ ]
act = [ ]
X = [ ]

def generate_fparr( mol ):
    arr = np.zeros( (1,) )
    fp = AllChem.GetMorganFingerprintAsBitVect( mol, 2, nBits = 1024, useFeatures = True )
    DataStructs.ConvertToNumpyArray( fp, arr )
    size = 32
    return arr.reshape( 1, size, size )

def save_fig( fparr, filepath, size=32 ):
    X, Y = np.meshgrid( range(size), range(size) )
    Z = fparr
    Z = Z[::-1,:]
    plt.xlim( 0, 31 )
    plt.ylim( 0, 31 )
    plt.pcolor(X, Y, Z[0])
    plt.gray()
    plt.savefig( filepath )
    plt.close()

def act2bin( val ):
    if val > 10000:
        return 0
    else:
        return 1

for i in range( rows ):
    try:
        smi = df_bind.CANONICAL_SMILES[i]
        mol = Chem.MolFromSmiles( smi )
        if mol != None:
            mols.append( mol )
            act.append( act2bin( df_bind.STANDARD_VALUE[i]) )
        else:
            pass
    except:
        pass

# save mols image dataset
for idx, mol in enumerate( mols ):
    X.append( generate_fparr( mol ) )
    if act[ idx ] == 1:
        save_fig( X[ idx ], "./posi/idx_{}.png".format( idx ) )
    elif act[ idx ] == 0:
        save_fig( X[ idx ], "./nega/idx_{}.png".format( idx ) )


X = np.asarray(X)
Y = np.asarray(act)

x_train, x_test, y_train, y_test = train_test_split( X,Y, test_size=0.2, random_state=123 )

f = open( 'fpimagedataset.pkl', 'wb' )
pickle.dump([ ( x_train,y_train ), ( x_test, y_test ) ], f)
f.close()

Now I got 2D Fingerprint dataset and 2D fingerprint molecular bit image. Following image is one of positive bit image. Of course I can not understand from the image why the molecule was active.
idx_0

Next, I wrote predictor using Keras.

import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout, Flatten
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.utils import np_utils
import pickle
import matplotlib
import matplotlib.pyplot as plt

( x_train, y_train ), ( x_test, y_test ) = pickle.load( open('fpimagedataset.pkl', 'rb') )

batch_size = 200
nb_classes = 2
nb_epoch = 100
nb_filters = 32
nb_pool = 2
nb_conv = 3
im_rows , im_cols = 32, 32
im_channels = 1

x_train = x_train.astype( "float32" )
x_test = x_test.astype( "float32" )

y_train = np_utils.to_categorical( y_train, nb_classes )
y_test = np_utils.to_categorical( y_test, nb_classes )

print( x_train.shape[0], 'train samples' )
print( x_test.shape[0], 'test samples' )


model = Sequential()
model.add( Convolution2D( 32, 3, 3,
                            input_shape = ( im_channels, im_rows, im_cols ) ) )
model.add( BatchNormalization() )
model.add( Activation( 'relu' ) )

model.add( Convolution2D( 32, 3, 3,    ))
model.add( BatchNormalization() )
model.add( Activation( 'relu' ) )

model.add( Convolution2D( 32, 3, 3 ) )
model.add( BatchNormalization() )
model.add( Activation( 'relu' ) )
model.add( MaxPooling2D( pool_size=( 2, 2 ) ) )
model.add( Flatten() )
model.add( Dense( 200 ) )
model.add( Activation( 'relu' ) )
model.add( Dropout( 0.5 ) )
model.add( Dense(nb_classes) )
model.add( Activation('softmax') )
model.compile( loss='categorical_crossentropy',
               optimizer='adadelta',
               metrics=['accuracy'],
               )

hist = model.fit( x_train, y_train,
                  batch_size = batch_size,
                  nb_epoch = nb_epoch,
                  verbose = 1,
                  validation_data = ( x_test, y_test ))


print( model.summary() )
score = model.evaluate( x_test, y_test, verbose=0 )

loss = hist.history[ 'loss' ]
acc = hist.history[ 'acc' ]
val_loss = hist.history[ 'val_loss' ]
val_acc = hist.history[ 'val_acc' ]
plt.plot( range(len( loss )), loss, label='loss' )
plt.plot( range(len( val_loss )), val_loss, label='val_loss' )
plt.xlabel( 'epoch' )
plt.ylabel( 'loss' )
plt.savefig( 'loss.png' )
plt.close()
plt.plot( range(len( acc )), acc, label='accuracy' )
plt.plot( range(len( val_acc )), val_acc, label='val_accuracy' )
plt.xlabel( 'epoch' )
plt.ylabel( 'acc' )
plt.savefig( 'acc.png' )
plt.close()

Finally, I run the code.
It took long time to finish the calculation and I got 2 images.
Accuracy of training set was increasing depend on number of epochs, but accuracy of test set was not same.
It was same as loss score. I think the predictor is overfitting. ;-(
I used drop out, normalisation but I couldn’t avoid over fitting. Hmm……

loss

acc

Callback function of keras.

I’m still building QSAR models using deep learning. And I thought I got problem of over fitting. 🙂
Training error was decreasing but, validation error was increasing depend on number of epochs. :-/
It seems over fitting and I could not avoid the event even if I used drop out function.

Tried lots of learning conditions but all challenge was failed…..finally I thought that too long learning did not have good effect for QSAR.
I thought reason why over fitting was occurred.
1st) I could not optimise learning conditions.
2nd) I did not have enough amount of training dataset. But the problem is difficult to solve in the actual drug discovery project.
3rd) Long learning time was not good, so early stopping is better.

Keras has same callback functions. And there is early stopping function too!
So, I wrote some code.
1st, make dataset for hERG binary classification.

 

from __future__ import print_function
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
from sklearn.cross_validation import train_test_split
import pandas as pd
import numpy as np
import pickle

df = pd.read_table('bioactivity-15_13-09-28.txt', header=0)
df_bind = df[ df.ASSAY_TYPE=="B" ]
df_bind = df_bind[ df_bind.STANDARD_VALUE != None ]
df_bind = df_bind[ df_bind.STANDARD_VALUE >= 0 ]

rows = df_bind.shape[ 0 ]
mols = [ ]
act = [ ]
fps = []
def act2bin( val ):
    if val > 10000:
        return 0
    else:
        return 1

for i in range( rows ):
    try:
        smi = df_bind.CANONICAL_SMILES[i]
        mol = Chem.MolFromSmiles( smi )
        if mol != None:
            mols.append( mol )
            act.append( act2bin( df_bind.STANDARD_VALUE[i]) )
        else:
            pass
    except:
        pass
for mol in mols:
    arr = np.zeros( (1,) )
    fp = AllChem.GetMorganFingerprintAsBitVect( mol, 2, nBits=1024 )
    DataStructs.ConvertToNumpyArray( fp, arr )
    fps.append( fp )

fps = np.array( fps, dtype = np.float32 )
act = np.array( act, dtype = np.int32 )

train_x, test_x, train_y, test_y = train_test_split( fps, act, test_size=0.3, random_state=5 )

f = open('dataset_fp1024.pkl', 'wb')
pickle.dump( [train_x, train_y, test_x, test_y ], f )
f.close()

Distribution of logIC50

ic50dist

 

Then wrote model builder using keras.
EarlyStopping Class provides early stop function.
And If I want to use the function, it’s easy. Just set callbacks option in fit().
Following code, I set another callback function ‘ModelCheckpoint’ to save parameters.
Finally I got better results not only training set but also validation set.
Keras is easy and good tool for deep learning.

import pickle
import matplotlib
import matplotlib.pyplot as plt

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adagrad, Adam, Adadelta
from keras.utils.visualize_util import plot
from keras.utils import np_utils
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint

earlystopping = EarlyStopping( monitor='val_loss',
                             patience = 4,
                             verbose=0,
                             mode='auto' )

modelcheckpoint = ModelCheckpoint( './bestmodel.hdf5',
                                   monitor='val_loss',
                                   verbose=0,
                                   save_best_only=True )

nb_epoch = 250
f = open( 'dataset_fp1024.pkl', 'rb' )
train_x, train_y, test_x, test_y = pickle.load( f )
train_y = np_utils.to_categorical(train_y, 2)
test_y = np_utils.to_categorical(test_y, 2)

model = Sequential()
model.add( Dense( output_dim = 500, input_shape=(1024,) ) )
model.add( Activation( 'sigmoid' ) )
model.add( Dropout(0.2) )
model.add( Dense( output_dim = 100 ) )
model.add( Activation( 'sigmoid' ))
model.add( Dropout(0.2) )

model.add( Dense( output_dim = 20 ) )
model.add( Activation( 'sigmoid' ))
model.add( Dropout(0.2) )

model.add( Dense( 2 ) )
model.add( Activation( 'sigmoid' ) )

model.compile( optimizer=Adadelta(lr=0.95), loss='categorical_crossentropy' )

hist = model.fit( train_x, train_y ,
                     nb_epoch=nb_epoch,
                     batch_size=50,
                     validation_split=0.1,
                     show_accuracy=True,
                     callbacks=[earlystopping, modelcheckpoint, monitor])
score = model.evaluate( test_x, test_y, show_accuracy=True, verbose=1 )
print( ' testscore: ', score[0], ' testaccuracy: ', score[1] )
model.summary()

loss = hist.history[ 'loss' ]
val_loss = hist.history[ 'val_loss' ]
plt.plot( range( len(val_loss ) ), loss, label='loss' )
plt.plot( range( len( val_loss ) ), val_loss,  label = 'val_loss' )
plt.legend( loc='best', fontsize=10 )
plt.grid()
plt.xlabel( 'epoch' )
plt.ylabel( 'loss' )
plt.savefig( 'res_gpu.png' )
plt.close()
acc = hist.history[ 'acc' ]
val_acc = hist.history[ 'val_acc' ]
plt.plot( range( len( acc )), acc, label='accuracy' )
plt.plot( range( len( val_acc )), val_acc, label='val_accuracy' )
plt.xlabel( 'epoch' )
plt.ylabel( 'accuracy' )
plt.savefig( 'acc_gpu.png' )

Accuracy…
acc_gpu
Loss….
res_gpu

New library for deep learning

Deep learning is old but new technology of machine learning. I have been interested in this technology however it was difficult to optimise lots of parameters.

I posted on my blog about python library named ‘chainer’ before. Chainer is one of flexible frame work for NN.
And somedays ago, I found new library named ‘keras’
http://keras.io/
The library uses theano or tensorflow. And user be able to develop code more simply.
So, I coded sample code to test the library.
First, I made sample dataset from Chembl hERG assay.
Input layer was calculated by rdkit as ECFP.
I wrote following code python3.x env.

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
from sklearn.cross_validation import train_test_split
import pandas as pd
import numpy as np
import pickle

df = pd.read_table('bioactivity-15_13-09-28.txt', header=0)
df_bind = df[ df.ASSAY_TYPE=="B" ]
df_bind = df_bind[ df_bind.STANDARD_VALUE != None ]
rows = df_bind.shape[ 0 ]
mols = [ ]
act = [ ]
fps = []
def act2bin( val ):
    if val > 10000:
        return 0
    else:
        return 1

for i in range( rows ):
    try:
        smi = df_bind.CANONICAL_SMILES[i]
        mol = Chem.MolFromSmiles( smi )
        if mol != None:
            mols.append( mol )
            act.append( act2bin( df_bind.STANDARD_VALUE[i]) )
        else:
            pass
    except:
        pass
for mol in mols:
    arr = np.zeros( (1,) )
    fp = AllChem.GetMorganFingerprintAsBitVect( mol, 2, nBits=256 )
    DataStructs.ConvertToNumpyArray( fp, arr )
    fps.append( fp )

fps = np.array( fps, dtype = np.float32 )
act = np.array( act, dtype = np.int32 )
#act = np.array(act, dtype=np.float32)
train_x, test_x, train_y, test_y = train_test_split( fps, act, test_size=0.3, random_state=455 )

f = open('dataset_fp256.pkl', 'wb')
pickle.dump( [train_x, train_y, test_x, test_y ], f )
f.close()

Then I made NN model using keras. (keras is easily installed by pip.)
First, I need to set model class, keras has two model class, Sequential, Graph. I used first one.
Then, define the model structure.
I think the manner is very easy to understand, I set each layer using ‘model.add(….)’.
Drop out is also handled as a layer.
Of course, keras has many activation function. Tanh, ReLu, Sigmoid, etc.
After define the model, user need to compile the model.
That’s all!
Fit function can build the predictive model! It’s like scikit-learn ;-).
When user set validation_split argument, keras automatically split train and validation data. Cool!
Finally to predict the data, model.evaluate() was used.
It seems very simple way.
And interesting point was that, when running the code, user can check the progress of learning through the console.
Like that.

3500/4861 [====================>.........] - ETA: 0s - loss: 0.5465 - acc: 0.7146
4000/4861 [=======================>......] - ETA: 0s - loss: 0.5412 - acc: 0.7195
4500/4861 [==========================>...] - ETA: 0s - loss: 0.5358 - acc: 0.7231
4861/4861 [==============================] - 0s - loss: 0.5390 - acc: 0.7219 - val_loss: 0.5399 - val_acc: 0.7338
Epoch 97/500

 500/4861 [==>...........................] - ETA: 0s - loss: 0.5182 - acc: 0.7360
1000/4861 [=====>........................] - ETA: 0s - loss: 0.5208 - acc: 0.7240
1500/4861 [========>.....................] - ETA: 0s - loss: 0.5250 - acc: 0.7240
2000/4861 [===========>..................] - ETA: 0s - loss: 0.5311 - acc: 0.7240
2500/4861 [==============>...............] - ETA: 0s - loss: 0.5264 - acc: 0.7280
3000/4861 [=================>............] - ETA: 0s - loss: 0.5302 - acc: 0.7240
3500/4861 [====================>.........] - ETA: 0s - loss: 0.5374 - acc: 0.7191
4000/4861 [=======================>......] - ETA: 0s - loss: 0.5411 - acc: 0.7180
4500/4861 [==========================>...] - ETA: 0s - loss: 0.5396 - acc: 0.7202
4861/4861 [==============================] - 0s - loss: 0.5402 - acc: 0.7194 - val_loss: 0.5379 - val_acc: 0.7412
Epoch 98/500

And when user can use GPU, type command like
iwatobipen$ time THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python hogehoge.py
Training history was stored in model.fit object and user can call history later.
It was useful for new model development.
However I could not optimise model yet. Model response is very sensitive to the parameters.
(In my case, too many bits, ReLu function, too many number of layers setting was not worked well.)
Is training data too small to build model?

If you have any ideas, I’d like to hear them.

import pickle
import matplotlib
import matplotlib.pyplot as plt

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adagrad
nb_epoch = 500
f = open( 'dataset_fp256.pkl', 'rb' )
trainx, trainy, testx, testy = pickle.load( f )

model = Sequential()
model.add( Dense( output_dim = 50, init='uniform', input_dim=256 ) )
model.add( Activation( 'tanh' ) )
#model.add(Dropout(0.5))
model.add( Dense( 1 ) )
model.add( Activation( 'sigmoid' ) )

model.compile( optimizer='adagrad', loss='binary_crossentropy' )

hist = model.fit( trainx, trainy ,
                     nb_epoch=nb_epoch,
                     batch_size=500,
                     validation_split=0.1,
                     show_accuracy=True)

score = model.evaluate( testx, testy, show_accuracy=True, verbose=1 )
print( ' testscore: ', score[0], ' testaccuracy: ', score[1] )

loss = hist.history[ 'loss' ]
val_loss = hist.history[ 'val_loss' ]
plt.plot( range( nb_epoch ), loss, label='loss' )
plt.plot( range( nb_epoch ), val_loss,  label = 'val_loss' )
plt.legend( loc='best', fontsize=10 )
plt.grid()
plt.xlabel( 'epoch' )
plt.ylabel( 'loss' )
plt.savefig( 'res.png' )

res_gpu