Go to Dodge Ball national tournament!

My kid is first grade elementary school. And he is a member of the dodgeball.
The team practice twice a week. In this year the team won the regional championship. So they will go to national tournament tomorrow!.
He is not a regular player so we will participate the tournament as cheer team. ;-)
I hope all team members do their best and enjoy and I wish he become a regular near the future. Of course he need to practice more enjoy the game.

By the way, Mie is far from here. We will get up 3:00AM.
It gonna be a long day. I’m so excited about the game.

Multi-armed Bandit problem

I am interested in reinforcement learning.
It is difficult for me. @_@
I tried to implement very simple and famous problem called ‘multi-armed bandit’.
Image from wikipedia..

The multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice’s properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice.

Following code is sample code of the problem. I tried random, greedy, epsilon-greedy and ucb1.
Epsilon greedy algorithm has parameter called epsilon that controls selection method random or greedy.
Greedy algorithm sometime causes mislead because it often fall to local optimum.
My results was indicated epsilon-greedy was best. ??? Am I something wrong?
I need to check my code…..

import numpy as np
import random
ARM_PROB = [ 0.2, 0.3, 0.4, 0.5 ]

class MultiArmedbandit(object):
    def __init__(self, n_try, k_arm = len(ARM_PROB)):
        self.n_try = n_try
        self.k_arm = k_arm
        self.counts = np.zeros(self.k_arm, dtype=np.float32)
        self.values = np.zeros(self.k_arm, dtype=np.float32)
        self.rewards = np.zeros(self.k_arm, dtype=np.float32)
    def update(self, action):
        # action = random.choice(range(self.k_arm))
        r = get_reward(action)
        self.counts[action] += 1
        self.values[action] += r
        self.rewards[action] = self.values[action]/self.counts[action]

def get_reward(action):
    if random.random() < ARM_PROB[action]:
        reward = 1.
        reward = 0.
    return reward

def randomchoice(bandit):
    action = random.choice(range(bandit.k_arm))
    return action

def greedy(bandit):
    counts = bandit.counts
    values = bandit.values
    for i in range(bandit.k_arm):
        if counts[i] == 0:
            action = i
            return action
    average_reward = values / counts
    action = np.argmax(average_reward)
    return action

def epsilon_greedy(bandit, epsilon): # epsilon <= 1.
    if random.random() < epsilon:
        action = randomchoice(bandit)
        action = greedy(bandit)
    return action

def ucb1(bandit):
    ucb1s = []
    for i in range(bandit.k_arm):
        ri = bandit.values[i]
        ni = bandit.counts[i]
        if ni == 0:
            return i
        n = np.sum(bandit.counts)
        ucb = ri/ni + np.sqrt(2 * np.log2(n) / ni)
    action = np.argmax(ucb1s)
    return action
import bandit_exp
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

rand = bandit_exp.MultiArmedbandit(10000,4)
g = bandit_exp.MultiArmedbandit(10000,4)
eg = bandit_exp.MultiArmedbandit(10000,4)
ucb = bandit_exp.MultiArmedbandit(10000,4)

times =[i for i in  range( 10000 )]
randdata = []
gdata = []
egdata = []
ucbdata = []
for i in range(rand.n_try):               

    randdata.append( np.sum(rand.values) / np.sum(rand.counts))
    gdata.append( np.sum(g.values) / np.sum(g.counts))
    egdata.append( np.sum(eg.values) / np.sum(eg.counts))
    ucbdata.append( np.sum(ucb.values) / np.sum(ucb.counts))

df = pd.DataFrame()
df['time'] = times
df['random'] = randdata
df['greedy'] = gdata
df['epsilongreedy'] = egdata
df['ucbdata'] = ucbdata

l1=plt.plot(df.time, df.random, label='random')
l2=plt.plot(df.time, df.greedy, label='greedy')
l3=plt.plot(df.time, df.epsilongreedy, label='e-greedy0.1')
l4=plt.plot(df.time, df.ucbdata, label='ucb1')
plt.legend(loc='upper right')

New finding of the Chan-Lam coupling

Copper catalyzed boronic acids and OH or NH containing reaction is called Chan-Lam reaction. The reaction often perform in mild conditions under oxygen atmosphere.
Yes I have used the reaction to synthesize my target compound. But always yield was moderate and depends on steric effect.
Today I found useful article that is reported by researchers from Pfizer and Scripps research institute.

They focused on cyclopropylation of phenols and azaheterocycles. Because cylopropyl group is very important and unique substituent for Medicinal Chemistry but sometime has difficulty to introduce. So new methodologies are required still now.

At first they optimized the reaction condition for O-cyclopropylation of phenols.
In table1 they found good condition with bidentate ligand and it gave high yield (84%). Next they investigated scope and limitation of the reaction.
It was interesting for me that the condition give high yield against not only para substituted electron rich phenol but also ortho substituted phenols.
And also the condition gave low – moderate yield against para meta substituted electron poor phenols.
Procedure of the reaction is not so complicate. I would like to try to use the reaction if I have chance. Organic synthesis and medicinal chemistry is fun and creative task I think.

BTW, I think that introduction of cyclopropan MMP sounds like sudden.

Edge Attention-based Multi-Relational GCN #pytorch #RDKit #DeepLearning

In the chemoinformatics area molecules are represented as graph, atom as node and bond as edge. In the ML area, Graph Convolution is catching a great deal of attention I think. Today I would like to introduce new approach which is proposed by Chao SHANG’s group.
They developed Edge attention-based Multi-Relational Graph Convolutional Networks.
URL is below.


In the method, molecules are encoded adjacency matrix with bond information such as atom pair, bond order, aromaticity an so on, atom feature matrix.
And they apply graph convolution to the adjacency matrix. And they consider node interaction layer wise like ECFP fingerprint.

The author evaluates the method with publically available data HIV, tox21 etc. And EAGCN outperformed other common method such as SVM RF.

They used RDKit to handle chemical data and they used pytorch as a deep learning package.
RDKit is a sate of the art in the chemoinfromatics.