Make multi target predictive model for Knime #RDKit #Knime #DeepLearning

I participated my kid’s elemenary school graduation ceremony today. It was nice ceremony. …Time pasts very fast ;) In the season in Japan is very good.
Beautiful cherry blossoms start to bloom. It’s good time to start anything ;-)

I would like to say thank to my kid.

BTW, recently chembl ver32 was released. As you know chembl DB has lots of biological activities of small moleculres. So it’s really useful source of multi target predictive model.

Fortunately, Greg Landrum posted nice blog that predictiction with knime. The URL is below.
https://www.knime.com/blog/interactive-bioactivity-prediction-with-multitask-neural-networks

He uses onnx to convert pytorch predictive model to tensorflow model because current knime doesn’t support pytorch. I have intereted the approach and would like to run the work flow with chembl32.

To do that, I need to build model with chembl32. It was requred to modify original code because schema is changed.

After some trial, I could build the workflow. So I would like to share my experience.

At first, I made dataset. The code is below. I got data from chemblftp. The origin of the code came from Eloy’s work.

http://chembl.blogspot.com/2019/05/multi-task-neural-network-on-chembl.html.

!/usr/bin/env python
# coding: utf-8
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
from rdkit.Chem import rdMolDescriptors
from sqlalchemy import create_engine
from sqlalchemy.sql import text
import pandas as pd
import numpy as np
import tables as tb
import json
from tables.atom import ObjectAtom

engine = create_engine('sqlite:///chembl_32/chembl_32_sqlite/chembl_32.db') 


qtext = """
SELECT
  activities.doc_id                    AS doc_id,
  activities.standard_value            AS standard_value,
  molecule_hierarchy.parent_molregno   AS molregno,
  compound_structures.canonical_smiles AS canonical_smiles,
  molecule_dictionary.chembl_id        AS chembl_id,
  target_dictionary.tid                AS tid,
  target_dictionary.chembl_id          AS target_chembl_id,
  protein_classification.pref_name     AS pref_name,
  protein_classification.short_name     AS short_name,
  protein_classification.PROTEIN_CLASS_DESC    AS protein_class,
  protein_classification.class_level     AS class_level
FROM activities
  JOIN assays ON activities.assay_id = assays.assay_id
  JOIN target_dictionary ON assays.tid = target_dictionary.tid
  JOIN target_components ON target_dictionary.tid = target_components.tid
  JOIN component_class ON target_components.component_id = component_class.component_id
  JOIN protein_classification ON component_class.protein_class_id = protein_classification.protein_class_id
  JOIN molecule_dictionary ON activities.molregno = molecule_dictionary.molregno
  JOIN molecule_hierarchy ON molecule_dictionary.molregno = molecule_hierarchy.molregno
  JOIN compound_structures ON molecule_hierarchy.parent_molregno = compound_structures.molregno
WHERE activities.standard_units = 'nM' AND
      activities.standard_type IN ('EC50', 'IC50', 'Ki', 'Kd', 'XC50', 'AC50', 'Potency') AND
      activities.data_validity_comment IS NULL AND
      activities.standard_relation IN ('=', '<') AND
      activities.potential_duplicate = 0 AND assays.confidence_score >= 8 AND
      target_dictionary.target_type = 'SINGLE PROTEIN'"""

with engine.begin() as conn:
    res = conn.execute(text(qtext))
    df = pd.DataFrame(res.fetchall())



df.columns = res.keys()
df = df.where((pd.notnull(df)), None)



cls_list=df["protein_class"].to_list()

uniq = list(set(cls_list))




df = df.sort_values(by=['standard_value', 'molregno', 'tid'], ascending=True)
df = df.drop_duplicates(subset=['molregno', 'tid'], keep='first')

df.to_csv('chembl_activity_data.csv', index=False)




def set_active(row):
    active = 0
    if row['standard_value'] <= 1000:
        active = 1
    if "ion channel" in row['protein_class']:
        if row['standard_value'] <= 10000:
            active = 1
    if "kinase" in row['protein_class']:
        if row['standard_value'] > 30:
            active = 0
    if "nuclear receptor" in row['protein_class']:
        if row['standard_value'] > 100:
            active = 0
    if "membrane receptor" in row['protein_class']:
        if row['standard_value'] > 100:
            active = 0
    return active

df['active'] = df.apply(lambda row: set_active(row), axis=1)

# get targets with at least 100 different active molecules
acts = df[df['active'] == 1].groupby(['target_chembl_id']).agg('count')
acts = acts[acts['molregno'] >= 100].reset_index()['target_chembl_id']

# get targets with at least 100 different inactive molecules
inacts = df[df['active'] == 0].groupby(['target_chembl_id']).agg('count')
inacts = inacts[inacts['molregno'] >= 100].reset_index()['target_chembl_id']

# get targets mentioned in at least two docs
docs = df.drop_duplicates(subset=['doc_id', 'target_chembl_id'])
docs = docs.groupby(['target_chembl_id']).agg('count')
docs = docs[docs['doc_id'] >= 2.0].reset_index()['target_chembl_id']



t_keep = set(acts).intersection(set(inacts)).intersection(set(docs))

# get dta for filtered targets
activities = df[df['target_chembl_id'].isin(t_keep)]


ion = pd.unique(activities[activities['protein_class'].str.contains("ion channel",  na=False)]['tid']).shape[0]
kin = pd.unique(activities[activities['protein_class'].str.contains("kinase",  na=False)]['tid']).shape[0]
nuc = pd.unique(activities[activities['protein_class'].str.contains("nuclear receptor",  na=False)]['tid']).shape[0]
gpcr = pd.unique(activities[activities['protein_class'].str.contains("membrane receptor", na=False)]['tid']).shape[0]
print('Number of unique targets: ', len(t_keep))
print('  Ion channel: ', ion)
print('  Kinase: ', kin)
print('  Nuclear receptor: ',  nuc)
print('  GPCR: ', gpcr)
print('  Others: ', len(t_keep) - ion - kin - nuc - gpcr)


# save it to a file
activities.to_csv('chembl_activity_data_filtered.csv', index=False)




def gen_dict(group):
    return {tid: act  for tid, act in zip(group['target_chembl_id'], group['active'])}

print('MULTI TASK DATA PREP')
group = activities.groupby('chembl_id')
temp = pd.DataFrame(group.apply(gen_dict))
mt_df = pd.DataFrame(temp[0].tolist())
mt_df['chembl_id'] = temp.index
mt_df = mt_df.where((pd.notnull(mt_df)), -1)




structs = activities[['chembl_id', 'canonical_smiles']].drop_duplicates(subset='chembl_id')

print('GET MOL')
# drop mols not sanitizing on rdkit
def molchecker(smi):
    mol = Chem.MolFromSmiles(smi)
    if mol == None:
        return None
    else:
        return 1

#structs['romol'] = structs.apply(lambda row: Chem.MolFromSmiles(row['canonical_smiles']), axis=1)
structs['romol'] = structs.apply(lambda row: molchecker(row['canonical_smiles']), axis=1)
structs = structs.dropna()
del structs['romol']

# add the structures to the final df
mt_df = pd.merge(structs, mt_df, how='inner', on='chembl_id')


# save to csv
mt_df.to_csv('chembl_multi_task_data.csv', index=False)




FP_SIZE = 1024 
RADIUS = 2

def calc_fp(smiles, fp_size, radius):
    """
    calcs morgan fingerprints as a numpy array.
    """
    mol = Chem.MolFromSmiles(smiles, sanitize=False)
    mol.UpdatePropertyCache(False)
    Chem.GetSSSR(mol)
    fp = rdMolDescriptors.GetMorganFingerprintAsBitVect(mol, radius, nBits=fp_size)
    a = np.zeros((0,), dtype=np.float32)
    Chem.DataStructs.ConvertToNumpyArray(fp, a)
    return a

# calc fps
print('CALC FP')
descs = [calc_fp(smi, FP_SIZE, RADIUS) for smi in mt_df['canonical_smiles'].values]
descs = np.asarray(descs, dtype=np.float32)

# put all training data in a pytables file
print('SAVE DATA')
with tb.open_file('mt_data.h5', mode='w') as t_file:

    # set compression filter. It will make the file much smaller
    filters = tb.Filters(complib='blosc', complevel=5)

    # save chembl_ids
    tatom = ObjectAtom()
    cids = t_file.create_vlarray(t_file.root, 'chembl_ids', atom=tatom)
    for cid in mt_df['chembl_id'].values:
        cids.append(cid)

    # save fps
    fatom = tb.Atom.from_dtype(descs.dtype)
    fps = t_file.create_carray(t_file.root, 'fps', fatom, descs.shape, filters=filters)
    fps[:] = descs

    del mt_df['chembl_id']
    del mt_df['canonical_smiles']

    # save target chembl ids
    tcids = t_file.create_vlarray(t_file.root, 'target_chembl_ids', atom=tatom)
    for tcid in mt_df.columns.values:
        tcids.append(tcid)

    # save labels
    labs = t_file.create_carray(t_file.root, 'labels', fatom, mt_df.values.shape, filters=filters)
    labs[:] = mt_df.values
    
    # save task weights
    # each task loss will be weighted inversely proportional to its number of data points
    weights = []
    for col in mt_df.columns.values:
        c = mt_df[mt_df[col] >= 0.0].shape[0]
        weights.append(1 / c)
    weights = np.array(weights)
    ws = t_file.create_carray(t_file.root, 'weights', fatom, weights.shape)
    ws[:] = weights




with tb.open_file('mt_data.h5', mode='r') as t_file:
    print(t_file.root.chembl_ids.shape)
    print(t_file.root.target_chembl_ids.shape)
    print(t_file.root.fps.shape)
    print(t_file.root.labels.shape)
    print(t_file.root.weights.shape)
    
    # save targets to a json file
    with open('targets.json', 'w') as f:
        json.dump(t_file.root.target_chembl_ids[:], f)

After making the dataset, I build the model with pytorch. The code is below.

import numpy as np
import torch
from torch import nn
import torch.nn.functional as F
import torch.utils.data as D
import tables as tb
from sklearn.metrics import (matthews_corrcoef, 
                             confusion_matrix, 
                             f1_score, 
                             roc_auc_score,
                             accuracy_score,
                             roc_auc_score)


# set the device to GPU if available
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
MAIN_PATH = '.'
DATA_FILE = 'mt_data.h5'
MODEL_FILE = 'chembl_mt.model'
N_WORKERS = 8 # Dataloader workers, prefetch data in parallel to have it ready for the model after each batch train
BATCH_SIZE = 32 # https://twitter.com/ylecun/status/989610208497360896?lang=es
LR = 2 # Learning rate. Big value because of the way we are weighting the targets
N_EPOCHS = 2 # You should train longer!!!

class ChEMBLDataset(D.Dataset):
    
    def __init__(self, file_path):
        self.file_path = file_path
        with tb.open_file(self.file_path, mode='r') as t_file:
            self.length = t_file.root.fps.shape[0]
            self.n_targets = t_file.root.labels.shape[1]
        
    def __len__(self):
        return self.length
    
    def __getitem__(self, index):
        with tb.open_file(self.file_path, mode='r') as t_file:
            structure = t_file.root.fps[index]
            labels = t_file.root.labels[index]
        return structure, labels


dataset = ChEMBLDataset(f"{MAIN_PATH}/{DATA_FILE}")
validation_split = .2
random_seed= 42

dataset_size = len(dataset)
indices = list(range(dataset_size))
split = int(np.floor(validation_split * dataset_size))

np.random.seed(random_seed)
np.random.shuffle(indices)
train_indices, test_indices = indices[split:], indices[:split]

train_sampler = D.sampler.SubsetRandomSampler(train_indices)
test_sampler = D.sampler.SubsetRandomSampler(test_indices)

# dataloaders can prefetch the next batch if using n workers while
# the model is tranining
train_loader = torch.utils.data.DataLoader(dataset,
                                           batch_size=BATCH_SIZE,
                                           num_workers=N_WORKERS,
                                           sampler=train_sampler)

test_loader = torch.utils.data.DataLoader(dataset, 
                                          batch_size=BATCH_SIZE,
                                          num_workers=N_WORKERS,
                                          sampler=test_sampler)


class ChEMBLMultiTask(nn.Module):
    """
    Architecture borrowed from: https://arxiv.org/abs/1502.02072
    """
    def __init__(self, n_tasks):
        super(ChEMBLMultiTask, self).__init__()
        self.n_tasks = n_tasks
        self.fc1 = nn.Linear(1024, 2000)
        self.fc2 = nn.Linear(2000, 100)
        self.dropout = nn.Dropout(0.25)

        # add an independet output for each task int the output laer
        for n_m in range(self.n_tasks):
            self.add_module(f"y{n_m}o", nn.Linear(100, 1))
        
    def forward(self, x):
        h1 = self.dropout(F.relu(self.fc1(x)))
        h2 = F.relu(self.fc2(h1))
        out = [torch.sigmoid(getattr(self, f"y{n_m}o")(h2)) for n_m in range(self.n_tasks)]
        return out
    
# create the model, to GPU if available
model = ChEMBLMultiTask(dataset.n_targets).to(device)

# binary cross entropy
# each task loss is weighted inversely proportional to its number of datapoints, borrowed from:
# http://www.bioinf.at/publications/2014/NIPS2014a.pdf
with tb.open_file(f"{MAIN_PATH}/{DATA_FILE}", mode='r') as t_file:
    weights = torch.tensor(t_file.root.weights[:])
    weights = weights.to(device)

criterion = [nn.BCELoss(weight=w) for x, w in zip(range(dataset.n_targets), weights.float())]

# stochastic gradient descend as an optimiser
optimizer = torch.optim.SGD(model.parameters(), LR)

# model is by default in train mode. Training can be resumed after .eval() but needs to be set to .train() again
model.train()
for ep in range(N_EPOCHS):
    for i, (fps, labels) in enumerate(train_loader):
        # move it to GPU if available
        fps, labels = fps.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = model(fps)
        
        # calc the loss
        loss = torch.tensor(0.0).to(device)
        for j, crit in enumerate(criterion):
            # mask keeping labeled molecules for each task
            mask = labels[:, j] >= 0.0
            if len(labels[:, j][mask]) != 0:
                # the loss is the sum of each task/target loss.
                # there are labeled samples for this task, so we add it's loss
                loss += crit(outputs[j][mask], labels[:, j][mask].view(-1, 1))

        loss.backward()
        optimizer.step()

        if (i+1) % 500 == 0:
            print(f"Epoch: [{ep+1}/{N_EPOCHS}], Step: [{i+1}/{len(train_indices)//BATCH_SIZE}], Loss: {loss.item()}")
    
y_trues = []
y_preds = []
y_preds_proba = []

# do not track history
with torch.no_grad():
    for fps, labels in test_loader:
        # move it to GPU if available
        fps, labels = fps.to(device), labels.to(device)
        # set model to eval, so will not use the dropout layer
        model.eval()
        outputs = model(fps)
        for j, out in enumerate(outputs):
            mask = labels[:, j] >= 0.0
            mask = mask.to(device)
            y_pred = torch.where(out[mask].to(device) > 0.5, torch.ones(1).to(device), torch.zeros(1).to(device)).view(1, -1)

            if y_pred.shape[1] > 0:
                for l in labels[:, j][mask].long().tolist():
                    y_trues.append(l)
                for p in y_pred.view(-1, 1).tolist():
                    y_preds.append(int(p[0]))
                for p in out[mask].view(-1, 1).tolist():
                    y_preds_proba.append(float(p[0]))

tn, fp, fn, tp = confusion_matrix(y_trues, y_preds).ravel()
sens = tp / (tp + fn)
spec = tn / (tn + fp)
prec = tp / (tp + fp)
f1 = f1_score(y_trues, y_preds)
acc = accuracy_score(y_trues, y_preds)
mcc = matthews_corrcoef(y_trues, y_preds)
auc = roc_auc_score(y_trues, y_preds_proba)

print(f"accuracy: {acc}, auc: {auc}, sens: {sens}, spec: {spec}, prec: {prec}, mcc: {mcc}, f1: {f1}")
print(f"Not bad for only {N_EPOCHS} epochs!")

torch.save(model.state_dict(), f"./{MODEL_FILE}")

Due to run the code on my personal environment I set epoch is 2 but it’s better to more longer epoch for production.

After the training, I converted the model to onnx format. The code is…

import numpy as np
import torch
from torch import nn
import torch.nn.functional as F
import torch.utils.data as D
import tables as tb
from sklearn.metrics import (matthews_corrcoef, 
                             confusion_matrix, 
                             f1_score, 
                             roc_auc_score,
                             accuracy_score,
                             roc_auc_score)
from torch import onnx

# set the device to GPU if available
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
MAIN_PATH = '.'
DATA_FILE = 'mt_data.h5'
MODEL_FILE = 'chembl_mt.model'
N_WORKERS = 8 # Dataloader workers, prefetch data in parallel to have it ready for the model after each batch train
BATCH_SIZE = 32 # https://twitter.com/ylecun/status/989610208497360896?lang=es
LR = 2 # Learning rate. Big value because of the way we are weighting the targets
N_EPOCHS = 2 # You should train longer!!!

class ChEMBLDataset(D.Dataset):
    
    def __init__(self, file_path):
        self.file_path = file_path
        with tb.open_file(self.file_path, mode='r') as t_file:
            self.length = t_file.root.fps.shape[0]
            self.n_targets = t_file.root.labels.shape[1]
        
    def __len__(self):
        return self.length
    
    def __getitem__(self, index):
        with tb.open_file(self.file_path, mode='r') as t_file:
            structure = t_file.root.fps[index]
            labels = t_file.root.labels[index]
        return structure, labels


class ChEMBLMultiTask(nn.Module):
    """
    Architecture borrowed from: https://arxiv.org/abs/1502.02072
    """
    def __init__(self, n_tasks):
        super(ChEMBLMultiTask, self).__init__()
        self.n_tasks = n_tasks
        self.fc1 = nn.Linear(1024, 2000)
        self.fc2 = nn.Linear(2000, 100)
        self.dropout = nn.Dropout(0.25)

        # add an independet output for each task int the output laer
        for n_m in range(self.n_tasks):
            self.add_module(f"y{n_m}o", nn.Linear(100, 1))
        
    def forward(self, x):
        h1 = self.dropout(F.relu(self.fc1(x)))
        h2 = F.relu(self.fc2(h1))
        out = [torch.sigmoid(getattr(self, f"y{n_m}o")(h2)) for n_m in range(self.n_tasks)]
        return out

dataset = ChEMBLDataset(f"{MAIN_PATH}/{DATA_FILE}")
validation_split = .2
random_seed= 42

dataset_size = len(dataset)
model = ChEMBLMultiTask(dataset.n_targets).to(device)
path = './model_onnx.onnx'
dummy = torch.tensor([[0.5 for _ in range(1024)]],
dtype=torch.float32).to(device)
onnx.export(model, dummy, path, input_names=['input_1'],output_names=['output'])

Now, I got onnx model of chembl target prediciton. So I tried to run the work flow with the model. But it didn’t work. So I tried to tensorflow model directly on knime workflow.

At first, I converted onnx model to tensorflow model with onnx-tf.

$ onnx-tf convert -i model_onnx.onnx -o model_tf

Now ready to run the prediction ;)

I build workflow (most of part came from greg’s great work!).

To run the WF, I could get heat map of target prediction.

The environment of deeplearning for knime is below.


# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
abseil-cpp                20210324.2           h9c3ff4c_0    conda-forge
absl-py                   1.4.0              pyhd8ed1ab_0    conda-forge
aiohttp                   3.8.3            py37h540881e_0    conda-forge
aiosignal                 1.3.1              pyhd8ed1ab_0    conda-forge
astunparse                1.6.3              pyhd8ed1ab_0    conda-forge
async-timeout             4.0.2              pyhd8ed1ab_0    conda-forge
asynctest                 0.13.0                     py_0    conda-forge
attrs                     22.2.0             pyh71513ae_0    conda-forge
blinker                   1.5                pyhd8ed1ab_0    conda-forge
brotlipy                  0.7.0           py37h540881e_1004    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2022.12.7            ha878542_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cachetools                5.3.0              pyhd8ed1ab_0    conda-forge
certifi                   2022.12.7          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1           py37h43b0acd_1    conda-forge
charset-normalizer        2.1.1              pyhd8ed1ab_0    conda-forge
click                     8.1.3            py37h89c1867_0    conda-forge
conda                     4.12.0           py37h89c1867_0    conda-forge
conda-package-handling    2.0.2              pyh38be061_0    conda-forge
conda-package-streaming   0.7.0              pyhd8ed1ab_1    conda-forge
cryptography              38.0.2           py37h38fbfac_1    conda-forge
cudatoolkit               11.8.0              h37601d7_11    conda-forge
cudnn                     8.4.1.50             hed8a83a_0    conda-forge
frozenlist                1.3.1            py37h540881e_0    conda-forge
gast                      0.5.3              pyhd8ed1ab_0    conda-forge
giflib                    5.2.1                h0b41bf4_3    conda-forge
google-auth               2.16.2             pyh1a96a4e_0    conda-forge
google-auth-oauthlib      0.4.6              pyhd8ed1ab_0    conda-forge
google-pasta              0.2.0              pyh8c360ce_0    conda-forge
grpc-cpp                  1.43.2               h9e046d8_3    conda-forge
grpcio                    1.43.0           py37hb27c1af_0    conda-forge
h5py                      3.7.0           nompi_py37hf1ce037_101    conda-forge
hdf5                      1.12.2          nompi_h2386368_101    conda-forge
icu                       70.1                 h27087fc_0    conda-forge
idna                      3.4                pyhd8ed1ab_0    conda-forge
importlib-metadata        4.11.4           py37h89c1867_0    conda-forge
jpeg                      9e                   h0b41bf4_3    conda-forge
keras                     2.8.0              pyhd8ed1ab_0    conda-forge
keras-preprocessing       1.1.2              pyhd8ed1ab_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
krb5                      1.20.1               hf9c8cef_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
libaec                    1.0.6                hcb278e6_1    conda-forge
libblas                   3.9.0           16_linux64_openblas    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libcurl                   7.87.0               h6312ad2_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libnghttp2                1.51.0               hdcd2b5c_0    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libpng                    1.6.39               h753d276_0    conda-forge
libprotobuf               3.19.4               h780b84a_0    conda-forge
libsolv                   0.7.23               h3eb15da_0    conda-forge
libsqlite                 3.40.0               h753d276_0    conda-forge
libssh2                   1.10.0               haa6b8db_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
mamba                     0.1.2            py37h99015e2_0    conda-forge
markdown                  3.4.1              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.1            py37h540881e_1    conda-forge
multidict                 6.0.2            py37h540881e_1    conda-forge
nccl                      2.14.3.1             h0800d71_0    conda-forge
ncurses                   6.3                  h27087fc_1    conda-forge
numpy                     1.21.6           py37h976b520_0    conda-forge
oauthlib                  3.2.2              pyhd8ed1ab_0    conda-forge
onnx                      1.13.1                   pypi_0    pypi
onnx-tf                   1.10.0                   pypi_0    pypi
openssl                   1.1.1t               h0b41bf4_0    conda-forge
opt_einsum                3.3.0              pyhd8ed1ab_1    conda-forge
packaging                 23.0                     pypi_0    pypi
pip                       23.0.1             pyhd8ed1ab_0    conda-forge
protobuf                  3.20.3                   pypi_0    pypi
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.7                      py_0    conda-forge
pycosat                   0.6.4            py37h540881e_0    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pyjwt                     2.6.0              pyhd8ed1ab_0    conda-forge
pyopenssl                 23.0.0             pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1            py37h89c1867_5    conda-forge
python                    3.7.12          hb7a2778_100_cpython    conda-forge
python-flatbuffers        23.1.21            pyhd8ed1ab_0    conda-forge
python_abi                3.7                     3_cp37m    conda-forge
pyu2f                     0.1.5              pyhd8ed1ab_0    conda-forge
pyyaml                    6.0                      pypi_0    pypi
re2                       2022.02.01           h9c3ff4c_0    conda-forge
readline                  8.1.2                h0f457ee_0    conda-forge
requests                  2.28.2             pyhd8ed1ab_0    conda-forge
requests-oauthlib         1.3.1              pyhd8ed1ab_0    conda-forge
rsa                       4.9                pyhd8ed1ab_0    conda-forge
ruamel_yaml               0.15.80         py37h540881e_1007    conda-forge
scipy                     1.7.3            py37hf2a6cf1_0    conda-forge
setuptools                67.6.0             pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
snappy                    1.1.10               h9fff704_0    conda-forge
sqlite                    3.40.0               h4ff8645_0    conda-forge
tensorboard               2.8.0              pyhd8ed1ab_1    conda-forge
tensorboard-data-server   0.6.0            py37h38fbfac_2    conda-forge
tensorboard-plugin-wit    1.8.1              pyhd8ed1ab_0    conda-forge
tensorflow                2.8.0           cuda112py37h01c6645_0    conda-forge
tensorflow-addons         0.19.0                   pypi_0    pypi
tensorflow-base           2.8.0           cuda112py37hd7e45b3_0    conda-forge
tensorflow-estimator      2.8.0           cuda112py37h25bb9bc_0    conda-forge
tensorflow-gpu            2.8.0           cuda112py37h0bbbad9_0    conda-forge
termcolor                 2.2.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
typeguard                 3.0.1                    pypi_0    pypi
typing-extensions         4.5.0                hd8ed1ab_0    conda-forge
typing_extensions         4.5.0              pyha770c72_0    conda-forge
urllib3                   1.26.15            pyhd8ed1ab_0    conda-forge
werkzeug                  2.2.3              pyhd8ed1ab_0    conda-forge
wheel                     0.40.0             pyhd8ed1ab_0    conda-forge
wrapt                     1.14.1           py37h540881e_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
yarl                      1.7.2            py37h540881e_2    conda-forge
zipp                      3.15.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstandard                 0.18.0           py37h540881e_0    conda-forge

And I uploaded today’s code on github and knimehub.

https://github.com/iwatobipen/chembl_targetprediction

https://hub.knime.com/-/spaces/-/latest/~keV0drWZt67jVOx3/

You can modify the code and knime workflow if you would like to do. Let’s enjoy chemoinformatics!

Advertisement

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: