I participated my kid’s elemenary school graduation ceremony today. It was nice ceremony. …Time pasts very fast ;) In the season in Japan is very good.
Beautiful cherry blossoms start to bloom. It’s good time to start anything ;-)
I would like to say thank to my kid.
BTW, recently chembl ver32 was released. As you know chembl DB has lots of biological activities of small moleculres. So it’s really useful source of multi target predictive model.
Fortunately, Greg Landrum posted nice blog that predictiction with knime. The URL is below.
https://www.knime.com/blog/interactive-bioactivity-prediction-with-multitask-neural-networks
He uses onnx to convert pytorch predictive model to tensorflow model because current knime doesn’t support pytorch. I have intereted the approach and would like to run the work flow with chembl32.
To do that, I need to build model with chembl32. It was requred to modify original code because schema is changed.
After some trial, I could build the workflow. So I would like to share my experience.
At first, I made dataset. The code is below. I got data from chemblftp. The origin of the code came from Eloy’s work.
http://chembl.blogspot.com/2019/05/multi-task-neural-network-on-chembl.html.
!/usr/bin/env python
# coding: utf-8
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
from rdkit.Chem import rdMolDescriptors
from sqlalchemy import create_engine
from sqlalchemy.sql import text
import pandas as pd
import numpy as np
import tables as tb
import json
from tables.atom import ObjectAtom
engine = create_engine('sqlite:///chembl_32/chembl_32_sqlite/chembl_32.db')
qtext = """
SELECT
activities.doc_id AS doc_id,
activities.standard_value AS standard_value,
molecule_hierarchy.parent_molregno AS molregno,
compound_structures.canonical_smiles AS canonical_smiles,
molecule_dictionary.chembl_id AS chembl_id,
target_dictionary.tid AS tid,
target_dictionary.chembl_id AS target_chembl_id,
protein_classification.pref_name AS pref_name,
protein_classification.short_name AS short_name,
protein_classification.PROTEIN_CLASS_DESC AS protein_class,
protein_classification.class_level AS class_level
FROM activities
JOIN assays ON activities.assay_id = assays.assay_id
JOIN target_dictionary ON assays.tid = target_dictionary.tid
JOIN target_components ON target_dictionary.tid = target_components.tid
JOIN component_class ON target_components.component_id = component_class.component_id
JOIN protein_classification ON component_class.protein_class_id = protein_classification.protein_class_id
JOIN molecule_dictionary ON activities.molregno = molecule_dictionary.molregno
JOIN molecule_hierarchy ON molecule_dictionary.molregno = molecule_hierarchy.molregno
JOIN compound_structures ON molecule_hierarchy.parent_molregno = compound_structures.molregno
WHERE activities.standard_units = 'nM' AND
activities.standard_type IN ('EC50', 'IC50', 'Ki', 'Kd', 'XC50', 'AC50', 'Potency') AND
activities.data_validity_comment IS NULL AND
activities.standard_relation IN ('=', '<') AND
activities.potential_duplicate = 0 AND assays.confidence_score >= 8 AND
target_dictionary.target_type = 'SINGLE PROTEIN'"""
with engine.begin() as conn:
res = conn.execute(text(qtext))
df = pd.DataFrame(res.fetchall())
df.columns = res.keys()
df = df.where((pd.notnull(df)), None)
cls_list=df["protein_class"].to_list()
uniq = list(set(cls_list))
df = df.sort_values(by=['standard_value', 'molregno', 'tid'], ascending=True)
df = df.drop_duplicates(subset=['molregno', 'tid'], keep='first')
df.to_csv('chembl_activity_data.csv', index=False)
def set_active(row):
active = 0
if row['standard_value'] <= 1000:
active = 1
if "ion channel" in row['protein_class']:
if row['standard_value'] <= 10000:
active = 1
if "kinase" in row['protein_class']:
if row['standard_value'] > 30:
active = 0
if "nuclear receptor" in row['protein_class']:
if row['standard_value'] > 100:
active = 0
if "membrane receptor" in row['protein_class']:
if row['standard_value'] > 100:
active = 0
return active
df['active'] = df.apply(lambda row: set_active(row), axis=1)
# get targets with at least 100 different active molecules
acts = df[df['active'] == 1].groupby(['target_chembl_id']).agg('count')
acts = acts[acts['molregno'] >= 100].reset_index()['target_chembl_id']
# get targets with at least 100 different inactive molecules
inacts = df[df['active'] == 0].groupby(['target_chembl_id']).agg('count')
inacts = inacts[inacts['molregno'] >= 100].reset_index()['target_chembl_id']
# get targets mentioned in at least two docs
docs = df.drop_duplicates(subset=['doc_id', 'target_chembl_id'])
docs = docs.groupby(['target_chembl_id']).agg('count')
docs = docs[docs['doc_id'] >= 2.0].reset_index()['target_chembl_id']
t_keep = set(acts).intersection(set(inacts)).intersection(set(docs))
# get dta for filtered targets
activities = df[df['target_chembl_id'].isin(t_keep)]
ion = pd.unique(activities[activities['protein_class'].str.contains("ion channel", na=False)]['tid']).shape[0]
kin = pd.unique(activities[activities['protein_class'].str.contains("kinase", na=False)]['tid']).shape[0]
nuc = pd.unique(activities[activities['protein_class'].str.contains("nuclear receptor", na=False)]['tid']).shape[0]
gpcr = pd.unique(activities[activities['protein_class'].str.contains("membrane receptor", na=False)]['tid']).shape[0]
print('Number of unique targets: ', len(t_keep))
print(' Ion channel: ', ion)
print(' Kinase: ', kin)
print(' Nuclear receptor: ', nuc)
print(' GPCR: ', gpcr)
print(' Others: ', len(t_keep) - ion - kin - nuc - gpcr)
# save it to a file
activities.to_csv('chembl_activity_data_filtered.csv', index=False)
def gen_dict(group):
return {tid: act for tid, act in zip(group['target_chembl_id'], group['active'])}
print('MULTI TASK DATA PREP')
group = activities.groupby('chembl_id')
temp = pd.DataFrame(group.apply(gen_dict))
mt_df = pd.DataFrame(temp[0].tolist())
mt_df['chembl_id'] = temp.index
mt_df = mt_df.where((pd.notnull(mt_df)), -1)
structs = activities[['chembl_id', 'canonical_smiles']].drop_duplicates(subset='chembl_id')
print('GET MOL')
# drop mols not sanitizing on rdkit
def molchecker(smi):
mol = Chem.MolFromSmiles(smi)
if mol == None:
return None
else:
return 1
#structs['romol'] = structs.apply(lambda row: Chem.MolFromSmiles(row['canonical_smiles']), axis=1)
structs['romol'] = structs.apply(lambda row: molchecker(row['canonical_smiles']), axis=1)
structs = structs.dropna()
del structs['romol']
# add the structures to the final df
mt_df = pd.merge(structs, mt_df, how='inner', on='chembl_id')
# save to csv
mt_df.to_csv('chembl_multi_task_data.csv', index=False)
FP_SIZE = 1024
RADIUS = 2
def calc_fp(smiles, fp_size, radius):
"""
calcs morgan fingerprints as a numpy array.
"""
mol = Chem.MolFromSmiles(smiles, sanitize=False)
mol.UpdatePropertyCache(False)
Chem.GetSSSR(mol)
fp = rdMolDescriptors.GetMorganFingerprintAsBitVect(mol, radius, nBits=fp_size)
a = np.zeros((0,), dtype=np.float32)
Chem.DataStructs.ConvertToNumpyArray(fp, a)
return a
# calc fps
print('CALC FP')
descs = [calc_fp(smi, FP_SIZE, RADIUS) for smi in mt_df['canonical_smiles'].values]
descs = np.asarray(descs, dtype=np.float32)
# put all training data in a pytables file
print('SAVE DATA')
with tb.open_file('mt_data.h5', mode='w') as t_file:
# set compression filter. It will make the file much smaller
filters = tb.Filters(complib='blosc', complevel=5)
# save chembl_ids
tatom = ObjectAtom()
cids = t_file.create_vlarray(t_file.root, 'chembl_ids', atom=tatom)
for cid in mt_df['chembl_id'].values:
cids.append(cid)
# save fps
fatom = tb.Atom.from_dtype(descs.dtype)
fps = t_file.create_carray(t_file.root, 'fps', fatom, descs.shape, filters=filters)
fps[:] = descs
del mt_df['chembl_id']
del mt_df['canonical_smiles']
# save target chembl ids
tcids = t_file.create_vlarray(t_file.root, 'target_chembl_ids', atom=tatom)
for tcid in mt_df.columns.values:
tcids.append(tcid)
# save labels
labs = t_file.create_carray(t_file.root, 'labels', fatom, mt_df.values.shape, filters=filters)
labs[:] = mt_df.values
# save task weights
# each task loss will be weighted inversely proportional to its number of data points
weights = []
for col in mt_df.columns.values:
c = mt_df[mt_df[col] >= 0.0].shape[0]
weights.append(1 / c)
weights = np.array(weights)
ws = t_file.create_carray(t_file.root, 'weights', fatom, weights.shape)
ws[:] = weights
with tb.open_file('mt_data.h5', mode='r') as t_file:
print(t_file.root.chembl_ids.shape)
print(t_file.root.target_chembl_ids.shape)
print(t_file.root.fps.shape)
print(t_file.root.labels.shape)
print(t_file.root.weights.shape)
# save targets to a json file
with open('targets.json', 'w') as f:
json.dump(t_file.root.target_chembl_ids[:], f)
After making the dataset, I build the model with pytorch. The code is below.
import numpy as np
import torch
from torch import nn
import torch.nn.functional as F
import torch.utils.data as D
import tables as tb
from sklearn.metrics import (matthews_corrcoef,
confusion_matrix,
f1_score,
roc_auc_score,
accuracy_score,
roc_auc_score)
# set the device to GPU if available
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
MAIN_PATH = '.'
DATA_FILE = 'mt_data.h5'
MODEL_FILE = 'chembl_mt.model'
N_WORKERS = 8 # Dataloader workers, prefetch data in parallel to have it ready for the model after each batch train
BATCH_SIZE = 32 # https://twitter.com/ylecun/status/989610208497360896?lang=es
LR = 2 # Learning rate. Big value because of the way we are weighting the targets
N_EPOCHS = 2 # You should train longer!!!
class ChEMBLDataset(D.Dataset):
def __init__(self, file_path):
self.file_path = file_path
with tb.open_file(self.file_path, mode='r') as t_file:
self.length = t_file.root.fps.shape[0]
self.n_targets = t_file.root.labels.shape[1]
def __len__(self):
return self.length
def __getitem__(self, index):
with tb.open_file(self.file_path, mode='r') as t_file:
structure = t_file.root.fps[index]
labels = t_file.root.labels[index]
return structure, labels
dataset = ChEMBLDataset(f"{MAIN_PATH}/{DATA_FILE}")
validation_split = .2
random_seed= 42
dataset_size = len(dataset)
indices = list(range(dataset_size))
split = int(np.floor(validation_split * dataset_size))
np.random.seed(random_seed)
np.random.shuffle(indices)
train_indices, test_indices = indices[split:], indices[:split]
train_sampler = D.sampler.SubsetRandomSampler(train_indices)
test_sampler = D.sampler.SubsetRandomSampler(test_indices)
# dataloaders can prefetch the next batch if using n workers while
# the model is tranining
train_loader = torch.utils.data.DataLoader(dataset,
batch_size=BATCH_SIZE,
num_workers=N_WORKERS,
sampler=train_sampler)
test_loader = torch.utils.data.DataLoader(dataset,
batch_size=BATCH_SIZE,
num_workers=N_WORKERS,
sampler=test_sampler)
class ChEMBLMultiTask(nn.Module):
"""
Architecture borrowed from: https://arxiv.org/abs/1502.02072
"""
def __init__(self, n_tasks):
super(ChEMBLMultiTask, self).__init__()
self.n_tasks = n_tasks
self.fc1 = nn.Linear(1024, 2000)
self.fc2 = nn.Linear(2000, 100)
self.dropout = nn.Dropout(0.25)
# add an independet output for each task int the output laer
for n_m in range(self.n_tasks):
self.add_module(f"y{n_m}o", nn.Linear(100, 1))
def forward(self, x):
h1 = self.dropout(F.relu(self.fc1(x)))
h2 = F.relu(self.fc2(h1))
out = [torch.sigmoid(getattr(self, f"y{n_m}o")(h2)) for n_m in range(self.n_tasks)]
return out
# create the model, to GPU if available
model = ChEMBLMultiTask(dataset.n_targets).to(device)
# binary cross entropy
# each task loss is weighted inversely proportional to its number of datapoints, borrowed from:
# http://www.bioinf.at/publications/2014/NIPS2014a.pdf
with tb.open_file(f"{MAIN_PATH}/{DATA_FILE}", mode='r') as t_file:
weights = torch.tensor(t_file.root.weights[:])
weights = weights.to(device)
criterion = [nn.BCELoss(weight=w) for x, w in zip(range(dataset.n_targets), weights.float())]
# stochastic gradient descend as an optimiser
optimizer = torch.optim.SGD(model.parameters(), LR)
# model is by default in train mode. Training can be resumed after .eval() but needs to be set to .train() again
model.train()
for ep in range(N_EPOCHS):
for i, (fps, labels) in enumerate(train_loader):
# move it to GPU if available
fps, labels = fps.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(fps)
# calc the loss
loss = torch.tensor(0.0).to(device)
for j, crit in enumerate(criterion):
# mask keeping labeled molecules for each task
mask = labels[:, j] >= 0.0
if len(labels[:, j][mask]) != 0:
# the loss is the sum of each task/target loss.
# there are labeled samples for this task, so we add it's loss
loss += crit(outputs[j][mask], labels[:, j][mask].view(-1, 1))
loss.backward()
optimizer.step()
if (i+1) % 500 == 0:
print(f"Epoch: [{ep+1}/{N_EPOCHS}], Step: [{i+1}/{len(train_indices)//BATCH_SIZE}], Loss: {loss.item()}")
y_trues = []
y_preds = []
y_preds_proba = []
# do not track history
with torch.no_grad():
for fps, labels in test_loader:
# move it to GPU if available
fps, labels = fps.to(device), labels.to(device)
# set model to eval, so will not use the dropout layer
model.eval()
outputs = model(fps)
for j, out in enumerate(outputs):
mask = labels[:, j] >= 0.0
mask = mask.to(device)
y_pred = torch.where(out[mask].to(device) > 0.5, torch.ones(1).to(device), torch.zeros(1).to(device)).view(1, -1)
if y_pred.shape[1] > 0:
for l in labels[:, j][mask].long().tolist():
y_trues.append(l)
for p in y_pred.view(-1, 1).tolist():
y_preds.append(int(p[0]))
for p in out[mask].view(-1, 1).tolist():
y_preds_proba.append(float(p[0]))
tn, fp, fn, tp = confusion_matrix(y_trues, y_preds).ravel()
sens = tp / (tp + fn)
spec = tn / (tn + fp)
prec = tp / (tp + fp)
f1 = f1_score(y_trues, y_preds)
acc = accuracy_score(y_trues, y_preds)
mcc = matthews_corrcoef(y_trues, y_preds)
auc = roc_auc_score(y_trues, y_preds_proba)
print(f"accuracy: {acc}, auc: {auc}, sens: {sens}, spec: {spec}, prec: {prec}, mcc: {mcc}, f1: {f1}")
print(f"Not bad for only {N_EPOCHS} epochs!")
torch.save(model.state_dict(), f"./{MODEL_FILE}")
Due to run the code on my personal environment I set epoch is 2 but it’s better to more longer epoch for production.
After the training, I converted the model to onnx format. The code is…
import numpy as np
import torch
from torch import nn
import torch.nn.functional as F
import torch.utils.data as D
import tables as tb
from sklearn.metrics import (matthews_corrcoef,
confusion_matrix,
f1_score,
roc_auc_score,
accuracy_score,
roc_auc_score)
from torch import onnx
# set the device to GPU if available
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
MAIN_PATH = '.'
DATA_FILE = 'mt_data.h5'
MODEL_FILE = 'chembl_mt.model'
N_WORKERS = 8 # Dataloader workers, prefetch data in parallel to have it ready for the model after each batch train
BATCH_SIZE = 32 # https://twitter.com/ylecun/status/989610208497360896?lang=es
LR = 2 # Learning rate. Big value because of the way we are weighting the targets
N_EPOCHS = 2 # You should train longer!!!
class ChEMBLDataset(D.Dataset):
def __init__(self, file_path):
self.file_path = file_path
with tb.open_file(self.file_path, mode='r') as t_file:
self.length = t_file.root.fps.shape[0]
self.n_targets = t_file.root.labels.shape[1]
def __len__(self):
return self.length
def __getitem__(self, index):
with tb.open_file(self.file_path, mode='r') as t_file:
structure = t_file.root.fps[index]
labels = t_file.root.labels[index]
return structure, labels
class ChEMBLMultiTask(nn.Module):
"""
Architecture borrowed from: https://arxiv.org/abs/1502.02072
"""
def __init__(self, n_tasks):
super(ChEMBLMultiTask, self).__init__()
self.n_tasks = n_tasks
self.fc1 = nn.Linear(1024, 2000)
self.fc2 = nn.Linear(2000, 100)
self.dropout = nn.Dropout(0.25)
# add an independet output for each task int the output laer
for n_m in range(self.n_tasks):
self.add_module(f"y{n_m}o", nn.Linear(100, 1))
def forward(self, x):
h1 = self.dropout(F.relu(self.fc1(x)))
h2 = F.relu(self.fc2(h1))
out = [torch.sigmoid(getattr(self, f"y{n_m}o")(h2)) for n_m in range(self.n_tasks)]
return out
dataset = ChEMBLDataset(f"{MAIN_PATH}/{DATA_FILE}")
validation_split = .2
random_seed= 42
dataset_size = len(dataset)
model = ChEMBLMultiTask(dataset.n_targets).to(device)
path = './model_onnx.onnx'
dummy = torch.tensor([[0.5 for _ in range(1024)]],
dtype=torch.float32).to(device)
onnx.export(model, dummy, path, input_names=['input_1'],output_names=['output'])
Now, I got onnx model of chembl target prediciton. So I tried to run the work flow with the model. But it didn’t work. So I tried to tensorflow model directly on knime workflow.
At first, I converted onnx model to tensorflow model with onnx-tf.
$ onnx-tf convert -i model_onnx.onnx -o model_tf
Now ready to run the prediction ;)
I build workflow (most of part came from greg’s great work!).

To run the WF, I could get heat map of target prediction.

The environment of deeplearning for knime is below.
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge
absl-py 1.4.0 pyhd8ed1ab_0 conda-forge
aiohttp 3.8.3 py37h540881e_0 conda-forge
aiosignal 1.3.1 pyhd8ed1ab_0 conda-forge
astunparse 1.6.3 pyhd8ed1ab_0 conda-forge
async-timeout 4.0.2 pyhd8ed1ab_0 conda-forge
asynctest 0.13.0 py_0 conda-forge
attrs 22.2.0 pyh71513ae_0 conda-forge
blinker 1.5 pyhd8ed1ab_0 conda-forge
brotlipy 0.7.0 py37h540881e_1004 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2022.12.7 ha878542_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
cachetools 5.3.0 pyhd8ed1ab_0 conda-forge
certifi 2022.12.7 pyhd8ed1ab_0 conda-forge
cffi 1.15.1 py37h43b0acd_1 conda-forge
charset-normalizer 2.1.1 pyhd8ed1ab_0 conda-forge
click 8.1.3 py37h89c1867_0 conda-forge
conda 4.12.0 py37h89c1867_0 conda-forge
conda-package-handling 2.0.2 pyh38be061_0 conda-forge
conda-package-streaming 0.7.0 pyhd8ed1ab_1 conda-forge
cryptography 38.0.2 py37h38fbfac_1 conda-forge
cudatoolkit 11.8.0 h37601d7_11 conda-forge
cudnn 8.4.1.50 hed8a83a_0 conda-forge
frozenlist 1.3.1 py37h540881e_0 conda-forge
gast 0.5.3 pyhd8ed1ab_0 conda-forge
giflib 5.2.1 h0b41bf4_3 conda-forge
google-auth 2.16.2 pyh1a96a4e_0 conda-forge
google-auth-oauthlib 0.4.6 pyhd8ed1ab_0 conda-forge
google-pasta 0.2.0 pyh8c360ce_0 conda-forge
grpc-cpp 1.43.2 h9e046d8_3 conda-forge
grpcio 1.43.0 py37hb27c1af_0 conda-forge
h5py 3.7.0 nompi_py37hf1ce037_101 conda-forge
hdf5 1.12.2 nompi_h2386368_101 conda-forge
icu 70.1 h27087fc_0 conda-forge
idna 3.4 pyhd8ed1ab_0 conda-forge
importlib-metadata 4.11.4 py37h89c1867_0 conda-forge
jpeg 9e h0b41bf4_3 conda-forge
keras 2.8.0 pyhd8ed1ab_0 conda-forge
keras-preprocessing 1.1.2 pyhd8ed1ab_0 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
krb5 1.20.1 hf9c8cef_0 conda-forge
ld_impl_linux-64 2.40 h41732ed_0 conda-forge
libaec 1.0.6 hcb278e6_1 conda-forge
libblas 3.9.0 16_linux64_openblas conda-forge
libcblas 3.9.0 16_linux64_openblas conda-forge
libcurl 7.87.0 h6312ad2_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 12.2.0 h69a702a_19 conda-forge
libgfortran5 12.2.0 h337968e_19 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
liblapack 3.9.0 16_linux64_openblas conda-forge
libnghttp2 1.51.0 hdcd2b5c_0 conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge
libpng 1.6.39 h753d276_0 conda-forge
libprotobuf 3.19.4 h780b84a_0 conda-forge
libsolv 0.7.23 h3eb15da_0 conda-forge
libsqlite 3.40.0 h753d276_0 conda-forge
libssh2 1.10.0 haa6b8db_3 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
mamba 0.1.2 py37h99015e2_0 conda-forge
markdown 3.4.1 pyhd8ed1ab_0 conda-forge
markupsafe 2.1.1 py37h540881e_1 conda-forge
multidict 6.0.2 py37h540881e_1 conda-forge
nccl 2.14.3.1 h0800d71_0 conda-forge
ncurses 6.3 h27087fc_1 conda-forge
numpy 1.21.6 py37h976b520_0 conda-forge
oauthlib 3.2.2 pyhd8ed1ab_0 conda-forge
onnx 1.13.1 pypi_0 pypi
onnx-tf 1.10.0 pypi_0 pypi
openssl 1.1.1t h0b41bf4_0 conda-forge
opt_einsum 3.3.0 pyhd8ed1ab_1 conda-forge
packaging 23.0 pypi_0 pypi
pip 23.0.1 pyhd8ed1ab_0 conda-forge
protobuf 3.20.3 pypi_0 pypi
pyasn1 0.4.8 py_0 conda-forge
pyasn1-modules 0.2.7 py_0 conda-forge
pycosat 0.6.4 py37h540881e_0 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pyjwt 2.6.0 pyhd8ed1ab_0 conda-forge
pyopenssl 23.0.0 pyhd8ed1ab_0 conda-forge
pysocks 1.7.1 py37h89c1867_5 conda-forge
python 3.7.12 hb7a2778_100_cpython conda-forge
python-flatbuffers 23.1.21 pyhd8ed1ab_0 conda-forge
python_abi 3.7 3_cp37m conda-forge
pyu2f 0.1.5 pyhd8ed1ab_0 conda-forge
pyyaml 6.0 pypi_0 pypi
re2 2022.02.01 h9c3ff4c_0 conda-forge
readline 8.1.2 h0f457ee_0 conda-forge
requests 2.28.2 pyhd8ed1ab_0 conda-forge
requests-oauthlib 1.3.1 pyhd8ed1ab_0 conda-forge
rsa 4.9 pyhd8ed1ab_0 conda-forge
ruamel_yaml 0.15.80 py37h540881e_1007 conda-forge
scipy 1.7.3 py37hf2a6cf1_0 conda-forge
setuptools 67.6.0 pyhd8ed1ab_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
snappy 1.1.10 h9fff704_0 conda-forge
sqlite 3.40.0 h4ff8645_0 conda-forge
tensorboard 2.8.0 pyhd8ed1ab_1 conda-forge
tensorboard-data-server 0.6.0 py37h38fbfac_2 conda-forge
tensorboard-plugin-wit 1.8.1 pyhd8ed1ab_0 conda-forge
tensorflow 2.8.0 cuda112py37h01c6645_0 conda-forge
tensorflow-addons 0.19.0 pypi_0 pypi
tensorflow-base 2.8.0 cuda112py37hd7e45b3_0 conda-forge
tensorflow-estimator 2.8.0 cuda112py37h25bb9bc_0 conda-forge
tensorflow-gpu 2.8.0 cuda112py37h0bbbad9_0 conda-forge
termcolor 2.2.0 pyhd8ed1ab_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
typeguard 3.0.1 pypi_0 pypi
typing-extensions 4.5.0 hd8ed1ab_0 conda-forge
typing_extensions 4.5.0 pyha770c72_0 conda-forge
urllib3 1.26.15 pyhd8ed1ab_0 conda-forge
werkzeug 2.2.3 pyhd8ed1ab_0 conda-forge
wheel 0.40.0 pyhd8ed1ab_0 conda-forge
wrapt 1.14.1 py37h540881e_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
yaml 0.2.5 h7f98852_2 conda-forge
yarl 1.7.2 py37h540881e_2 conda-forge
zipp 3.15.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge
zstandard 0.18.0 py37h540881e_0 conda-forge
And I uploaded today’s code on github and knimehub.
https://github.com/iwatobipen/chembl_targetprediction
https://hub.knime.com/-/spaces/-/latest/~keV0drWZt67jVOx3/
You can modify the code and knime workflow if you would like to do. Let’s enjoy chemoinformatics!