タグ: drug discovery

PKPD in R.

You know, to drug development understanding PKPD is important.
I’m not DMPK dept. but I think it’s better to know about basic PKPD theory.
There are some packages about pkpd analysis in R.
And I found cool library developed ronkeiser named “PKPDsim”.
This library can integrate shiny, so user can calculate PKPD on the fly!
I used the library today.
I following code is almost same as document.
Following code is simulation of 1 compartment model, oral dose.
pk_1cmt_oral is defined in source code.

p <- list(CL=1, V=10, KA=0.5)
pk1 <- new_ode_model("pk_1cmt_oral")
r1 <- new_regimen( amt=100,
dat <- sim_ode(ode = "pk1",
               par = p,
plt <- ggplot( dat, aes(x=t, y=y) ) + geom_line() + facet_wrap(~comp)

Now I got following image.
1 means elimination phase, 2 means absorption phase.

Then run shiny, and simulate on the fly.

sim_ode_shiny(ode='pk1', par=p, regimen=r1)

Now web browser launched…
Screen Shot 2015-10-21 at 11.13.04 PM

I think it’s good library to study or simulate PKPD.

Visualizing the process of lead optimization

Some time we set milestones to management of portfolio, or/and to check the progeress of projects.
These data were reported document, power point slides etc, so it’s difficult to grasp situation of LO timely.
Researchers at GSK published a solution of visualize LO process.
It was impressive for me.
Link is here.

They called “LO telemetry” that shows time course of total risk of compounds.
Total risk is calculated based on potency of each target, ADME, Tox and physchem profiles.
Ideally, total risk will decreased progress of project. But, there are a lot of problems in drug discovery project (at least for me! 😉 ).
Fig5 shows one of the example.
The figure shows progress of lead optimization and design entropy(chemical diversity).
Design entropy is suddenly increased because of Tox problems. PhysChem prop risk slightly increased also.
To avoid tox problem(adverse effect) chemist think about change of chemical series or dynamic change of structure. It risk to loss of potency, but Fig 5 shows there strategy keep row score of pharmacological risk.

The paper reported that LO project team can check the telemetry. It tells team about bottlenecks and progress of there project.
Also the system can use portfolio management.
It useful to decision make, motivate the team.
On the other hand, the telemetry provides a vivid description of each projects.
How do you think about metrics of Lead Optimization.

Passport for compound.

I was interested in the title.
“Compound Passport Service”
AZ made passport for compound to manage compound rights tracking.

The system can manage status of compounds, like ownership, permission and structure shared.

I really impressed with the concept and system because I think that management of compound(and right) logistics is key factor in Drug Discovery.

I want to develop seamless compound logistics system and tracking system of medicinal chemistry…

How to visualize QSAR model.

I often discuss with other chemist(s) about QSAR.
And sometime they told me …”QSAR is useful tool for drug discovery, but I don’t understand it. Because QSAR model (i.e. ML) is hard to understand why the compound is good ?”
Hmm, I agree his opinion.
SVM, NB, RF etc are very useful but these models are black box. So, it difficult to understand effect of substructures to the moldes.
Jürgen Bajorath et al. challenged to solve the gap and published interesting paper in J. Chem. Inf. Model.

They described in the paper…

understanding why a compound has undesirable ADME cahracterisitcs is just as important as knowing that it(ADME prediction) does.

I like this phrase.

They developed python library named nbvis that depend on scikitlearn and matplotlib.
The library can visualise contribution of each features of vectors.
I think the key point of the method is that the author used MACCSkeys to build model.
Because MACCSkey is easy to understand for chemist.
I wrote demo_code using RDKit.
Sample data was downloaded following ftp.
And added Class properties.(I set active flag “IC50_uM < 0.1 is active”)
At first, I set arguments 'names' and 'groups'.
Then wrote sample script like following.

import nbviz
import numpy as np
import sys
import maccskey
from rdkit import Chem
from rdkit.Chem import MACCSkeys
from sklearn.naive_bayes import BernoulliNB

def calc_MACCS_fp( mol ):
	mol_fp =list( MACCSkeys.GenMACCSKeys( mol ).GetOnBits() )
	mol_fp_vec = np.zeros( 167, )
	mol_fp_vec[ mol_fp ] = 1
	return mol_fp_vec

def make_fp_array( mols ):
	fp_array = [ calc_MACCS_fp( mol ) for mol in mols ]
	return fp_array

mols = [ mol for mol in Chem.SDMolSupplier( sys.argv[1] ) ]
X = make_fp_array( mols )
Y = [ float(mol.GetProp( "Class" )) for mol in mols ]

model = BernoulliNB( alpha=0.1 )
model.fit( X[1:], Y[1:] )
conditional_probs = np.exp( model.feature_log_prob_ )
prior = np.exp( model.class_log_prior_[1] )
print 'condtional feature prob', conditional_probs
print 'class prior', prior
nbviz.visualize_model( conditional_probs, prior, names=maccskey.names, groups=maccskey.groups )

nbviz.visualize_prediction( X[0], conditional_probs, prior, names=maccskey.names, groups=maccskey.groups )

Let,s run script!

modelviz iwatobipen$ python view_model_demo.py mol_viz_demo/cox2_test.sdf 

Then two figures generated.
Red and blue colour of circles indicate that positive / negative influence of features and distance indicate that log odds ratio.
The approach is useful for discussion, because the figure provide information to chemists why the model indicate the substructures are effective.
But, it hard for me to visualise each targets….



mishima.syk#6 etc.

Today I presented at mishima.syk#6 about rdkitjs.
I wrote very simple script using rdkitjs and d3js.
This script can make scatter plot about fraction of Csp3 and molwt and when user mouse over the circle, rdkitjs return molecule image.
Like this.
Screen Shot 2015-07-25 at 9.29.50 PM
I uploaded the script to github.
I hope, participants enjoyed my presentation….

The presentation about FMO was very impressive for me. The method will provide to medchem about more details of ligand-target interaction.
Presentation about real Docking and data management and ingress were also very interesting!

And I enjoyed a workshop it was held yesterday.
It was worth for me about how do you think about SAR from given dataset.
It seems to be dependent on philosophy of chemist. 😉
Thank for all participants and presenters.

tool kitのハンズオン、リアルドッキングの話、データマネジメントの話、プレゼンは毎回楽しく勉強になります。


Think about SAR analysis.

I lost a chance of participation in RDKIT-UGM because ticket was sold out. ;-(
I’ll try next year….

SAR analysis is key for drug discovery.
MMPA is one of major tool, I like the method.
Because MMPA is easy to check effect of substituent in molecule.
But sometime, it difficult to understand why the parameter is changed.
I found interesting way to analyse SAR using MMP, it’s called ‘non-additive SAR’.
Link is following.

‘Non-additive’ means …. If the effect of adding a specific substituent to position A depends on the presence of another substituent in position B.

I met this situation sometime.
…Hmm this part increase activity dramatically when the scaffold has this substituent.

So, I think non-additive SAR is useful for med-chem.

The author described some example about non-additive SAR in drug discovery project.
And source code can get from supporting information.

I customise the code and apply for my project.
The results seems to interesting. 😉

review for kinetics

I found nice review about kinetics of drug binding and residence time.

To improve in vitro, in vivo potency, I some time try to get SKR for designing molecule.
If I got correlation only residence time and lipophilicity or molecular weight, the information is not so good.
Because too liphophilic or heavy molecule is not so drug like.
So, I’m thinking about what is the best way to using kinetic data.
In this review, there are some example about how to use kinetic data for molecular design.

I was interested in Table1, because there are a lot of examples that are using SPR for kinetic data analysis.
Sometime I think analysis of kinetic using SPR is difficult because of instability of target protein or another factor but lots of success stories are. Hmm.

SKR is attractive for me as same as SAR, but rational design of molecule using kinetic data is still challenging area for me.