AI/Robot will be good partner of human! #AI #memo #journal

Today I read exciting article which is reported by Leroy Cronin’s group. The title is ‘Intuition-Enabled Machine Learning Beats the Competition When Joint Human-Robot Teams Perform Inorganic Chemical Experiments‘!

And the article was published Feb 2019 to ChemRxiv at first.

The article describes about experimental optimization for inorganic chemistry, not drug discovery. But I think it is worth to read it.

In the drug discovery area, there are many AI driven drug discovery approaches are reported. Almost of them describes about architecture of AI based molecular generation, property prediction and does not discuss about experiences that are conducted by researchers.

Machine learning based approach is not affected by bias, background knowledge, by the way human knowledge based approach is often affected by experimenter’s background and experience. So human intuition is very biased. The bias limits search area of experimental space such as chemical space.

The author tested performance of crystallization model quality with 4 situations, fully algorithm based, human based, random and algorithm and human team based.

Result was very interesting for me. Fig 7 shows relation ship between explored crystallization space and number of experiments. Explored crystallization space means experimental space of target inorganic compound crystallization conditions.

Random and only human approach shows very narrow space even if lots of experiments are conducted. On the other hand algorithm and human-AI team based approach covers wide range of crystallization space. And human-AI team approach shows best result.

It means that algorithm can design and search wide range of space by its algorithms but can not select best experiments with background knowledge and human can select good experimental conditions by their experience and background knowledge. So human and AI can cooperate with each other. And it will produce good results.

Recently I challenging Al drug discovery. It can not conduct only machine, it is needed of human!

I think Robots will not steal human jobs but will be good parter of human. It is just my opinion.

Any comments are highly appreciated. ;)

Current version Openforcefield supports rdkit #RDKit #Openforcefield #chemoinformatics

I posted about openforcefield(OpenFF) before. You know, old version of openff supports only OpenEyeTK but current version supports RDKit too.

It is worth to know that we can use openff with open source tool kit. I really appreciate developer’s work! It is great. Today I use the package and ipymol which can control pymol in ipython session. It means that you can control pymol from jupyter notebook! My example code is below.

import openforcefield as off
from rdkit import Chem
from rdkit.Chem import AllChem
from simtk import openmm, unit
from simtk.openmm import app
from openforcefield.topology import Topology
from openforcefield.topology import Molecule
from openforcefield.typing.engines.smirnoff import ForceField
from rdkit.Chem.Draw import IPythonConsole

Load SMIRNOF99FROSS forcefield.

ff = ForceField('test_forcefields/smirnoff99Frosst.offxml')

Then define get_energy function.

def get_energy(system, positions):
    integrator = openmm.VerletIntegrator(1.0 * unit.femtoseconds)
    context = openmm.Context(system, integrator)
    state = context.getState(getEnergy=True)
    energy = state.getPotentialEnergy().in_units_of(unit.kilocalories_per_mole)
    return energy

Make sample mol object with rdkit and convert openff mol object. It is easy, just call from_rdkit!.

rdmol = Chem.MolFromSmiles('c1c(c2sccc2)c(c3c[nH]cc3)oc1')
rdmol = Chem.AddHs(rdmol)
ofmol = Molecule.from_rdkit(rdmol)

Then generate conformer with openff function. And get topology.

topology = ofmol.to_topology()

Next, create openMM system with generated topology and get position of one conformer.

org_system = ff.create_openmm_system(topology)
pos = ofmol.conformers[0]

At first, calculate energy with default conformation.

get_energy(org_system, pos)
> Quantity(value=80.93719044789302, unit=kilocalorie/mole)

Next I tried to minimize energy with openMM method.

new_system = ff.create_openmm_system(topology)
new_energy = get_energy(new_system, pos)
from sys import stdout
def minimizeOpenMM(Topology, System, Positions):
    integrator = openmm.LangevinIntegrator(
                                        300.0 * unit.kelvin,
                                        1.0 / unit.picosecond,
                                        2.0 * unit.femtosecond)
                                        #.002 * unit.picoseconds)
    simulation = app.Simulation(Topology, System, integrator)
    simulation.minimizeEnergy(tolerance=5.0E-9, maxIterations=2000)
    state =  simulation.context.getState(getPositions=True, getEnergy=True)
    positions =state.getPositions(asNumpy=True)
    energy = state.getPotentialEnergy().in_units_of(unit.kilocalories_per_mole)
    simulation.reporters.append(app.StateDataReporter(stdout, 1000, step=True, potentialEnergy=True, temperature=True))
    positions = positions / unit.angstroms
    coordlist = list()
    for atom_coords in positions:
        coordlist += [i for i in atom_coords]
    return coordlist, positions

cl, pos=minimizeOpenMM(topology, org_system, pos)

Now I could get minimized position by calling minimizeOpenMM.

Next I updated atom position with optimized atom geometries.

from rdkit.Chem import rdGeometry
from rdkit.Chem.rdchem import Conformer
AllChem.EmbedMolecule(rdmol, useExpTorsionAnglePrefs = True , useBasicKnowledge = True)

conf = rdmol.GetConformer()
for i in range(rdmol.GetNumAtoms()):
    conf.SetAtomPosition(i, rdGeometry.Point3D(pos[i][0], pos[i][1],pos[i][2],))

I made two SDF one is default conformer and another is minimized conformer.

OK, I use ipymol to communicate pymol and visualize in jupyter notebook.

from ipymol import viewer
viewer.start() # this method launches pymol
from ipymol import viewer
viewer.start() # this method launches pymol


The result seems different to original article.

Hmm why???

BTW, OpenFF is very attractive package for chemoinformatics I think.

My code can access below.

psikit updates #psi4 #RDKit

Recently @fmkz___ san updated psikit and uploaded nice example code. URL is below.
It is very nice example to MO rendering in pymol.

I implemented cube file generator for molecular orbital rendering before. But the code has a bug and not user convenient… So I fix the bug and added new function which can communicate pymol and render the MO on pymol.

Current psikit generates molecular geometry string with ‘no_reorient’ and ‘no_com’ options. It is necessary to keep original orientation of query molecule. I did not know that. Thank you for psi4 community to give such as a nice suggestion.

Following code is example of new function named ‘view_on_pymol’. I added some additional code for drawing MO on ligand-receptor complex.

To run the code, I launch pymol with server mode. ‘$pymol -R’
With -R option, I can communicate pymol with xmlrpc.

Next I load target pdb which is p38-BIRB796 complex. Then select ligand, add hydrogen and save ligand as mol format.

To use xmlrpc it is easy to do these process.

After extracted the ligand, pass the ligand to psikit and generate HOMO/LUMO cube file with getMOview function.

Finally call veiw_on_pymol make ligand MO view on pymol.

Next load original pdb file on the same session, I can get PDB-ligand complex with MO.

example code

Now, Psikit can integrate psi4/quantum chemistry, chemoinformactics/rdkit and visualization/pymol and open source. It seems nice I think.

Any suggestions and advices will be greatly appreciated.

Comparison of rdChemReactions and EditableMol #RDKit #chemoinformatics

In this year I moved from MedChem team to CompChem team. And now I need to learn SBDD. Today I struggled mol object that has 3D information.

I would like to replace hydrogen which attached aromatic carbon to some atoms. I thought it is easy if I use rdChemReactions method. But I found that it was not good approach. Because RunReactants method generates products but the method can not keep 3D information of reacted atoms.

It is very interesting and good information for me. Let see example code.

Following code is very simple example.

code example

At first, I tried to replace atom with rdChemReactions method. It worked well but the position of fluorine atom was 0, 0, 0. It indicates that the method can not keep 3D information. I think it is reasonable because some chemical reaction dramatically changes molecular conformation.

The second example used Editable mol.

This approach could keep 3D information, it means that the method do just replace atom!

Of course bond length of C-H and C-F is different but the approach is more suitable for atom scanning I think.

I always surprised that RDKit has many useful function for drug design.

T-shaped skill, diversity is needed for innovation

Recently automation of chemistry at pharma is attracting area because of pharma would like to improve productivity I think.

Today I read interesting letter in ACS med chem lett. URL is below.

The article describes the approach of AbbVie. When I was fresh man, combinatorial chemistry is major for library production and there were many solid-phase synthesis instruments. But we do not use them recently.

The author analyzed the reason. One of interesting reason is that if the process is easy to automate, it is probably easy to perform manually. I couldn’t agree more. Lab automation is important for labor saving, researchers to focus more scientific problem so we would like to automate not so simple task.

BTW, drug discovery process is very complex. So it is difficult to automate the flow. It is required wide range of talent to do it.

In the case of AbbVie, they have had electrical and mechanical engineers who possess hundreds of years of experience working in industries ranging from cell phones to aerospace!

For innovation, diversity is key.

In the article the author introduced their instruments for NMP sample maker named SNAPP (source NMR Assay Plate Prep).

The system is well organized and well designed.

Now AbbVie is challenging photo redox reaction automation.

Engineering Chemistry requires wide range of knowledge and skill sets I think. So T-shape skill set is important for pharma I think.

If you have interest, please read the article.

Golden Week, 10-day holiday

Golden Week is a series of four national holidays that take place within one week at the end of April to the beginning of May each year. In this year I can get 10-day holiday. My kid is belonging to dodgeball team now and 5 days are practice game or tournament game…. Too busy ;) I could not have enough time for coding… On the game day, he has to get up early he has to get up 05:30 or 04:50 AM.

Of corse I need to get up early too because I will referee.

Dodgeball game is very speedy, match time is only 5 min for one set. And most of game is one set mach except for final game.

The ball speed of ace attacker is very fast. For top level team, it will be 80km/hr or more.

My kid is third grade of elementary school however he catches ball. And he really enjoys each game.

Practice is hard but he is growing up more and more except study….

Anyway I am happy that he can find something he would like to do and enjoys. Tomorrow there is final tournament in the holiday and we will get up 04:15 AM. I need go to bed ASAP.