Do Pharmas need more data for building better model? #memo #medchem

Is ‘AI’ & ‘BidData’ key for success? Recently many pharmaceutical companies are trying to use AI for their drug discovery project acceleration and cost reduction. However sometime it fails because of lack of training data. It is rare that lots of data is available in early drug discovery projects. It is still difficult to build accurate predictive model with small data set.

Recently attractive project is reported in nature reviews drug discovery.
https://www.nature.com/articles/d41573-019-00120-w

The project named MELLODDY(Machine Learning Ledger Orchestration for Drug Discovery) which is an an €18-million, 3-year IMI project.

According the following URL, the MELLODDY consists 17 partners. And the partner will share about the structure and activity data without IP.
https://www.janssen.com/emea/new-research-consortium-seeks-accelerate-drug-discovery-using-machine-learning-unlock-maximum

  • 10 pharmaceutical companies: Amgen, Astellas, AstraZeneca, Bayer, Boehringer Ingelheim, GSK, Janssen Pharmaceutica NV, Merck KgaA, Novartis, and Institut de Recherches Servier
  • 2 academic universities: KU Leuven, Budapesti Muszaki es Gazdasagtudomanyi Egyetem
  • 4 subject matter experts: Owkin, Substra Foundation, Loodse, Iktos
  • 1 large AI computing company: NVIDIA

The consortium uses Amazon web services for machine learning. I think it is nice approach for data and model management and it will be difficult for pharma in Japan (just my personal opinion)…..

How about the chemical space of the combined compounds data? The data is enough for overcoming the applicability domain?

I hope the project goes well.