![calcusyn calcusyn](https://media.springernature.com/m685/springer-static/image/art%3A10.1038%2Fs41598-018-23321-6/MediaObjects/41598_2018_23321_Fig1_HTML.jpg)
More recently, QSAR modeling has also achieved accurate prediction of compound activity on non-molecular targets such as cancer cell lines ( Kumar et al., 2014). In the last 15 years, non-linear machine learning methods, such as Neural Network (NN) ( González-Díaz et al., 2007), Support Vector Machine (SVM) ( Doucet et al., 2007) or Random Forest (RF) ( Singh et al., 2015), have also been employed to build QSAR models. QSAR models are traditionally built using simple linear models ( Sabet et al., 2010 Pick et al., 2011 Speck-Planche et al., 2011, 2012) to predict the activity of individual molecules against a molecular target.
![calcusyn calcusyn](https://www.thno.org/v11/p4335/thnov11p4335g001.jpg)
Such methods have been successfully used in a wide variety of pharmacology and drug design projects ( Cherkasov et al., 2014), including cancer research ( Chen et al., 2007 Mullen et al., 2011 Ali and Aittokallio, 2018). Quantitative Structure-Activity Relationship (QSAR) models establish a mathematical relationship between the chemical structure of a molecule, encoded as a set of structural and/or physico-chemical features (descriptors), and its biological activity on a target. The sheer number of available and possible drug-like molecules ( Polishchuk et al., 2013) and an exponential number of their combinations, however, make the process of finding new therapeutic combinations by purely experimental means highly inefficient.Īn efficient way of discovering molecules with previously unknown activity on a given target is using in silico prediction methods. Yet systematic large-scale efforts to identify them have only recently been pursued, with a growing number of preclinical experimental efforts to identify synergistic combinations ( Zoli et al., 2001 Budman et al., 2012 Lieu et al., 2013 Kashif et al., 2015 Yu et al., 2015 Kischkel et al., 2017) being reported in literature. Serendipitous discovery in the clinic has been a traditional source of effective drug combinations ( Zoli et al., 2001 Kurtz et al., 2015). Administering more than one drug can provide many benefits: higher efficacy, lower toxicity, and at least delayed onset of acquired drug resistance ( Sugahara et al., 2010 Holohan et al., 2013 Crystal et al., 2014). Given their accuracy and reliability estimation, the developed models should drastically reduce the number of required in vitro tests by predicting in silico which of the considered combinations are likely to be synergistic.ĭrug combinations are a well-established form of cancer treatment ( Bayat Mokhtari et al., 2017). Despite its leading size, NCI-ALMANAC comprises an extremely small part of all conceivable combinations. Alkylating agents, tyrosine kinase inhibitors and topoisomerase inhibitors are the drugs whose synergy with other partner drugs are better predicted by the models. We have also found that restricting to the most reliable synergy predictions results in at least 2-fold error decrease with respect to employing the best learning algorithm without any reliability estimation. The evaluation of these models shows that it is possible to predict the synergy of unseen drug combinations with high accuracy (Pearson correlations between 0.43 and 0.86 depending on the considered cell line, with XGBoost providing slightly better predictions than RF). The application of a powerful, yet uncommonly used, RF-specific technique for reliability prediction is also investigated. This large-scale predictive modeling study comprises more than 5,000 pair-wise drug combinations, 60 cell lines, 4 types of models, and 5 types of chemical features. Each cell line is modeled using primarily two machine learning techniques, Random Forest (RF) and Extreme Gradient Boosting (XGBoost), on the datasets provided by NCI-ALMANAC. Here we investigate to which extent drug combination synergy can be predicted by exploiting the largest available dataset to date (NCI-ALMANAC, with over 290,000 synergy determinations). In silico modeling methods can substantially widen this search by providing tools able to predict which of all possible combinations in a large compound library are synergistic. Unfortunately, the discovery of synergistic combinations by purely experimental means is only feasible on small sets of drugs. 2Department of Tumor Immunology, Institut de Duve, Bruxelles, Belgiumĭrug combinations are of great interest for cancer treatment.1CRCM, INSERM, Cancer Research Center of Marseille, Institut Paoli-Calmettes, Aix-Marseille Univ, CNRS, Marseille, France.Pavel Sidorov 1, Stefan Naulaerts 1,2, Jérémy Ariey-Bonnet 1, Eddy Pasquier 1 and Pedro J.