Development of a Machine Learning Model to Design Target-specific Ligands

dc.creatorMathew, Ezek
dc.creatorLiu, Jin
dc.creatorWang, Duen-Shian
dc.creatorLiu, Kevin
dc.creator.orcid0000-0001-8365-2984 (Wang, Duen-Shian)
dc.descriptionResearch Appreciation Day Award Winner - 2022 School of Biomedical Sciences, Department of Pharmacology & Neuroscience - 2nd Place
dc.description.abstractBackground: As the estimated cost required to bring a drug to market ranges from $314 million to $2.8 billion, drug discovery is undoubtedly a lengthy and expensive process. Additionally, completion of Phase 3 trials does not guarantee FDA approval. For most drugs, the probability of receiving FDA approval ranges from 9% to 14%, depending on the time period. Therefore, researchers have turned to machine learning (ML) to decrease the burden of drug discovery for multiple targets. In the central nervous system (CNS), the metabotropic glutamate receptor subtype 2 (mGlu2) and metabotropic glutamate receptor subtype 3 (mGlu3) play various roles in normal physiology. Therefore, ligands of these receptors pose potential for the treatment of various pathologies, such as Alzheimer's disease, schizophrenia, and other neurological disorders. Currently, no literature exists referencing a machine learning model that is capable of distinguishing drug ligands based on their affinity to mGlu2 or mGlu3. To fill this gap in knowledge, we will design a machine learning algorithm capable of making associations across the entire data set, identifying patterns that the human eye cannot detect. Methods: We utilized a dataset which included two dimensional (2D) images of drug ligands belonging to two classes, mGlu2 or mGlu3. The images were resized, then converted into grayscale and subsequently processed as a numerical NumPy array with their associated labels. Convolutional Neural Network (CNN) and Functional API architecture were tested to determine the optimal model. Hyperparameter optimization occurred throughout this process. Results: The CNN and Functional API both reached 100% accuracy within 20 epochs, successfully classifying ligands as mGlu2 or mGlu3 based on 2D structure alone. However, the Functional API reached 100% accuracy in under 5 epochs, yielding superior performance when compared to the CNN. Conclusion: While the CNN is one of the most popular ML architectures for image classification, the Functional API can perform a similar role. As datasets expand, it may be beneficial to consider more efficient models, especially for image classification in the realm of drug discovery.
dc.description.sponsorshipThis work is partially supported by a grant (#RP17301) from the Cancer Prevention and Research Institute of Texas
dc.titleDevelopment of a Machine Learning Model to Design Target-specific Ligands