Browsing by Author "Liu, Jin"

Now showing 1 - 20 of 42

A Dynamic Approach to Targeting Acid-Sensing Ion Channels: Computational simulations reveal key residues in ASICs
(2016-03-23) Liu, Jin; De La Cruz, Daniel
The current molecular dynamics (MD) research project employs virtual model building as a tool in elucidating the functions associated with key calcium binding sites of acid-sensing ion channels (ASICs). These integral membrane proteins, with neuronal proton-sensitive channels associated with pain and central nervous system diseases, represent novel therapeutic targets for these diseases. ASIC1 and ASIC3 are two subtypes of ASICs with highly conserved channel “pore” sequences, but play different roles in the development of hyperalgesia after inflammatory muscle injury. It has been proposed that the removal of calcium continues to allow the ASIC3 channel to open, but this is not the case for ASIC1. The objective of this project is to identify key residues responsible for the distinct gating mechanisms of ASIC1 and ASIC3, utilizing MD simulations. Model building through software, CHARMM-GUI membrane builder program, utilizing the RCSB-PDB (4KNY), has allowed the manipulation and examination of ASIC1’s amino acid sequence. Six simulation trajectories were carried out (accumulative 300 ns- 50 ns per trajectory) through remote access to TACC supercomputer center using NAMD simulation software. Previous experimental work has shown that unlike ASIC3, the ASIC1 channel cannot be opened by the removal of calcium. Despite ASICs' highly conserved channel sequence, this characteristic difference between these two subtypes may be defined by one key residue: a glutamic acid residue found in ASIC3, position 429, versus a glycine residue in ASIC1. Introduction of G429E mutation opens the ASIC1 channel. Consistent with experimental observation, analysis via VMD visualization software revealed the G429E mutant has a wider channel opening than the WT. We further identified that this opening is facilitated by the electrostatic interaction of glutamic acid 429 and asparagine 65 of lateral chains. We identified key residue responsible for the distinct gating mechanisms of ASIC1 and ASIC3. Located at the lipoprotein interface, this key “gating” region of the pore may prove useful in the identification of novel pharmacological targets and understanding the differences in channel gating between ASIC1 and ASIC3. Novel applications are sought for the selective targeting of ASICs channel subtypes, as well as, targeting ASICs within specific regions of the body.
Allostery: An Overview of Its History, Concepts, Methods, and Applications
(PLOS, 2016-06-02) Liu, Jin; Nussinov, Ruth
The concept of allostery has evolved in the past century. In this Editorial, we briefly overview the history of allostery, from the pre-allostery nomenclature era starting with the Bohr effect (1904) to the birth of allostery by Monod and Jacob (1961). We describe the evolution of the allostery concept, from a conformational change in a two-state model (1965, 1966) to dynamic allostery in the ensemble model (1999); from multi-subunit (1965) proteins to all proteins (2004). We highlight the current available methods to study allostery and their applications in studies of conformational mechanisms, disease, and allosteric drug discovery. We outline the challenges and future directions that we foresee. Altogether, this Editorial narrates the history of this fundamental concept in the life sciences, its significance, methodologies to detect and predict it, and its application in a broad range of living systems.
An Investigation of the Allosteric Effects of Agonist and Antagonist Ligands on Sigma-1 Receptor using MD Simulation and Machine Learning Methods
(2022) Kumari, Pratibha; Liu, Jin
Purpose: Allosteric regulation is the control of the activity of a protein or protein complex by the binding of a ligand or effector molecule, at a site topographically distinct from the active site of the protein. The sigma-1 receptor (Sig1R), a small-ligand operated transmembrane protein, has been implicated in various neural processes such as calcium signalling, cell survival and function, inflammation, and synaptogenesis. Many small molecules act as agonist or antagonist ligands to Sig1R based on their ability to recapitulate the phenotype of receptor overexpression or knockdown, respectively. Sig1R exists in multiple oligomeric states, and agonist and antagonist are found to have a different impact on the oligomeric form of the receptor. The crystal structure of human Sig1R reveals that both agonist and antagonist ligands share the same binding pocket. However, why agonists and antagonists have distinct activities while binding to the same pocket remains unknown. It is also not clear why binding to a pocket not at the oligomer interface could allosterically affect oligomer formation of Sig1R. Our objective is to gain a molecular-level understanding of how agonist and antagonist ligands allosterically modulate the oligomer interactions differently. Method: An atomistic molecular dynamics (MD) simulation study was employed to investigate how the interface of homotrimer human Sig1R bound to agonist ((+)-pentazocine) and antagonist (PD 144418) ligands are allosterically affected. Machine learning algorithms developed by our lab were used to identify the residues that are impacted allosterically. Results: A significant decrease in the interactions between the interface residues of protomer units in agonist bound Sig1R has been found. MM/GBSA and PCA analysis reveal lowered stability of agonist-bound trimer in simulations compared to an antagonist-bound structure. The coordinated actions between the pocket and interface residues depend substantially on the type of ligands present in the binding pocket. The residue response map obtained using machine learning algorithms reflects that the properties of most of the interface residues (T141, H54, H55, G87, L111, H116, R119, A183, D188, S192, Q194, D195, and T198) are affected in different manners. Conclusion: It is shown that even though agonist and antagonist ligands bound at the same pocket, their ability to allosterically impact the interface residues is significantly different which may lead to lesser stability of high molecular weight oligomers in the agonist bound Sig1R. Our research presents a potential to collaborate MD and machine learning methods to identify the allosteric response of different ligands binding at the same pocket in protein.
Cas9-catalyzed DNA Cleavage Generates Staggered Ends: Evidence from Molecular Dynamics Simulations
(Springer Nature, 2016-11-22) Zuo, Zhicheng; Liu, Jin
The CRISPR-associated endonuclease Cas9 from Streptococcus pyogenes (spCas9) along with a single guide RNA (sgRNA) has emerged as a versatile toolbox for genome editing. Despite recent advances in the mechanism studies on spCas9-sgRNA-mediated double-stranded DNA (dsDNA) recognition and cleavage, it is still unclear how the catalytic Mg(2+) ions induce the conformation changes toward the catalytic active state. It also remains controversial whether Cas9 generates blunt-ended or staggered-ended breaks with overhangs in the DNA. To investigate these issues, here we performed the first all-atom molecular dynamics simulations of the spCas9-sgRNA-dsDNA system with and without Mg(2+) bound. The simulation results showed that binding of two Mg(2+) ions at the RuvC domain active site could lead to structurally and energetically favorable coordination ready for the non-target DNA strand cleavage. Importantly, we demonstrated with our simulations that Cas9-catalyzed DNA cleavage produces 1-bp staggered ends rather than generally assumed blunt ends.
Characterize the Pre-catalytic State of CRISPR/Cas9
(2020) Liu, Jin; Chen, Xiongping
Purpose: CRISPR-Cas9 has been widely used as a gene-editing tool, but its catalytic mechanism remains elusive. Cas9 has two catalytic domains, HNH domain and RuvC domain. In our previous study, we have successfully identified the pre-catalytic state of the HNH domain. The purpose of this study is to identify the pre-catalytic state of the RuvC domain, which may provide an understanding of the Cas9 catalytic mechanism and strategies for Cas9 engineering. Methods: Here we use molecular dynamic simulations to discover the pre-catalytic state of Cas9. The initial structure was obtained from our previous simulations, where the HNH domain was in the pre-catalytic state. We placed two Mg2+ ions and non-target DNA strand at the RuvC domain and performed molecular dynamics simulations to capture the pre-catalytic state of the RuvC domain. Results: Our molecular dynamics simulations revealed a pre-catalytic state of Cas9 that both target DNA strand and non-target DNA strand are posed to be catalyzed by Cas9. Conclusions: In this study, we identified the first atomic-level structure of CRISPR/Cas9 with the pre-catalytic state in both catalytic domains.
Computational Design of Compact CRISPR-Cas Enzymes of Lachnospiraceae bacterium Cas12a Utilizing Bioinformatic Tools
(2023) Mashburn, Dominic; Arachchige, Vindi; Liu, Jin
Purpose: Nature has provided us with a popular genome editing tool known as the CRISPR (clustered regularly interspaced short palindromic repeats)-Cas system, which has shown promise in both plants and animals. The CRISPR-Cas system utilizes a guide RNA (gRNA) and specific proteins known as Cas proteins to facilitate its function. A major limitation of the CRISPR/Cas system and any gene therapy is how it’s delivered within the organism. The most common "vehicle” for delivering gene therapies is adeno-associated viral vectors (AAVs), which have a maximum effective capacity of approximately 4.7 kb. The main issue with most Cas enzymes and other CRISPR components needed is that they are much bigger than this required maximum capacity. The most widely characterized CRISPR-Cas system is Cas9. However, the unique feature of Cas12a’s ability to process its own crRNA arrays without the requirement for tracrRNA makes it a promising candidate as well. In other CRISPR-Cas systems, the RNA CRISPR components need to be synthesized and packaged into an AAV, whereas in the Cas12a family, some of these components are not needed. Lachnospiraceae bacterium Cas12a (LbCas12a) has increased activity when compared to other species of Cas12a enzymes. To address the aforementioned size issue, we have used various bioinformatic tools to computationally design compact-size proteins of LbCas12a with similar functionality and comparable efficiency. Methods: The best available crystal structure of LbCas12a was chosen from the Protein Data Bank (PDB). A structure reduction process was carried out using Yasara and UCSF ChimeraX. The intermediate steps of this process were verified using the homology-based modeling tool SWISS-MODEL and AI-based modeling tool Alphafold2 to ensure that the protein was still folding similarly to the original structure. Furthermore, the global and local structural features were analyzed, and the best candidate was subjected to molecular dynamics (MD) simulations along with gRNA and substrate DNA to determine its functional efficiency under realistic dynamic conditions and compared it with the original structure. Results/Conclusions: A compact-size variant of LbCas12a was generated, which is 292 residues smaller than the original crystal structure. This man-made miniature protein contains all the regions that are needed for DNA cleavage activity. MD simulations confirm its stability in the presence of DNA and gRNA. Further validation of the designed protein and experimental testing is under investigation at this point of the study.
Computational Insights into Cas9 Conformational Activation and Specificity Enhancement
(2018-03-14) Liu, Jin; Zuo, Zhicheng
Over the past a few years, the biotechnology harnessing the microbial CRISPR/Cas systems has revolutionized the field of genome editing. The RNA-guided endonuclease Cas9 from Streptococcus pyogenes (SpCas9) can be programmed with a synthetic single guide RNA (sgRNA) to induce site-specific double-stranded DNA (dsDNA) cleavage. Despite recent progresses in deciphering the Cas9 structural and functional mechanisms, the knowledge of the Cas9 HNH nuclease domain catalytic state remains sparse, and it remains elusive as to how the catalytic Mg2+ affects the HNH domain conformational transition. A deeper understanding of Cas9 conformational activation and its action mechanism is of fundamental importance for guiding the improvement of Cas9-mediated genome-editing specificity and efficiency. Herein we report a cross-validated catalytic state of the Cas9 HNH domain poised for cutting the target DNA strand by means of two distinct molecular dynamics (MD) simulation strategies. We note that the derived model has been in good agreement and rationalized by various available experiments. Moreover, we demonstrate the essential roles of Mg2+ for the cleavage-state formation and stability. Importantly, our study suggests additional promising mutation sites on Cas9 that could be exploited for rationally engineering more Cas9 variants with enhanced specificity.
Design of Man-made Miniature CRISPR-Cas Proteins Using Computational and Artificial Intelligence Technologies
(2023) Jayasinghe-Arachchige, Vindi; Madugula, Sita Sirisha; Nammi, Bharani; Nukala, Nihitha; Wang, Shouyi; Liu, Jin
Purpose: The CRISPR/Cas system is a popular genome editing technique that uses a guide RNA and specific proteins known as Cas proteins for its function. A major challenge in harnessing CRISPR-Cas technology for applications in living organisms is the lack of an efficient delivery system. Due to the larger size of available Cas proteins used in this tool, it is challenging to encapsulate the CRISPR components into a single vehicle for delivery. To address this issue, we have used computational and Artificial Intelligence (AI) tools on designing compact-size Cas proteins that have a similar function and are more efficient than available Cas proteins. Methods: The available crystal structures of the smallest CRISPR-Cas systems were utilized and further reduced. A novel method termed the "Blocks and Gaps approach” was employed to design new mini-Cas proteins with a size range of 450-500 amino acids in length. The generated protein sequences (1 million) were subsequently used in machine learning-based two classification models to filter out the non-Cas proteins from it. The resultant Cas protein sequences were used in homology-modeling-based (Swiss-Model) and AI-based (Alphafold2) protein structure prediction methods to obtain their 3D structures. Further, the global and local structural features as well as the solubility of these proteins were analyzed, and top candidates were subjected to molecular dynamics (MD) simulations including substrate DNA and gRNA. Results/Conclusions: A library of man-made miniature Cas proteins was generated, and these proteins are less than half the size of the widely used CRISPR-Cas such as Cas9 or Cas12a. 50% of these were predicted as Cas proteins by both the machine learning-based classification models used. And 90% of them show similar 3D structures as their original counterparts. 10% of these passed through the final validations. Experimental testing of the activity of these designed proteins is to be investigated at this point of the study.
Design of man-made miniature CRISPR-Cas systems using computational technologies
(2022) Arachchige, Vindi Mahesha Jayasinghe; Liu, Jin
Purpose: An RNA-guided targeted genome engineering platform, CRISPR/Cas system is one of the breakthroughs of the twenty-first century. Despite the wealth of its advancement, there are some associated limitations that need to be overcome for the betterment of this revolutionized technology. Among them, the larger size of the available Cas proteins that are essential for the functioning of these tools limits their in vivo administration due to the low delivery efficiency. To address this issue, we have used computational chemistry tools to design smaller versions or compact size Cas proteins that can be used as an alternative. Methods: The available crystal structures of CRISPR-Cas systems were utilized and the reduction was done preserving the regions that are essential for the DNA binding and cleavage functions using Chimera, Yasara, and the Swiss Model software. Molecular Dynamics (MD) simulations were performed to obtain stable conformations of the reduced structures. The minimized sequences were used to generate their structures by the Swiss Model. Results/Conclusions: Four stable man-made miniature Cas proteins were generated that are less than half the size of the currently used CRISPR systems such as Cas9 or Cas12a. The sequence-based modeling studies using the Swiss model have shown the similar folding of these reduced proteins compared to their original counterparts. Further experimental validation of their ds-DNA cleavage activities remains to be determined at this point of the study.
Design of mini Cas9 proteins using computational tools
(2023) Artiles, Maria; Jayasinghe Arachchige, Vindi; Liu, Jin
Purpose: Adeno-associated viral (AAV) vectors are routinely used for the delivery of CRISPR-Cas systems. These vectors can only package molecules up to ~4.7 kilobase pair (kbp) in size. The most widely used Cas protein is Streptococcus pyogenes Cas9 (SpCas9), which is comprised of 1368 amino acid (aa) residues and is 4.3 kbp in size. Therefore, delivery of the CRISPR-Cas9 and the guide RNA (gRNA) requires the use of two separate vectors, which decreases the overall effectiveness of the system. In this study we used computational tools to facilitate the design of mini Neisseria Meningitidis Cas9 (Nme1Cas9) nucleases, of 900 aa in length or less, that can be packed with its associated guide gRNA in a single vector. Nme1Cas9 is a promising system, given that in its wild-type conformation, it already is 286 aa residues smaller than the widely used SpCas9[LJ1] ; and it has shown promising effectiveness in mammalian cells. Additionally, Nme1Cas9 also has a longer spacer derived guide sequence than other orthologs, and a longer protospacer adjacent motif (PAM[LJ2] )consensus, which reduces the propensity to off-target effects. For these reasons, Nme1Cas9 provides an ideal starting point for the development of engineered mini Cas9 protein, allowing us to exploit its natural features and optimize them with the use of computational tools including artificial intelligence (AI) and machine learning. Methods: We used The Protein Data Bank (PDB) database, UniProt database, and ChEMBL database to obtain the sequences and crystal structures of Cas orthologs and their associated gRNA and DNA sequences. Next, we identified the known DNA and RNA interacting residues of Nme1Cas9 from the available literature. Sequence alignments were performed with CLUSTAL OMEGA. Structural visualizations and reductions were performed with ChimeraX and Yasara software. AlphaFold2 was used for 3D structure prediction and molecular dynamics (MD) simulations were used to determine the stability of designed proteins. Results/Conclusions: We generated a library of mini Nme1Cas9 sequences that are less than 900 aa in length. The AI based modeling studies using the Alphafold2 have shown similar folding of these mini Cas proteins compared to their original counterparts. MD simulations confirm their stability in the presence of DNA and gRNA. Further validation of the designed proteins and their experimental testing is under investigation at this point of the study. Keywords: CRISPR-Cas9, Cas9, mini Cas9, Cas9 orthologs, AI.
Determining the binding site of Carisoprodol on GABAA receptor
(2020) Liu, Jin; Huang, Renqi; Claudio, Maria; Hayatshahi, Sayyed
Purpose: Carisoprodol (CSP) is prescribed to treat musculoskeletal pain. CSP exerts inhibitory action on GABAA receptors (GABAA Rs) in certain concentrations. However, its binding sites remain elusive. The purpose of this study is to determine the binding site for CSP's inhibitory action on GABAA Rs. Our electrophysiological studies have shown that CSP inhibitory action is diminished by alpha1 T261F mutation of the picrotoxin (PTX) binding site. Therefore, we hypothesize that CSP shares PTX's binding site at GABAA Rs. Methods: We docked CSP on wild type alpha1beta2gamma2 and mutant alpha1(T261F)beta2gamma2 GABAA Rs using Glide program. We further performed molecular dynamics (MD) simulations of wild type and mutant GABAA Rs in unbound forms and in complex with PTX and CSP. Results: The docking reproduced the experimental pose of PTX and the effect of mutations on its binding, but could not predict the effect of the mutations on the CSP binding. However, the MD simulations showed that the local channel conformation is changed upon the mutations, and consequently, the binding of both ligands is significantly deteriorated. We further used the observed receptor-ligand interactions of CSP to predict molecular changes that would improve its binding. Conclusions: We demonstrate that the consideration of the pocket dynamics is necessary to capture the changes mutations potentially cause in GABAA Rs. The similar trend for CSP and PTX in MD simulation results validate our hypothesis that the two molecules share the same binding pocket. These data provide further information on how CSP may interact with the receptors.
Development of a Machine Learning Model to Design Target-specific Ligands
(2022) Mathew, Ezek; Liu, Jin; Wang, Duen-Shian; Liu, Kevin
Background: As the estimated cost required to bring a drug to market ranges from $314 million to $2.8 billion, drug discovery is undoubtedly a lengthy and expensive process. Additionally, completion of Phase 3 trials does not guarantee FDA approval. For most drugs, the probability of receiving FDA approval ranges from 9% to 14%, depending on the time period. Therefore, researchers have turned to machine learning (ML) to decrease the burden of drug discovery for multiple targets. In the central nervous system (CNS), the metabotropic glutamate receptor subtype 2 (mGlu2) and metabotropic glutamate receptor subtype 3 (mGlu3) play various roles in normal physiology. Therefore, ligands of these receptors pose potential for the treatment of various pathologies, such as Alzheimer's disease, schizophrenia, and other neurological disorders. Currently, no literature exists referencing a machine learning model that is capable of distinguishing drug ligands based on their affinity to mGlu2 or mGlu3. To fill this gap in knowledge, we will design a machine learning algorithm capable of making associations across the entire data set, identifying patterns that the human eye cannot detect. Methods: We utilized a dataset which included two dimensional (2D) images of drug ligands belonging to two classes, mGlu2 or mGlu3. The images were resized, then converted into grayscale and subsequently processed as a numerical NumPy array with their associated labels. Convolutional Neural Network (CNN) and Functional API architecture were tested to determine the optimal model. Hyperparameter optimization occurred throughout this process. Results: The CNN and Functional API both reached 100% accuracy within 20 epochs, successfully classifying ligands as mGlu2 or mGlu3 based on 2D structure alone. However, the Functional API reached 100% accuracy in under 5 epochs, yielding superior performance when compared to the CNN. Conclusion: While the CNN is one of the most popular ML architectures for image classification, the Functional API can perform a similar role. As datasets expand, it may be beneficial to consider more efficient models, especially for image classification in the realm of drug discovery.
Development of a mobile-app to enable communication between patients, pharmacists and physicians
(2020) Liu, Jin; White, Annesha; Srinivasan, Meenakshi
Purpose- Today, healthcare in the United States is a highly fragmented system with failure of care-coordination estimated to cost the healthcare system between $27.2 billion to $78.2 billion. Fax and phone based communication between pharmacists and physicians often results in delays patient care. We aim to develop "HealthKnect", a web-based integrated platform to facilitate communication to help patients, pharmacists and providers solve delays and miscommunication. Methods- Market-research was conducted to identify available mobile applications focused on patient engagement with physicians and pharmacists. The desirable features of the proposed application were identified. A wire-frame was developed to design the interface of the application. The next step involves development of a functional web-based application with the help of a sub-contractor. Finally, the application will be pilot tested among various stakeholders to gather feedback and determine proof-of-concept. Results- The applications identified in the market-research fell in one of the following categories- (1) Tele-medicine and communication (2) E-prescription (3) prescription discounts and (4) Patient EHR portals. There was a lack of integrated platforms cutting across health systems and EHR barriers where patient-initiated communication could take place between physicians and pharmacists. Conclusion- HealthKnect serves to eliminate the "silo-approach" to healthcare delivery which leads to poor information flows due to lack of care-coordination between physicians and pharmacists. The application serves to empower patients and include pharmacists in digital collaboration for patient-care. The functional application is in the process of being developed.
Identification of Family-Specific Features in Cas9 and Cas12 Proteins: A Machine Learning Approach Using Complete Protein Feature Spectrum
(Cold Spring Harbor Laboratory, 2024-02-08) Madugula, Sita S.; Pujar, Pranav; Bharani, Nammi; Wang, Shouyi; Jayasinghe-Arachchige, Vindi M.; Pham, Tyler; Mashburn, Dominic; Artilis, Maria; Liu, Jin
The recent development of CRISPR-Cas technology holds promise to correct gene-level defects for genetic diseases. The key element of the CRISPR-Cas system is the Cas protein, a nuclease that can edit the gene of interest assisted by guide RNA. However, these Cas proteins suffer from inherent limitations like large size, low cleavage efficiency, and off-target effects, hindering their widespread application as a gene editing tool. Therefore, there is a need to identify novel Cas proteins with improved editing properties, for which it is necessary to understand the underlying features governing the Cas families. In the current study, we aim to elucidate the unique protein attributes associated with Cas9 and Cas12 families and identify the features that distinguish each family from the other. Here, we built Random Forest (RF) binary classifiers to distinguish Cas12 and Cas9 proteins from non-Cas proteins, respectively, using the complete protein feature spectrum (13,495 features) encoding various physiochemical, topological, constitutional, and coevolutionary information of Cas proteins. Furthermore, we built multiclass RF classifiers differentiating Cas9, Cas12, and Non-Cas proteins. All the models were evaluated rigorously on the test and independent datasets. The Cas12 and Cas9 binary models achieved a high overall accuracy of 95% and 97% on their respective independent datasets, while the multiclass classifier achieved a high F1 score of 0.97. We observed that Quasi-sequence-order descriptors like Schneider-lag descriptors and Composition descriptors like charge, volume, and polarizability are essential for the Cas12 family. More interestingly, we discovered that Amino Acid Composition descriptors, especially the Tripeptide Composition (TPC) descriptors, are important for the Cas9 family. Four of the identified important descriptors of Cas9 classification are tripeptides PWN, PYY, HHA, and DHI, which are seen to be conserved across all the Cas9 proteins and were located within different catalytically important domains of the Cas9 protein structure. Among these four tripeptides, tripeptides DHI and HHA are well-known to be involved in the DNA cleavage activity of the Cas9 protein. We therefore propose the the other two tripeptides, PWN and PYY, may also be essential for the Cas9 family. Our identified important descriptors enhanced the understanding of the catalytic mechanisms of Cas9 and Cas12 proteins and provide valuable insights into design of novel Cas systems to achieve enhanced gene-editing properties.
Identification of New Allosteric Modulators for the mGlu2 Receptor by using a Ligand-based Drug Discovery Approach
(2023) Nguyen, Trong; Kumari, Pratibha; Mathew, Ezek; Liu, Jin
Purpose: The human mGlu receptors are G protein-coupled receptors located within the central nervous system. These receptors normally bind to glutamate, which is the primary excitatory neurotransmitter in the body. The receptors can then assist in modulating the transmission of excitatory signals within the brain. These characteristics help to make the mGlu2 receptor a potential, novel target for future drug development, particularly for the treatment of certain neurologic or neuropsychiatric disorders, such as schizophrenia or depression. However, most allosteric ligands bind non-selectively on both mGlu2 and mGlu3 receptors. A pharmacological tool that assists with distinguishing ligands specific to mGlu2 and mGlu3 receptor subtypes will be pivotal to speed-up the drug discovery process. Our purpose in this study is to find novel ligands of potential allosteric modulators for the mGlu2 receptor by using already identified modulators through a ligand-based drug designing approach. Methods: The potential allosteric ligands for the mGlu2 receptor were obtained by performing similarity searches on the online databases, ZINC and Drugbank. The original compounds used as the basis for the similarity searches came from a previously compiled list of Top 39 ZINC mGlu2 ligands (from the Liu Lab). Once the ligands were downloaded, they were converted into the appropriate file formats for molecular docking. Due to time constraints, it was decided that we would only dock the compounds whose original ligands had <10 results obtained from similar searching through ZINC. The selected ligands were then docked using Autodock Vina and visualized using Pymol. The Top 3 ligands were then determined based on their presence within the mGlu2 allosteric binding pocket and their predicted binding affinity for the receptor. Additionally, these ligands were also analyzed using a previously developed machine learning model. Specifically, the machine learning model would predict mGlu2 ligand likeness and binding affinity for each of the obtained ligands. Results: A total of 1507 allosteric ligands were obtained for the mGlu2 receptor through the similarity searches. Machine learning model analysis of the similar ligands deemed that 88.89% of them were more likely to be mGlu2 ligands. Additionally, 83.50% of the ligands were deemed to have a high predicted binding affinity for the mGlu2 receptor. A total of 46 compounds were docked to the mGlu2 receptor using Autodock Vina, and their predicted binding affinities were obtained. The Top 3 similar ligands for the mGlu2 receptor, listed in order, exhibited binding affinities of -12.5 kcal/mol, -12.3 kcal/mol and -11.0 kcal/mol. Conclusion: We were able to identify 1507 potential ligands for the mGlu2 receptor through similarity searches. Through further molecular docking of 46 of the similar ligands, we have determined three specific allosteric ligands for the mGlu2 receptor that are comparable or slightly better to their original counterparts. However, we believe additional research and investigation is required for validation of their potential efficacy. Future studies should involve analysis of the specific protein-ligand interactions that exist between the mGlu2 receptor and the three similar allosteric ligands, followed by comparison with the interactions present in their original counterparts.
Identification of Potential Positive Allosteric Modulators of Sigma-1 Receptor using Computational Molecular Docking and Virtual Screening
(2022) Olson, Zachary Gunnar; Kumari, Pratibha; Liu, Jin
Purpose: Coronaviruses (such as SARS-COV-2) can achieve replication in host cells by activating pathways in the endoplasmic reticulum (ER), which causes ER stress. As it is known that the mortality rate of elderly populations in COVID-19 infection is dramatically high, indicating a vital role in the timely response of cell stress response signaling pathways in the management of the treatment of COVID-19. The sigma-1 receptor (Sig1R) is an important upstream modulator of ER stress, which regulates folding/degradation of proteins, Ca+2 homeostasis, ER stress responses, and cellular survival. Therefore, ligands enhancing Sig1R activities may improve the treatment of COVID-19 of the elderly patients. Positive Allosteric Modulators (PAM) can enhance protein activities by binding at an allosteric site. Several PAMs of Sig1R have been reported. However, the molecular basis of interactions of PAMs in Sig1R is poorly understood. Further, we do not have much information about the allosteric binding sites in Sig1R yet. Our purpose in this research is to identify possible chemical scaffolds/compounds that can bind at the allosteric sites of Sig1R and selectively elicit the activity of Sig1R. Method: In this study, we have assessed several known PAMs of Sig1R to investigate their binding affinity, the molecular basis of their interactions at three possible allosteric binding sites in Sig1R using the efficient docking suite, Glide. In addition to this, we explored ZINC and DRUG bank databases to search for compounds/chemical scaffolds that are similar to PAMs, which can be docked and engineered further to get a highly efficient drug target/PAM of Sig1R. Results: We have found that methylphenylpiracetam, SKF38393, and SCH23390 show high affinity for allosteric pockets. Further, by virtual screening of small drug-like compounds of the ZINC database in Auto Dock Vina, we obtained a list of 1000 compounds for each allosteric pocket of Sig1R. In the next step, we plan to continually refine our search by performing docking of these compounds and the compounds we obtained through ligand-based search in Glide to identify the promising set of compounds that bind efficiently at an allosteric site in Sig1R. Conclusion: Using molecular docking, we have found three compounds methylphenylpiracetam, SKF38393, and SCH23390 that bind to Sig1R at the allosteric pockets with high binding affinities and identified a list of 1000 compounds for each potential allosteric sites, shedding light on the further development of selective PAMs of Sig1R.
Identification of Triple Negative Breast Cancer Epidemiological Risk Factors for African American Women
(2020) Del Valle, Michelle; Hayatshahi, Sayyed; Liu, Jin; Radler, Charlene; Morid, Mohammad; Calcagno, Alexa
Purpose: Triple Negative Breast Cancer (TNBC) is a basal breast cancer subtype lacking progesterone receptor, estrogen receptor, and HER2 gene expression. TNBC has a poor prognosis with limited treatment options, and it occurs at higher rates in African American women compared to White American women. The purpose of this project was to classify race-dependent socioeconomic risk factors and to initiate a guideline for TNBC screening in African American women. Methods: We used the data from the Surveillance, Epidemiology, and End Results Program (SEER) of the National Cancer Institute. Our dataset included 35,976 female TNBC patients with 441 attributes documented in the United States. Top attributes for TNBC were specified for female African American patients through applying machine learning principles and optimization algorithms. Traditional statistical techniques were used to validate the key risk factors identified via machine learning. The top attributes identified represent socioeconomic, phenotypic, and epidemiologic factors. Results: The top factors related to African American TNBC patients determined in machine learning and descriptive statistics included marital status at diagnosis, age at diagnosis, and insurance status. Specifically, single status, lack of private insurance, and younger age at diagnosis corresponded to worse prognosis and increased mortality. Conclusions: Based on the top attributes identified, African American women would benefit from screening for TNBC beginning in their mid-late 40s, with particular attention given to single and uninsured women. Such screening methods might correspond to earlier diagnosis, prevention, and treatment and decrease rates of incidence and mortality of TNBC in African American women.
Identifying Top Gene Contributors to Triple Negative Breast Cancer Health Disparities Among African American Women: A Machine Learning Approach
(2019-03-05) Liu, Jin; Hayatshahi, Hamed; Morid, Mohammad Amin; Green, Amyia; Fluker, Kenneth Jr.; Ahuactzin, Emilio; Radler, Charlene
Purpose: Triple Negative Breast Cancer (TNBC) is a breast cancer subtype which multiple studies have shown to be disproportionately prevalent among premenopausal African American women. The factors contributing to the TNBC health disparities remain unclear. Methods: Here, we developed a highly accurate, reproducible machine learning classification model that used patient gene expression values as predictor attributes to classify 100 TNBC patients as either African American or non-African American. Results: By using weighting methods and comparison of classification performance at varying levels of attributes, our study identified a subset of genes able to accurately classify TNBC patients by race. Intriguingly, the top genes of this subset are linked to diabetes, indicating that diabetes may associate with the TNBC health disparities. Conclusions: Our study demonstrated the factors contributing the TNBC health disparities and provided a subset of genes that may be targetable for precision medicine development to address disparity of TNBC among the African American female population.
Leveraging Graph Attention Mechanisms to Create an Explainable Multi-Function Machine Learning Model
(2024-03-21) Mathew, Ezek; Madugula, Sita Sirisha; Emmitte, Kyle; Liu, Jin
Purpose: Identifying target-specific ligands is a difficult task, especially in cases where receptors display high structural similarity. Such is the case for metabotropic glutamate receptor subtype 2 (mGlu2) and metabotropic glutamate receptor subtype 3 (mGlu3), which are prime targets for various neurological treatments. However, signal transduction through these two receptors often yields opposing physiological function and differentially affect pathologies. Methods: Understanding the need to differentiate ligands based on their binding to mGlu2 and mGlu3, we employed a machine learning (ML) approach. The ML model performed three distinct tasks and leveraged transfer learning to inform each subsequent task. Task 1: Simple Classification was performed, as the ML model predicted if the ligands displayed selectivity for the mGlu2 or mGlu3 class. Task 2: Regression was performed, as the ML model estimated the IC50 values of individual input ligands. The classification weights from Task 1 were broadcasted into the attention layers of the ML model for Task 2, serving as a starting point. Task 3: Classification was performed, as the ML model sought to determine if a ligand displayed low or high potency for the target class. Classification weights and regression weights from previous tasks were broadcasted into the model. Results: The model yielded greater than 99% accuracy in the selectivity classification task, while also delivering satisfactory performance when predicting potency (72.80% error). The model yielded 83% accuracy in correctly identifying high potency mGlu2 ligands, as high potency mGlu2 compounds. Meanwhile, the algorithm displayed 75% accuracy in correctly identifying high potency mGlu3 ligands, as high potency mGlu3 compounds. Conclusions: This approach allows for prediction of multiple target properties using a single model. With access to other high-quality datasets, this model has the potential to apply to other ligand classes of interest, posing great potential for drug repurposing studies.
Ligand Identification for the Orthosteric site of Sigma 1 Receptor using Computational Molecular Docking and Virtual Screening Methods
(2023) Olson, Zachary; Liu, Jin; Kumari, Pratibha
Purpose: The 𝝈1 Receptor (Sig-1R) is a ligand operated membrane protein resides in the mitochondria-associated-membranes of the endoplasmic reticulum (ER). At the molecular level, Sig-1R has several important roles in cellular homeostasis, including Ca2+ regulation, and helping chaperone the unfolded protein response. This ER stress has been found to be one of the factors leading to cytokine storm and clinical deterioration in patients with a coronavirus infection, leading to an interest in drugs which modulate the response of the Sig-1R for treatment of COVID-19. These receptors are found throughout the CNS as well as the periphery, explaining its wide range of effects throughout the body. At the organ level, studies conducted on the Sig-1R have implicated its involvement in neurodegenerative diseases such as Parkinson’s and Alzheimer’s Disease, cardiac diseases such as heart failure and cardiovascular disease (CVD), and major depressive disorder. This implication in a wide variety of disease states means it has a large potential as a drug target. Our study's purpose is to identify novel potential drug candidates at the orthosteric binding site of Sig-1R with high binding affinity, specificity, and favorable PK parameters using a structure-based drug design approach. Method: Prior work in this lab found a list of the top 1000 orthosteric ligands by docking Sig-1R against libraries containing 9,270 small drug-like molecules using the TACC drug discovery tool. These libraries were extracted from the ZINC database. From this list of 1000 compounds, we selected the top 130 compounds (binding affinity cut-off ≥ -11.0 kcal/mol) and re-docked against the Sig-1R using the efficient docking suite Glide in Maestro. Also, we analyzed the pharmacokinetic/ADME parameters of these compounds using SwissADME, identifying possible candidates to use as our scaffold to try and design a ligand with even stronger binding affinity to the orthosteric site of Sig1R. Furthermore, we docked 130 compounds with the Dopamine Receptor D2 (D2R) to analyze their specificity for the Sig-1R. Results: Using an extra precision molecular docking in Glide, we found that our molecule 106 (-12.88 kcal/mol), molecule 105 (-12.83 kcal/mol), and molecule 100 (-12.29 kcal/mol) all had very high binding affinity for the Sig-1R. Both Molecules 105 and 100 had favorable PK parameters, as both were estimated to be BBB permeable, as well as not breaking any aspects of Lipinski’s Rule of Five. Molecule 100 was also found to have relatively low binding affinity (-7.6 kcal/mol) for the D2R. Conclusion: Using our computational molecular docking methods, we have identified molecule 100 as a ligand with strong affinity and specificity for the Sig1R, as well as favorable PK parameters. This could be a strong candidate to use as a chemical scaffold to develop a ligand with even stronger binding affinity for Sig-1R, which can eventually go on to in-vitro assays to confirm activity.