Molecular Genetics

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12503/32085

Browse

Recent Submissions

Now showing 1 - 5 of 5
  • Item
    Computational Design of Compact CRISPR-Cas Enzymes of Lachnospiraceae bacterium Cas12a Utilizing Bioinformatic Tools
    (2023) Mashburn, Dominic; Arachchige, Vindi; Liu, Jin
    Purpose: Nature has provided us with a popular genome editing tool known as the CRISPR (clustered regularly interspaced short palindromic repeats)-Cas system, which has shown promise in both plants and animals. The CRISPR-Cas system utilizes a guide RNA (gRNA) and specific proteins known as Cas proteins to facilitate its function. A major limitation of the CRISPR/Cas system and any gene therapy is how it’s delivered within the organism. The most common "vehicle” for delivering gene therapies is adeno-associated viral vectors (AAVs), which have a maximum effective capacity of approximately 4.7 kb. The main issue with most Cas enzymes and other CRISPR components needed is that they are much bigger than this required maximum capacity. The most widely characterized CRISPR-Cas system is Cas9. However, the unique feature of Cas12a’s ability to process its own crRNA arrays without the requirement for tracrRNA makes it a promising candidate as well. In other CRISPR-Cas systems, the RNA CRISPR components need to be synthesized and packaged into an AAV, whereas in the Cas12a family, some of these components are not needed. Lachnospiraceae bacterium Cas12a (LbCas12a) has increased activity when compared to other species of Cas12a enzymes. To address the aforementioned size issue, we have used various bioinformatic tools to computationally design compact-size proteins of LbCas12a with similar functionality and comparable efficiency. Methods: The best available crystal structure of LbCas12a was chosen from the Protein Data Bank (PDB). A structure reduction process was carried out using Yasara and UCSF ChimeraX. The intermediate steps of this process were verified using the homology-based modeling tool SWISS-MODEL and AI-based modeling tool Alphafold2 to ensure that the protein was still folding similarly to the original structure. Furthermore, the global and local structural features were analyzed, and the best candidate was subjected to molecular dynamics (MD) simulations along with gRNA and substrate DNA to determine its functional efficiency under realistic dynamic conditions and compared it with the original structure. Results/Conclusions: A compact-size variant of LbCas12a was generated, which is 292 residues smaller than the original crystal structure. This man-made miniature protein contains all the regions that are needed for DNA cleavage activity. MD simulations confirm its stability in the presence of DNA and gRNA. Further validation of the designed protein and experimental testing is under investigation at this point of the study.
  • Item
    Machine Learning Based Classification of CRISPR-Cas Proteins Using Complete Protein Spectrum
    (2023) Madugula, Sita Sirisha; Arachchige, Vindi Mahesha Jayasinghe; Pham, Tyler; Nammi, Bharani; Wang, Shouyi; Liu, Jin
    Purpose: Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and its associated (Cas) proteins together form the CRISPR-Cas system. The CRISPR-Cas system typically forms the machinery for innate defense mechanism in prokaryotes against foreign genetic elements such as phages and plasmids. The recent development of this mechanism into a gene editing technology holds a promise to correct gene level defects for several genetic diseases. The key element of CRISPR-Cas system is the Cas protein that are nucleases and possess the ability to edit gene of interest. Different types of Cas proteins are involved in different CRISPR-Cas systems. Cas proteins however suffer from inherent limitations like specificity and off-target effects which limits its widespread application as a gene editing tool. In the current study, a novel method has been developed for classifying the Cas9 and Cas12 families. Existing classification tools have a low overall accuracy and are usually built using only a few types of protein features. We also attempt to understand the different protein features governing the Cas9 and Cas12 classes using a multitude of protein features. Method: We built Random Forest (RF) binary classifiers to classify Cas12 and Cas9 proteins respectively using the complete spectrum of protein features (13,495 features) encoding the physiochemical, constitutional, and evolutionary information. Additionally, we also built multiclass RF classifiers that differentiates between Cas9, Cas12 and non-Cas proteins. The performance of all models was evaluated using a 5-fold cross validation and six evaluation metrices like accuracy, precision, recall, F1-score, AUC score and specificity. We also tested our models on the respective independent datasets that were developed in-house from various public domain databases. Results: The Cas12 and Cas9 models achieved a high overall accuracy of 0.97 and 0.96 on their independent datasets respectively while the multiclass classifier achieved a high F1 score of 1.0. We observed that amino acid composition, Qasi-sequence-order and Composition-based protein features are particularly important for the Cas12 and Cas9 family of proteins. Conclusions: We successfully built the classification models for Cas12 and Cas9 protein families and identified the protein features that are unique to each family, which enhance the understanding of the structure and functions of Cas9 and Cas12 proteins and also provide valuable insights into plausible structural modifications in these proteins to achieve enhanced specificity and reduced off-target effects.
  • Item
    Population-specific mtDNA indices of mitochondrial stress associated with Alzheimer’s disease in Mexican Americans and Non-Hispanic Whites
    (2023) Gorham, Isabelle; Reid, Danielle; Sun, Jie; Barber, Robert C.; Phillips, Nicole R.
    Purpose: Alzheimer’s disease (AD) is the most prevalent form of dementia and is one of America’s leading causes of death. Age is known to be the biggest risk factor for AD, and Mexican Americans are one of the fastest-aging populations in America. Mitochondrial stress and dysfunction are key players in the progression of AD and are also known to be impacted by lifestyle and environmental exposures/stressors. Mitochondrial dysfunction can cause the release of mitochondrial DNA (mtDNA) extracellularly, which can be detected in the peripheral blood (i.e., plasma). MtDNA copy number within the cell can also serve as an indicator of overall mitochondrial health, biogenesis, and/or mitophagy. This project hopes to identify population-specific differences in mitochondrial stress and dysfunction detectable in the blood and identify any relationship between AD risk factors and cognitive impairment. This data may help to further elucidate the role that mtDNA may be playing in population-specific Alzheimer's disease pathogenesis. Methods: DNA was extracted from 200uL of participant plasma and buffy coat using the Mag-Bind® Blood & Tissue DNA HDQ 96 kit (Omega Bio-tek) according to the manufacturer’s specifications. mtDNA and nuDNA copy number was assessed through absolute quantitative PCR (qPCR), targeting the mitochondrial minor arc (MinArc), and the nuclear-encoded beta-2-microglobulin gene (B2M). Data was stratified by population and sample type and linear regressions were performed to adjust for batch effects and identify factors that may influence this phenotype of mitochondrial dysfunction. Results: Population-specific differences in factors contributing to the mtDNA phenotype were observed at the p < 0.05 level. In the Mexican American cohort, there was a significant relationship between cellular mtDNA:nuDNA ratio (quantified from buffy coat) and BMI, Clinical Dementia Rating Sum of Boxes score (CDRSum), and education. Further, there was a relationship between cell-free mtDNA copy number (quantified from participant plasma), education, and CDRSum. In the non-Hispanic white cohort, there was a significant relationship between cellular mtDNA:nuDNA ratio (from buffy coat) and both age and CDRSum. Age was associated with cell-free mtDNA in the non-Hispanic white cohort. Conclusions: Evidence supports that there are population-based differences in which factors may be predictive of this blood-based phenotype of mitochondrial dysfunction. Mexican American populations seem to be more heavily influenced by environmental factors (BMI and education) whereas the non-Hispanic white population seems to be more heavily influenced by non-environmental factors (age). There also seems to be an indication of a relationship between these indicators of mitochondrial dysfunction and AD-related cognitive impairment (when measured through the CDR sum of boxes score).
  • Item
    Clinical significance of Annexin A2 overexpression in kidney renal clear cell carcinoma
    (2023) Joseph, Matthew; Chaudhary, Pankaj
    Purpose: Invasion and metastasis led to poor prognosis and death of kidney renal clear cell carcinoma (KIRC) patients. In this study, we focus on the characterization of Annexin A2 (AnxA2), which plays an essential role in cell growth, angiogenesis, migration, invasion, and metastasis. Although the role of Annexin A2 (AnxA2) has been studied in many cancers, its function in KIRC is still unexplored. Therefore, in this study, we investigated the AnxA2 expression in tumor tissues of KIRC patients to determine its association with disease characteristics. Methods: We utilized data from The Cancer Genome Atlas (TCGA) to observe AnxA2 gene expression in KIRC and its association with survival. Additionally, immunohistochemical (IHC) analysis was performed to examine the AnxA2 expression in tumor tissues of KIRC patients. Results: In our analysis of TCGA data, AnxA2 mRNA expression was found significantly higher in KIRC tumor tissues compared to the adjacent noncancerous tissues. In addition, AnxA2 expression was significantly associated with higher tumor stage and grade. The high expression of AnxA2 in KIRC patients was significantly correlated to decreased survival [hazard ratio (HR), 1.75; 95% confidence interval (CI), 1.29 - 2.36; p = 0.00023] as compared to low expression. In addition, our IHC staining suggests that AnxA2 was overexpressed in tumor tissues of KIRC patients compared with adjacent noncancerous tissues. Conclusion: AnxA2 is overexpressed in KIRC tumor tissues, and has a direct relationship to the advanced clinicopathological variables and adverse prognosis associated to patients with KIRC.
  • Item
    Symptomatic Carrier Frequency of Familial Mediterranean Fever
    (2023) Nguyen, Linh
    Purpose: Familial Mediterranean Fever (FMF) is a genetic disorder that is characterized by recurrent episodes of fever and inflammation of the serous membrane affecting mainly Mediterranean and Middle Eastern populations. Five founder mutations of the Mediterranean fever (MEFV) gene, M694V, M694I, V726A, M680I, and E148Q, account for the majority of FMF cases. The disease is considered an autosomal recessive disorder; however, cases of carriers of oneMEFVmutation have been reported to have FMF-like symptoms. The purpose of this review is to investigate the common symptoms of manifesting carriers and determine the symptomatic carrier frequencies for FMF. Methods: A comprehensive literature search was carried out utilizing three electronic databases (PubMed, ClinVar, and OMIM). The study included published case reports and cohort studies that evaluated FMF carriers. All the included studies underwent assessment and data extraction and analysis. Results: Data and clinical presentations from 7 studies that met the inclusion criteria were identified. A range of symptomatic carrier frequencies was determined for different FMF-causing mutations, with M694V being the most common at 0.82%, followed by E148Q at 0.37%, V726A at 0.17%, and M680I at 0.16%. The most common symptoms found were abdominal pain, fever, chest pain, and arthritis, with arthritis being the most prevalent symptom among the carriers. None of the carriers developed amyloidosis, a serious complication associated with FMF. Conclusions: The results data highlight the existence of a substantial group of FMF patients who possess only oneMEFVmutation. These findings have important implications for the medical practice and genetic counseling for FMF patients, especially those from classically affected populations. The results also suggest that detection of a single mutation in conjunction with clinical symptoms appears to be adequate and colchicine treatment should be considered.