Browsing by Subject "Algorithms"

Now showing 1 - 4 of 4

A Continuous Statistical Phasing Framework for the Analysis of Forensic Mitochondrial DNA Mixtures
(MDPI, 2021-01-20) Smart, Utpal; Cihlar, Jennifer Churchill; Mandape, Sammed N.; Muenzler, Melissa; King, Jonathan L.; Budowle, Bruce; Woerner, August E.
Despite the benefits of quantitative data generated by massively parallel sequencing, resolving mitotypes from mixtures occurring in certain ratios remains challenging. In this study, a bioinformatic mixture deconvolution method centered on population-based phasing was developed and validated. The method was first tested on 270 in silico two-person mixtures varying in mixture proportions. An assortment of external reference panels containing information on haplotypic variation (from similar and different haplogroups) was leveraged to assess the effect of panel composition on phasing accuracy. Building on these simulations, mitochondrial genomes from the Human Mitochondrial DataBase were sourced to populate the panels and key parameter values were identified by deconvolving an additional 7290 in silico two-person mixtures. Finally, employing an optimized reference panel and phasing parameters, the approach was validated with in vitro two-person mixtures with differing proportions. Deconvolution was most accurate when the haplotypes in the mixture were similar to haplotypes present in the reference panel and when the mixture ratios were neither highly imbalanced nor subequal (e.g., 4:1). Overall, errors in haplotype estimation were largely bounded by the accuracy of the mixture's genotype results. The proposed framework is the first available approach that automates the reconstruction of complete individual mitotypes from mixtures, even in ratios that have traditionally been considered problematic.
An epidemic model for non-first-order transmission kinetics
(PLOS, 2021-03-11) Mun, Eun-Young; Geng, Feng
Compartmental models in epidemiology characterize the spread of an infectious disease by formulating ordinary differential equations to quantify the rate of disease progression through subpopulations defined by the Susceptible-Infectious-Removed (SIR) scheme. The classic rate law central to the SIR compartmental models assumes that the rate of transmission is first order regarding the infectious agent. The current study demonstrates that this assumption does not always hold and provides a theoretical rationale for a more general rate law, inspired by mixed-order chemical reaction kinetics, leading to a modified mathematical model for non-first-order kinetics. Using observed data from 127 countries during the initial phase of the COVID-19 pandemic, we demonstrated that the modified epidemic model is more realistic than the classic, first-order-kinetics based model. We discuss two coefficients associated with the modified epidemic model: transmission rate constant k and transmission reaction order n. While k finds utility in evaluating the effectiveness of control measures due to its responsiveness to external factors, n is more closely related to the intrinsic properties of the epidemic agent, including reproductive ability. The rate law for the modified compartmental SIR model is generally applicable to mixed-kinetics disease transmission with heterogeneous transmission mechanisms. By analyzing early-stage epidemic data, this modified epidemic model may be instrumental in providing timely insight into a new epidemic and developing control measures at the beginning of an outbreak.
Identification of novel alternative splicing biomarkers for breast cancer with LC/MS/MS and RNA-Seq
(BioMed Central Ltd., 2020-12-03) Zhang, Fan; Deng, Chris K.; Wang, Mu; Deng, Bin; Barber, Robert C.; Huang, Gang
Background: Alternative splicing isoforms have been reported as a new and robust class of diagnostic biomarkers. Over 95% of human genes are estimated to be alternatively spliced as a powerful means of producing functionally diverse proteins from a single gene. The emergence of next-generation sequencing technologies, especially RNA-seq, provides novel insights into large-scale detection and analysis of alternative splicing at the transcriptional level. Advances in Proteomic Technologies such as liquid chromatography coupled tandem mass spectrometry (LC-MS/MS), have shown tremendous power for the parallel characterization of large amount of proteins in biological samples. Although poor correspondence has been generally found from previous qualitative comparative analysis between proteomics and microarray data, significantly higher degrees of correlation have been observed at the level of exon. Combining protein and RNA data by searching LC-MS/MS data against a customized protein database from RNA-Seq may produce a subset of alternatively spliced protein isoform candidates that have higher confidence. Results: We developed a bioinformatics workflow to discover alternative splicing biomarkers from LC-MS/MS using RNA-Seq. First, we retrieved high confident, novel alternative splicing biomarkers from the breast cancer RNA-Seq database. Then, we translated these sequences into in silico Isoform Junction Peptides, and created a customized alternative splicing database for MS searching. Lastly, we ran the Open Mass spectrometry Search Algorithm against the customized alternative splicing database with breast cancer plasma proteome. Twenty six alternative splicing biomarker peptides with one single intron event and one exon skipping event were identified. Further interpretation of biological pathways with our Integrated Pathway Analysis Database showed that these 26 peptides are associated with Cancer, Signaling, Metabolism, Regulation, Immune System and Hemostasis pathways, which are consistent with the 256 alternative splicing biomarkers from the RNA-Seq. Conclusions: This paper presents a bioinformatics workflow for using RNA-seq data to discover novel alternative splicing biomarkers from the breast cancer proteome. As a complement to synthetic alternative splicing database technique for alternative splicing identification, this method combines the advantages of two platforms: mass spectrometry and next generation sequencing and can help identify potentially highly sample-specific alternative splicing isoform biomarkers at early-stage of cancer.
Precision DNA Mixture Interpretation with Single-Cell Profiling
(MDPI, 2021-10-20) Ge, Jianye; King, Jonathan L.; Smuts, Amy; Budowle, Bruce
Wet-lab based studies have exploited emerging single-cell technologies to address the challenges of interpreting forensic mixture evidence. However, little effort has been dedicated to developing a systematic approach to interpreting the single-cell profiles derived from the mixtures. This study is the first attempt to develop a comprehensive interpretation workflow in which single-cell profiles from mixtures are interpreted individually and holistically. In this approach, the genotypes from each cell are assessed, the number of contributors (NOC) of the single-cell profiles is estimated, followed by developing a consensus profile of each contributor, and finally the consensus profile(s) can be used for a DNA database search or comparing with known profiles to determine their potential sources. The potential of this single-cell interpretation workflow was assessed by simulation with various mixture scenarios and empirical allele drop-out and drop-in rates, the accuracies of estimating the NOC, the accuracies of recovering the true alleles by consensus, and the capabilities of deconvolving mixtures with related contributors. The results support that the single-cell based mixture interpretation can provide a precision that cannot beachieved with current standard CE-STR analyses. A new paradigm for mixture interpretation is available to enhance the interpretation of forensic genetic casework.