Browsing by Subject "supervised learning"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Genetic profiling of skin microbiomes for forensic human identification(2017-12-01) Schmedes, Sarah E.; Budowle, Bruce; Ge, Jianye; He, Johnny J.The field of microbial forensics has expanded from a focus in biodefense and biocrime attribution to include various metagenomics and microbiome applications made possible by advancements in sequencing and bioinformatics technologies. Recent developments in metagenomics and microbiome research with application to the forensic sciences, include post-mortem interval, body fluid identification, recent geolocation, and human identification. The primary goal of the dissertation described herein was to assess the feasibility of human identification from skin microbiomes using both shotgun metagenomic sequencing and targeted enrichment strategies. The main studies of this dissertation were conducted under the hypothesis that genes from stable, universal microbial species from the core skin microbiome can differentiate skin microbiomes of individuals and be applied towards forensic human identification purposes. The initial study presented describes the development of a tool, AutoCurE, used to identify errors in bacterial genome metadata from public databases and curate the data for subsequent use in comparative genomic studies. This study highlights the types of inconsistencies and errors which may be present in public genome databases and describes the development of a curated local bacterial database for use in subsequent studies. This doctoral research herein presents the development of a novel approach for human identification using stable, universal clade-specific markers from skin microbiomes. Initially, publically available shotgun metagenomic datasets generated from skin microbiome samples collected from 17 body sites from 12 individuals, sampled over three time points over the course of ~3-year period, were mined to identify stable, universal microbial markers. Supervised learning, specifically regularized multinomial logistic regression and 1-nearest-neighbor classification, were performed using the nucleotide diversities of clade-specific markers to predict the correct classification of skin microbiomes to their respective host individuals. Reduced subsets of markers were developed into a novel targeted metagenomics sequencing panel, the hidSkinPlex, to generate individual-specific skin microbiome profiles to use for human identification. Finally, the hidSkinPlex was evaluated on skin microbiome samples collected from eight individuals and three body sites, in triplicate, to demonstrate a proof-of-concept to differentiate individuals with high accuracy. The hidSkinPlex, comprised of 282 bacterial and 4 phage markers from 22 family-, genus-, species-, and subspecies-level clades, was used to correctly identify skin microbiomes from their respective donors with up to 92%, 96%, and 100% accuracy using samples from the foot, manubrium, and hand, respectively. Additionally, skin microbiomes were classified with up to 97% accuracy when the body site was unknown, and body site origin could be predicted with up to 86% accuracy. The hidSkinPlex is the first targeted metagenomics sequencing panel and method designed specifically for skin microbiomes with the intent of forensic human identification applicationsItem Improving Human Identification Using the Human Skin Microbiome(2021-12) Sherier, Allison J.; Budowle, Bruce; Leudtke, Robert; Phillips, Nicole R.There are times when biological evidence has too low of quality or quantity of human DNA to provide enough information for human identification (HID). However, nucleic acids from the human skin microbiome are sources of genetic material that may be useful for HID. The studies in this dissertation test the hypothesis that specific single nucleotide polymorphisms (SNPs) of selected human skin microorganisms can be used to attribute an unknown microbiome sample to an individual. The first study investigated how Wright's fixation index (FST) can be used to select potentially informative SNPs for HID. SNPs with high estimated FST were ascertained in three different ways to examine three distinct hypotheses. The hypotheses focused on testing whether a high FST, increased taxonomic abundance, and/or using a predetermined panel would be the most effective for HID. Classification accuracies ranged from 88 – 95%, and the method using the most taxa possible performed the best. Results from the study support that using genetic distance to select informative markers from the human skin microbiome for HID was viable. The predetermined panel only achieved an 88% accuracy, although it would be the most applicable of the tested method for forensic case work. The second study focused on using FST estimations to select SNPs abundant in 51 individuals sampled at three body sites in triplicate for HID. The most common SNPs (present in ≥ 75% of the samples) which had FST estimates ≥ 0.1 were used with least absolute shrinkage and selection operator (LASSO) to select a list of informative SNPs for HID. The final list (i.e., hidSkinPlex+) contains 365 SNPs and achieved a 95% classification accuracy on 459 samples. The hidSkinPlex+ lays the foundation for a targeted sequencing panel that can be used to further study the stability and specificity of human skin microorganism SNPs for HID applications.