MaCHTools: Additional functionality for the imputation software MaCH

dc.contributor.advisorRobert C. Barber
dc.contributor.committeeMemberFan Zhang
dc.creatorMitchel, Jeffrey S.
dc.date.accessioned2019-08-22T21:11:24Z
dc.date.available2019-08-22T21:11:24Z
dc.date.issued2016-12-01
dc.date.submitted2017-05-08T07:12:06-07:00
dc.description.abstractImputation of unknown genotypes is becoming a standard procedure in exploratory genetic association studies. Imputation is accomplished by comparing observed data from the study population to reference panels of individuals who are from a genetically similar population and genotyped at a dense set of polymorphic sites. Linkage disequilibrium within the reference panels is used to construct haplotypes and extrapolate allelic correlations in the test sample. Imputation has been shown to be accurate for the inference of genotypes at unobserved SNPs, as well as for quality control measures at genotyped locations. Imputing genotypes also allows cohorts that were genotyped on different platforms to be combined in a joint or meta-analysis. One of the most widely used imputation software packages is MaCH (http://csg.sph.umich.edu//abecasis/mach/). MaCH uses a powerful and accurate Markov chain-based algorithm, however its usability is lacking. MaCHTools allows the user to streamline their workflow with MaCH through input file specification, error checking, and QC measures, MaCHTools began as a series of Java scripts used to check input files and QC raw data as an initial step before imputing additional genotypes in MaCH. This set of scripts became invaluable to the GWAS workflow, but they were unpolished and ill-suited for public release to benefit the scientific community. This project aimed to bundle the scripts into a single executable program that provides a graphical user interface (GUI) to facilitate use by students and researchers to aid in streamlining the GWAS workflow. Additional functionalities include more efficient launching of jobs to compute clusters and compatibility with different Linux job handlers, the ability to easily switch between different GWAS projects including switching between different genotype data and reference datasets, more simplistic specification of parameters and thresholds, and several other usability improvements. The GWAS workflow that includes dataset preparation with MaCHTools coupled with haplotype estimation and imputation with MaCH was validated by replicating results from a published study of the genetic basis of Alzheimer’s endophenotypes in the Texas Alzheimer’s Research and Care Consortium. A similar analysis was then performed to determine the genetic basis of D, a latent variable that represents the dementing process.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/20.500.12503/29140
dc.language.isoen
dc.provenance.legacyDownloads33
dc.subjectMedical Sciences
dc.subjectMedicine and Health Sciences
dc.subjectMaCHTools
dc.subjectimputation
dc.subjectAlzheimer's
dc.subjectbioinformatics
dc.titleMaCHTools: Additional functionality for the imputation software MaCH
dc.typeDissertation
dc.type.materialtext
thesis.degree.departmentGraduate School of Biomedical Sciences
thesis.degree.disciplineBiomedical Sciences
thesis.degree.grantorUniversity of North Texas Health Science Center at Fort Worth
thesis.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2016_12_gsbs_Mitchel_Jeffrey_dissertation.pdf
Size:
5.79 MB
Format:
Adobe Portable Document Format