SASD: The Synthetic Alternative Splicing Database for identifying novel isoform from proteomics

dc.contributor.authorZhang, Fan
dc.contributor.authorDrabier, Renee
dc.date.accessioned2019-09-12T15:19:56Z
dc.date.available2019-09-12T15:19:56Z
dc.date.issued2013
dc.description.abstractBackground: Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. Results: We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) in liver cancer and 2) in breast cancer. Conclusions: The SASD provides the scientific community with an efficient means to identify, analyze, and characterize novel Exon Skipping and Intron Retention protein isoforms from mass spectrometry and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing.
dc.identifier.citationZhang, F., & Drabier, R. (2013). SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics. BMC Bioinformatics, 14(S14). doi: 10.1186/1471-2105-14-s14-s13
dc.identifier.urihttps://hdl.handle.net/20.500.12503/29672
dc.identifier.urihttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-S14-S13
dc.subject.meshAlternative Splicing
dc.subject.meshAmino Acid Sequence
dc.subject.meshBase Sequence
dc.subject.meshComputational Biology, methods
dc.titleSASD: The Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SASD_ the Synthetic Alternative Splicing Database for identifying.pdf
Size:
795.19 KB
Format:
Adobe Portable Document Format
Description:
Main article

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: