e-learning
Proteogenomics 1: Database Creation
Abstract
Proteogenomics involves the use of mass spectrometry (MS) based proteomics data against genomics and transcriptomics data to identify peptides and to understand protein-level evidence of gene expression. In the first section of the tutorial, we will create a protein database (FASTA) using RNA-sequencing files (FASTQ) and then perform sequence database searching using the resulting FASTA file with the MS data to identify peptides corresponding to novel proteoforms. Then, we will assign the genomic coordinates and annotations for these identified peptides and visualize the data for its spectral quality and genomic localization
About This Material
This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.
Questions this will address
- How to create a customized Protein Database from RNAseq data?
Learning Objectives
- Generating a customized Protein sequence database with variants, indels, splice junctions and known sequences.
Licence: Creative Commons Attribution 4.0 International
Keywords: Proteomics, proteogenomics
Target audience: Students
Resource type: e-learning
Version: 27
Status: Active
Prerequisites:
Introduction to Galaxy Analyses
Learning objectives:
- Generating a customized Protein sequence database with variants, indels, splice junctions and known sequences.
Date modified: 2024-08-09
Date published: 2018-11-20
Contributors: James Johnson, Pratik Jagtap, Praveen Kumar, Ray Sajulga, Subina Mehta, Timothy J. Griffin
Scientific topics: Proteomics
Activity log