e-learning

Proteogenomics 1: Database Creation

Abstract

Proteogenomics involves the use of mass spectrometry (MS) based proteomics data against genomics and transcriptomics data to identify peptides and to understand protein-level evidence of gene expression. In the first section of the tutorial, we will create a protein database (FASTA) using RNA-sequencing files (FASTQ) and then perform sequence database searching using the resulting FASTA file with the MS data to identify peptides corresponding to novel proteoforms. Then, we will assign the genomic coordinates and annotations for these identified peptides and visualize the data for its spectral quality and genomic localization

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

  • How to create a customized Protein Database from RNAseq data?

Learning Objectives

  • Generating a customized Protein sequence database with variants, indels, splice junctions and known sequences.

Licence: Creative Commons Attribution 4.0 International

Keywords: Proteomics, proteogenomics

Target audience: Students

Resource type: e-learning

Version: 27

Status: Active

Prerequisites:

Introduction to Galaxy Analyses

Learning objectives:

  • Generating a customized Protein sequence database with variants, indels, splice junctions and known sequences.

Date modified: 2024-08-09

Date published: 2018-11-20

Authors: James Johnson, Pratik Jagtap, Praveen Kumar, Ray Sajulga, Subina Mehta, Timothy J. Griffin

Contributors: James Johnson, Pratik Jagtap, Praveen Kumar, Ray Sajulga, Subina Mehta, Timothy J. Griffin

Scientific topics: Proteomics


Activity log