Genome Assembly and Annotation
Date: 10 - 12 February 2025
Genome assembly is the process of piecing together fragments of DNA to reconstruct the original genome. The genome provides crucial information for understanding genetic structure, function and variation. In recent years, long-read sequencing technologies have revolutionized genome assembly. These long reads can span repetitive sequences and structural variations making genome assembly simpler but also reducing gaps and fragments in the genome, resolve repeats, help with the detection of structural variation as well as improved haplotype phasing. During this course we will look at data generated using PacBio and Oxford Nanopore, discuss the pros and cons of both sequencing technologies and the effect they might have on genome assembly. During the course we will look at different tools available to generate assemblies, focussing on de novo genome assembly. Polishing using short or long reads and the introduction of Hi-C sequencing can increase completeness of the genomes. At the difference steps during the assembly process we will look at the contiguity, completeness and correctness of the generated genomes, thereby evaluation the status of the genome. Once a genome has been assembled the next step is annotation. Genome annotation involves identifying and mapping locations of genes and other functional elements within the sequenced genome. We will take a look at the differences between prokaryote and eukaryote genomes and the tools available for annotation. We will talk about steps to improve annotation once the automatic annotation has been made.
Keywords: Hi-C, Long-read, Nanopore, PacBio, Short-read
Prerequisites:
- A basic understanding of molecular biology
- A working knowledge of how to use the Linux BASH command line - our 1-day 'Linux for bioinformatics' course is a suitable background
Learning objectives:
- Assemble genomes integrating Hi-C data
- Be able to assembly genomes
- Be able to assess the generated genomes
- Know how to annotated a genome
- Know the difference between Nanopore and PacBio data
Organizer: Edinburgh Genomics
Target audience: Academics, post-graduate students, and anyone looking to learn this essential bioinformatics skill.
Event types:
- Workshops and courses
Activity log