Slides for Video4

Download Report

Transcript Slides for Video4

Short Read Workshop
Day 5: Mapping and Visualization
Video 4
Introducing IGV
Alternate Options for Viewing Data
• IGV (Integrative Genomics Viewer)
• UCSC Genome Browser
https://www.broadinstitute.org/igv/home
IGV is a java-based application designed to visualize many forms of genomic data
along a given reference genome
https://www.broadinstitute.org/igv/home
IGV is a java-based application designed to visualize many forms of genomic data
along a given reference genome
$ java –jar –Xmx5g IGV.jar
Getting Started on IGV
Choosing a genome, import custom genomes
Overview of layout, tracks, moving around, memory bar, etc.
What types of data can you view on IGV?
Quantitative Data
Read pile up/coverage
Qualitative Data
Annotations
• .bam
• .bedgraph
• .wig
• .bed
• .gff3
• .vcf
Quantitative
Data
Coverage
Qualitative
Data
Gene
Annotations
IGV image showing
both types
Go to http://genome.ucsc.edu/FAQ/FAQformat.html for more information on file format and options
What types of data can you view on IGV?
Quantitative Data
Read pile up/coverage
Qualitative Data
Annotations
• .bam
• .bedgraph
• .wig
• .bed
• .gff3
• .vcf
Quantitative
Data
Coverage
Qualitative
Data
Gene
Annotations
IGV image showing
both types
Go to http://genome.ucsc.edu/FAQ/FAQformat.html for more information on file format and options
Quantitative Data
Sorted BAM
Coverage
Track
Read
Alignment
(colored by strand)
Genes
You must have an index of the sorted BAM file for visualization
(see Video 1)
Quantitative Data
Bedgraph and Wig file visualization
Bedgraph
Plus and
Minus Strand
Wig File
Plus Strand
Wig File
Minus Strand
Genes
Quantitative Data
Bedgraph and Wig file visualization
Bedgraph
Plus and
Minus Strand
Wig File
Plus Strand
Wig File
Minus Strand
Genes
Quantitative Data
– .bam
– .bedgraph
– .wig
IGV image showing all 3
.bam is binary version of SAM (Video 1 explains how to convert SAM
to BAM using SAMTools)
.bedgraph is generated using BEDTools and .bam file; allows 2 data
points per genomic position, both negative and positve, integer or
real numbers
.wig file similar to bedgraph, but single data point per genomic
position
it is recommended that wig file be converted to binary
bigWig files for large data sets
Bedgraph file example:
chr1
chr1
Quantitative Data
Read pile up (AKA coverage) (i.e. how many reads are mapping to specific location
Will look different depending on the type of sequencing you have performed
RNA-seq will show pile up along exons and other transcribed regions
genome-sequence should show uniform coverage along the genome
We will discuss 3 different file types for storing and viewing quantitative data
BAM
Bedgraph
Wig
Each type of file has a unique format and visualization on IGV varies slightly from
one another
Qualitative Data
– .bed
– .gff
IGV image showing both
Virtually no difference in visualization BUT gff3 allows for mouse
over display of very useful data, this data display is limited in bed file
bed file is simplified format, can color code segments using BED6 or
BED8 format
BED file example:
chr1
2363
5279
chr1
6739
8923
Gene1
Gene2
0
0
+
-
What types of data can you view on IGV?
Quantitative Data
Read pile up/coverage
Qualitative Data
Annotations
• .bam
• .bedgraph
• .wig
• .bed
• .gff3
• .vcf
Quantitative
Data
Coverage
Qualitative
Data
Gene
Annotations
IGV image showing
both types
Go to http://genome.ucsc.edu/FAQ/FAQformat.html for more information on file format and options
Qualitative Data
BED
GFF
Qualitative Data
BED
GFF
GFF is 9 column tab-delimited file
GFF file example:
chr1
SGD
gene
chr1
SGD
snRNA
2363
6739
5279
8923
.
.
+
-
BED is tab-delimited, but variable column format 3-12
BED6 file example:
chr1
2362
5280
chr1
6738
8924
Gene1
Gene2
0
0
+
-
.
.
ID=Gene1
ID=snRNA1
BEDTools Suite
Set of useful scripts designed to
conduct various tasks using bed
files
bedtools.googlecode.com/files/BEDTools-User-Manual.pdf
http://bioinformatics.oxfordjournals.org/content/suppl/2010/01/27/btq033.DC1/bioinf-2009-1812-File003.pdf
BEDTools Suite
• intersectBed – what overlaps between 2 files
• mergeBed – merge together any overlaping
annotations
• genomeCoverageBed – convert bam to bedgraph
• coverageBed – get read counts for annotations
• closestBed – compare 2 files, find what closest to an
annotation
bedtools.googlecode.com/files/BEDTools-User-Manual.pdf
http://bioinformatics.oxfordjournals.org/content/suppl/2010/01/27/btq033.DC1/bioinf-2009-1812-File003.pdf
Variant Call Format – VCF File
Displays genomic variant data e.g. SNP, indels
You must index a vcf file before loading onto IGV
1.) $ IGVTools index file.vcf
2.) IGV → File → Run IGVTools → Select Index Command