UCSC Genome Browser introduction - Bioinformatics Unit

Download Report

Transcript UCSC Genome Browser introduction - Bioinformatics Unit

The UCSC Genome Browser
Introduction
Osvaldo Graña
CNIO Bioinformatics Unit
Materials prepared by
Mary Mangan, Ph.D.
www.openhelix.com
Version 21a_1012
1
UCSC Genome Browser Agenda






Introduction
Basic Searches
Understanding Displays
Get Details or Sequences
Sequence Searches (BLAT)
Exercises
UCSC Genome Browser: http://genome.ucsc.edu
2
Organization of Genomic Data
Annotation Tracks
sequence Reference genome: base position number
chromosome band
gap locations
known genes
predicted genes
phenotype and disease
Links out to
more data
enhancer/promoter data
microarray/expression data
evolutionary conservation
SNPs and structural variation
repeated regions
more…
3
A Sample of the UCSC Genome Browser
gene details
reference
sequence
Annotation Tracks
comparisons
SNPs
4
UCSC Genome Browser Agenda






Introduction and Credits
Basic Searches
Understanding Displays
Get Details or Sequences
Sequence Searches (BLAT)
Exercises
UCSC Genome Browser: http://genome.ucsc.edu
5
The UCSC Homepage: http://genome.ucsc.edu
navigate
navigate
General information
Specific information—
new features, current status, etc.
6
Gateway: Start Page for a Basic Search
text/ID
searches

Use this Gateway to search:




Gene names, symbols, IDs
Chromosome number: chr7, or region: chr11:1038475-1075482
Keywords: kinase, receptor
See lower part of page for help with format
7
UCSC Genome Browser Gateway
1
2
3
4
5
assembly
Make your Gateway choices:
1.
Select clade + genome = species: search 1 species at a time
2.
Assembly: the official reference DNA sequence
3.
Position: location in the genome to examine, or text search
4.
Track search to find data types of interest (annotation tracks)
5.
Configure: make fonts bigger + other display choices
8
Sample Search for Human TP53
select

Sample search: human, February 2009 assembly, tp53
uc002gij.2

Select from results list; or goes to a viewer page, if unique
9
UCSC Genome Browser Agenda







Introduction and Credits
Basic Searches
Understanding Displays
Get Details or Sequences
Sequence Searches (BLAT)
Summary
Exercises
UCSC Genome Browser: http://genome.ucsc.edu
10
}
Overview of the Whole
Genome Browser Page
Genome viewer
(2009 Human Assembly)
Groups of data (Tracks)
Mapping and Sequencing Tracks
Phenotype and Disease Tracks
Genes and Gene Prediction Tracks
(including sno/miRNA data)
Track data
may be updated
mRNA and EST Tracks
Expression (such as microarray)
Regulation (including TFBS)
Comparative Genomics
•As a group
•Individual species
Variation and Repeats
Default settings;
tracks can now be
dragged in viewer
(including SNPs, copy number variation)
11
Different Assemblies, Species, Tracks


Assemblies, Species may have different data tracks
Layout, software, functions the same
12
Sample Genome Viewer Image, TP53 Region
scale
base position
UCSC genes
RefSeq
mRNAs & ESTs
ENCODE
many species compared
single species compared
SNPs
repeats
13
Visual Cues on the Genome Browser
Tick marks; a single location (STS, SNP)
3' UTR
exon
<<<
exon
< exon < < < <ex 5' UTR
Intron and direction of transcription <<< or >>>
Track colors may have meaning—for example, UCSC Gene track:
•If there is a corresponding PDB entry = black
•If there is a corresponding reviewed/validated seq = dark blue
•If there is a non-RefSeq seq = lightest blue
Mammal
cons.
height of a blue bar is increased likelihood of conservation,
red indicates a likelihood of faster-evolving regions
Alignment indications (Conservation pairs: “chain” or “net” style)
•Alignments = boxes, Gaps = lines
14
Options for Changing Images: Upper Section
walk
zoom
Tweak position
or do new search
Rightclick
items
Hold/drag mouse
Drag (like Google Maps) to view section




Change your view or location with controls at the top
Use “base” to get right down to the nucleotides
Drag tracks up and down the viewer to re-arrange
Various select and focus options by clicking/dragging mouse
15
Annotation Track Display Options
Links to info
and/or filters
and color key




Enforce
menu
changes
Some data is ON or OFF by default
Menu links to info about the tracks: content, methods Change
track view
You change the view with pulldown menus
After making changes, REFRESH to enforce the change
16
Basic Annotation Track Menus Defined

Hide: removes a track from view

Dense: all items collapsed into a single line

Squish: each item = separate line, but 50% height + packed

Pack: each item separate, but efficiently stacked (full height)

Full: each item on separate line (may need to zoom to fit)
17
Tracks with Additional Options: Filters, more….
off
on
Supertrack



Some tracks have filters (ESTs shown; SNPs other good example)
Some tracks may have undisplayed data (Yale TFBS; 2006)
Super-tracks may have multiple components, various settings
18
Mid-page Options to Change Settings
Search for
data types
Resets, back
to defaults




Flip display to
Genomic 5’3’
Fit to browser
window size
Start from
scratch
Search for data types
Reset to defaults
Configure options page
You control the views with numerous features
19
Cookies and Sessions

Your browser remembers where you were (cookies)
OR
To clear your “cart” or parameters, click default tracks or reset

Save your setup as “Session” and store/share them
Requires login
Lifespan: 4 months
20
UCSC Genome Browser Agenda






Introduction and Credits
Basic Searches
Understanding Displays
Get Details or Sequences
Sequence Searches (BLAT)
Exercises
UCSC Genome Browser: http://genome.ucsc.edu
21
Click Any Viewer Object for More Details
Click the
item
New description
web page opens
Many details
and links
to more data
about TP53
Example: click your mouse anywhere on the TP53 line
22
informative
description
other resource links
links to sequences
Click Annotation Track Item
for Description Pages
genetic association
studies
comparative toxicology
microarray data
Not all genes have
this much detail.
Different
annotation tracks
carry different data.
mRNA secondary structure
protein domains/structure
orthologs in other species
Gene Ontology™ descriptions
Click a SNP
to get SNP
details
mRNA descriptions
pathways
synonyms
gene model
23
Get DNA, with Extended Case/Color Options
Get DNA
255




255
Use the View DNA link at the top
Plain or Extended options
Change colors, fonts, underline, etc.
24
Get Sequence from Description Pages
Click the
item
sequence section on detail page
Copy whole mRNA
for next segment

Click an item, go to Sequence section of description page
25
UCSC Genome Browser Agenda






Introduction and Credits
Basic Searches
Understanding Displays
Get Details or Sequences
Sequence Searches (BLAT)
Exercises
UCSC Genome Browser: http://genome.ucsc.edu
26
Accessing the BLAT Tool
BLAT = BLAST-like Alignment Tool



Rapid searches by INDEXING the entire genome
Works best with high similarity matches
See documentation and publication for details

Kent, WJ. Genome Res. 2002. 12:656 and “Help”
27
BLAT Tool Interface
Make
choices



Paste one or
more sequences
FASTA for more
than one
DNA limit 25000 bases
Protein limit 10000 aa
25 total sequences
submit
Or
upload

28
BLAT Results with Hyperlinks
Results with demo sequences, settings default; sort = Query, Score



go to alignment detail
go to browser/viewer

sorting
Score is a count of matches—higher number, better match
Click browser to go to Genome Browser image location (next slide)
Click details to see the alignment to genomic sequence (2nd slide)
29
BLAT Results: Browser Link
query



From browser click in BLAT results
A new track line with Your Sequence from BLAT Search appears
Also a new menu to adjust
30
BLAT Results, Alignment Details
Your query
Side by Side Alignment
Genomic
match, with
color cues
yours
genomic
31
UCSC Genome Browser Agenda






Introduction and Credits
Basic Searches
Understanding Displays
Get Details or Sequences
Sequence Searches (BLAT)
Exercises
UCSC Genome Browser: http://genome.ucsc.edu
32
UCSC Genome Browser Agenda






Introduction and Credits
Basic Searches
Understanding Displays
Get Details or Sequences
Sequence Searches (BLAT)
Exercises
UCSC Genome Browser: http://genome.ucsc.edu
33
Notice:

The materials and slides offered are for non-commercial use
only. Reproduction, distribution and/or use for commercial
purposes is strictly prohibited.

Copyright 2012, OpenHelix, LLC

http://www.openhelix.com/ucsc
34