Day 1. General aspects for genetic map construction
Download
Report
Transcript Day 1. General aspects for genetic map construction
DAY 1. GENERAL ASPECTS FOR
GENETIC MAP CONSTRUCTION
SANGREA SHIM
INDEX
Day 1
General aspects for genetic map construction
Genetic polymorphism and recombination frequency
Genotyping using molecular marker
Map construction (phenotype, AFLP, RFLP)
Sequencing method
Next generation sequencing
Whole genome reference sequence
Reference sequencing for Genotyping
Retrieving sequence polymorphism
Genetic map construction (SNP, InDel)
GENETIC POLYMORPHISM &
RECOMBINATION FREQUENCY
GENOTYPING USING MOLECULAR MARKER
An Integrated High-density Linkage Map of Soybean with RFLP, SSR, STS, and AFLP Markers Using A Single F2 Population
Xia et al. 2008
MAP CONSTRUCTION
An Integrated High-density Linkage Map of Soybean with RFLP, SSR, STS, and AFLP Markers Using A Single F2 Population
Xia et al. 2008
NEXT GENERATION SEQUENCING
Sequencing
Sanger’s Dideoxy Termination
Using dNTPs
Electrophoresis in capillary gel
Read dye colors one-by-one
Average 700~900 bp
Massive Parallel Sequencing Platform
So called Next Generation Sequencing platform
SOLiD (Sequencing by Ligation), Illumina (Sequencing by synthesis), 454 (Pyrosequencing)
Read 50+35(50+50), 50~300, 700 bp
1200~1300, ~3000, 1 million reads per run
NEXT GENERATION SEQUENCING
Sequencing technologies – the next generation
Michael et al. Nature review genetics 2010
WHOLE GENOME REFERENCE SEQUENCE
Polymorphism discovered by comparison
Reference is required for comparison
So, the reference genome is obligated
Making contigs which is constituted by unique
sequences combination using PE or small size MP
Scaffolding which includes less unique sequences (i.e.
repetitive sequences) using large insert size MP
library sequences
Anchor the scaffold using genetic map
But, genetic map constituted by several types of
molecular marker is not able to translate to
sequence information
RESEQUENCING FOR GENOTYPING
GET Polymorphism!, Treat it as a marker or locus!
SNPs
Small size InDels
Align several depth of raw read sequence against Ref.
Statistics
Lots of alignment software is available
BLAST, BLAT, BWA, BOWTIE-series…..
Aligner which use BWT as a main algorithm are famous
Fast, efficient
RESEQUENCING FLOW CHART
DNA/RNA
NGS platform
Alignment
pileup
bwa
bowtie2
samtools
bcftools
VCF
Raw read
Sequences
SAM
samtools
selection
BAM
Quality trimming
SolexaQA
samtools
Sorted BAM
Map construction
JoinMap4
RETRIEVING SEQUENCE POLYMORPHISM
BOWTIE2 or BWA are just align the bulky reads to reference sequence
Making SAM(sequence alignment/mapping)/BAM(binary sequence alignment/mapping) as a result
Several types of statistics or inferences can be adapted to retrieving polymorphism (Picard, GATK)
Samtools package is used in retrieving variants
The VCF(variant calling format) is the ouput file
GENETIC MAP CONSTRUCTION
Selection of a core set of RILs from Forrest x Williams 82 to develop a framework map in soybean
Wu et al. 2011
HURDLES ON THE ROAD TO GENETIC MAP
Output of calling variation is a VCF format
JoinMap input file is LOC format
Is there a Converter between the VCF and LOC?
Make converter program, Make genetic map yourself
These are the final goal of this courses
TODAY’S PRACTICE
Make a connection to remote computer
Get used to Linux system
Get familiar with python2.7
THANK YOU
If you have a question, please ask me.
DAY 1.
PRACTICE - BASIC LINUX COMMAND
TAEYOUNG LEE
CONNECTING
Server is located in Seoul National University campus
Connect to server computer using putty SSH client program
Download at http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
CONNECTING
Execute putty
Put IP address (147.46.250.193) at Host Name and click OPEN
CONNECTING
ID : trainee
PW : bogor
Then you are in server now
Only white character on black background
BASIC COMMAND IN LINUX
ls
Listing files and directories
cd
Change directory
Practice) enter into /data2/python
BASIC COMMAND IN LINUX
mkdir
Make directory
Usage) mkdir dir_name
Practice) make directory named as your name
BASIC COMMAND IN LINUX
vi
Open text editing program
Make new text file
usage) vi filename_to_edit
vi filename_to_make
Practice) make text file named as yourname in your directory, write something and save it
Insert, replace, esc
:q :w :wq :q!
BASIC COMMAND IN LINUX
mv
Moving files or directories
Rename files or directories
Usage) mv present_file_path file_path_to_move
Practice)
Change directory into upper directory
cm) cd ..
Make some text file by vi
Move text file to your directory
Rename text file
BASIC COMMAND IN LINUX
cp
Coping files or directories
Usage) cp file_path file_path_to_copy
cp can rename file
If you want to copy directory, you have to use –r option
Cp –r dir_path dir_path_to_copy
Practice)
Make directory in your directory
Copy some file into directory with rename and w/o rename
BASIC COMMAND IN LINUX
rm
Removing files or directories
Usage) rm file_name
If you want to remove directory, you have to use –r option
rm –r dir_name
Practice)
Remove the directory and file
BASIC COMMAND IN LINUX
less
Read only text viewer
Have advantage for large size text file
Usage) less file_name
Searching function
/
Practice)
Open large text file by vi and less
/data2/python/Gmax_109_gene_exons.gff3
Use searching function
/Gm12
wget ftp://ftp.arabidopsis.org/
home/tair/Sequences/whole_chromosomes/tai
r9_Assembly_gaps.gff
BASIC COMMAND IN LINUX
cat
Concatenate files
Print out files
Usage cat file_name1 file_name2 …
Practice)
Print out file by cat
Print out file three times
BASIC COMMAND IN LINUX
grep
Grep the lines contain some words
Usually use with cat
Usage) cat file_name | grep ‘word’
‘|’ mean after
This usage mean we grep line which contain some word after print out file
Various useful options
-v : vanish
-c : count
‘word1\|word2’ = word1 or word2
grep ‘word1’ | grep ‘word2’ = word1 and word2
Practice)
Grep ‘Gm12’ in /data2/python/Gmax_109_gene_exons.gff3
Grep ‘Gm12’ or ‘Gm15’ in same file
Grep ‘gene’ and ‘mRNA’
Count line contain ‘Gm12’
Vanish line contain exon or CDS or mRNA
BASIC COMMAND IN LINUX
sort
Sorting file
Usually use with cat
Usage) cat file_name | sort
Various useful options
-k sort by column
-u sort and remove redundancy
-n numeric sort
-r reverse
-d delimiter setting
Practice)
Sort /data2/python/Gmax_109_gene_exons.gff3 by start position(by column and numeric)
BASIC COMMAND IN LINUX
cut
Cutting column in file
Usually use with cat
Usage) cat file_name | cut –f n (n : integer)
Practice)
Retrieve chromosome, start position, end position in /data2/python_study/Gmax_109_gene_exons.gff3
BASIC COMMAND IN LINUX
>
Standard input, output vs. file input, output
Input and output on screen or file
> can save standard output to file output
cat file_name | grep ‘word’ > output_file
>>
>> also can save standard output to file output
But just adding!
HANDLE FILE
Fasta file
/data2/python/ap2.fa
Fastq file
/data2/python/example.fastq
Gff file
/data2/python/Gmax_109_gene_exons.gff3
Python file!
/data2/python/1stday.py
Make a new text file named as new.txt
The file contain
Gm01,1,23
Gm04,4,56
Gm03,6,78
Gm04,8,10
Copy new.txt into new.copy
Remove new.copy
Using cat, print the contents of new.txt
Using grep, print the contents the new.txt contain Gm04
Using cut, print the first column of new.txt and save it as a file
named as new.txt.cut
THAT’S IT FOR TODAY
Q &A