WR_2009.09.18

Download Report

Transcript WR_2009.09.18

Weekly Report
Start learning GPU
Ph.D. Student: Leo Lee
date:
Sep. 18, 2009
Outline
• References
• Courses study
• Development
• Work plan
Outline
• References
• Courses study
• Development
• Work plan
References
• K-Means on commodity GPUs with CUDA
– http://portal.acm.org/citation.cfm?id=1579193.1579654&coll=Por
tal&dl=GUIDE&CFID=52122012&CFTOKEN=42909759
• Accelerating K-Means on the Graphics Processor via
CUDA
– http://portal.acm.org/citation.cfm?id=1547557.1548166&coll=Por
tal&dl=GUIDE&CFID=53240258&CFTOKEN=63251930
• Fast Support Vector Machine Training and Classification
on Graphics Processors
– http://portal.acm.org/citation.cfm?id=1390156.1390170&coll=Por
tal&dl=GUIDE&CFID=53246314&CFTOKEN=25986930
K-Means on commodity GPUs
with CUDA
• Introduction:
– OpenMP has too much message communication overhead.
– GPU is becoming common.
– Compared with Shuai Che, puts new centroids recalculation step
also onto GPU and algorithm performance thus becomes better.
• GPGPU
– The challenge in mapping a computing problem efficiently on a
GPU through CUDA is to store frequently used data items in the
fastest memory, while keeping as much of the data on the device
as possible.
– digital investigation, physics simulation, molecular dynamics.
K-Means on commodity GPUs
with CUDA
• K-Means algorithm on GPU
– Data objects assignment, two strategies
• Centroids-oriented-when the number of processors is small;
• Data objects-oriented, adopted in this paper.
– K centroids recalculation
• Massive condition statements are not suitable to the stream
processor model of GPUs
• Host rearranges all data objects and counts the number of
data objects contained by each cluster.
– GPU based K means
K-Means on commodity GPUs
with CUDA
• Performance analysis
K-Means on commodity GPUs
with CUDA
K-Means on commodity GPUs
with CUDA
• Pros and cons
– Describe a GPU based k-Means algorithm
and achieve a speed up of 10;
– Does not have enough comparison, especially
with other GPU base algorithms.
Fast SVM Training and Classification on GPU
• Introduction
– SVM could be adapted to parallel computers.
– SVM is widely used.
– Training and classification are computationally
intensive.
Fast SVM Training and Classification on GPU
• C-SVM
– SVM Training
– SMO
Fast SVM Training and Classification on GPU
Fast SVM Training and Classification on GPU
• SVM Classification
Fast SVM Training and Classification on GPU
• Graphics Processors
– General purpose;
– More aggressive memory subsystems;
– Peak performance is usually impossible to achieve, but GPU still
has significant speedups;
– True round to nearest even rounding on IEEE single precision
datatypes and promise double precision in the near future.
– Nvidia GeForce 8800 GTX
– CUDA
Fast SVM Training and Classification on GPU
• SVM Training Implementation
– Map reduce: computing f is the map, finding b
and I is the reduction.
Fast SVM Training and Classification on GPU
• Results, compared with LibSVM
Fast SVM Training and Classification on GPU
• Results, compared with LibSVM
Summary
• GPU related paper outline
–
–
–
–
** algorithm is useful and computational intensive;
GPU and CUDA is powerful;
Implement the algorithm on GPU;
Results, compared with CPU-based algorithm and
others’ GPU-based algorithm.
• New algorithms or better speedup.
– K-means is hot;
– K-nn, SVM, Apriori appeared.
– What is ours focus?
Outline
• References
• Courses study
– Data mining, Security, CUDA Programming
• Development
• Work plan
CUDA Programming
• On-line class
– Introduction
– Basic
– Memory
– Threads
– Application performance
– Floating-point
Outline
• References
• Courses study
• Development
– Matrix multiply, read k-means and k-nn.
• Work plan
Outline
• References
• Courses study
• Development
• Work plan
Work plan
• Continue read the papers.
• Read the code of k-means and k-nn in
details.
• Data mining
– SVM and C4.5
• Thanks for you listening