Frequent itemset mining on graphics

Download Report

Transcript Frequent itemset mining on graphics

Weekly Report
Start learning GPU
Ph.D. Student: Leo Lee
date:
Sep. 18, 2009
Outline
• References
• CUDA
• Work plan
Outline
• References
• CUDA
• Work plan
References
Frequent itemset mining on graphics
• Introduction
– Two representative algorithms: Apriori and FP-growth;
• FP-growth were generally faster than Apriori;
• Apriori-borgelt was slightly faster when the support was high;
– No prior work focuses on studying the GPU
acceleration for FIM algorithms.
– Challenge: the data structure is not aligned and
access patterns are not regular (pointer-chasing).
Frequent itemset mining on graphics
• Background and related work-GPGPU
– The parallel primitives [19] are a small set of common operations
exploiting the architectural features of GPUs. We utilize map, reduce,
and prefix sum primitives in our two FIM implementations.
– Improvement - Memory optimizations:
• Local memory optimization for temporal locality
• Coalesced access optimization of device memory for spatial locality
• The built-in vector data type to reduce the number of memory access.
– Difference
• we study the GPU acceleration of Apriori for FIM, which incurs much more
complex control fows and memory accesses than performing database joins
or maintaining quantiles from data streams.
Frequent itemset mining on graphics
• Implementation
Frequent itemset mining on graphics
• Implementation
Frequent itemset mining on graphics
• Implementation-Pure Bitmap Implementation
Frequent itemset mining on graphics
• Implementation-PBI
Given m frequent (K ¡1)-itemsets,
and n items. In order to check
whether one (K ¡ 1)-itemset is
frequent, we need to access
(logm*(n/128)*16) bytes of data,
where logm is the cost of
performing a binary search, and
(n/128)*16 is the size of a row (in
bytes) in the bitmap of (K¡1)itemsets. Typically, if m = 10000
and n = 10000, we need to access
about 16 KB for checking only
one (K ¡ 1)-subset. This problem
in our pure bitmap- based solution
triggers us to consider adopting
another data structure in the
Candidate Generation procedure
in the presence of a large number
of items.
Frequent itemset mining on graphics
• Implementation-Trie based Implemetation
The candidate generation based
on trie traversal is implemented on
the CPU. This decision is based
on the fact that, the trie is an
irregular structure and difficult to
share among SIMD threads. Thus,
we store the trie representing
itemsets in the CPU memory, and
the bitmap representation of
transactions in the GPU device
memory.
Frequent itemset mining on graphics
• Implementation-TBI
Frequent itemset mining on graphics
• Experiments
Frequent itemset mining on graphics
• Experiments
Frequent itemset mining on graphics
• Results
Frequent itemset mining on graphics
• Results
Frequent itemset mining on graphics
• Results
Frequent itemset mining on graphics
• Results
Outline
• References
• CUDA
• Work plan
CUDA
• Review the code of K-means
– CPU: 1101 S (10 S)
– GPU: still need debug, no results right now
Outline
• References
• CUDA
• Work plan
Work Plan
• Summary this month
• Make plan for next month
• Try to implement a data mining algorithm
• Homework
References
Key words
GPU decision tree
Google scholar
ACM portal
2,230
222
GPU k-means
388
184
GPU SVM
416
27
1,980
11
GPU Expectation
Maximization
266
24
GPU PageRank
4,260
5
GPU AdaBoost
113
9
GPU k-nn
314
20
GPU Naive Bayes
104
2 (false positive)
1,040
3 (false positive)
GPU Apriori
GPU CART
• Thanks for your listening