Transcript Document

Review of CIKM 2009
Da Zhou
2009-11-08
Outline
•
•
•
•
Flash-based Database
Poster
Demo
Conclusion
Flash-based Database
• RS-Wrapper: Random Write Optimization for Solid State
Drive
• A Flexible Simulation Environment for Flash-aware
Algorithms
• Dynamic In-Page Logging for Flash-Aware B-Tree Index
– Gap-Joo Na (Sungkyunkwan University)
Sang-Won Lee (Sungkyunkwan University)
Bongki Moon (University of Arizona)
– Experiments
– People
Outline
•
•
•
•
Flash-based Database
Posters
Demo
Conclusion
Posters
Information Retrieval:
Knowledge Management:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Citation Analysis, Social
Networks for IR
Domain-specific IR
Filtering
Foundation of Information Retrieval
IR Architectures, Scalability and
Efficiency
IR Evaluation
Language Specific IR
Machine Learning for IR
Multimedia IR
Semi-Structured Information Retrieval
User Modeling for IR,
Search Personalization
Web Search, Advertising, Adversarial
Advertising and Optimization
Classification and Clustering
Data pre- and post-Processing
Information Extraction
Information Filtering and
Recommender Systems
Knowledge Synthesis and
Visualization
Large-scale statistical techniques
Link and Graph Mining
Semantic Techniques
Temporal and Spatial data Mining
Text Mining
Posters
• Acronym Extraction
–
–
–
–
Training data: full name, context
Existing method: context-based
New idea: Context-based + Links between pages
Dataset: web pages in intranet
• Review assignment
– Motivation: According to research topic leads too
much papers are assigned to a few persons
– Greedy algorithm
– Integer linear programming algorithm
– Dataset: Gold standard data
Posters
• Construct Wikis
– Time consuming, Laborious
– Snippet extraction and selection
• Distance and influence for snippet
• What make categories difficult to classify?
– Data itself
– The characteristics of data
Posters
• A word clustering approach for language
model-based sentence retrieval in
question answering system
– Word-based model
– Class-based model
– N-gram model
– Add a variance into the formula
Posters
• A Collaborative Filtering Approach to Ad
Recommendation using the Query-Ad Click Graph
Discount shoes
A.com
Cheap shoes
B.com
Running shoes
C.com
Posters
• URL Normalization for de-duplication of
web page
• URL 1: t1, t2, t3, t4
• URL 2: t1, t2, t3, t_title
• T4= t_title
Demo
• OfCourse: Web Content Discovery,
Classification and Information
Extraction for Online Course Materials
• XQGen - An Algebra-based XPath
Query Generator for MicroBenchmarking
Conclusion
• Limited improvement on old topics
– More factors are considered
– Minor improvement for existing method
– Applying existing method in wider applications
• DB is a small group
• DB is closer to system than IR
• IR is closer to daily life than DB