Mid-Term Review

Download Report

Transcript Mid-Term Review

CS505: Final Exam Review
Jinze Liu
Major Topics
• Before Mid-Term
– Security and Access Control
– Indexing
• After Mid-Term
– Transaction Management
• Locking, Concurrency Control, Logging and Recovery
– Query Processing & Optimization
– Introduction to Information Retrieval
Concurrency Control
• Basic Problem: Given two sets of transactions,
determine
– whether they are conflict serializable
– which schedule can maximize their concurrency.
Locking
• How does the system enforce concurrency
control?
• Locking
– 1. Basic read and write locks
– 2. Incremental locks
– 3. Tree based locks
• Validation based concurrency control
Logging and Recovery
• Undo logs
– What’s the content of the log
– How to recover from undo logs
• Redo logs
– What’s the content of the log
– How to recover from redo logs
• Similar for undo and redo logs.
• How to use checkpoints
Query Processing
• What are the most common queries that are
time consuming?
– Join
• What’s the basic algorithm to implement Join?
– Why is it time consuming
• How to improve it?
– Sorted Merge Join
– Hash-based Join
– What’s their performance
Query Plan
• Basic Question: Give a query, what’s the most
efficient plan to execute it?
– How many equivalent plans are there?
• Query Rewrite
– What’s the plan with best performance?
• How do you estimate performance based on data
distribution?
– How to choose the best plan?
Information Retrieval
• What’s the data structure to store document
and words relationships
– Dictionary and posting.
• How to speed up query of words?
• How to tolerant errors in the queries?
CS685: Data Mining
Final Exam
• Exam questions
– 5 big questions just like mid-term exam
– 1 extra credit question
• Exam
– Location: CB242
– Time: Dec 18th 3pm-5pm
– Just bring yourself and a pen or pencil
Exam Week Office Hours
• Time
– Monday, Wednesday, 11am – 1pm
• Location
– Hardymon building 237
Thank You
• Questions?
– Send email or drop by