transparencies - Indico
Download
Report
Transcript transparencies - Indico
Data consistency for Applications
using FroNtier/Squid
Luis Ramos, CERN
3D Meeting, January 2006
Agenda
1.
2.
3.
4.
5.
6.
Frontier Basics
Cache consistency issues with Frontier/Squid
Inconsistency Scenarios
Application Restrictions Summary
Conclusions
Appendix: Invalidation Mechanism
3D Meeting, January 2006
Luis Ramos, CERN
2
Frontier Basics
Frontier servlet generates query results as XML
documents from database queries submitted by clients
Frontier Client is an C/C++ API to send requests to
the Frontier servlet
FrontierAccess (Frontier POOL “plug-in”) uses
Frontier Client to access Frontier servlets
In this context, Frontier is a web-based approach
for generic DB access
3D Meeting, January 2006
Luis Ramos, CERN
3
Example - Frontier servlet (HTTP)
QUERY:
http://pcitdb03.cern.ch:8080/Frontier/Frontier?type=frontier_request:1:DEFAULT&encoding=BLOB&p1=...(SQL query encoded in base64)
REPLY:
3D Meeting, January 2006
Luis Ramos, CERN
4
Squid
Squid cache servers are placed between clients
and the Frontier servlet
Squid caches query results (XML documents)
and serves them to clients that ask for exactly
the same query
3D Meeting, January 2006
Luis Ramos, CERN
5
Cache Consistency Problem
Squid caches database query results for a fixed
time (HTTP TimeToLive)
set by Frontier server (7 days)
time-based cache invalidation
Backend database change
Squid keeps serving stale data to clients
3D Meeting, January 2006
Luis Ramos, CERN
6
Cache Consistency Problem
If tables are created in the database, new queries will
refer them and results will not be in cache as tables are
new, no problem
If tables are dropped the cached results will be
wrong
BUT, if inserts or updates are made in existing
tables, cached data in Squids becomes stale!
3D Meeting, January 2006
Luis Ramos, CERN
7
Scenario: CREATE TABLE - OK
Cached query:
Database change
Create table tab3 (…);
New query:
Select * from tab1, tab2 where …
Select * from tab1, tab3 where …
Query not cached, OK
3D Meeting, January 2006
Luis Ramos, CERN
8
Scenario: DROP TABLE - KO
Cached query:
Database change
Drop table tab1 (…);
New query:
Select * from tab1, tab2 where …
Select * from tab1, tab2 where …
If query is cached, KO: wrong result
3D Meeting, January 2006
Luis Ramos, CERN
9
Scenario: INSERT - KO
Cached query:
Database change
insert into tab1 values (…);
New query:
Select * from tab1, tab2 where …
Select * from tab1, tab2 where …
If query is cached, KO: stale data
3D Meeting, January 2006
Luis Ramos, CERN
10
Scenario: UPDATE - KO
Cached query:
Database change
Update tab1 set … where …;
New query:
Select * from tab1, tab2 where …
Select * from tab1, tab2 where …
If query is cached, KO: stale data
3D Meeting, January 2006
Luis Ramos, CERN
11
Scenario: new object OK
Cached query:
Database change
insert into objs values (Y, …);
insert into attribs values (.., Y, …);
New query:
Select * from obj, attribs
where objs.ID = attribs.OBJ_ID and objs.ID = X
select * from objs, attribs
where objs.ID = attribs.OBJ_ID and objs.ID = Y
Query for object Y is not cached, OK
Queries on IDs of static objects, static cache is OK
3D Meeting, January 2006
Luis Ramos, CERN
12
Scenario: new attribute KO
Cached query:
Database change
insert into attribs values (.., X, …);
New query:
Select * from objs, attribs
where objs.ID = attribs.OBJ_ID and objs.ID = X
select * from objs, atribs
where objs.ID = attribs.OBJ_ID and objs.ID = X
Query for object X might be cached, KO
Queries on IDs of non static objects, static cache is KO
3D Meeting, January 2006
Luis Ramos, CERN
13
Restrictions with static cache
Table drops can lead to wrong query results
Data updates can lead to wrong query results
Inserts need special care
ID based queries are OK
Otherwise, KO
when inserting in “attribs”, a force refresh is needed at
user application level for queries over “objs”
Will user applications respect these restrictions?
3D Meeting, January 2006
Luis Ramos, CERN
14
Present Status - problem
POOL Frontier plug-in has two types of
queries:
DB dictionary data and user data
To avoid stale cached data, the plug-in does
client side cache refresh for metadata queries
Stale data in cache may appear in user data queries
3D Meeting, January 2006
Luis Ramos, CERN
15
Invalidation Mechanism
Build a cache content invalidation mechanism over
Squid/Frontier/OracleDB
A way to invalidate cached query results when respective
tables are changed
Invalidation mechanism basic steps are:
Detect database changes
Detect which cache content is stale
Send invalidation messages to Squids
Purge cached content in Squids
3D Meeting, January 2006
Luis Ramos, CERN
16
Conclusions
Frontier alone does not grant data consistency
Applications must follow a set of rules to keep
data consistency (see slide 14)
Invalidation mechanism could be developed
Some ideas follow in appendix
3D Meeting, January 2006
Luis Ramos, CERN
17
Appendix - Invalidation Steps
1. Database changes detection
2. Stale cached queries detection
3. Invalidation propagation to Squids
4. Purge cached content in Squids
3D Meeting, January 2006
Luis Ramos, CERN
18
1. Database changes detection
Options:
Database triggers
View ALL_TAB_MODIFICATIONS
This view is updated off-line with up to 3 hours delay between table
update and registration in all_tab_modifications
Database auditing
data manipulation triggers (DML operations) can only be setup on table
level (not on database or schema level)
AUDIT INSERT TABLE, UPDATE TABLE, DELETE TABLE
BY ACCESS
WHENEVER SUCCESSFUL;
Oracle Log Miner
More info available and less performance overhead than auditing
Not so simple as DB auditing and implies setup time overhead
3D Meeting, January 2006
Luis Ramos, CERN
19
1. Database changes detection
Database auditing
Simple to configure
Trigger over the table sys.aud$
Trigger fires a stored procedure to start the
invalidation procedure
3D Meeting, January 2006
Luis Ramos, CERN
20
2. Stale cached queries detection
How to find pages to invalidate in Squids given the name of a
modified table?
A mapping between tables and queries
Frontier servlet query strings could be modified to ease this mapping
Whenever there’s a query to the servlet it must store the query
and the tables somewhere
When a table is modified all queries with that table are
invalidated
Danger of invalidating objects that are still valid (over-invalidation)
Invalidation procedure can be tricky (invalidation rules)
3D Meeting, January 2006
Luis Ramos, CERN
21
2. Stale cached queries detection
Logging queries, clients and tables affected
Two logging options:
Log module in Frontier servlet (as a servlet wrapper)
OR
Some script running over Apache logs
3D Meeting, January 2006
Luis Ramos, CERN
22
3. Invalidation propagation to Squids
After having a list of queries to invalidate we
need to know:
What caches requested the query?
Easy to register except with hierarchical caches
Where are those caches?
Caches must be registered in server
The cache hierarchy (topology) must be also registered
3D Meeting, January 2006
Luis Ramos, CERN
23
4. Purge cached content in Squids
Two options:
Purge HTTP command
Squid purge tool
one object at a time
regular expressions for purging multiple objects with one
command
Performance tests could be done
3D Meeting, January 2006
Luis Ramos, CERN
24
Questions
3D Meeting, January 2006
Luis Ramos, CERN
25