4. EXTERNAL INDEX - Information Builders
Download
Report
Transcript 4. EXTERNAL INDEX - Information Builders
XFOC Databases:
If You Build Them, They will Come
Noreen Redden
FOCWizard
June, 2008
Copyright 2007, Information Builders. Slide 1
Agenda
XFOC Databases
Sizing Considerations
Number of Instances / page
4 Times Bigger ?
Efficiencies
Multiple Partitions
Horizontal and Vertical
Intelligent Partitioning
MDI (Multi-Dimensional Index)
Why use it vs. regular External Index
Components
ACCESS File
Building/Adding to MDI
Copyright 2007, Information Builders. Slide 2
XFOC Databases
The Data
Warehouse
Copyright 2007, Information Builders. Slide 3
XFOC Databases – Sizing Considerations
General:
Pages contain instances of a single SEGMENT, specific
internal index, or values of text fields.
Page 1 contains ONLY the FDT
Page is 16K, with a maximum of 1K pages (maximum
size of partition is 16gig.
Segment instances contain:
Pointer (at beginning of page) to point to specific
instance on page
Pointer from instance to instance (chain, Parent to
child, Child to parent)
Data Storage
Copyright 2007, Information Builders. Slide 4
XFOC Databases – Sizing Considerations
Segment Instances – Data Storage
Alpha (An)
Variable Alpha (AnV)
Integer
Packed
Float Single Precision
Float Double Precision
Text
n bytes
n bytes + 2 for length
4 bytes
8 or 16 packed bytes
4
8
8 byte pointer on Seg
+ actual length on
text pages)
Copyright 2007, Information Builders. Slide 5
XFOC Databases – EX CALCFILE
Segment
Population
FDT
FOC 4K Pages
N/A
XFOC 16K
Pages
1
ORIGIN
1,000
5
2
COMP
5,000
46
13
CARREC
50,000
455
123
BODY
100,000
1,220
320
SPECS
100,000
2128
541
WARANT
40,000
488
128
EQUIP
75,000
915
240
COUNTRY
INDEX
1,000
9
3
Copyright 2007, Information Builders. Slide 6
XFOC Database - Efficiencies
For maximum efficiency, a chain of child instances should be
able to be accessed in a single I/O.
Use Unique Segments to isolate infrequently used fields.
Partition historical data to reduce I/O needed for UPDATE.
Build/rebuild internal indexes AFTER large updates/loads
Increase XFBINs to increase page re-utilization.
Use Partitions to reduce the need to REBUILD external
indexes/MDI’s
Copyright 2007, Information Builders. Slide 7
Vertical
Partitioning
Copyright 2007, Information Builders. Slide 8
Intelligent Partitioning
GGSALES
FIELDNAME
--------SEQ_NO
CATEGORY
PCD
PRODUCT
REGION
ST
CITY
STCD
DATE
UNITS
DOLLARS
BUDUNITS
BUDDOLLARS
ALIAS
----SEQ
E02
E03
E04
E05
E06
E07
E08
E09
E10
E11
E12
E13
FORMAT
-----I6
A11
A04
A16
A11
A02
A20
A05
I8YYMD
I08
I08
I08
I08
Copyright 2007, Information Builders. Slide 9
Intelligent Partitioning
USE
DYNAM ALLOC F GGSALES1 DA PMSNMR.GGSLS2K.FOCUS SHR REU
DYNAM ALLOC F GGSALES2 DA PMSNMR.GGSALES.FOCUS SHR REU
USE
-IF &MDATE.ENTER LOWEST DATE FOR REPORT. GT 19991231
-
GOTO NOT1;
GGSALES2 AS GGSALES
-NOT1
-IF &MDATE.ENTER HIGHEST DATE FOR REPORT. LT 20010101
-GOTO ENDIT
GGSALES1 AS GGSALES
-ENDIT
END
Copyright 2007, Information Builders. Slide 10
Intelligent Partitioning
ACCESS File Points to the FOCUS Files
FILENAME=GGSALES, SUFFIX=XFOC, ACCESS=GGSALES,$
SEGNAME=SALES01, SEGTYPE=S1
FIELD=SEQ_NO, ALIAS=SEQ, FORMAT=I6, TITLE='SEQUENCE#', $
FIELD=CATEGORY, ALIAS=E02, FORMAT=A11, INDEX=I, TITLE='Category', $
FIELD=PCD, ALIAS=E03, FORMAT=A04, INDEX=I, TITLE='Product ID', $
FIELD=PRODUCT, ALIAS=E04, FORMAT=A16, TITLE='Product',$
FIELD=REGION, ALIAS=E05, FORMAT=A11, INDEX=I, TITLE='Region', $
FIELD=ST, ALIAS=E06, FORMAT=A02, INDEX=I, TITLE='State', DESC='State',$
FIELD=CITY, ALIAS=E07, FORMAT=A20, TITLE='City', DESC='City',$
FIELD=STCD, ALIAS=E08, FORMAT=A05, INDEX=I, TITLE='Store ID', $
FIELD=DATE, ALIAS=E09, FORMAT=I8YYMD, TITLE='Date',
$
MASTER=GGSALES ,$
DATA = PMSNMR.GGSLS2K.FOCUS ,
WHERE = DATE FROM 20000101 TO 20091231,$
DATA = PMSNMR.GGSALES.FOCUS ,
WHERE = DATE FROM 19900101 TO 19991231,$
Copyright 2007, Information Builders. Slide 11
Tip with Partitions
MODIFY/MAINTAIN only use the first partition.
External indexes must be rebuilt if the file changes.
Keep 2 MFDs: GGSALES (with ACCESS) ; GGSALE2
MODIFY FILE GGSALE2
…
END
TABLE FILE GGSALE2
…
WHERE …
MORE
FILE GGSALES
END
Copyright 2007, Information Builders. Slide 12
XFOC Databases
Copyright 2007, Information Builders. Slide 13
MDI – Multi-Dimensional Index
Dimensions categorize data –
Sales may be categorized by REGION, BRANCH and
PRODUCT Type,
Products may be categorized by PRODUCT CATEGORY,
PRODUCT CODE, SIZE, COLOR
Employees may be categorized by Job code, Hire_date,
Branch, etc.
A multi-dimensional index uses dimensions to point to specific
facts.
MDI is a multi-field index to behave like a virtual “cube” of
values that intersect at measures of interest. The more
dimensions used, the better the retrieval performance.
MDIs may be used for Selection, JOIN, or reported from
directly.
Copyright 2007, Information Builders. Slide 14
MDI – Multi-Dimensional Index
Know Your Data, and queries to choose index dimensions:
Fields used frequently as filters (Multiple fields used as filters
within a single report/query should be in the same MDI.
Fields used as basis for vertical portioning
DEFINE fields
Fields with many unique values
Multiple MDIs may be created for separate subsets of data.
In a query, AUTOINDEX will choose the MDI to use based on
Segment most involved in the query
MDI with the most filtering conditions
Percent of index dimensions involved in the request from each MDI
Size of each MDI.
Copyright 2007, Information Builders. Slide 15
Intelligent Partitioning
GGSALES
FIELDNAME
--------SEQ_NO
CATEGORY
PCD
PRODUCT
REGION
ST
CITY
STCD
DATE
UNITS
DOLLARS
BUDUNITS
BUDDOLLARS
ALIAS
----SEQ
E02
E03
E04
E05
E06
E07
E08
E09
E10
E11
E12
E13
FORMAT
-----I6
A11
A04
A16
A11
A02
A20
A05
I8YYMD
I08
I08
I08
I08
Copyright 2007, Information Builders. Slide 16
ACCESS File Points to the MDIS
MASTER=GGSALES ,$
DATA = PMSNMR.GGSLS2K.FOCUS ,
WHERE = DATE FROM 20000101 TO 20091231,$
DATA = PMSNMR.GGSALES.FOCUS ,
WHERE = DATE FROM 19900101 TO 19991231,$
MDI=GGMDI, TARGET_OF=SALES01 ,$
DIM=CATEGORY ,MAXVALUE = 100,$
DIM=PCD
,MAXVALUE = 50, WITHIN=CATEGORY,$
DIM=REGION
,MAXVALUE = 20,$
DIM=ST
,MAXVALUE = 20, WITHIN=REGION ,$
DIM=CITY
,MAXVALUE = 20, WITHIN = ST,$
DIM=STCD
,MAXVALUE = 20, WITHIN=CITY ,$
DIM=DATE
,MAXVALUE = 240,$
MDIDATA= PMSNMR.GGMDI1.FOCUS ,$
MDIDATA= PMSNMR.GGMDI2.FOCUS ,$
Copyright 2007, Information Builders. Slide 17
ACCESS File Points to the MDIS
MDI=GGLOC, TARGET_OF=SALES01 ,$
DIM=REGION
,MAXVALUE = 20,$
DIM=ST
,MAXVALUE = 20, WITHIN=REGION ,$
DIM=CITY
,MAXVALUE = 20, WITHIN = ST,$
DIM=STCD
,MAXVALUE = 20, WITHIN=CITY ,$
DIM=DATE
,MAXVALUE = 240,$
MDIDATA= PMSNMR.GGLOC1.FOCUS ,$
MDIDATA= PMSNMR.GGLOC2.FOCUS ,$
Copyright 2007, Information Builders. Slide 18
Creating the MDI - REBUILD
Enter option
1. REBUILD
(Optimize the database structure)
2. REORG
(Alter the database structure)
3. INDEX
(Build/modify the database index)
4. EXTERNAL INDEX
(Build/modify an external index database)
5. CHECK
(Check the database structure)
6. TIMESTAMP
(Change the database timestamp)
7. DATE NEW
(Convert old date formats to smartdate formats)
8. MDINDEX
(Build/modify a multidimensional index)
MDINDEX
NEW / ADD ? NEW
ENTER MASTER FILE NAME
GGSALES
ENTER MD_INDEX LOCATION FILE NAME GGMDI
ANY RECORD SELECTION TESTS? (YES/NO) NO
NEW FILE PMSNMR.GGMDI1.FOCUS
ON 04/09/2008 AT
NEW FILE PMSNMR.GGMDI2.FOCUS
ON 04/09/2008 AT
Accumulating MDI Data ....
Copyright 2007, Information Builders. Slide 19
Adding a Partition - REBUILD
Enter option
1. REBUILD
(Optimize the database structure)
2. REORG
(Alter the database structure)
3. INDEX
(Build/modify the database index)
4. EXTERNAL INDEX
(Build/modify an external index database)
5. CHECK
(Check the database structure)
6. TIMESTAMP
(Change the database timestamp)
7. DATE NEW
(Convert old date formats to smartdate formats)
8. MDINDEX
(Build/modify a multidimensional index)
MDINDEX
NEW / ADD ? ADD
ENTER MASTER FILE NAME
GGSALES
ENTER MD_INDEX LOCATION FILE NAME GGMDI
ANY RECORD SELECTION TESTS? (YES/NO) NO
NEW FILE PMSNMR.GGMDI1.FOCUS
ON 04/09/2008 AT
NEW FILE PMSNMR.GGMDI2.FOCUS
ON 04/09/2008 AT
Accumulating MDI Data ....
Copyright 2007, Information Builders. Slide 20
Comparing
TABLE FILE GGSALES
PRINT * BY REGION BY ST
WHERE REGION EQ 'Northeast'
WHERE DATE EQ '19920501'
WHERE ST
EQ 'NY'
WHERE PRODUCT EQ 'Capuccino‘
END
270,000 RECORDS
AUTOINDEX
RECORDS
YES (REGION)
3
GGMDI
3
BASEIO
1682
41
FOCCPU
434
293
INDEXIO
0
71
Copyright 2007, Information Builders. Slide 21
MDI – Other Facilities
JOIN field1 [AND field2…] IN file1 TO ALL mdiname
IN file2 AS joinname
JOIN field1 [AND field2 …] IN file1 TO ALL dim1
[AND dim2 …] IN file2 AS joinname
Read Just the MDI
Must have MAXVALUES in the ACCESS.
Uses WITHIN in ACCESS to define hierarchies
SET MDIOENCODING = ON
TABLE FILE filename.mdiname
request
ON TABLE SET MDIENCODING ON
END
Copyright 2007, Information Builders. Slide 22
Working with New Structures:
Questions?
Copyright 2007, Information Builders. Slide 23