4. EXTERNAL INDEX - Information Builders

Download Report

Transcript 4. EXTERNAL INDEX - Information Builders

XFOC Databases:
If You Build Them, They will Come
Noreen Redden
FOCWizard
June, 2008
Copyright 2007, Information Builders. Slide 1
Agenda


XFOC Databases
 Sizing Considerations
 Number of Instances / page
 4 Times Bigger ?
 Efficiencies
 Multiple Partitions
 Horizontal and Vertical
 Intelligent Partitioning
MDI (Multi-Dimensional Index)
 Why use it vs. regular External Index
 Components
 ACCESS File
 Building/Adding to MDI
Copyright 2007, Information Builders. Slide 2
XFOC Databases
The Data
Warehouse
Copyright 2007, Information Builders. Slide 3
XFOC Databases – Sizing Considerations
 General:
 Pages contain instances of a single SEGMENT, specific


internal index, or values of text fields.
 Page 1 contains ONLY the FDT
Page is 16K, with a maximum of 1K pages (maximum
size of partition is 16gig.
Segment instances contain:
 Pointer (at beginning of page) to point to specific
instance on page
 Pointer from instance to instance (chain, Parent to
child, Child to parent)
 Data Storage
Copyright 2007, Information Builders. Slide 4
XFOC Databases – Sizing Considerations
 Segment Instances – Data Storage
 Alpha (An)
 Variable Alpha (AnV)
 Integer
 Packed
 Float Single Precision
 Float Double Precision
 Text
n bytes
n bytes + 2 for length
4 bytes
8 or 16 packed bytes
4
8
8 byte pointer on Seg
+ actual length on
text pages)
Copyright 2007, Information Builders. Slide 5
XFOC Databases – EX CALCFILE
Segment
Population
FDT
FOC 4K Pages
N/A
XFOC 16K
Pages
1
ORIGIN
1,000
5
2
COMP
5,000
46
13
CARREC
50,000
455
123
BODY
100,000
1,220
320
SPECS
100,000
2128
541
WARANT
40,000
488
128
EQUIP
75,000
915
240
COUNTRY
INDEX
1,000
9
3
Copyright 2007, Information Builders. Slide 6
XFOC Database - Efficiencies
 For maximum efficiency, a chain of child instances should be
able to be accessed in a single I/O.
 Use Unique Segments to isolate infrequently used fields.
 Partition historical data to reduce I/O needed for UPDATE.
 Build/rebuild internal indexes AFTER large updates/loads
 Increase XFBINs to increase page re-utilization.
 Use Partitions to reduce the need to REBUILD external
indexes/MDI’s
Copyright 2007, Information Builders. Slide 7
Vertical
Partitioning
Copyright 2007, Information Builders. Slide 8
Intelligent Partitioning
GGSALES
FIELDNAME
--------SEQ_NO
CATEGORY
PCD
PRODUCT
REGION
ST
CITY
STCD
DATE
UNITS
DOLLARS
BUDUNITS
BUDDOLLARS
ALIAS
----SEQ
E02
E03
E04
E05
E06
E07
E08
E09
E10
E11
E12
E13
FORMAT
-----I6
A11
A04
A16
A11
A02
A20
A05
I8YYMD
I08
I08
I08
I08
Copyright 2007, Information Builders. Slide 9
Intelligent Partitioning
USE
DYNAM ALLOC F GGSALES1 DA PMSNMR.GGSLS2K.FOCUS SHR REU
DYNAM ALLOC F GGSALES2 DA PMSNMR.GGSALES.FOCUS SHR REU
USE
-IF &MDATE.ENTER LOWEST DATE FOR REPORT. GT 19991231
-
GOTO NOT1;
GGSALES2 AS GGSALES
-NOT1
-IF &MDATE.ENTER HIGHEST DATE FOR REPORT. LT 20010101
-GOTO ENDIT
GGSALES1 AS GGSALES
-ENDIT
END
Copyright 2007, Information Builders. Slide 10
Intelligent Partitioning
ACCESS File Points to the FOCUS Files
FILENAME=GGSALES, SUFFIX=XFOC, ACCESS=GGSALES,$
SEGNAME=SALES01, SEGTYPE=S1
FIELD=SEQ_NO, ALIAS=SEQ, FORMAT=I6, TITLE='SEQUENCE#', $
FIELD=CATEGORY, ALIAS=E02, FORMAT=A11, INDEX=I, TITLE='Category', $
FIELD=PCD, ALIAS=E03, FORMAT=A04, INDEX=I, TITLE='Product ID', $
FIELD=PRODUCT, ALIAS=E04, FORMAT=A16, TITLE='Product',$
FIELD=REGION, ALIAS=E05, FORMAT=A11, INDEX=I, TITLE='Region', $
FIELD=ST, ALIAS=E06, FORMAT=A02, INDEX=I, TITLE='State', DESC='State',$
FIELD=CITY, ALIAS=E07, FORMAT=A20, TITLE='City', DESC='City',$
FIELD=STCD, ALIAS=E08, FORMAT=A05, INDEX=I, TITLE='Store ID', $
FIELD=DATE, ALIAS=E09, FORMAT=I8YYMD, TITLE='Date',
$
MASTER=GGSALES ,$
DATA = PMSNMR.GGSLS2K.FOCUS ,
WHERE = DATE FROM 20000101 TO 20091231,$
DATA = PMSNMR.GGSALES.FOCUS ,
WHERE = DATE FROM 19900101 TO 19991231,$
Copyright 2007, Information Builders. Slide 11
Tip with Partitions
 MODIFY/MAINTAIN only use the first partition.
 External indexes must be rebuilt if the file changes.
 Keep 2 MFDs: GGSALES (with ACCESS) ; GGSALE2
MODIFY FILE GGSALE2
…
END
TABLE FILE GGSALE2
…
WHERE …
MORE
FILE GGSALES
END
Copyright 2007, Information Builders. Slide 12
XFOC Databases
Copyright 2007, Information Builders. Slide 13
MDI – Multi-Dimensional Index
 Dimensions categorize data –
 Sales may be categorized by REGION, BRANCH and
PRODUCT Type,
 Products may be categorized by PRODUCT CATEGORY,
PRODUCT CODE, SIZE, COLOR
 Employees may be categorized by Job code, Hire_date,
Branch, etc.
 A multi-dimensional index uses dimensions to point to specific
facts.
 MDI is a multi-field index to behave like a virtual “cube” of
values that intersect at measures of interest. The more
dimensions used, the better the retrieval performance.
 MDIs may be used for Selection, JOIN, or reported from
directly.
Copyright 2007, Information Builders. Slide 14
MDI – Multi-Dimensional Index
 Know Your Data, and queries to choose index dimensions:
 Fields used frequently as filters (Multiple fields used as filters
within a single report/query should be in the same MDI.
 Fields used as basis for vertical portioning
 DEFINE fields
 Fields with many unique values
 Multiple MDIs may be created for separate subsets of data.
 In a query, AUTOINDEX will choose the MDI to use based on
 Segment most involved in the query
 MDI with the most filtering conditions
 Percent of index dimensions involved in the request from each MDI
 Size of each MDI.
Copyright 2007, Information Builders. Slide 15
Intelligent Partitioning
GGSALES
FIELDNAME
--------SEQ_NO
CATEGORY
PCD
PRODUCT
REGION
ST
CITY
STCD
DATE
UNITS
DOLLARS
BUDUNITS
BUDDOLLARS
ALIAS
----SEQ
E02
E03
E04
E05
E06
E07
E08
E09
E10
E11
E12
E13
FORMAT
-----I6
A11
A04
A16
A11
A02
A20
A05
I8YYMD
I08
I08
I08
I08
Copyright 2007, Information Builders. Slide 16
ACCESS File Points to the MDIS
MASTER=GGSALES ,$
DATA = PMSNMR.GGSLS2K.FOCUS ,
WHERE = DATE FROM 20000101 TO 20091231,$
DATA = PMSNMR.GGSALES.FOCUS ,
WHERE = DATE FROM 19900101 TO 19991231,$
MDI=GGMDI, TARGET_OF=SALES01 ,$
DIM=CATEGORY ,MAXVALUE = 100,$
DIM=PCD
,MAXVALUE = 50, WITHIN=CATEGORY,$
DIM=REGION
,MAXVALUE = 20,$
DIM=ST
,MAXVALUE = 20, WITHIN=REGION ,$
DIM=CITY
,MAXVALUE = 20, WITHIN = ST,$
DIM=STCD
,MAXVALUE = 20, WITHIN=CITY ,$
DIM=DATE
,MAXVALUE = 240,$
MDIDATA= PMSNMR.GGMDI1.FOCUS ,$
MDIDATA= PMSNMR.GGMDI2.FOCUS ,$
Copyright 2007, Information Builders. Slide 17
ACCESS File Points to the MDIS
MDI=GGLOC, TARGET_OF=SALES01 ,$
DIM=REGION
,MAXVALUE = 20,$
DIM=ST
,MAXVALUE = 20, WITHIN=REGION ,$
DIM=CITY
,MAXVALUE = 20, WITHIN = ST,$
DIM=STCD
,MAXVALUE = 20, WITHIN=CITY ,$
DIM=DATE
,MAXVALUE = 240,$
MDIDATA= PMSNMR.GGLOC1.FOCUS ,$
MDIDATA= PMSNMR.GGLOC2.FOCUS ,$
Copyright 2007, Information Builders. Slide 18
Creating the MDI - REBUILD
Enter option
1. REBUILD
(Optimize the database structure)
2. REORG
(Alter the database structure)
3. INDEX
(Build/modify the database index)
4. EXTERNAL INDEX
(Build/modify an external index database)
5. CHECK
(Check the database structure)
6. TIMESTAMP
(Change the database timestamp)
7. DATE NEW
(Convert old date formats to smartdate formats)
8. MDINDEX
(Build/modify a multidimensional index)
MDINDEX
NEW / ADD ? NEW
ENTER MASTER FILE NAME
GGSALES
ENTER MD_INDEX LOCATION FILE NAME GGMDI
ANY RECORD SELECTION TESTS? (YES/NO) NO
NEW FILE PMSNMR.GGMDI1.FOCUS
ON 04/09/2008 AT
NEW FILE PMSNMR.GGMDI2.FOCUS
ON 04/09/2008 AT
Accumulating MDI Data ....
Copyright 2007, Information Builders. Slide 19
Adding a Partition - REBUILD
Enter option
1. REBUILD
(Optimize the database structure)
2. REORG
(Alter the database structure)
3. INDEX
(Build/modify the database index)
4. EXTERNAL INDEX
(Build/modify an external index database)
5. CHECK
(Check the database structure)
6. TIMESTAMP
(Change the database timestamp)
7. DATE NEW
(Convert old date formats to smartdate formats)
8. MDINDEX
(Build/modify a multidimensional index)
MDINDEX
NEW / ADD ? ADD
ENTER MASTER FILE NAME
GGSALES
ENTER MD_INDEX LOCATION FILE NAME GGMDI
ANY RECORD SELECTION TESTS? (YES/NO) NO
NEW FILE PMSNMR.GGMDI1.FOCUS
ON 04/09/2008 AT
NEW FILE PMSNMR.GGMDI2.FOCUS
ON 04/09/2008 AT
Accumulating MDI Data ....
Copyright 2007, Information Builders. Slide 20
Comparing
TABLE FILE GGSALES
PRINT * BY REGION BY ST
WHERE REGION EQ 'Northeast'
WHERE DATE EQ '19920501'
WHERE ST
EQ 'NY'
WHERE PRODUCT EQ 'Capuccino‘
END
270,000 RECORDS
AUTOINDEX
RECORDS
YES (REGION)
3
GGMDI
3
BASEIO
1682
41
FOCCPU
434
293
INDEXIO
0
71
Copyright 2007, Information Builders. Slide 21
MDI – Other Facilities
 JOIN field1 [AND field2…] IN file1 TO ALL mdiname
IN file2 AS joinname
 JOIN field1 [AND field2 …] IN file1 TO ALL dim1
[AND dim2 …] IN file2 AS joinname
 Read Just the MDI
 Must have MAXVALUES in the ACCESS.
 Uses WITHIN in ACCESS to define hierarchies
 SET MDIOENCODING = ON
 TABLE FILE filename.mdiname
request
ON TABLE SET MDIENCODING ON
END
Copyright 2007, Information Builders. Slide 22
Working with New Structures:
Questions?
Copyright 2007, Information Builders. Slide 23