Data Administration

Download Report

Transcript Data Administration

Data Administration
Data Warehouse Implementation
9/25/01
DW Current Resources
Server - IBM RS6000 S7A with 4 engines, 4GB memory,
200GB disk space, running AIX 4.3.0.
Data – 100GB disk space used to house FRS, FES, HRS and
SIS Oracle extracts. This is essentially a “data dump” of the
IDMS data, and “views” still need to be created.
Staff –


DBA - Henry 80%, Song 20%, Short 10%, Chen 10%
DA - Kisil 70%, Cheesman 50%, Claunch 50%, Carter 100%
DW Tasks

DBA – Extract data, design Oracle DB, load data, and
production support (i.e. monitor system and DB performance,
enforce security, schedule backups, etc.)

Data Administration – User interface, develop requirements
document for all DW projects, evaluate data quality, create
DB views, develop specialized reports, test, train users, and
coordinate projects.

Both – Infrastructure design (with Systems staff), and tool
evaluation (ETL, OLAP and desktop reporting) with help from
the C/S group.
Where are we now?
Data Inventory
Element names, formats, and
definitions
FRS, FES, HRS and SIS
SAS Data Mining Software
Source Data
 IDMS
 Sybase
Data Staging Area
 Oracle
 Extract of FRS, FES,
HRS, and SIS
 No cleansing
Software purchased
Planning implementation
Where do we want to be in three
years?
Ad Hoc and
Operational
Reports
Ad Hoc Query Repository





Copy of Source Data
Operational
Daily Updates
All Elements
Minimum Number of Years of
Data
Tell me what
happened and
why?
Tell me what
happened?
OLAP Server
Operational Data
IDMS
Oracle
Sybase
Tell me what may
happen, or what is
interesting?
Metadata
SAS Data Mining
Server
Data Staging Area




External Data
Census Data, Benchmark,
Salary Surveys, Economic
Data
Extract Data
Transform Data
Quality Assurance
Create Metadata
Data Mart #1
Data Warehouse
 Cleansed
 Subset of Detail Data
 Subset of Summary
Data
 Multiple Years of Data
 Periodic Updates
 Strategic
Course Management
 Subset of DW
 Summarized in specific
manner
 Tactical
Data Mart #2
Resource Management
Data Mart #?
Give me information
to help me achieve
specific goals!
Components of a Decision Support System
What is a
Decision Support System
EIS
High Level Summarized Data
For Top Executives
(“Pre-programmed DASHBOARD”)
Data Mart
Data Warehouse
Operational Data Store
Addresses Specific
Subject Area
Collection Of Integrated Subject
Oriented Databases
(Historical)
Time-Current, Integrated
Databases
(Tactical-Power Users)
Covansys
How do we get there?
Educate users.
Develop detailed requirements documents,
including “Information Value Chains” for all
goals which the DW/DM is expected to
address.
Data Mart approach.
Phased implementation.
Additional resources.
Educate Users
Basics – “What is a Data Warehouse?” Create a
“single-source-of-truth.” “What it’s not!” (All the data,
with daily updates and online access.)
Change in culture – “Let’s make better decisions
based on objective analysis of data.”
Set realistic expectations - No silver bullet. It can help
you make better decisions, but you still have to be
responsible for implementing those decisions.
Focus on institutional goals – “What is it we need to
achieve? What metrics do we need to evaluate our
progress in attaining goals?”
Importance of business sponsors – Make timely
business decisions and support IT requests for additional
funds.
Course Management (I.V.C.)
Business Functions and Goals

Optimize course offerings to meet student need.
Improvement Opportunities




Increase number of high demand courses/sections
Increase maximum enrollment in sections
Eliminate or reduce frequency of low demand courses
Improve course meeting patterns and delivery mode
Performance Measures
 # and % decrease of students who do not get
any section of the course requested
 # and % decrease of low demand courses
 # and % increase in enrollment
 % usage of classroom capacity
 % decrease in length of time to graduate
 # and % increase in courses taught through
preferred mode
Business Questions






What are the characteristics of high/low demand courses?
What characteristics of the student are related to demand?
What courses can be eliminated?
Which courses should/can be moved to smaller/larger facilities?
What impact does the meeting time and location have on demand?
What improvements can be made with/without additional money?
Data Model
College Budgets
Degree Reqs.
Student
Defines
Facilities
Course Demand
Courses
Available Faculty
Enrollment
Economic Data
(American Management Systems, Inc.)
Data Mart/
Warehouse
Enrollment Management (I.V.C.)
Business Functions and Goals

Increase student enrollment
Improvement Opportunities
 Increase student retention
 Increase number of new students
 Increase number of transfers
Performance Measures
 # and % increase of returning, new and
transfer students by term
 # and % increase of applicants to college
eligible by KY county
 # and % increase of admits to applicants
 # and % increase of enrollments to admits
 Ratios and % increase for FT/PT, Res/Non-Resident
Business Questions





What are the characteristics of students who do/do not return?
What are the characteristics of students who apply, but do not enroll?
What are the characteristics of students who do/do not transfer?
What policies or practices could affect these student outcomes?
What are the external factors affecting our enrollment?
Data Model
KY Population Data
State Aid Programs Data
Financial Aid
Applicant
Student
Defines
Economic Data
Course Demand
(American Management Systems, Inc.)
Courses
Enrollment
Data Mart/
Warehouse
DW Subject Areas
Course Management DM – Subject Areas
Accounts (dollars for faculty and supplies)
Assets (facilities and equipment)
Benchmark Data (what and when courses
offered, degree requirements)
Courses (meeting pattern, teaching mode,
requirements)
Department/College (who owns the course,
degree offered and requirements)
Economic Data (salary and demand by
profession)
Faculty (distribution of effort, availability to teach)
Faculty Applicants (who and how many)
Population Data (who, where and what volume)
Staff (instructional support)
Students (classification, course demand)
Enrollment Management DM – Subject Areas
Accounts (dollars for services, financial aid)
Assets (facilities and equipment, housing)
Benchmark Data (enrollment patterns)
Courses (meeting pattern, teaching mode,
requirements, professor)
Department/College (who owns the course,
degree requirements, degrees offered)
Economic Data (general condition of economy,
hot professions)
Faculty (quality, student evaluations)
Population Data (who, where and what volume)
Staff (instructional support, student service area)
Students (demographics, course demand, GPA)
Student Prospects (demographics, contacts)
Student Applicants (demographics, contacts)
Project Schedule
Phase I, 2001-02
Infrastructure & Planning
Phase II, 2002-03
Creation of Additional Data Marts
Phase III, 2003-04
Creation of Additional Data Marts
Phase I – Infrastructure and Planning
(2001-2002)
1)
2)
3)
IDMS Data Dump to Oracle – 100% complete.
Create Views for “Data Dump” (Ad Hoc
Reporting Repository) - Target completion 3Q 01-02.
WebFOCUS Implementation – Target completion
01/01/02.
4)
Data Mining Tools for IR staff – Purchased, and
planning installation. Target installation completion 12/1/01.
5)
Establish Enterprise Standards for Key Data –
Analysis and recommendations are ongoing.
6)
Identify and Prioritize Data Mart Development
– Ongoing work with Data Stewards.
Phase I – Infrastructure and Planning
(2001-2002) (Continued)
7)
Select Desktop Reporting Tools – Preliminary
evaluation beginning by DA, DBA and C/S staff. Selected
tool(s) will be the IT standard and supported by our staff.
Target completion 1/1/02. Choices:





8)
Crystal Reports
Infomaker
SAS
WebFOCUS
Cognos
Select ETL Tools – Preliminary comparison, and funds
will be requested for 02-03. Choices:





Ascential
IBM
Informatica
Oracle
SAS
Phase I – Infrastructure and Planning
(2001-2002) (Continued)
Develop “basic” Data Marts and
Corresponding DW Functionality –
9)
•
•
•
•
CPE – Six years of data loaded in Oracle; SAS
testing underway. Target completion 11/1/01.
GASB – Requirements phase. Target project
completion 2/1/02.
Course Management DM – Requirements
phase. Target completion 4/1/02.
Enrollment Management DM – Preliminary
meetings. Data will be added incrementally,
with target project completion 1/1/03.
Phase II – Creation of Additional Data
Marts (2002-2003)
1)
2)
3)
4)
5)
6)
ETL Tools – Purchase and install by 11/1/02.
Create Metadata – Will need to be created for the existing
Data Marts, as well as the new. Target completion for
existing DM 2/1/03.
Create Additional Data Marts – Complete Enrollment DM
by 7/1/03, and start/complete two or more of the following:
Department Mgt., Diversity & Equity Initiatives, Grant
Management, Research Management and Resource
Management (accounts, personnel, etc.).
Evaluate OLAP Tools – Choices could include Cognos and
Essbase. Select tool by 1/1/03, and request funds for 0304.
Evaluate External Data Needs - Identify external data
needed for the Data Marts, locate source(s), estimate cost
and request funds for 03-04. Target completion 3/1/03.
Begin Preliminary Development of EIS -Executive
Information System.
Phase III – Creation of Additional Data
Marts (2003-2004)
1) OLAP Tools – Purchase and install by 11/1/03.
2) Create Additional Data Marts – Start/complete two
or more of the following: Department Mgt., Diversity
& Equity Initiatives, Grant Management, Research
Management and Resource Management.
3) Purchase and Load External Data - Target
completion 7/1/04.
4) Continue Development of EIS - Executive
Information System.
New DW Resources
Server (Existing) – $84,300 Additional 4 engines and 4GB
memory for Data Warehouse Server.
Servers (New) –Additional servers will need to be purchased for
the Ad Hoc Reporting Repository, the Data Staging Area, the
Data Mining and/or OLAP tools, and for one or more of the Data
Marts. Size, type and cost will vary depending on function. Goal
is for remaining servers to be Intel-based to minimize cost. Cost
unknown.
Storage – Additional disk space for DW server, and for other
servers using Enterprise Storage System. Exact amounts for
online storage and for archiving will be identified during the
project requirements phase. Cost unknown.
New Resources (Cont.)
Training – $21,200
 Infrastructure – “How do you design a DW environment?
 Staff Training (TDWI Seminar $7,600 for 2 FTE)
 Oracle DB Training ($7,200 for 4 FTE)
 Technical Training – Analysis and reporting tools. ($6,400 for 4
FTE)
 Onsite visits to other universities.
Consulting on Infrastructure Design – Covansys,
Gavroshe, Oracle, IBM, etc. Cost unknown.
Staff – $140,000 for 2 FTE


DBA – 1 FTE (Totally devoted to DW development.)
Reporting – 1 FTE (Totally devoted to development of the EIS
and/or institutional reports from DW environment.)
DW Critical Issues
Personnel Resources – Need to keep people focused on DW in
light of competing projects (i.e., Sybase/Omnis Conversion, WWTE, IPP,
IDMS maintenance.)
Training – IT and users.
Consulting – Assistance for infrastructure planning. This is new to us!
Culture Change – Focus on goals rather than data elements. Tell us
what, not how! Value in making decision based on data. Proof of
concept.
Requirements Gathering – Needs to be thorough, and heavily relies
on timely user decisions. Defining who the “users” are on any given
project could be political.
Budget/Funding – For additional hardware, software and people.
Business Sponsor – The Data Warehouse is not another IT
project. It is an enterprise initiative!
Data Administration
QUESTIONS?