Transcript Document

Introduction to
the Institutional Research Database
(IRDB) and Discoverer
1
OFFICE OF INSTITUTIONAL RESEARCH AND
ASSESSMENT
APRIL 27, 2011
Today’s Agenda
 Introductions
2
 Housekeeping
 Data Sources
 IRDB Structures
Tables/Fields
 Facts/Dimensions
 Discoverer
 Folders/Items
 Rows, Columns, Page Items
 Calculations
 Documentation

How are data moved from operational systems into a data
warehouse?
3
 Step 1. Snapshots are extracted from operational systems.
 Step 2. Extracted files are reformatted and cleaned.
 Step 3. Pre-processed files are loaded into staging tables and




metadata are loaded into lookup tables in an Oracle relational
database.
Step 4. Data in the staging tables are migrated to normalized tables.
Step 5. Summary tables and other high performance query
structures are created from the normalized tables and lookup tables.
Step 6. Semester-based fact and dimension tables are created from
the normalized tables and lookup tables.
Step 7. Longitudinal fact tables are created from the semesterbased fact and dimension tables.
Steps 1–3
Why are snapshots used to populate the IRDB?
4
Extraction
Transformation
ShowRegistration
File
COBOL
SPSS
Performance
File
SIMS
COBOL
Graduation
File
SKAT
Skills
Tests
Results
NCS
Pearsons
Post
Graduate
Surveys
Academic
Program
Inventory
Clean
Show-Reg
File
Database Link
SPSS
Clean
Performance
File
Clean
Graduation
File
Load
SQL*Loader
SHOW_FILE
SQL*Loader
PERF_FILE
SQL*Loader
GRAD_FILE
SKAT_FILE_02
SQL*Loader
SPSS
Survey
data with
SSN’s
Database View
SQL*Loader
VTEA_SURVEY_FILE_02
NYSED_PPROGRAM_LOOKUP
CAS
(freshman
admissions)
CUNY IRDB
Data Flow Diagram
Longitudinal
Cohorts
(denormalized
student-level
data)
ASTA
(transfer
admissions)
SKAT
(skills tests)
Migrate Data
into Oracle9i
Environment
(SQL*Loader)
SHOW
(enrollment)
Standardized
Files
PERF
(grades)
GRAD
(degrees)
Staging
Tables
Normalize
Data
(PL/SQL)
Operational
Data Store
(normalized
student-level
data)
PC Files
Migrate Data
into Oracle 9i
Environment
(SQL*Loader)
Oracle
Discoverer
Crosstabs
Extract Files
Ad-Hoc
Queries
Institutional
Researchers
Joins from
Multiple Tables
across
Multiple Terms
Oracle
Discoverer
Tables
Create Fact
and Dimension
Tables
(SQL)
Reformat
and Clean
Input Files
(SPSS)
SPSS
for
Windows
Data
Warehhouse
(denormalized
student-level
data)
Oracle
Discoverer
Crosstabs
Spread
sheets
Ad-Hoc
Queries
University
Administrators
Group by
Selected
Columns
(SQL)
Lookup
Tables
(metadata)
NCES
(job survey)
SFA
(financial aid)
Clearinghouse
(transfers to
non-CUNY
colleges)
Type or
Cut and Paste
Code Descriptions
from File Layouts
Summary
Tables
(denormalized
aggregate-level
data)
Oracle
Forms
Crystal Reports
and
Oracle Portal
CUNY Data Book
on Institutional
Research
Web Site
Special Reports
Public Users
Flash Enrollment
5
6
What are fact and dimension tables and how are they
related?
7
 A fact table is composed of numerical measures of business
performance. Examples of facts would be headcount, FTE’s,
and cumulative credits earned.
 Dimension tables contain items that describe or categorize the
items in the fact table. Examples of dimensions would be
gender, full-time/part-time status, and college of attendance.
 The fact table also contains foreign keys that can be used to
join it with the primary keys of the dimension tables. For
example, “Student ID”, “Term Enrolled Date”, and “College
ID” are used to join the table “History Facts” with the table
“History Major 1 Dim”.
 A central fact table with multiple dimension tables radiating
out from it is called a star schema.
What are the advantages of using a star schema?
8
 Creates a database design that improves performance.
 Parallels, in the database design, how the end users
usually think and use the data.
 Provides versatile and robust ad-hoc query capabilities.
 Provides an extensible design which supports changing
business requirements.
 Can be used with point-and-click tools such as Oracle
Discoverer 9iAs.
History Facts and Dimensions
9
The Joins between the Fact Table “History Facts” and its Dimension Tables
Are Defined by an OIRA Administrator in the Discoverer End-User Layer
10
How is a campus limited to viewing only the data of
its own students?
11
IRASI
Institutional
Research
Staten Island
IRASI.HISTORY_FACTS
# student_id
# term_enrolled_date
# college_id
IR.HISTORY_FACTS
# student_id
# term_enrolled_date
# college_id
IR.SEC_COLLEGE_07_MV
# sec_student_id
IR.USERID_LOOKUP
# userid
# college_id
# table_name
# table_grant
Users Select “Items” from a “Folder”
with a Mouse Rather than Writing and Executing SQL Code
12
Discoverer
13
 IRDB End-User Query Tool
 Currently accessed via Citrix
 Requires user id/password – domain log in (managed by CIS)
 Discoverer (account required – managed by OIRA)
 Set of Business Areas (linked fact and dimension tables)
 History Facts – Historical Enrollment Records
 Degree Facts - Historical Degree Records (through most recent complete
academic year)

Cohort Facts – Integration of Enrollment and Degree data in a
longitudinal structure for tracking cohorts over time

Special Business Area - mostly stand-alone tables for specific
analyses (e.g., PMP)
Users Arrange the Items as the Page-Breaks, Columns, and Rows
for a Desired Report
14
Accessing the IRDB Through Discoverer
15
 Navigate your web browser to https://ez.cuny.edu
 Log in with your LAN user id and password
 Click on the Discoverer icon in the list of available
applications via Citrix
 Install Java code as prompted upon first use of a given
computer (you may need an IT technician to install
programs on your computer)
 After Java installation, you will be prompted to log in to
Discoverer (user id and initial password established by
OIRA)
 Documentation available
Creating a New Workbook as a Crosstabs Report
with Discoverer 9iAS
16
The Derived Fact “Headcount” Reflects the Business Rules for
Excluding Some Students from Official Enrollment Statistics
17
Creating a Layout for the Crosstabs Report
18
Discoverer Estimates the Time Needed to Run a Query
19
A Crosstab Built from “History Facts” and
Two Related Dimension Tables
20
Creating Totals and Subtotals
21
Creating a New Workbook as a Table Report (or Extract)
using Discoverer
22
Creating a Layout for the Table Report
23
An Example of an Implicit Condition
24
Sorting the Rows Retrieved
25
A Table Report of Fall 2002 Graduates with the Original Dimension “Birth Date”,
the Computed Fact “Age”, and the Computed Dimension “Age Group 1”
26
With Discoverer, Table Reports Can Be Exported in a Variety
of Formats
27
Tracking Student Progress:
How Should Many-to-Many Relationships
between Fact Tables be Resolved?
28
History Facts
Degree Facts
Answer:
Create an Intersection Entity that Has Many-toOne Relationships with both Tables
29
History Facts
Intersection
Entity
Degree Facts
30
Selecting Three Different “Headcount” Facts
from the Table “Cohort Facts”
31
“Headcount” of Undergraduates Who Entered in Fall 1990
“Headcount” of Fall 1990 Entrants Who Returned in Fall 1991
“Headcount” of Fall 1990 Entrants Who Graduated by Summer 1996
32
Fall-to-Fall Retention of Fall 1990 Undergraduate Entrants
33
The “Headcount” Facts in the Table Cohort Facts” and the Foreign Keys that Join
it with the Table “Degree Facts” Can Be Used to Create a Graduation Rate Item
34
Six- Year Graduation Rates of Fall 1990 Undergraduate Entrants
35