rwebdb 27-August-2010
Many things to mention since the last post. Here’s a summary.
- Upgraded my workstation/server so it now uses Ubuntu 10.04 64-bit server. It’s kind of freaky without a graphical interface, but it’s growing on me.
- Upgraded the Zorba XQuery version to 1.4. This involves a build and an install so it’s more complicated than a binary download.
- Revised the procedures used to move IPEDS variables to a staging area and then into the warehouse. This involves several things:
- Decided to include all institutions for all years so as to allow greater flexibility in selecting a warehouse population as the project matures.
- Added a step prior to the move to staging. In this new step, IPEDS data headed for staging and the warehouse is written to temporary data files for processing. This greatly reduces file sizes and improves XQuery performance.
- Generalized the move to staging and the warehouse so that variable specific XQuery no longer need to be written.
- Made the data verification process more robust.
- Improved the logic in the XML file used to translate IPEDS variables into warehouse variables.
- Added one new warehouse variable for level of institution (eg, 2-year, 4-year). The warehouse now contains three variables: institution ID, control, and level.
Next steps involve adding a few more classificatory variables into the warehouse. Some of these will be recoded IPEDS variables. Others will be variables constructed to define populations of institutions for later use when we get to the web application development stage.
New
1. Programs used to select variables headed for staging and the warehouse.
dyn_create_stgPop_all_1year.xq
create_stgPop_1year.xq
2. Programs used to move IPEDS variables to staging.
create_stgPop.xq
create_stg2whPop.xq
3. Programs used to verify data moved to staging and the warehouse.
freq_1var_stgPop_by_year.xq
freq_1var_stg2whPop_by_year.xq
freq_1var_whPop_by_year.xq
4. Metadata XML file for variables added to the warehouse.
rVariables.xml
Revised
1. XML file used to recode IPEDS variables as warehouse variables.
recodes.xml
Obsoleted
gen_whPop.xq
rControl.xq
rControl_to_whPop.xq
rControl_freq.xq
verify_whPop_freq_1var_1year.xq
