rwebdb 08-October-2010

Enrollment data got added to the warehouse. Bulk loads, query performance issues, procedural changes for staging data, how to handle edits of source data, an impervious data anomaly, new documentation, new utility programs, an effort to improve eventual analysis with the time series, and on an on. It’s been an intense but productive two weeks.

rwebdb 03-September-2010

The benefit from metadata and generalized utility programs can be immense. But first you need to hit a critical point where the number of utility tools is sufficient to do most of what needs to happen even in novel data situations. This week I had my first taste of that critical point for this project. It was a good feeling.

rwebdb 27-August-2010

Many new items: an operating system, xquery upgrade, greater flexibility in the warehouse population, more generalized move procedures, a more robust data verification process, improved program performance, and a nifty new way to translate IPEDS variables into warehouse variables.

rwebdb – Warehouse Build Begins

This past week I wrote and assembled the XQuery programs and procedures needed to build a warehouse of time series data on college and university financial data. At present the warehouse contains only the two variables used as test cases. But I believe the generality present in the programs and procedures will carry us a good distance into the build.

rwebdb 29-July-2010

A milestone. We now have a working definition of the population of institutions to include in the data warehouse. Includes links to 7 new xquery programs and 3 other assorted links to new material.

rwebdb 26-July-2010

Two more XQuery utility programs helpful when identifying IPEDS variables to include in the data mart. One of the programs illustrates the use of dynamic XQuery, which can be a particularly powerful way to collapse a multitude of specific queries into one generalized query.

rwebdb 12-July-2010

Housekeeping post. Today I added an attribute to one of the XML files to include the name of the SPS file used as the source for the metadata. If you ever need to go back to the source file and check metadata values, it’s important to know what source file you actually used!

rwebdb: Good News

Yesterday the Delta Project released an impressive web-based tool that allows journalists, officials, policymakers, and the public to conveniently explore higher education finances in the United States. A report in Inside Higher Ed described it this way: “the new database’s groundbreaking feature is that – fasten your seatbelts – it allows for an analysis of the budget priorities of individual institutions.” Good news indeed.

rwebdb 07-July-2010

More metadata. This time on codes used by some variables and the descriptions of those codes. And yet another illustration that you don’t just mashup data files like IPEDS. There are no shortcuts.

rwebdb – 30-June-2010

On variable metadata and the need to search for inconsistencies and changes in variables over time. Building the scaffolding needed to design and populate the data warehouse.