The benefit from metadata and generalized utility programs can be immense. But first you need to hit a critical point where the number of utility tools is sufficient to do most of what needs to happen even in novel data situations. This week I had my first taste of that critical point for this project. It was a good feeling.
Many new items: an operating system, xquery upgrade, greater flexibility in the warehouse population, more generalized move procedures, a more robust data verification process, improved program performance, and a nifty new way to translate IPEDS variables into warehouse variables.
Everything is supersized in this example of the benefits from open sharing of data.
Commentary | Gary Lewis, August 13, 2010 6:32 am |
Comments Off
I revised the About page today. It now includes a description of what I’m trying to do with the rwebdb project. It also includes a list of principles that I find helpful when writing this blog and when imagining education tomorrow. I’ve reproduced the new About page in this blog post.
This past week I wrote and assembled the XQuery programs and procedures needed to build a warehouse of time series data on college and university financial data. At present the warehouse contains only the two variables used as test cases. But I believe the generality present in the programs and procedures will carry us a good distance into the build.
A milestone. We now have a working definition of the population of institutions to include in the data warehouse. Includes links to 7 new xquery programs and 3 other assorted links to new material.
Two more XQuery utility programs helpful when identifying IPEDS variables to include in the data mart. One of the programs illustrates the use of dynamic XQuery, which can be a particularly powerful way to collapse a multitude of specific queries into one generalized query.
On star schemas, dimension and fact variables, and iteration toward a warehouse design.
design, rwebdb | Gary Lewis, July 13, 2010 12:21 pm |
Comments Off
Housekeeping post. Today I added an attribute to one of the XML files to include the name of the SPS file used as the source for the metadata. If you ever need to go back to the source file and check metadata values, it’s important to know what source file you actually used!
On listening for tomorrow after filtering the institutional noise of today. Includes a threaded bookmark of the following:
1. Making good society.
2. The Cooperative Movement in Century 21.
3. The Foundation for P2P Alternatives.
4. A Gathering of Ideas.