rwebdb 29-May-2010
On triangulation and techniques to build metadata and confidence in the data.
On triangulation and techniques to build metadata and confidence in the data.
An illustration of why data curation is necessary for even modestly complex data sets.
How many of the 20,000 institutions represented in the 25 years of IPEDS data appear throughout the time series? And how many appear infrequently? Answers to these questions make an initial design decision possible and provide an opportunity to illustrate the use of an SVG data visualization.
Dynamic query programs can significantly improve management of complex datasets. Here’s a second example that illustrates the benefits.
In a data warehouse, the design effort is all about ease-of-use. Forget elegant data structures; it must first and foremost be simple to use.
Data about the data (metadata) is essential for managing a data project of any complexity.
It begins … the long but absolutely essential task of living in data long enough to recognize both its warts and its beauty.
“Poor neighborhoods around the world embrace a surprising idea: incredibly low-priced private schools.”
Character encodings, Unicode, and how I tripped on smart quotes and curly apostrophes.
Today I got an initial look at a number of IPEDS data files, but it was sufficient to suggest a strategy for building a small data warehouse on institutional costs and pricing.