rwebdb 29-May-2010

On triangulation and techniques to build metadata and confidence in the data.

rwebdb 27-May-2010

An illustration of why data curation is necessary for even modestly complex data sets.

rwebdb 25-May-2010

How many of the 20,000 institutions represented in the 25 years of IPEDS data appear throughout the time series? And how many appear infrequently? Answers to these questions make an initial design decision possible and provide an opportunity to illustrate the use of an SVG data visualization.

rwebdb 20-May-2010

Dynamic query programs can significantly improve management of complex datasets. Here’s a second example that illustrates the benefits.

rwebdb 19-May-2010

In a data warehouse, the design effort is all about ease-of-use. Forget elegant data structures; it must first and foremost be simple to use.

rwebdb 14-May-2010

Data about the data (metadata) is essential for managing a data project of any complexity.

rwebdb 12-May-2010

It begins … the long but absolutely essential task of living in data long enough to recognize both its warts and its beauty.

Bookmarks 09-May-2010

“Poor neighborhoods around the world embrace a surprising idea: incredibly low-priced private schools.”

rwebdb 07-May-2010

Character encodings, Unicode, and how I tripped on smart quotes and curly apostrophes.

rwebdb 04-May-2010

Today I got an initial look at a number of IPEDS data files, but it was sufficient to suggest a strategy for building a small data warehouse on institutional costs and pricing.