rwebdb – Looking Ahead

Sometimes in the early morning haze between sleep and waking, an interesting thought appears out-of-nowhere. So it didn’t come as a real surprise that a similar thing happened last week when I got laid low by a nasty E. coli bacteria and was struggling in the surreal world of hospitals.

In this case I realized that several recent articles each offered perspective on the rwebdb project and, specifically, that part of the project about which I feel least confident. Current rwebdb work focuses on building a warehouse from IPEDS time series data. For certain there will be challenges, but I’m not much concerned with accomplishing this part of the project. What does concern me is how to use the warehouse to engage people around the question of university costs and pricing in the United States.

Three of the articles that bear obliquely on this post are listed below. Here’s my attempt at synthesis as it relates to rwebdb.

1. Several national education projects and think-tanks already focus on rising higher education costs. But their work is meant to influence power and policy. This is not the audience for the rwebdb project. It’s rather those people most affected by rising costs who simply want an answer to the question “why does college cost so much?”

2. Both David Eaves and danah boyd (see below) argue for an educational component to open data projects. Fair enough. I see the need to improve information literacy skills. But that must not be the main focus of the rwebdb project. I fear that making the project overtly educational will be the kiss of death. The issue of rising university costs is powerful. It grabs people in the guts. We cannot lose that impact by trying to teach people something. Let the learning happen naturally as a result of people exploring a topic of real interest to them. No scripted learning.

3. I’m drawn to the notion of stories and games as ways into the data. Both seem natural to humans. We each organize our world as interconnected stories. We each like to play. But beyond these rather vague features, I really am at a loss. Application development of this kind is not something I’ve ever done. I can provide a crude prototype, but making it sparkle will require help. More about this as the time approaches.
 


 
Nat Torkington. Truly Open Data. 01 March 2010.

I’ve been focused on getting people to release data. That’s the data analogue of tossing code over the wall, and we know it takes more than a tarball on an FTP server to get the benefits of open source. The same is true of data.

Open source discourages laziness (because everyone can see the corners you’ve cut), it can get bugs fixed or at least identified much faster (many eyes), it promotes collaboration, and it’s a great training ground for skills development. I see no reason why open data shouldn’t bring the same opportunities to data projects.

danah boyd. Transparency is Not Enough. 26 May 2010.
Transcript and video of a talk that boyd delivered at Gov2.0 Expo. She makes the important point that access to information (ie, transparency) is essential but not sufficient. Actually empowering people with information requires two additional things: that people possess the information literacy skills needed to interpret data; and that people understand the context surrounding the data so that meaningful interpretations occur and misinterpretations are minimized.

[M]aking information available alone is not the great democratizer. It must be coupled with enabling people to have the skills to interpret it.

David Eaves. Learning from Libraries: The Literacy Challenge of Open Data. 10 June 2010.

We need a data-literate citizenry, not just a small elite of hackers and policy wonks. And the best way to cultivate that broad-based literacy is not to release in small or measured quantities, but to flood us with data. To provide thousands of niches that will interest people in learning, playing and working with open data. But more than this we also need to think about cultivating communities where citizens can exchange ideas as well as involve educators to help provide support and increase people’s ability to move up the learning curve.