Final XQuery Use Case
For a finale in my run of testing xquery in different data situations, I chose to deliberately use xquery where it would be slightly disadvantaged. So I decided to analyze moderately complex survey data in the form of the October 2007 School Enrollment Supplement to the Current Population Survey. The CPS is sponsored jointly by the U.S. Bureau of Census and the Bureau of Labor Statistics. It is the data source used to calculate those dreadful monthly unemployment rates we’ve seen in the current recession.
Instead of unemployment, I chose to examine whether family income influences students’ choices about enrollment in public versus private colleges and universities in the United States. Given the substantial gap in net price between public and private institutions, you would think that family finances might play an important role in the public vs private decision.
One caveat as always. Throughout these use cases, I’ve been less interested in the research question than in how xquery performs. I chose research questions that I thought would be interesting, but I was more concerned about the tool than the actual research.
In this post I won’t give away the story-line about the research. For that, please refer to the 4-page PDF that describes the project.
However, I will say once again that I enjoyed working with xquery. There are several implementations of the W3C standards. I used zorba.
This project did place stress on xquery performance at times, but I was able to concoct workarounds that I’ll be able to use in similar situations if they occur in the future.
Also, I’d like to highlight once again the very intricate nature of data analysis. I’m occasionally uncomfortable in this age of transparency and big data and semantic linked data that ease-of-use is the assumed norm and expectation. It has never been my experience that data analysis is easy or fast. It requires living in the data for considerable periods of time. It requires thought and play and more thought. It requires failure and more failure and finally small steps of success. I hope this will be evident in the PDF that describes this project.
I also tried again to provide links to all important programs I wrote as part of this project. It’s not exactly what some call open research, but it certainly is a step in that direction. If anyone feels motivated to duplicate what I did or, better yet, to modify and improve upon what I did, then I think you’ll find the source links useful. And I’d be happy to assist if you have questions.
With the conclusion of this project, I no longer feel the need to equivocate about xquery. I will definitely use it.
