<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Educational Imaginations &#187; xquery</title>
	<atom:link href="http://garymlewis.com/instchg/tag/xquery/feed/" rel="self" type="application/rss+xml" />
	<link>http://garymlewis.com/instchg</link>
	<description></description>
	<lastBuildDate>Fri, 04 May 2012 11:08:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Final XQuery Use Case</title>
		<link>http://garymlewis.com/instchg/2009/12/09/final-xquery-use-case/</link>
		<comments>http://garymlewis.com/instchg/2009/12/09/final-xquery-use-case/#comments</comments>
		<pubDate>Wed, 09 Dec 2009 23:04:58 +0000</pubDate>
		<dc:creator>Gary Lewis</dc:creator>
				<category><![CDATA[Query Tools]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[xquery]]></category>

		<guid isPermaLink="false">http://garymlewis.com/instchg/?p=1917</guid>
		<description><![CDATA[With the conclusion of this project, I no longer feel the need to equivocate about xquery. I will definitely use it.]]></description>
			<content:encoded><![CDATA[<p>For a finale in my run of testing xquery in different data situations, I chose to deliberately use xquery where it would be slightly disadvantaged. So I decided to analyze moderately complex survey data in the form of the October 2007 School Enrollment Supplement to the Current Population Survey. The CPS is sponsored jointly by the U.S. Bureau of Census and the Bureau of Labor Statistics. It is the data source used to calculate those dreadful monthly unemployment rates we&#8217;ve seen in the current recession.</p>
<p>Instead of unemployment, I chose to examine whether family income influences students&#8217; choices about enrollment in public versus private colleges and universities in the United States. Given the substantial gap in net price between public and private institutions, you would think that family finances might play an important role in the public vs private decision.</p>
<p>One caveat as always. Throughout these use cases, I&#8217;ve been less interested in the research question than in how xquery performs. I chose research questions that I thought would be interesting, but I was more concerned about the tool than the actual research.</p>
<p>In this post I won&#8217;t give away the story-line about the research. For that, please refer to the <a href="http://garymlewis.com/instchg/public/xquery/cpsoct2007/final_cpsoct2007.pdf">4-page PDF</a> that describes the project.</p>
<p>However, I will say once again that I enjoyed working with xquery. There are several implementations of the W3C standards. I used <a href="http://www.zorba-xquery.com/">zorba</a>.</p>
<p>This project did place stress on xquery performance at times, but I was able to concoct workarounds that I&#8217;ll be able to use in similar situations if they occur in the future.</p>
<p>Also, I&#8217;d like to highlight once again the very intricate nature of data analysis. I&#8217;m occasionally uncomfortable in this age of transparency and big data and semantic linked data that ease-of-use is the assumed norm and expectation. It has never been my experience that data analysis is easy or fast. It requires living in the data for considerable periods of time. It requires thought and play and more thought. It requires failure and more failure and finally small steps of success. I hope this will be evident in the PDF that describes this project.</p>
<p>I also tried again to provide links to all important programs I wrote as part of this project. It&#8217;s not exactly what some call open research, but it certainly is a step in that direction. If anyone feels motivated to duplicate what I did or, better yet, to modify and improve upon what I did, then I think you&#8217;ll find the source links useful. And I&#8217;d be happy to assist if you have questions.</p>
<p>With the conclusion of this project, I no longer feel the need to equivocate about xquery. I will definitely use it.</p>
]]></content:encoded>
			<wfw:commentRss>http://garymlewis.com/instchg/2009/12/09/final-xquery-use-case/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Another XQuery Use Case: Is Higher Education Countercyclical?</title>
		<link>http://garymlewis.com/instchg/2009/08/10/another-xquery-use-case-is-higher-education-countercyclical/</link>
		<comments>http://garymlewis.com/instchg/2009/08/10/another-xquery-use-case-is-higher-education-countercyclical/#comments</comments>
		<pubDate>Mon, 10 Aug 2009 17:52:12 +0000</pubDate>
		<dc:creator>Gary Lewis</dc:creator>
				<category><![CDATA[Query Tools]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[web queries]]></category>
		<category><![CDATA[xquery]]></category>

		<guid isPermaLink="false">http://garymlewis.com/instchg/?p=1172</guid>
		<description><![CDATA[In this XQuery use case, I consider whether higher education in the U.S. is countercyclical to recessions.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve completed another XQuery demonstration project (for earlier ones, see <a href="http://garymlewis.com/instchg/category/query-tools">XQuery</a>). In this version I used XQuery to assemble data to consider anew the conventional wisdom that higher education is countercyclical (ie, enrollments grow during recessions).</p>
<p>Here&#8217;s a sample graph from the report. It shows the annual percentage change in enrollments from all U.S. degree-granting institutions of higher education from 1968 to 2007. The grayscale background shows unemployment rates and periods of recessions. The report is available in this <a href="http://garymlewis.com/instchg/public/xquery/dt08_189/dt08_189.pdf">pdf</a> that includes better resolution graphics.</p>
<p><img class="alignleft size-full wp-image-1177" title="dt08_main" src="http://garymlewis.com/instchg/wp-content/uploads/2009/08/dt08_main.png" alt="dt08_main" width="500" height="357" /></p>
<p>Here is an abbreviated version of the summary section from the report.</p>
<p>1. The new XQuery use cases included screen scraping enrollment data embedded in html, and the pipelining of enrollment and economic data through several staged transformations.<br />
2. The evidence that higher education in the United States is countercyclical appears weak based on the exploratory analysis done here.<br />
3. The evidence varies somewhat by institutional control and type, but mostly the general observation of a weak association between enrollment change and recession cycles holds true.<br />
4. The tools used  in this project, XQuery and R, are wonderful research tools but are not suitable for more general use without wrappers that mute their complexity.<br />
5. The screen-scraping technique used in this project provides another access route to a vast amount of U.S. Department of Education data on the web.<br />
6. None of the data sources used in this project included semantic or linked data markup. Whether XQuery can be used successfully with RDFa or microformats seems worthy of investigation.</p>
<p>Here are links to various pieces of this project:<br />
a. <a href="http://garymlewis.com/instchg/public/xquery/dt08_189/dt08_189.pdf">Final report</a><br />
b. <a href="http://garymlewis.com/instchg/public/xquery/dt08_189/dt08_189_doc.xq">Documentation for XQuery source programs</a><br />
c. <a href="http://garymlewis.com/instchg/public/xquery/dt08_189/dt08_189_enrl.xq">XQuery to generate enrollment data</a><br />
d. <a href="http://garymlewis.com/instchg/public/xquery/dt08_189/dt08_189_econ.xq">XQuery to generate unemployment and recession data</a><br />
e. <a href="http://garymlewis.com/instchg/public/xquery/dt08_189/dt08_189_final.xq">XQuery to merge enrollment, unemployment, and recession data</a><br />
f. <a href="http://garymlewis.com/instchg/public/xquery/dt08_189/dt08_189_final.Rhistory">R analysis history</a></p>
<p>I remain quite satisfied with XQuery. </p>
]]></content:encoded>
			<wfw:commentRss>http://garymlewis.com/instchg/2009/08/10/another-xquery-use-case-is-higher-education-countercyclical/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>XQuery and an XML Database</title>
		<link>http://garymlewis.com/instchg/2009/07/01/xquery-and-an-xml-database/</link>
		<comments>http://garymlewis.com/instchg/2009/07/01/xquery-and-an-xml-database/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 19:30:26 +0000</pubDate>
		<dc:creator>Gary Lewis</dc:creator>
				<category><![CDATA[Query Tools]]></category>
		<category><![CDATA[xquery]]></category>

		<guid isPermaLink="false">http://garymlewis.com/instchg/?p=1043</guid>
		<description><![CDATA[Recently I installed an XML database and duplicated  one of the performance tests I'd done earlier with the Federal Reserve Economic Data (FRED). Response time dropped from 195 seconds to 2 seconds. <a href="http://garymlewis.com/instchg/2009/07/01/xquery-and-an-xml-database/">Read more</a>.]]></description>
			<content:encoded><![CDATA[<p>Recently I wrote about using <a href="http://garymlewis.com/instchg/2009/06/08/making-a-dent-in-a-steep-learning-curve/">XQuery with the Federal Reserve Economic Data (FRED)</a> API. I also complained a bit about sluggish performance (<a href="http://garymlewis.com/instchg/public/xquery/fred/fred_xquery.pdf">pdf</a>), but noted by way of explanation that I was using a creaky old server and that XQuery was processing XML files and not a database.</p>
<p>Since then I installed Oracle&#8217;s <a href="http://www.oracle.com/database/berkeley-db/xml/index.html">BerkeleyDB XML</a> database and duplicated one of the performance tests I&#8217;d done with the FRED data. Response time in the test dropped from 195 seconds to 2 seconds. Not bad; much more consistent with what I&#8217;d expect from SQL queries against a similarly sized relational database.</p>
<p>These are not precise performance tests by any means. For example, the XQuery processor I used with the original FRED data was <a href="http://www.zorba-xquery.com/index.php/about/">Zorba</a>, but <a href="http://xqilla.sourceforge.net/HomePage">XQilla</a> is the XQuery processor that comes bundled with BerkeleyDB XML. So, for certain, I changed at least two important factors in the tests, and unless you make controlled changes in factors it is pretty much impossible to attribute causation.</p>
<p>However, I rather doubt that the choice of the XQuery processor had much to do with the performance improvement. Performance only improved when I created database indexes for the XML.</p>
<p>At this point I am still very much playing; just trying to use XQuery in a variety of environments to better understand what it can and cannot accomplish. If you&#8217;re a researcher, for example, it doesn&#8217;t much matter if it takes 195 seconds for a data integration step in an entire research process that may take days, weeks or longer. However, 2 seconds is much preferred if you&#8217;re doing ad hoc data mashups in a browser.</p>
<p>From what little I&#8217;ve seen of Oracle&#8217;s BerkeleyDB XML database, it&#8217;s an interesting product, available with either a commercial or free open source license. It&#8217;s an embedded database accessed through a programmatic API and linked libraries. It is not a relational database, database management system, or a database server. It does, however, offer a lightweight solution when performance and data persistence are critical. Disclaimer: I have no association with Oracle, BerkeleyDB, or BerkeleyDB XML.</p>
<p>I realize this blog post will only interest a very few people, but from time-to-time I may write similar posts that document my playing with XQuery. I continue to like what I see. Truthfully I&#8217;m not certain where it&#8217;s headed, other than to say that the driver is free learning for everyone everywhere.</p>
]]></content:encoded>
			<wfw:commentRss>http://garymlewis.com/instchg/2009/07/01/xquery-and-an-xml-database/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

