You have reached the site of the relational XQuery compiler Pathfinder. At the core of this research project lies our desire to answer the question
"How far can we push relational database technology to construct an efficient and scalable XQuery implementation?"
If you are interested in a copy of Pathfinders SQL code generator and/or MonetDB/XQuery please have a look at the Download page. If you are interested in the ideas and the technology behind Pathfinder please have a look at the Technology and the Publications pages. If you just want to get a first overview please read on.
Pathfinder Overview
Pathfinder is a re-targetable query compiler that turns XQuery expressions into table algebra queries. While Pathfinder is tightly coupled with MonetDB we also provide a SQL code generator that allows any database to become a faithful XQuery processor.
The Approach
Pathfinder assumes a database to store shredded XML documents---documents that are transformed into a relational encoding. An incoming XQuery query is compiled by Pathfinder into a relational query plan. The database evaluates the generated query plan based on the shredded XML documents and returns a table. A serializer consumes this table and transforms it into an XQuery result sequence. (In MonetDB/XQuery automatic shredding and serialization as well as the tight integration of Pathfinder lead to a runtime where the relational approach is not visible for the user anymore.)
Motivation
We believe that relational database are the most researched and best engineered query processing infrastructures available today. They are able to efficiently query tons of data. By using a relational database as runtime environment for an XQuery processor we can port 30+ years of research to the XQuery domain and build a processor that is able to scale well with increasing input sizes. (For these benefits we are willing to pay the extra costs for shredding, serialization, and compilation.)
Download MonetDB/XQuery or the SQL Code Generator?
MonetDB/XQuery integrates the Pathfinder compiler into the MonetDB product and furthermore extends it with runtime extensions for offline and online shredding, serialization (with multiple serialization modes), efficient path step algorithms such as Staircase Join, and support for updates. MonetDB/XQuery inherits the scalability of its database back-end: you may feed XML documents beyond 1GB size into the system and still expect reasonable, interactive query response times (for example, with the XMark benchmark).
Pathfinder's SQL Code Generator is still limited in its functionality. It is the ideal playground for everybody who is interested to extend his favourite database with XML support. We have tested the generated SQL code on DB2 v9. B-tree indexes faithfully speed up the range predicates that implement XPaths location step semantics. (The combination of Pathfinder and DB2 was even able to outperform DB2's built-in XQuery support on larger XML documents.) The performance analysis for other SQL backends such as e.g., PostgreSQL (and other XML encodings) is currently underway.
Extensibility of Pathfinder
Pathfinder is a re-targetable compiler that is able to produce optimized algebra plans (in XML format). These plans feature normal table algebra operators (such as select, project, join, ...) and some XML specific operators like e.g, path joins, node access. The XML specific operators are the operators interacting with the encoded XML document---they keep the interface to the encoded XML document abstract. This allows new backends to choose a XML encoding that works best. Based on plans generated and optimized by Pathfinder we e.g., turned KX systems' kdb+ into an efficient XQuery processor.
|