A new SQL query generator, Semmle’s .QL, was being demonstrated at the 2008 ACM SIGMOD conference in Vancouver two weeks ago.
Semmle is a start-up, headed by Oege de Moor who is on the faculty at Oxford University’s Magdalen College. Semmle’s .QL language, written in Java and callable by Java applications only (for the moment), is based on Datalog with support for type classing (entity-type hierarchies). Because .QL is based on Datalog, its syntax makes it quite easy to construct recursive queries over SQL databases – considerably simpler than with the rather clumsy (and error-prone) ANSI SQL syntax for RECURSIVE UNION. The .QL language uses the familiar FROM, WHERE, and SELECT keywords though their ordering is reversed: projection (SELECT) is last syntactically, matching the order of evaluation. There is also support for rudimentary aggregation functions. Here is a simple example query taken from the online Semmle .QL tutorial:
1 2 3 4 5 6 | from Package p, int loc where p.fromSource() and loc = sum(CompilationUnit cu | cu.getPackage() = p | cu.getNumberOfLines()) select p, loc |
In this example, the query sums the lines of code in each Java package. An aggregate function is of the form func( declarations | condition | selected value).
Here is a second example that illustrates the ease of constructing a recursive query. As in the first example above, the example is taken from the Semmle code analysis tool that serves as a free demo of .QL’s capabilities. First, here’s the definition of a masking predicate that determines if a private class name:
7 8 9 10 11 | predicate masks(Field masker, VisibleInstanceField maskee) { maskee.getName() = masker.getName() and masker.getDeclaringType().hasSupertype+(maskee.getDeclaringType()) } |
Note the “+” next to the hasSupertype() function call. This is how one specifies recursion in .QL; in the example, the query traverses all supertypes looking for potential name conflicts. The outer query becomes simply
12 13 14 | from Field f, VisibleInstanceField vif where masks(f,vif) select f, vif |
.QL is not a generalized object-relational mapping system, like Hibernate. At the moment, it is only a query language: it accesses a relational database by transforming the queries into SQL and issuing those SQL statements over a JDBC connection. What I have yet to figure out is the support in .QL to transform .QL language constructions into SQL ones. However, I am reasonably certain that .QL doesn’t transform its recursion into RECURSIVE UNION queries, but does the recursion internally. What is cool about Semmle’s .QL, however, is that .QL does include optimization technology: the optimization is not cost-based, as it is in SQL Anywhere’s query optimizer, but is based on a set of heuristics to find and eliminate common subexpressions. de Moor and his team from Semmle presented two papers at SIGMOD/PODS, both related to the optimization of Datalog queries [1,2] and that are applicable to the class of optimizations present in the .QL language system. It is the presence of these optimizations that make .QL significant over existing O-R mapping software such as Hibernate and iBatis.
[1] Oege de Moor, Damien Sereni, Pavel Avgustinov and Mathieu Verbaere (June 2008). Type Inference for Datalog and its Application to Query Optimisation. Proceedings, ACM Principles of Database Systems, Vancouver, BC, pp. 291-300.
[2] Damien Sereni, Pavel Avgustinov and Oege de Moor (June 2008). Adding Magic to an Optimising Datalog Compiler. In Proceedings, ACM SIGMOD Conference, Vancouver, BC, pp. 553-565.

Glenn Paulley is a Director of Engineering at Sybase iAnywhere.

1 response so far ↓
1 Using recursive queries with SQL Anywhere // Apr 20, 2009 at 3:01 am
[...] semantics are more commonly used in functional programming languages, including Datalog, upon which Semmle’s .QL query language is based. In ANSI standard SQL [1], fixpoint semantics are loosely defined as the situation [...]