Derby
  1. Derby
  2. DERBY-3946

Provide support for using the Derby parser to generate Abstract Syntax Trees

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 10.5.1.1
    • Fix Version/s: None
    • Component/s: SQL
    • Urgency:
      Normal

      Description

      Users would like to be able to use the Derby parser to produce query trees without actually running the queries on Derby.

      1. ASTParser.java
        4 kB
        Rick Hillegas
      2. derby-3946-01-aa-standaloneParser.diff
        2 kB
        Rick Hillegas
      3. TreeWalker.java
        5 kB
        Rick Hillegas
      4. ASTParser.java
        4 kB
        Rick Hillegas
      5. ASTParser.java
        4 kB
        Rick Hillegas
      6. TreeWalker.java
        6 kB
        Rick Hillegas

        Issue Links

          Activity

          Hide
          Mamta A. Satoor added a comment -

          Hi Rick, as part of 10.10 bug triaging, I was wondering what was the status of this jira? Should it be left open because there is more work to be done?

          Show
          Mamta A. Satoor added a comment - Hi Rick, as part of 10.10 bug triaging, I was wondering what was the status of this jira? Should it be left open because there is more work to be done?
          Hide
          Rick Hillegas added a comment -

          Hi Ciu,

          Some responses follow:

          1) The printing is triggered by the call to QueryTreeNode.treePrint(). That method immediately defers its work to QueryTreeNode.printSubNodes(), which recurses through the nodes, printing as it goes.

          2) The *s are expanded by ResultColumnList.expandAllsAndNameColumns().

          Hope this helps,
          -Rick

          Show
          Rick Hillegas added a comment - Hi Ciu, Some responses follow: 1) The printing is triggered by the call to QueryTreeNode.treePrint(). That method immediately defers its work to QueryTreeNode.printSubNodes(), which recurses through the nodes, printing as it goes. 2) The *s are expanded by ResultColumnList.expandAllsAndNameColumns(). Hope this helps, -Rick
          Hide
          Derby Ciu added a comment -

          Hi Rick,

          I have two questions for you.

          1) I am trying to analyze which function call is responsible for printing the parse tree.

          2) What is the exact point in the code when a " * " gets replaced with the column names in a query like "Select * from tablename" where tablename is a table with different column names clm1,clm2,clm3 etc.

          If you could please help me with these asap.

          Thank you,
          Ciu

          Show
          Derby Ciu added a comment - Hi Rick, I have two questions for you. 1) I am trying to analyze which function call is responsible for printing the parse tree. 2) What is the exact point in the code when a " * " gets replaced with the column names in a query like "Select * from tablename" where tablename is a table with different column names clm1,clm2,clm3 etc. If you could please help me with these asap. Thank you, Ciu
          Hide
          Rick Hillegas added a comment -

          Hi Ciu,

          ASTParser simply defers its work to the treePrint() methods of the nodes themselves. The nodes are retrieved using a visitor pattern. To see how this works, take a look at the TreeWalker program attached to this JIRA. TreeWalker is another implementation of Visitor. Regards-Rick

          Show
          Rick Hillegas added a comment - Hi Ciu, ASTParser simply defers its work to the treePrint() methods of the nodes themselves. The nodes are retrieved using a visitor pattern. To see how this works, take a look at the TreeWalker program attached to this JIRA. TreeWalker is another implementation of Visitor. Regards-Rick
          Hide
          Derby Ciu added a comment -

          Could you please explain the output of ASTParser.java? How are each of the nodes retrieved?

          Show
          Derby Ciu added a comment - Could you please explain the output of ASTParser.java? How are each of the nodes retrieved?
          Hide
          Derby Ciu added a comment -

          Hi Rick,

          I am now able to get the parser output. Could you please elaborate and help me interpret this output? I just want to make sure this is going to work for all types of complex /nested/queryies from different databases. Could you please let me know if ASTParser.java depends on any other files in impl.sql.compile package apart from QuerytreeNode.java.

          Thanks,
          Ciu

          Show
          Derby Ciu added a comment - Hi Rick, I am now able to get the parser output. Could you please elaborate and help me interpret this output? I just want to make sure this is going to work for all types of complex /nested/queryies from different databases. Could you please let me know if ASTParser.java depends on any other files in impl.sql.compile package apart from QuerytreeNode.java. Thanks, Ciu
          Hide
          Derby Ciu added a comment -

          Hi Rick,

          I guess I am on the right track. Running the above query gives this -

          Parsing:
          select a from t, s where t.a = s.a

          org.apache.derby.impl.sql.compile.CursorNode@19fdbcb
          name: null
          updateMode: UNSPECIFIED (0)
          statementType: SELECT
          resultSet:
          org.apache.derby.impl.sql.compile.SelectNode@16fdac
          isDistinct: false
          groupByList: null
          orderByList: null
          resultSetNumber: 0
          referencedTableMap: null
          statementResultSet: false
          resultColumns:
          org.apache.derby.impl.sql.compile.ResultColumn@1682a53
          exposedName: A
          name: A
          tableName: null
          isDefaultColumn: false
          wasDefaultColumn: false
          isNameGenerated: false
          sourceTableName: null
          type: null
          columnDescriptor: null
          isGenerated: false
          isGeneratedForUnmatchedColumnInInsert: false
          isGroupingColumn: false
          isReferenced: false
          isRedundant: false
          virtualColumnId: 1
          resultSetNumber: -1
          dataTypeServices: null
          expression:
          org.apache.derby.impl.sql.compile.ColumnReference@1b2d7df
          columnName: A
          tableNumber: -1
          columnNumber: 0
          replacesAggregate: false
          tableName: null
          nestingLevel: -1
          sourceLevel: -1
          dataTypeServices: null
          fromList:
          org.apache.derby.impl.sql.compile.FromList@7a9224
          tableName: T
          tableDescriptor: null
          updateOrDelete: 0
          null
          existsBaseTable: false
          dependencyMap: null
          correlation Name: null
          null
          tableNumber -1
          level 0
          resultSetNumber: 0
          referencedTableMap: null
          statementResultSet: false
          ; tableName: S
          tableDescriptor: null
          updateOrDelete: 0
          null
          existsBaseTable: false
          dependencyMap: null
          correlation Name: null
          null
          tableNumber -1
          level 0
          resultSetNumber: 0
          referencedTableMap: null
          statementResultSet: false
          ;
          org.apache.derby.impl.sql.compile.FromBaseTable@110b640
          tableName: T
          tableDescriptor: null
          updateOrDelete: 0
          null
          existsBaseTable: false
          dependencyMap: null
          correlation Name: null
          null
          tableNumber -1
          level 0
          resultSetNumber: 0
          referencedTableMap: null
          statementResultSet: false

          org.apache.derby.impl.sql.compile.FromBaseTable@2d59a3
          tableName: S
          tableDescriptor: null
          updateOrDelete: 0
          null
          existsBaseTable: false
          dependencyMap: null
          correlation Name: null
          null
          tableNumber -1
          level 0
          resultSetNumber: 0
          referencedTableMap: null
          statementResultSet: false
          whereClause:
          org.apache.derby.impl.sql.compile.BinaryRelationalOperatorNode@158015a
          operator: =
          methodName: equals
          dataTypeServices: null
          leftOperand:
          org.apache.derby.impl.sql.compile.ColumnReference@c0c8b5
          columnName: A
          tableNumber: -1
          columnNumber: 0
          replacesAggregate: false
          tableName: T
          nestingLevel: -1
          sourceLevel: -1
          dataTypeServices: null
          rightOperand:
          org.apache.derby.impl.sql.compile.ColumnReference@b914b3
          columnName: A
          tableNumber: -1
          columnNumber: 0
          replacesAggregate: false
          tableName: S
          nestingLevel: -1
          sourceLevel: -1
          dataTypeServices: null
          preJoinFL: orderByList:

          I will get back to you.
          Thanks and regards,
          Ciu

          Show
          Derby Ciu added a comment - Hi Rick, I guess I am on the right track. Running the above query gives this - Parsing: select a from t, s where t.a = s.a org.apache.derby.impl.sql.compile.CursorNode@19fdbcb name: null updateMode: UNSPECIFIED (0) statementType: SELECT resultSet: org.apache.derby.impl.sql.compile.SelectNode@16fdac isDistinct: false groupByList: null orderByList: null resultSetNumber: 0 referencedTableMap: null statementResultSet: false resultColumns: org.apache.derby.impl.sql.compile.ResultColumn@1682a53 exposedName: A name: A tableName: null isDefaultColumn: false wasDefaultColumn: false isNameGenerated: false sourceTableName: null type: null columnDescriptor: null isGenerated: false isGeneratedForUnmatchedColumnInInsert: false isGroupingColumn: false isReferenced: false isRedundant: false virtualColumnId: 1 resultSetNumber: -1 dataTypeServices: null expression: org.apache.derby.impl.sql.compile.ColumnReference@1b2d7df columnName: A tableNumber: -1 columnNumber: 0 replacesAggregate: false tableName: null nestingLevel: -1 sourceLevel: -1 dataTypeServices: null fromList: org.apache.derby.impl.sql.compile.FromList@7a9224 tableName: T tableDescriptor: null updateOrDelete: 0 null existsBaseTable: false dependencyMap: null correlation Name: null null tableNumber -1 level 0 resultSetNumber: 0 referencedTableMap: null statementResultSet: false ; tableName: S tableDescriptor: null updateOrDelete: 0 null existsBaseTable: false dependencyMap: null correlation Name: null null tableNumber -1 level 0 resultSetNumber: 0 referencedTableMap: null statementResultSet: false ; org.apache.derby.impl.sql.compile.FromBaseTable@110b640 tableName: T tableDescriptor: null updateOrDelete: 0 null existsBaseTable: false dependencyMap: null correlation Name: null null tableNumber -1 level 0 resultSetNumber: 0 referencedTableMap: null statementResultSet: false org.apache.derby.impl.sql.compile.FromBaseTable@2d59a3 tableName: S tableDescriptor: null updateOrDelete: 0 null existsBaseTable: false dependencyMap: null correlation Name: null null tableNumber -1 level 0 resultSetNumber: 0 referencedTableMap: null statementResultSet: false whereClause: org.apache.derby.impl.sql.compile.BinaryRelationalOperatorNode@158015a operator: = methodName: equals dataTypeServices: null leftOperand: org.apache.derby.impl.sql.compile.ColumnReference@c0c8b5 columnName: A tableNumber: -1 columnNumber: 0 replacesAggregate: false tableName: T nestingLevel: -1 sourceLevel: -1 dataTypeServices: null rightOperand: org.apache.derby.impl.sql.compile.ColumnReference@b914b3 columnName: A tableNumber: -1 columnNumber: 0 replacesAggregate: false tableName: S nestingLevel: -1 sourceLevel: -1 dataTypeServices: null preJoinFL: orderByList: I will get back to you. Thanks and regards, Ciu
          Hide
          Rick Hillegas added a comment -

          Hi Ciu,

          Can you post the command line you used? What happens when you run the example shown above:

          java ASTParser "select a from t, s where t.a = s.a"

          Thanks,
          -Rick

          Show
          Rick Hillegas added a comment - Hi Ciu, Can you post the command line you used? What happens when you run the example shown above: java ASTParser "select a from t, s where t.a = s.a" Thanks, -Rick
          Hide
          Derby Ciu added a comment -

          Hi Rick,

          not able to run ASTParser.java. As you suggested, I have not applied the derby-3946-01-aa-standaloneParser.diff patch. But still, when I try placing ASTParser.java in sql.compile or sql.conn packages and run it, I get similar errors in both cases--

          Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
          at org.apache.derby.impl.sql.compile.ASTParser.execute(ASTParser.java:85)
          at org.apache.derby.impl.sql.compile.ASTParser.main(ASTParser.java:76)

          Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
          at org.apache.derby.impl.sql.conn.ASTParser.execute(ASTParser.java:86)
          at org.apache.derby.impl.sql.conn.ASTParser.main(ASTParser.java:77)

          Thanks,
          Ciu

          Show
          Derby Ciu added a comment - Hi Rick, not able to run ASTParser.java. As you suggested, I have not applied the derby-3946-01-aa-standaloneParser.diff patch. But still, when I try placing ASTParser.java in sql.compile or sql.conn packages and run it, I get similar errors in both cases-- Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.derby.impl.sql.compile.ASTParser.execute(ASTParser.java:85) at org.apache.derby.impl.sql.compile.ASTParser.main(ASTParser.java:76) Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.derby.impl.sql.conn.ASTParser.execute(ASTParser.java:86) at org.apache.derby.impl.sql.conn.ASTParser.main(ASTParser.java:77) Thanks, Ciu
          Hide
          Rick Hillegas added a comment -

          Hi Ciu,

          I am confused by the experiment you ran. Do not try to re-apply the derby-3946-01-aa-standaloneParser.diff patch. The patch was checked in a long time ago.

          Regards,
          -Rick

          Show
          Rick Hillegas added a comment - Hi Ciu, I am confused by the experiment you ran. Do not try to re-apply the derby-3946-01-aa-standaloneParser.diff patch. The patch was checked in a long time ago. Regards, -Rick
          Hide
          Derby Ciu added a comment -

          Hi Kristian,

          Thanks for the information.

          I checked out 10.5 version of Derby source code and downloaded ASTParser.java. Then I applied patch and built derby. But when I run ASTParser.java, I get this error:
          ArrayIndexOutofBound Exception in lines 76, 85 of ASTParser.java

          Am I doing something wrong?

          Please help,
          Thanks,
          Ciu

          Show
          Derby Ciu added a comment - Hi Kristian, Thanks for the information. I checked out 10.5 version of Derby source code and downloaded ASTParser.java. Then I applied patch and built derby. But when I run ASTParser.java, I get this error: ArrayIndexOutofBound Exception in lines 76, 85 of ASTParser.java Am I doing something wrong? Please help, Thanks, Ciu
          Hide
          Kristian Waagan added a comment -

          Hi Ciu,

          The releases 10.5.1 and 10.5.3 are points in time on the 10.5 branch. That is, if you check out the head of 10.5 you'll get 10.5.3 plus any fixes and changes made on the 10.5 branch after 10.5.3 was released.
          To get the code for the 10.5.1 or 10.5.3 releases you have to check out a specific revision on the 10.5 branch. For instance, 10.5.3 is revision 802917 (see http://db.apache.org/derby/derby_downloads.html ).

          Hope this helps,

          Show
          Kristian Waagan added a comment - Hi Ciu, The releases 10.5.1 and 10.5.3 are points in time on the 10.5 branch. That is, if you check out the head of 10.5 you'll get 10.5.3 plus any fixes and changes made on the 10.5 branch after 10.5.3 was released. To get the code for the 10.5.1 or 10.5.3 releases you have to check out a specific revision on the 10.5 branch. For instance, 10.5.3 is revision 802917 (see http://db.apache.org/derby/derby_downloads.html ). Hope this helps,
          Hide
          Derby Ciu added a comment -

          Hi Rick,

          I followed the steps in the following URL to get the derby source code using eclipse.
          http://wiki.apache.org/db-derby/BuildingDerby

          According to the instructions given in the second step, I added the new repository url (https://svn.apache.org/repos/asf/db/derby/code) in the SVN Repository. After that when I expand branches, I only get options to check out starting from 10.0 to 10.5 versions. Can you please let me know how to get 10.5.1 or 10.5.3 releases?

          Thanks,
          Ciu

          Show
          Derby Ciu added a comment - Hi Rick, I followed the steps in the following URL to get the derby source code using eclipse. http://wiki.apache.org/db-derby/BuildingDerby According to the instructions given in the second step, I added the new repository url ( https://svn.apache.org/repos/asf/db/derby/code ) in the SVN Repository. After that when I expand branches, I only get options to check out starting from 10.0 to 10.5 versions. Can you please let me know how to get 10.5.1 or 10.5.3 releases? Thanks, Ciu
          Hide
          Rick Hillegas added a comment -

          Hi Ciu,

          Sorry for the confusion. That patch was committed to the trunk on 2008-11-13 but it's easy for that commit comment to get lost in the rest of the commentary on this issue. The patch was built into the 10.5.1 and 10.5.3 releases. All you need to do is download ASTParser.java to your current directory, compile it, and run it as described above. Let me know if you have more issues or if something breaks. Thanks.

          Show
          Rick Hillegas added a comment - Hi Ciu, Sorry for the confusion. That patch was committed to the trunk on 2008-11-13 but it's easy for that commit comment to get lost in the rest of the commentary on this issue. The patch was built into the 10.5.1 and 10.5.3 releases. All you need to do is download ASTParser.java to your current directory, compile it, and run it as described above. Let me know if you have more issues or if something breaks. Thanks.
          Hide
          Derby Ciu added a comment -

          Hi Rick,

          I tried applying the patch derby-3946-01-aa-standaloneParser.java in Eclipse IDE, which says "The patch has segments that did not match. are you sure you want to apply?" . Am I supposed to review a patch w.r.t the local file system and manually merge any unmerged portions?
          Also, could you be specific about where should ASTParser.java be placed? I had placed it in org.apache.derby.impl.sql.conn package and upon running it, I get an ArrayIndexOutofBound Exception in lines 76, 85 of ASTParser.java
          Am I missing out any step while applying this patch?

          Thanks,
          Ciu

          Show
          Derby Ciu added a comment - Hi Rick, I tried applying the patch derby-3946-01-aa-standaloneParser.java in Eclipse IDE, which says "The patch has segments that did not match. are you sure you want to apply?" . Am I supposed to review a patch w.r.t the local file system and manually merge any unmerged portions? Also, could you be specific about where should ASTParser.java be placed? I had placed it in org.apache.derby.impl.sql.conn package and upon running it, I get an ArrayIndexOutofBound Exception in lines 76, 85 of ASTParser.java Am I missing out any step while applying this patch? Thanks, Ciu
          Hide
          Rick Hillegas added a comment -

          Attaching a new version of TreeWalker which implements the visitChildrenFirst() method which was recently added to the Visitor interface.

          Show
          Rick Hillegas added a comment - Attaching a new version of TreeWalker which implements the visitChildrenFirst() method which was recently added to the Visitor interface.
          Hide
          Rick Hillegas added a comment -

          Attaching a third rev of the ASTParser. This rev hard-codes the context id for LanguageConnectionContext. This lets you compile ASTParser against the debug jars--the ContextId class itself does not appear in the Derby jar files because the Derby build eliminates classes which merely contain constants.

          I have successfully compiled and run this ASTParser against the debug derby.jar that is part of the 10.5.3.0 distribution.

          Show
          Rick Hillegas added a comment - Attaching a third rev of the ASTParser. This rev hard-codes the context id for LanguageConnectionContext. This lets you compile ASTParser against the debug jars--the ContextId class itself does not appear in the Derby jar files because the Derby build eliminates classes which merely contain constants. I have successfully compiled and run this ASTParser against the debug derby.jar that is part of the 10.5.3.0 distribution.
          Hide
          Knut Anders Hatlen added a comment -

          Good idea to use an in-memory database. The storeless engine (DERBY-2164) is probably an even better fit, as this is exactly the kind of use it was intended for (it skips database creation entirely). The down-side is of course that the storeless classes have not been added to the production jars yet, so you would need to add the classes.storeless directory to the classpath for it to work.

          Show
          Knut Anders Hatlen added a comment - Good idea to use an in-memory database. The storeless engine ( DERBY-2164 ) is probably an even better fit, as this is exactly the kind of use it was intended for (it skips database creation entirely). The down-side is of course that the storeless classes have not been added to the production jars yet, so you would need to add the classes.storeless directory to the classpath for it to work.
          Hide
          Rick Hillegas added a comment -

          Attaching a version of the ASTParser which has been modified to use a transient in-memory database rather than creating an unnecessary on-disk database.

          Show
          Rick Hillegas added a comment - Attaching a version of the ASTParser which has been modified to use a transient in-memory database rather than creating an unnecessary on-disk database.
          Hide
          Rick Hillegas added a comment -

          Hi Myrna,

          I think there's more work to be done on this issue--I just haven't found a rainy day for that yet. If we say something about this issue in the release notes, then we'll need to warn users that the technique relies on non-public apis which can change between releases. Thanks.

          Show
          Rick Hillegas added a comment - Hi Myrna, I think there's more work to be done on this issue--I just haven't found a rainy day for that yet. If we say something about this issue in the release notes, then we'll need to warn users that the technique relies on non-public apis which can change between releases. Thanks.
          Hide
          Myrna van Lunteren added a comment -

          Can this be closed? If so, should I put it as a 'new feature' in the Release notes for 10.5?

          Show
          Myrna van Lunteren added a comment - Can this be closed? If so, should I put it as a 'new feature' in the Release notes for 10.5?
          Hide
          Rick Hillegas added a comment -

          Linking this issue to DERBY-791. The XmlTreeWalker attached to that issue can be used to print out a query tree in xml.

          Show
          Rick Hillegas added a comment - Linking this issue to DERBY-791 . The XmlTreeWalker attached to that issue can be used to print out a query tree in xml.
          Hide
          Christian Riedel added a comment -

          Hi Rick,

          thanks a lot for your examples ... that makes things a lot clearer. I think I got the idea now.

          Now, I have to try and write my own code around it to make it "easier" to use for us.

          This was really great support!

          Regards

          Christian

          Show
          Christian Riedel added a comment - Hi Rick, thanks a lot for your examples ... that makes things a lot clearer. I think I got the idea now. Now, I have to try and write my own code around it to make it "easier" to use for us. This was really great support! Regards Christian
          Hide
          Rick Hillegas added a comment -

          Hi Christian,

          I have posted a reply to a similar question on your original email thread. Hopefully you will be able to see that soon.

          1) It's useful to build the engine javadoc (ant -quiet javadoc) and browse the javadoc for the package org.apache.derby.impl.sql.compile. In particular, you will see that the AST nodes are the classes indented under QueryTreeNode in the tree view.

          2) The nodes themselves implement Visitable so you can write you own Visitor to explore the AST graph. Visitable has one method, accept(), and by looking at the implementations of that method, you will understand how the nodes snap together into a graph.

          I've attached a simple Visitor (TreeWalker), which shows you some classes in the graph. That may help explain the AST a bit more. When I run

          java TreeWalker "select a from t, s where t.a = s.a"

          I get the following graph:

          org.apache.derby.impl.sql.compile.CursorNode
          org.apache.derby.impl.sql.compile.SelectNode
          org.apache.derby.impl.sql.compile.ResultColumnList
          org.apache.derby.impl.sql.compile.ResultColumn
          org.apache.derby.impl.sql.compile.ColumnReference
          org.apache.derby.impl.sql.compile.FromBaseTable
          org.apache.derby.impl.sql.compile.FromBaseTable
          org.apache.derby.impl.sql.compile.BinaryRelationalOperatorNode
          org.apache.derby.impl.sql.compile.ColumnReference
          org.apache.derby.impl.sql.compile.ColumnReference

          Let me try to explain this tree a bit:

          The SelectNode has the following children:

          i) The columns in the SELECT list. The whole SELECT list is represented by a ResultColumnList and there is only one ResultColumn (representing "a") in that list.

          ii) The tables in the FROM list. There are two of these, each represented by its own FromBaseTable.

          iii) The WHERE clause. This is a BinaryRelationalOperatorNode (representing the "=" operator). This operator node has a left child and a right child, "t.a" and "s.a" respectively.

          So to answer your specific questions:

          Q) What tables are in the query?
          A) Look for FromBaseTables in the graph.

          Q) Which fields in the tables are accessed?
          A) The column references can appear in the SELECT list (the ResultColumnList) or in the WHERE clause (under the BinaryRelationalOperatorNode). Note, however, that the columns are not matched up to tables yet. This isn't done by the parser. That kind of name resolution happens during Derby's bind() phase and it requires metadata so that Derby knows the structure of the tables. Unless you create the tables in Derby, there will be no way to move forward to the bind() phase. If you don't provide this information to Derby, then you will have to write your own name-resolution phase.

          Q) What are the table aliases, if any?
          A) FromBaseTable.getExposedName() will give you the name of the table (or the alias name if you specified an alias)

          Hope this helps,
          -Rick

          Show
          Rick Hillegas added a comment - Hi Christian, I have posted a reply to a similar question on your original email thread. Hopefully you will be able to see that soon. 1) It's useful to build the engine javadoc (ant -quiet javadoc) and browse the javadoc for the package org.apache.derby.impl.sql.compile. In particular, you will see that the AST nodes are the classes indented under QueryTreeNode in the tree view. 2) The nodes themselves implement Visitable so you can write you own Visitor to explore the AST graph. Visitable has one method, accept(), and by looking at the implementations of that method, you will understand how the nodes snap together into a graph. I've attached a simple Visitor (TreeWalker), which shows you some classes in the graph. That may help explain the AST a bit more. When I run java TreeWalker "select a from t, s where t.a = s.a" I get the following graph: org.apache.derby.impl.sql.compile.CursorNode org.apache.derby.impl.sql.compile.SelectNode org.apache.derby.impl.sql.compile.ResultColumnList org.apache.derby.impl.sql.compile.ResultColumn org.apache.derby.impl.sql.compile.ColumnReference org.apache.derby.impl.sql.compile.FromBaseTable org.apache.derby.impl.sql.compile.FromBaseTable org.apache.derby.impl.sql.compile.BinaryRelationalOperatorNode org.apache.derby.impl.sql.compile.ColumnReference org.apache.derby.impl.sql.compile.ColumnReference Let me try to explain this tree a bit: The SelectNode has the following children: i) The columns in the SELECT list. The whole SELECT list is represented by a ResultColumnList and there is only one ResultColumn (representing "a") in that list. ii) The tables in the FROM list. There are two of these, each represented by its own FromBaseTable. iii) The WHERE clause. This is a BinaryRelationalOperatorNode (representing the "=" operator). This operator node has a left child and a right child, "t.a" and "s.a" respectively. So to answer your specific questions: Q) What tables are in the query? A) Look for FromBaseTables in the graph. Q) Which fields in the tables are accessed? A) The column references can appear in the SELECT list (the ResultColumnList) or in the WHERE clause (under the BinaryRelationalOperatorNode). Note, however, that the columns are not matched up to tables yet. This isn't done by the parser. That kind of name resolution happens during Derby's bind() phase and it requires metadata so that Derby knows the structure of the tables. Unless you create the tables in Derby, there will be no way to move forward to the bind() phase. If you don't provide this information to Derby, then you will have to write your own name-resolution phase. Q) What are the table aliases, if any? A) FromBaseTable.getExposedName() will give you the name of the table (or the alias name if you specified an alias) Hope this helps, -Rick
          Hide
          Christian Riedel added a comment -

          Rick,

          your patch has helped us a lot so far ... since I cannot post any messages to the list for some reason, I want to give my feedback here.

          getting hold of the QueryTreeNode is a big step in the right direction ... however for actually analyzing the tree it would be necessary to access the tree's elements. I.e. traverse the tree sequentially, getting information how many childnodes exist etc. The problem that I still see is that some of the classes I'd need for this are not visible so that I can use the from the outside.

          The QueryTreeNode in your example code seems to be an instance of CursorNode ... how can I go on from here?

          The things we'd like to know is: which tables are involved in the statement (also in inner selects) which fields from which tables are accessed, which aliases are given for the tables that are accessed etc.

          Could you give me a further hint?

          Thanks

          Christian

          Show
          Christian Riedel added a comment - Rick, your patch has helped us a lot so far ... since I cannot post any messages to the list for some reason, I want to give my feedback here. getting hold of the QueryTreeNode is a big step in the right direction ... however for actually analyzing the tree it would be necessary to access the tree's elements. I.e. traverse the tree sequentially, getting information how many childnodes exist etc. The problem that I still see is that some of the classes I'd need for this are not visible so that I can use the from the outside. The QueryTreeNode in your example code seems to be an instance of CursorNode ... how can I go on from here? The things we'd like to know is: which tables are involved in the statement (also in inner selects) which fields from which tables are accessed, which aliases are given for the tables that are accessed etc. Could you give me a further hint? Thanks Christian
          Hide
          Rick Hillegas added a comment -

          Additional commentary on this issue can be found in this email thread: http://www.nabble.com/Using-derby-to-parse-an-SQL-statement-td20461127.html#a20461127

          Show
          Rick Hillegas added a comment - Additional commentary on this issue can be found in this email thread: http://www.nabble.com/Using-derby-to-parse-an-SQL-statement-td20461127.html#a20461127
          Hide
          Rick Hillegas added a comment -

          Tests ran cleanly for me on derby-3946-01-aa-standaloneParser.diff. Committed at subversion revision 713721.

          Show
          Rick Hillegas added a comment - Tests ran cleanly for me on derby-3946-01-aa-standaloneParser.diff. Committed at subversion revision 713721.
          Hide
          Rick Hillegas added a comment -

          I agree with Knut's analysis. The parse-time constructor for NewInvocationNode contains a bind-time call to the data dictionary. I regard this as a bug. The bug could probably be fixed by faulting in the metadata only when it is needed rather than pro-actively reading the catalogs when the AST is created.

          I also agree with Knut that the devil is in the details. Since a standalone parser hasn't been tested widely, there are likely to be several cases where bind-time logic has leaked into the parsing phase. Again, as they come up, I think we should clean them up. It might be worth figuring out how we could record all of the SQL that is issued by our regression tests and then pump that SQL through the ASTParser logic. That could give us some confidence that we have found the central bulge of problems--though it would not be a systematic way to catch all of the outliers.

          Show
          Rick Hillegas added a comment - I agree with Knut's analysis. The parse-time constructor for NewInvocationNode contains a bind-time call to the data dictionary. I regard this as a bug. The bug could probably be fixed by faulting in the metadata only when it is needed rather than pro-actively reading the catalogs when the AST is created. I also agree with Knut that the devil is in the details. Since a standalone parser hasn't been tested widely, there are likely to be several cases where bind-time logic has leaked into the parsing phase. Again, as they come up, I think we should clean them up. It might be worth figuring out how we could record all of the SQL that is issued by our regression tests and then pump that SQL through the ASTParser logic. That could give us some confidence that we have found the central bulge of problems--though it would not be a systematic way to catch all of the outliers.
          Hide
          Knut Anders Hatlen added a comment -

          Dag, this is because the initialization of NewInvocationNode (invoked by the parser – see vtiTableConstruct() in sqlgrammar.jj) looks up the class name associated with the table function in the data dictionary. That sounds like the wrong place to do it, it should probably be moved from init() to bindExpression(). The actual verification that the class exists is (correctly) done in bindExpression(). I wouldn't be surprised if there's more misplaced code, since until now it hasn't really made any difference if some of the binding has been performed in the parser.

          Show
          Knut Anders Hatlen added a comment - Dag, this is because the initialization of NewInvocationNode (invoked by the parser – see vtiTableConstruct() in sqlgrammar.jj) looks up the class name associated with the table function in the data dictionary. That sounds like the wrong place to do it, it should probably be moved from init() to bindExpression(). The actual verification that the class exists is (correctly) done in bindExpression(). I wouldn't be surprised if there's more misplaced code, since until now it hasn't really made any difference if some of the binding has been performed in the parser.
          Hide
          Dag H. Wanvik added a comment -

          Tried the program for this example

          java ASTParser "select * from table(foo(4))t"

          but it seems the binding is done early here so it crashed:
          Exception in thread "main" java.sql.SQLSyntaxErrorException: 'APP'.FOO' does not identify a table function.
          Does this mean only a subset of the syntax may be used? If so, what parts?

          Show
          Dag H. Wanvik added a comment - Tried the program for this example java ASTParser "select * from table(foo(4))t" but it seems the binding is done early here so it crashed: Exception in thread "main" java.sql.SQLSyntaxErrorException: 'APP'.FOO' does not identify a table function. Does this mean only a subset of the syntax may be used? If so, what parts?
          Hide
          Rick Hillegas added a comment - - edited

          Attaching derby-3946-01-aa-standaloneParser.diff and ASTParser.java. Note that this patch has already been committed and is built into the 10.5 releases. To test-drive ASTParser, you need to do the following:

          A) Compile ASTParser

          B) Run ASTParser. You must use the Derby debug jars. The program takes one argument, which is a query string which you would like to parse. The program then retrieves the parsed query tree and prints it out. For example:

          java ASTParser "select a from t, s where t.a = s.a"

          Note that this technique for using the parser standalone involves invoking non-public APIs which are subject to change from release to release. Note, however, that these APIs have remained stable since Derby was open-sourced.

          The patch touched the following files. I checked in these changes after the tests passed.

          M java/engine/org/apache/derby/impl/sql/conn/GenericLanguageConnectionContext.java
          M java/engine/org/apache/derby/iapi/sql/conn/LanguageConnectionContext.java

          Adds some accessors to the compiler's state variable so that query trees can be stored and retrieved.

          M java/engine/org/apache/derby/impl/sql/GenericStatement.java

          Pokes the query tree into the state variable when the user sets the StopAfterParsing tracepoint.

          Show
          Rick Hillegas added a comment - - edited Attaching derby-3946-01-aa-standaloneParser.diff and ASTParser.java. Note that this patch has already been committed and is built into the 10.5 releases. To test-drive ASTParser, you need to do the following: A) Compile ASTParser B) Run ASTParser. You must use the Derby debug jars. The program takes one argument, which is a query string which you would like to parse. The program then retrieves the parsed query tree and prints it out. For example: java ASTParser "select a from t, s where t.a = s.a" Note that this technique for using the parser standalone involves invoking non-public APIs which are subject to change from release to release. Note, however, that these APIs have remained stable since Derby was open-sourced. The patch touched the following files. I checked in these changes after the tests passed. M java/engine/org/apache/derby/impl/sql/conn/GenericLanguageConnectionContext.java M java/engine/org/apache/derby/iapi/sql/conn/LanguageConnectionContext.java Adds some accessors to the compiler's state variable so that query trees can be stored and retrieved. M java/engine/org/apache/derby/impl/sql/GenericStatement.java Pokes the query tree into the state variable when the user sets the StopAfterParsing tracepoint.

            People

            • Assignee:
              Unassigned
              Reporter:
              Rick Hillegas
            • Votes:
              5 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development