Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.11.0.0
    • Component/s: SQL
    • Urgency:
      Normal

      Description

      This piece of code once had a purpose in life. It was one of the double-joints which allowed cloudscape to ship with and without compiler support for the synchronization language. Synchronization has been removed. If we want to plug in optional language components, I think there are better ways to do this.

      The NodeFactory turned into a big, sprawling piece of code. At some point this code was slimmed down by telescoping all of its factory methods into a couple unwieldly, weakly-typed overloads backed by cumbersome logic in the actual node constructors. I would like to reintroduce strongly typed node constructors which the parser can call directly. This will make node generation easier to read and less brittle and it will get rid of the now useless NodeFactory class.

      1. nodefactory-31.zip
        105 kB
        Dag H. Wanvik
      2. nodefactory-31.status
        12 kB
        Dag H. Wanvik
      3. derby-673-1.diff.gz
        265 kB
        Dag H. Wanvik
      4. derby-673-1.status
        20 kB
        Dag H. Wanvik
      5. derby-673-2.diff.gz
        270 kB
        Dag H. Wanvik
      6. derby-673-2.status
        20 kB
        Dag H. Wanvik
      7. derby-673-3.diff.gz
        271 kB
        Dag H. Wanvik
      8. derby-673-3.status
        20 kB
        Dag H. Wanvik
      9. derby-673-fixcomments.diff
        2 kB
        Dag H. Wanvik
      10. derby-673-typesafe-lists-1.diff
        232 kB
        Dag H. Wanvik
      11. derby-673-typesafe-lists-1.status
        5 kB
        Dag H. Wanvik
      12. derby-673-typesafe-lists-2.status
        5 kB
        Dag H. Wanvik
      13. derby-673-typesafe-lists-2.diff.gz
        44 kB
        Dag H. Wanvik
      14. derby-673-more-typesafe-6.diff
        152 kB
        Dag H. Wanvik
      15. derby-673-more-typesafe-6.status
        5 kB
        Dag H. Wanvik
      16. derby-673-nuke-ctypes-enum.diff
        261 kB
        Dag H. Wanvik
      17. derby-673-nuke-ctypes-without-enum.diff
        264 kB
        Dag H. Wanvik
      18. derby-673-nuke-ctypes-enum.stat
        11 kB
        Dag H. Wanvik
      19. derby-673-nuke-ctypes-without-enum.status
        11 kB
        Dag H. Wanvik
      20. derby-673-nuke-ctypes-without-enum-2.diff
        275 kB
        Dag H. Wanvik
      21. derby-673-nuke-ctypes-without-enum-2.status
        11 kB
        Dag H. Wanvik
      22. derby-673-nuke-ctypes-without-enum-3.diff
        276 kB
        Dag H. Wanvik
      23. derby-673-nuke-ctypes-without-enum-3.status
        11 kB
        Dag H. Wanvik

        Issue Links

          Activity

          Hide
          Daniel John Debrunner added a comment -

          Would this change allow some rationalization of the type hierachy for nodes? Every "node" is a sub-class of QueryTreeNode, but it seems to me that several nodes are forced into the hierachy through the use of the NodeFactory. Any "node" that is really a data element would seem to fall into this. E.g. TableName, TableElementNode and its sub-classes. It's possible those nodes do not need to be sub-classes of QueryTreeNode.

          Show
          Daniel John Debrunner added a comment - Would this change allow some rationalization of the type hierachy for nodes? Every "node" is a sub-class of QueryTreeNode, but it seems to me that several nodes are forced into the hierachy through the use of the NodeFactory. Any "node" that is really a data element would seem to fall into this. E.g. TableName, TableElementNode and its sub-classes. It's possible those nodes do not need to be sub-classes of QueryTreeNode.
          Hide
          Rick Hillegas added a comment -

          This change should not preclude and I expect it would facilitate reorganizations of the type hierarchy for nodes in the abstract syntax tree. Having said that, I don't think I understand the motivation for making some nodes not descend from QueryTreeNode. QueryTreeNode is supposed to be the ancestor of all abstract syntax tree nodes. This is a useful abstraction because it regularizes compiler code. It removes a lot of special case logic from the handling of the compilation steps defined by QueryTreeNode: bind(), optimize(), and generate(). In addition, it forces an engineer, when adding a new AST node, to account for these steps. I can see some methods in QueryTreeNode which probably belong in its descendants. However, I think QueryTreeNode basically makes a lot of sense and the code would be a lot uglier without this abstraction.

          Show
          Rick Hillegas added a comment - This change should not preclude and I expect it would facilitate reorganizations of the type hierarchy for nodes in the abstract syntax tree. Having said that, I don't think I understand the motivation for making some nodes not descend from QueryTreeNode. QueryTreeNode is supposed to be the ancestor of all abstract syntax tree nodes. This is a useful abstraction because it regularizes compiler code. It removes a lot of special case logic from the handling of the compilation steps defined by QueryTreeNode: bind(), optimize(), and generate(). In addition, it forces an engineer, when adding a new AST node, to account for these steps. I can see some methods in QueryTreeNode which probably belong in its descendants. However, I think QueryTreeNode basically makes a lot of sense and the code would be a lot uglier without this abstraction.
          Hide
          Daniel John Debrunner added a comment -

          The rational is that some current nodes are not really QueryTreeNodes, but common data elements for real nodes. TableName is an example, which actually represents any two part name (e.g. view, table, function, procedure etc.). Some of the list nodes just contain lists of other nodes, it's not clear to me that all these correctly are QueryTreeNodes.

          Show
          Daniel John Debrunner added a comment - The rational is that some current nodes are not really QueryTreeNodes, but common data elements for real nodes. TableName is an example, which actually represents any two part name (e.g. view, table, function, procedure etc.). Some of the list nodes just contain lists of other nodes, it's not clear to me that all these correctly are QueryTreeNodes.
          Hide
          Rick Hillegas added a comment -

          TableName doesn't have a useful optimize() or generate() method but it does have a bind() method. It looks like a reasonable AST node (that is, QueryTreeNode) to me.

          Show
          Rick Hillegas added a comment - TableName doesn't have a useful optimize() or generate() method but it does have a bind() method. It looks like a reasonable AST node (that is, QueryTreeNode) to me.
          Hide
          Daniel John Debrunner added a comment -

          Maybe, but the bind() method doesn't actually bind the node, it just fetches the schema descriptor, which is already handled by other methods on QueryTreeNode. You could implement the functionality of the TableName in query trees, a two part name, without extending QueryTreeNode. Possibly it is never used as a QueryTreeNode and never uses any methods of its parent class.

          Show
          Daniel John Debrunner added a comment - Maybe, but the bind() method doesn't actually bind the node, it just fetches the schema descriptor, which is already handled by other methods on QueryTreeNode. You could implement the functionality of the TableName in query trees, a two part name, without extending QueryTreeNode. Possibly it is never used as a QueryTreeNode and never uses any methods of its parent class.
          Hide
          Daniel John Debrunner added a comment -

          See comments in DERBY-791 for a use of the node factory. It could be that a NodeFactory could still exist with strongly typed init methods instead of constructors.

          Show
          Daniel John Debrunner added a comment - See comments in DERBY-791 for a use of the node factory. It could be that a NodeFactory could still exist with strongly typed init methods instead of constructors.
          Hide
          Knut Anders Hatlen added a comment -

          Strongly typed constructors or init() methods, with or without the NodeFactory, would be particularly useful now that trunk can have Java 6 code with generics. If you try to pass a parameterized type to one of the init() methods, like a List<String> instance, you'll have to do an unchecked cast from Object to List<String> inside the init() method and the compiler will produce a warning.

          Show
          Knut Anders Hatlen added a comment - Strongly typed constructors or init() methods, with or without the NodeFactory, would be particularly useful now that trunk can have Java 6 code with generics. If you try to pass a parameterized type to one of the init() methods, like a List<String> instance, you'll have to do an unchecked cast from Object to List<String> inside the init() method and the compiler will produce a warning.
          Hide
          Dag H. Wanvik added a comment -

          Uploading a partial proof-of-concept patch for this; it is incremental by moving from using the node factory to constructors for a subset of the node types. When all node types have been converted, the node factory can be dismissed.

          I have not created new classes for those node types which share classes, e.g. the BinaryRelationalOperatorNode are used by six node types; =, !=, <, <=, >, >=. Since many old "init" methods were type overloaded, casting their arguments depending on the node type, the patch does create more methods, thus increasing foot print some, but not all that much I believe. I'll make some measurements later. Code clarity is improved, code reduction is ca 650 lines so far, and type safety is improved by stronger typing of constructor arguments .

          I have not (yet) made any attempts to uncouple classes from unnecessarily inheriting QueryTreeNode (as suggested by Dan).

          The regressions run ok with this.

          Show
          Dag H. Wanvik added a comment - Uploading a partial proof-of-concept patch for this; it is incremental by moving from using the node factory to constructors for a subset of the node types. When all node types have been converted, the node factory can be dismissed. I have not created new classes for those node types which share classes, e.g. the BinaryRelationalOperatorNode are used by six node types; =, !=, <, <=, >, >=. Since many old "init" methods were type overloaded, casting their arguments depending on the node type, the patch does create more methods, thus increasing foot print some, but not all that much I believe. I'll make some measurements later. Code clarity is improved, code reduction is ca 650 lines so far, and type safety is improved by stronger typing of constructor arguments . I have not (yet) made any attempts to uncouple classes from unnecessarily inheriting QueryTreeNode (as suggested by Dan). The regressions run ok with this.
          Hide
          Dag H. Wanvik added a comment - - edited

          Uploading derby-673-1 which removes the node factory and does some further cleanup in the compiler.

          a) Replaced the old init methods in "*Node.java" classes with constructors. Some logical node types have different "C_NodeType" values in their nodeType field after construction but still share the same node class. I have not attempted to increase the number of node classes to match logical == physical node classes this once. Actually one class was removed because it unused: "SQLBooleanConstantNode". "IsNode" is also currently unused but there is a JIRA to make use of it (DERBY-5973), so I left it in place.

          Boxed argument types were replaced by primitive types except in a few cases where instanceof was used on them to detect type overloading; this could be gotten rid of by adding more constructors.

          Since the constructor arguments are now strongly typed, a great many casts were removed in the process and readability is improved a lot.

          In some cases the old init procedures did computations before calling "super.init". Since the call to the corresponding super constructor needs to be the first code in a constructor, I sometimes had to introduce new private static methods to compute the correct arguments to send on to the super class constructor, e.g. "getTypeId" in "UserTypeConstantNode". I think there is some redundancy here that could be removed in a follow-up patch.

          All the non-abstract node classes (still) set their corresponding "C_NodeType" value; but in many (most?) cases the field is no longer needed. This could be improved by removing them altogether and introduced class constants where needed to differentiate between logical node type mapped to one class. This is already done halfheartedly to some extent, e.g. enum "Sign" in "IsNullNode".

          The old "tools/jar/DBMSnodes.properties" file could be removed altogether since the node classes are now added automatically due to dependencies that the compiler can see (no longer constructed by reflection).

          The old nodeFactory method "doJoinOrderOptimization" was moved to the OptimizerFactory now that the NodeFactory has gone.

          b) Added @Override tags to methods that override existing methods (not those that merely implement an interface)

          c) Removed unused imports and sorted import statements for ease of future maintenance by IDEs

          d) Renamed variables that shadowed class members

          e) Replaced usage of StringBuffer with StringBuilder

          f) Restricted public visibility to package private for all classes, methods and members in compile/impl unless they needed more visibility according to actual usage.

          g) Made List arguments to node classes use generics in those cases it was missing.

          e) Renamed some node types to make the the type mirror the node class correctly (there were only very few exceptions to that rule), e.g. LIKE_OPERATOR_NODE -> LIKE_ESCAPE_OPERATOR_NODE since the class is called LikeEscapeOperatorNode.

          f) Reduced scope of some variables: initialized to null values never used long before actual usage in code. By moving the declaration closed to usage, the unnecessary initialization could often be removed.

          g) Fixed some spelling errors in comments.

          h) Renamed some SQL-related constants (StoredFormatIds, TypeId) from "longint" " to "bigint" to reflect Derby SQL syntax.

          Overall the patch removes ca 5K bytes in the insane derby.jar file, and ca 5000 source lines.

          All regressions passed, but I'll be running more tests on more platforms since it is a big change.

          Reviewers: you need to do "svn remove" of the five files that went away before attempting to build Derby, cf. the status file.

          Things to look out for: missed opportunities to remove casts. The compiler doesn't help me detect superfluous ones

          I realize this is a big patch, and if anybody thinks I should break it up, or drop parts of it, I am willing to consider it. I didn't experience much in the way of errors during the conversion, though, so my confidence in the patch is pretty good. I did the changes incrementally over some 75 change/test cycles.

          Show
          Dag H. Wanvik added a comment - - edited Uploading derby-673-1 which removes the node factory and does some further cleanup in the compiler. a) Replaced the old init methods in "*Node.java" classes with constructors. Some logical node types have different "C_NodeType" values in their nodeType field after construction but still share the same node class. I have not attempted to increase the number of node classes to match logical == physical node classes this once. Actually one class was removed because it unused: "SQLBooleanConstantNode". "IsNode" is also currently unused but there is a JIRA to make use of it ( DERBY-5973 ), so I left it in place. Boxed argument types were replaced by primitive types except in a few cases where instanceof was used on them to detect type overloading; this could be gotten rid of by adding more constructors. Since the constructor arguments are now strongly typed, a great many casts were removed in the process and readability is improved a lot. In some cases the old init procedures did computations before calling "super.init". Since the call to the corresponding super constructor needs to be the first code in a constructor, I sometimes had to introduce new private static methods to compute the correct arguments to send on to the super class constructor, e.g. "getTypeId" in "UserTypeConstantNode". I think there is some redundancy here that could be removed in a follow-up patch. All the non-abstract node classes (still) set their corresponding "C_NodeType" value; but in many (most?) cases the field is no longer needed. This could be improved by removing them altogether and introduced class constants where needed to differentiate between logical node type mapped to one class. This is already done halfheartedly to some extent, e.g. enum "Sign" in "IsNullNode". The old "tools/jar/DBMSnodes.properties" file could be removed altogether since the node classes are now added automatically due to dependencies that the compiler can see (no longer constructed by reflection). The old nodeFactory method "doJoinOrderOptimization" was moved to the OptimizerFactory now that the NodeFactory has gone. b) Added @Override tags to methods that override existing methods (not those that merely implement an interface) c) Removed unused imports and sorted import statements for ease of future maintenance by IDEs d) Renamed variables that shadowed class members e) Replaced usage of StringBuffer with StringBuilder f) Restricted public visibility to package private for all classes, methods and members in compile/impl unless they needed more visibility according to actual usage. g) Made List arguments to node classes use generics in those cases it was missing. e) Renamed some node types to make the the type mirror the node class correctly (there were only very few exceptions to that rule), e.g. LIKE_OPERATOR_NODE -> LIKE_ESCAPE_OPERATOR_NODE since the class is called LikeEscapeOperatorNode. f) Reduced scope of some variables: initialized to null values never used long before actual usage in code. By moving the declaration closed to usage, the unnecessary initialization could often be removed. g) Fixed some spelling errors in comments. h) Renamed some SQL-related constants (StoredFormatIds, TypeId) from "longint" " to "bigint" to reflect Derby SQL syntax. Overall the patch removes ca 5K bytes in the insane derby.jar file, and ca 5000 source lines. All regressions passed, but I'll be running more tests on more platforms since it is a big change. Reviewers: you need to do "svn remove" of the five files that went away before attempting to build Derby, cf. the status file. Things to look out for: missed opportunities to remove casts. The compiler doesn't help me detect superfluous ones I realize this is a big patch, and if anybody thinks I should break it up, or drop parts of it, I am willing to consider it. I didn't experience much in the way of errors during the conversion, though, so my confidence in the patch is pretty good. I did the changes incrementally over some 75 change/test cycles.
          Hide
          Rick Hillegas added a comment -

          Thanks for this major overhaul, Dag. After this patch, what is the purpose of the type ids and names in C_NodeTypes and C_NodeNames? Thanks.

          Show
          Rick Hillegas added a comment - Thanks for this major overhaul, Dag. After this patch, what is the purpose of the type ids and names in C_NodeTypes and C_NodeNames? Thanks.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for the patch, Dag. I think this is a good improvement, as it makes a whole class of errors detectable at compile time.

          I did go through the patch, but because of the size of it, I must admit that I didn't study each line very carefully. But my impression was that the changes looked reasonable. So +1 from me.

          In some parts of the patch, it looked like the only changes were renaming of variables, like operator -> op, methodName -> mNam or methodNam, exposedName -> exposedNam, costEstimate -> costEst. I think I prefer the old names, but not so much that I think it should block the commit. No, wait... Maybe these were the variables you referred to in (d) in the patch description? OK, that's a good enough reason for renaming them, I would suppose.

          Show
          Knut Anders Hatlen added a comment - Thanks for the patch, Dag. I think this is a good improvement, as it makes a whole class of errors detectable at compile time. I did go through the patch, but because of the size of it, I must admit that I didn't study each line very carefully. But my impression was that the changes looked reasonable. So +1 from me. In some parts of the patch, it looked like the only changes were renaming of variables, like operator -> op, methodName -> mNam or methodNam, exposedName -> exposedNam, costEstimate -> costEst. I think I prefer the old names, but not so much that I think it should block the commit. No, wait... Maybe these were the variables you referred to in (d) in the patch description? OK, that's a good enough reason for renaming them, I would suppose.
          Hide
          Dag H. Wanvik added a comment -

          Rick; the C_NodeNames class is gone. Most of the C_NodeTypes constants are useless but in some cases they are used to differentiate logical node classes mapped onto one physical node class. That usage could be refactored into enums defined by that physical class, and would be good to do in a next step.

          Knut, you are right, the renamings were done to avoid the shadowing of class members. They could be removed I guess; shadowing isn't necessarily bad, but its probably a good idea to avoid them; that enables us to keep the IDE warnings enabled and so detect blunders due to new errors.

          I also found a way to enable warnings to detect now useless casts (-Xlint:cast) so I'll upload a revised version 2 of the patch shortly.

          Note: since classes in impl/sql/compilation do no always get compiled as part of the build.xml that logically is supposed to build those classes (they get sucked in for compilation in the iapi parts of the code I believe), I had to enable warnings in other build.xml files to see the warnings and even then I didn't see them all, so in the end I resorted to compile the classes by hand to get all the warnings More work to do there...

          Show
          Dag H. Wanvik added a comment - Rick; the C_NodeNames class is gone. Most of the C_NodeTypes constants are useless but in some cases they are used to differentiate logical node classes mapped onto one physical node class. That usage could be refactored into enums defined by that physical class, and would be good to do in a next step. Knut, you are right, the renamings were done to avoid the shadowing of class members. They could be removed I guess; shadowing isn't necessarily bad, but its probably a good idea to avoid them; that enables us to keep the IDE warnings enabled and so detect blunders due to new errors. I also found a way to enable warnings to detect now useless casts (-Xlint:cast) so I'll upload a revised version 2 of the patch shortly. Note: since classes in impl/sql/compilation do no always get compiled as part of the build.xml that logically is supposed to build those classes (they get sucked in for compilation in the iapi parts of the code I believe), I had to enable warnings in other build.xml files to see the warnings and even then I didn't see them all, so in the end I resorted to compile the classes by hand to get all the warnings More work to do there...
          Hide
          Dag H. Wanvik added a comment -

          Uploading version 2 with more unnecessary casts removed; plus some raw types converted to generics versions; regressions passed.

          Show
          Dag H. Wanvik added a comment - Uploading version 2 with more unnecessary casts removed; plus some raw types converted to generics versions; regressions passed.
          Hide
          Dag H. Wanvik added a comment -

          Attaching version 3 of the patch which adds some trivial merge changes to XMLOptTrace.java and TableName.java.

          Show
          Dag H. Wanvik added a comment - Attaching version 3 of the patch which adds some trivial merge changes to XMLOptTrace.java and TableName.java.
          Hide
          Dag H. Wanvik added a comment -

          Committed version 3 to trunk as 1495305.

          Show
          Dag H. Wanvik added a comment - Committed version 3 to trunk as 1495305.
          Hide
          Dag H. Wanvik added a comment -

          Uploading a little patch which fixes up some comments that were accidentally changed by the first patch; committed as svn 1495317.

          Show
          Dag H. Wanvik added a comment - Uploading a little patch which fixes up some comments that were accidentally changed by the first patch; committed as svn 1495317.
          Hide
          ASF subversion and git services added a comment -

          Commit 1495361 from Knut Anders Hatlen
          [ https://svn.apache.org/r1495361 ]

          DERBY-673: Fixed javadoc warnings.

          Show
          ASF subversion and git services added a comment - Commit 1495361 from Knut Anders Hatlen [ https://svn.apache.org/r1495361 ] DERBY-673 : Fixed javadoc warnings.
          Hide
          Dag H. Wanvik added a comment - - edited

          Attaching derby-673-typesafe-lists-1, which introduces generics to the lists based on QueryTreeNodeVector. I also let the latter implement the Iterable interface, which opens up for using Java 6 "foreach" syntax in many cases. The patch makes use of this too. Together, these changes enables many casts to be eliminated and code clarification in the compiler implementation.

          It also removes most -Xlint warnings from impl/sql/compile classes, so it should be ready to run with full lint. The one remaining I wasn't sure how to handle, but I believe
          it could be suppressed:

          ParseException.java:33: warning: [serial] serializable class ParseException has no definition of serialVersionUID
          class ParseException extends Exception {

          Diffstat summary;

          63 files changed, 854 insertions, 1236 deletions

          Regressions passed, ready for review.

          Show
          Dag H. Wanvik added a comment - - edited Attaching derby-673-typesafe-lists-1, which introduces generics to the lists based on QueryTreeNodeVector. I also let the latter implement the Iterable interface, which opens up for using Java 6 "foreach" syntax in many cases. The patch makes use of this too. Together, these changes enables many casts to be eliminated and code clarification in the compiler implementation. It also removes most -Xlint warnings from impl/sql/compile classes, so it should be ready to run with full lint. The one remaining I wasn't sure how to handle, but I believe it could be suppressed: ParseException.java:33: warning: [serial] serializable class ParseException has no definition of serialVersionUID class ParseException extends Exception { Diffstat summary; 63 files changed, 854 insertions , 1236 deletions Regressions passed, ready for review.
          Hide
          Knut Anders Hatlen added a comment -

          The patch looks like a good improvement. Some minor comments:

          • The first occurrence of "is" should be removed from the javadoc for CompilerContextImpl's constructor .
          • In FromList.addFromTable(), why did the call to addElement(fromTable) have to be qualified with "super"?
          • Perhaps add braces around the body of the first loop in FromList.returnsAtMostSingleRow().
          Show
          Knut Anders Hatlen added a comment - The patch looks like a good improvement. Some minor comments: The first occurrence of "is" should be removed from the javadoc for CompilerContextImpl's constructor . In FromList.addFromTable(), why did the call to addElement(fromTable) have to be qualified with "super"? Perhaps add braces around the body of the first loop in FromList.returnsAtMostSingleRow().
          Hide
          ASF subversion and git services added a comment -

          Commit 1497230 from Dag H. Wanvik
          [ https://svn.apache.org/r1497230 ]

          DERBY-673: Get rid of the NodeFactory

          Remove an erroneously re-introduced "public" keyword from method
          getParameterTypes. This removes a FindBugs warning about "exposing
          internal representation by returning reference to mutable object".

          Show
          ASF subversion and git services added a comment - Commit 1497230 from Dag H. Wanvik [ https://svn.apache.org/r1497230 ] DERBY-673 : Get rid of the NodeFactory Remove an erroneously re-introduced "public" keyword from method getParameterTypes. This removes a FindBugs warning about "exposing internal representation by returning reference to mutable object".
          Hide
          Dag H. Wanvik added a comment -

          Thanks, Knut. Uploading version derby-673-typesafe-lists-2 which addresses your comments. You are right, there was no reason to prefix with super there.

          Show
          Dag H. Wanvik added a comment - Thanks, Knut. Uploading version derby-673-typesafe-lists-2 which addresses your comments. You are right, there was no reason to prefix with super there.
          Hide
          ASF subversion and git services added a comment -

          Commit 1497644 from Dag H. Wanvik
          [ https://svn.apache.org/r1497644 ]

          DERBY-673: Get rid of the NodeFactory

          Patch derby-673-typesafe-lists-2, which introduces generics to the
          lists based on QueryTreeNodeVector. I also let the latter implement
          the Iterable interface, which opens up for using Java 6 "foreach"
          syntax in many cases. The patch makes use of this. Together, these
          changes enables many casts to be eliminated and code clarification in
          the compiler implementation.

          It also removes most -Xlint warnings from impl/sql/compile classes, so
          it should be ready to run with full lint.

          Diffstat summary:
          63 files changed, 854 insertions, 1236 deletions

          Show
          ASF subversion and git services added a comment - Commit 1497644 from Dag H. Wanvik [ https://svn.apache.org/r1497644 ] DERBY-673 : Get rid of the NodeFactory Patch derby-673-typesafe-lists-2, which introduces generics to the lists based on QueryTreeNodeVector. I also let the latter implement the Iterable interface, which opens up for using Java 6 "foreach" syntax in many cases. The patch makes use of this. Together, these changes enables many casts to be eliminated and code clarification in the compiler implementation. It also removes most -Xlint warnings from impl/sql/compile classes, so it should be ready to run with full lint. Diffstat summary: 63 files changed, 854 insertions , 1236 deletions
          Hide
          ASF subversion and git services added a comment -

          Commit 1497742 from Dag H. Wanvik
          [ https://svn.apache.org/r1497742 ]

          DERBY-673: Get rid of the NodeFactory

          Followup fix to patch derby-673-typesafe-lists-2. The patch introduced
          an issue causing ConcurrentModificationException. Roll back that change.

          Show
          ASF subversion and git services added a comment - Commit 1497742 from Dag H. Wanvik [ https://svn.apache.org/r1497742 ] DERBY-673 : Get rid of the NodeFactory Followup fix to patch derby-673-typesafe-lists-2. The patch introduced an issue causing ConcurrentModificationException. Roll back that change.
          Hide
          ASF subversion and git services added a comment -

          Commit 1497767 from Dag H. Wanvik
          [ https://svn.apache.org/r1497767 ]

          DERBY-673: Get rid of the NodeFactory

          Followup fix to patch derby-673-typesafe-lists-2. The patch introduced
          a bug in FromSubquery: wrong loop upper bound. Roll back that change.

          Show
          ASF subversion and git services added a comment - Commit 1497767 from Dag H. Wanvik [ https://svn.apache.org/r1497767 ] DERBY-673 : Get rid of the NodeFactory Followup fix to patch derby-673-typesafe-lists-2. The patch introduced a bug in FromSubquery: wrong loop upper bound. Roll back that change.
          Hide
          Dag H. Wanvik added a comment - - edited

          Attaching another patch, derby-673-more-typesafe-6. This cleans up most compiler warnings in the classes touched, including unnecessary casts due to changes introduced by this use, but also others.
          Some warnings were silenced using the @SuppressWarning tag (mostly use of old classes like Vector, Hashtable), in other cases the underlying problem was fixed by changing the code as required.

          It also introduces an explicit TriggerDescriptorList class extending a generified GenericDescriptorList<TriggerDescriptor>.

          I left warnings about missing serialVersionUID in serializable classes since I am not sure whether they can be safely suppressed; it would need further analysis.

          Regressions passed.

          Show
          Dag H. Wanvik added a comment - - edited Attaching another patch, derby-673-more-typesafe-6. This cleans up most compiler warnings in the classes touched, including unnecessary casts due to changes introduced by this use, but also others. Some warnings were silenced using the @SuppressWarning tag (mostly use of old classes like Vector, Hashtable), in other cases the underlying problem was fixed by changing the code as required. It also introduces an explicit TriggerDescriptorList class extending a generified GenericDescriptorList<TriggerDescriptor>. I left warnings about missing serialVersionUID in serializable classes since I am not sure whether they can be safely suppressed; it would need further analysis. Regressions passed.
          Hide
          ASF subversion and git services added a comment -

          Commit 1502965 from Dag H. Wanvik in branch 'code/trunk'
          [ https://svn.apache.org/r1502965 ]

          DERBY-673: Get rid of the NodeFactory

          Patch, derby-673-more-typesafe-6. This cleans up most compiler
          warnings in the classes touched, including unnecessary casts due to
          changes introduced by this use, but also others. Some warnings were
          silenced using the @SuppressWarning tag (mostly use of old classes
          like Vector, Hashtable), in other cases the underlying problem was
          fixed by changing the code as required.

          It also introduces an explicit TriggerDescriptorList class extending a
          generified GenericDescriptorList<TriggerDescriptor>.

          Some local variables shadowing others are renamed and in in case
          the two variables were merged into one.

          I left in place warnings about missing serialVersionUID in
          serializable classes since I am not sure whether they can be safely
          suppressed; it would need further analysis.

          Show
          ASF subversion and git services added a comment - Commit 1502965 from Dag H. Wanvik in branch 'code/trunk' [ https://svn.apache.org/r1502965 ] DERBY-673 : Get rid of the NodeFactory Patch, derby-673-more-typesafe-6. This cleans up most compiler warnings in the classes touched, including unnecessary casts due to changes introduced by this use, but also others. Some warnings were silenced using the @SuppressWarning tag (mostly use of old classes like Vector, Hashtable), in other cases the underlying problem was fixed by changing the code as required. It also introduces an explicit TriggerDescriptorList class extending a generified GenericDescriptorList<TriggerDescriptor>. Some local variables shadowing others are renamed and in in case the two variables were merged into one. I left in place warnings about missing serialVersionUID in serializable classes since I am not sure whether they can be safely suppressed; it would need further analysis.
          Hide
          Dag H. Wanvik added a comment -

          Uploading two alternate patches, derby-673-nuke-ctypes-without-enum and derby-673-nuke-ctypes-enum. Both remove the usage of the global
          nodeType integer (see QueryTreeNode) defined by C_NodeType.java. After the removal of the node factory, this global quantity whose use mimicked "instanceof" can be discarded, mostly. For some node which represent several logical node types, e.g. BinaryArithmeticOperatorNode which can represent for example both + and minus, a new quantity, "kind" is introduced to differentiate between the logical types of nodes.

          The one patch implements "kind" using enums, the other using static final ints. I first implemented with enums, but it turns out it adds a bit more weight than one would like. Sizes of the insane jars compiled using java 1.8 EA b94:

          derby.jar [before the patches]: 3017406
          derby.jar [iwith enum]: 3024918
          derby.jar [with static ints]: 3007835

          so I guess we'd better use the leaner version (it improves on footprint rather than increases it). Comments appreciated.
          Regressions ran OK.

          Show
          Dag H. Wanvik added a comment - Uploading two alternate patches, derby-673-nuke-ctypes-without-enum and derby-673-nuke-ctypes-enum. Both remove the usage of the global nodeType integer (see QueryTreeNode) defined by C_NodeType.java. After the removal of the node factory, this global quantity whose use mimicked "instanceof" can be discarded, mostly. For some node which represent several logical node types, e.g. BinaryArithmeticOperatorNode which can represent for example both + and minus, a new quantity, "kind" is introduced to differentiate between the logical types of nodes. The one patch implements "kind" using enums, the other using static final ints. I first implemented with enums, but it turns out it adds a bit more weight than one would like. Sizes of the insane jars compiled using java 1.8 EA b94: derby.jar [before the patches] : 3017406 derby.jar [iwith enum] : 3024918 derby.jar [with static ints] : 3007835 so I guess we'd better use the leaner version (it improves on footprint rather than increases it). Comments appreciated. Regressions ran OK.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for the patches, Dag.

          I think they both look like improvements. The enum variant doesn't seem to use any of the more advanced enum features, and I don't think the non-enum variant is significantly less readable or concise. So in this case I agree that the smaller footprint of the non-enum patch sounds more attractive.

          I would prefer that the isSameNodeType() method was overridden in the subclasses that need special handling, instead of stuffing the base method in ValueNode with knowledge about the specialized nodes.

          Otherwise, the changes look good to me.

          Show
          Knut Anders Hatlen added a comment - Thanks for the patches, Dag. I think they both look like improvements. The enum variant doesn't seem to use any of the more advanced enum features, and I don't think the non-enum variant is significantly less readable or concise. So in this case I agree that the smaller footprint of the non-enum patch sounds more attractive. I would prefer that the isSameNodeType() method was overridden in the subclasses that need special handling, instead of stuffing the base method in ValueNode with knowledge about the specialized nodes. Otherwise, the changes look good to me.
          Hide
          Dag H. Wanvik added a comment - - edited

          Thanks, Knut. I agree on the matter of isSameNodeType; I refrained from doing it to reduced # of methods, but I'll have a go and see if it comes cheaply enough.

          Show
          Dag H. Wanvik added a comment - - edited Thanks, Knut. I agree on the matter of isSameNodeType; I refrained from doing it to reduced # of methods, but I'll have a go and see if it comes cheaply enough.
          Hide
          Dag H. Wanvik added a comment -

          Attaching version 2 of the "nuke-ctypes-without-enums" patch. This refactors the isSameNodeType to move to logic of testing kinds into the relevant node classes.

          I introduced a default method ValueNode#isSameNodeKind which the classes that implement several kinds need to override to add the extra kind check. This is then called from the isEquivalent overrides as needed.

          Regressions passed.

          Show
          Dag H. Wanvik added a comment - Attaching version 2 of the "nuke-ctypes-without-enums" patch. This refactors the isSameNodeType to move to logic of testing kinds into the relevant node classes. I introduced a default method ValueNode#isSameNodeKind which the classes that implement several kinds need to override to add the extra kind check. This is then called from the isEquivalent overrides as needed. Regressions passed.
          Hide
          Knut Anders Hatlen added a comment -

          I see that not all classes that define "kind" override isSameNodeKind(). Is that intentional? (SpecialFunctionNode, ModifyColumnNode, TernaryOperatorNode.)

          Does isSameNodeKind() need to be declared in ValueNode? As far as I can see, the default implementation will never be called because of the type checks that all callers of isSameNodeKind() do first. And it looks like all callers of isSameNodeKind() cast the ValueNode to a more specific type, so maybe the isSameNodeKind() methods could be private instead?

          Show
          Knut Anders Hatlen added a comment - I see that not all classes that define "kind" override isSameNodeKind(). Is that intentional? (SpecialFunctionNode, ModifyColumnNode, TernaryOperatorNode.) Does isSameNodeKind() need to be declared in ValueNode? As far as I can see, the default implementation will never be called because of the type checks that all callers of isSameNodeKind() do first. And it looks like all callers of isSameNodeKind() cast the ValueNode to a more specific type, so maybe the isSameNodeKind() methods could be private instead?
          Hide
          Dag H. Wanvik added a comment -

          Thanks, Knut. Yes, it is intentional, if slightly confusing. isSameNodeKind is really only a helper for isEquivalent. For SpecialFunctioNode and TernaryOperatorNode isEquivalent tests on kind directly without a call to isSameNodeKind (because they have no derived classes). Perhaps it might look cleaner if we let them define isSameNodeKind also?

          ModifyColumnNode is not a ValueNode, so isEquivalent is not present (and hence no isSameNodeKind is needed either).

          There are some cases that use the default in ValueNode, e.g. BooleanConstantNode, UntypedNullConstantNode, other ConstantNode classes do have kinds. But I guess I could move it down to ConstantNode for that case. I still think it's nice to tie all the kind usages together, though, so if not via a default method in ValueNode, I guess I could make them implement a NodeKinds interface?

          Show
          Dag H. Wanvik added a comment - Thanks, Knut. Yes, it is intentional, if slightly confusing. isSameNodeKind is really only a helper for isEquivalent. For SpecialFunctioNode and TernaryOperatorNode isEquivalent tests on kind directly without a call to isSameNodeKind (because they have no derived classes). Perhaps it might look cleaner if we let them define isSameNodeKind also? ModifyColumnNode is not a ValueNode, so isEquivalent is not present (and hence no isSameNodeKind is needed either). There are some cases that use the default in ValueNode, e.g. BooleanConstantNode, UntypedNullConstantNode, other ConstantNode classes do have kinds. But I guess I could move it down to ConstantNode for that case. I still think it's nice to tie all the kind usages together, though, so if not via a default method in ValueNode, I guess I could make them implement a NodeKinds interface?
          Hide
          Knut Anders Hatlen added a comment -

          Maybe something like this would work, and make it clearer which nodes need to implement the isSameNodeKind() method:

          1) Keep ValueNode.isSameNodeType() as it is today (it seems like it's still useful as a shorthand, as the isEquivalent() methods in the patch perform the same check manually)

          2) Make UnaryOperatorNode.isEquivalent() and BinaryOperatorNode.isEquivalent() check other.kind == this.kind directly, since they cast the received object to the specific type anyway

          3) Introduce an isSameNodeKind() method in ConstantNode (and sub-classes as necessary) for use in ConstantNode.isEquivalent()

          I think that if isSameNodeKind() is defined in ValueNode, it is more likely that it will be used as a general utility method, so it should be overridden in all nodes that use kinds, for consistency. But it sounds unnecessary to implement this method in classes where we don't expect it to be called, so I'd prefer that we keep it local to the sub-tree (ConstantNode) that seems to need the ability to override the behaviour.

          If we give the method a prominent place, such as ValueNode or its own NodeKinds interface, I think we should give it better defined semantics. Currently, the base method has two possible outcomes (true or false), whereas the overrides have four (true, false, NullPointerException or ClassCastException).

          Show
          Knut Anders Hatlen added a comment - Maybe something like this would work, and make it clearer which nodes need to implement the isSameNodeKind() method: 1) Keep ValueNode.isSameNodeType() as it is today (it seems like it's still useful as a shorthand, as the isEquivalent() methods in the patch perform the same check manually) 2) Make UnaryOperatorNode.isEquivalent() and BinaryOperatorNode.isEquivalent() check other.kind == this.kind directly, since they cast the received object to the specific type anyway 3) Introduce an isSameNodeKind() method in ConstantNode (and sub-classes as necessary) for use in ConstantNode.isEquivalent() I think that if isSameNodeKind() is defined in ValueNode, it is more likely that it will be used as a general utility method, so it should be overridden in all nodes that use kinds, for consistency. But it sounds unnecessary to implement this method in classes where we don't expect it to be called, so I'd prefer that we keep it local to the sub-tree (ConstantNode) that seems to need the ability to override the behaviour. If we give the method a prominent place, such as ValueNode or its own NodeKinds interface, I think we should give it better defined semantics. Currently, the base method has two possible outcomes (true or false), whereas the overrides have four (true, false, NullPointerException or ClassCastException).
          Hide
          Dag H. Wanvik added a comment -

          Uploading patch #3. Discussed the approach with Knut on a back channel, and we agreed to keep the ValueNode#isSameNodeKind but give it more tasks: check that the other node isnt null and check that the two objects have the same class. Overrides will then first call super.isSameNodeKind. isEquivalent will all call isSameNodeKind only before proceeding to cast and check the kind fields.

          Tests ran ok.

          Show
          Dag H. Wanvik added a comment - Uploading patch #3. Discussed the approach with Knut on a back channel, and we agreed to keep the ValueNode#isSameNodeKind but give it more tasks: check that the other node isnt null and check that the two objects have the same class. Overrides will then first call super.isSameNodeKind. isEquivalent will all call isSameNodeKind only before proceeding to cast and check the kind fields. Tests ran ok.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks, Dag. Patch #3 looks good to me. +1

          Show
          Knut Anders Hatlen added a comment - Thanks, Dag. Patch #3 looks good to me. +1
          Hide
          ASF subversion and git services added a comment -

          Commit 1506827 from Dag H. Wanvik in branch 'code/trunk'
          [ https://svn.apache.org/r1506827 ]

          DERBY-673: Get rid of the NodeFactory

          Patch derby-673-nuke-ctypes-without-enum-3.

          This patch removes the usage of the global nodeType integer (see
          QueryTreeNode) defined by C_NodeType.java. After the removal of the
          node factory, this global quantity whose use mimicked "instanceof" can
          be discarded, mostly. For some node which represent several logical
          node types, e.g. BinaryArithmeticOperatorNode which can represent for
          example both + and minus, a new quantity, "kind" is introduced to
          differentiate between the logical types of nodes.

          We introduce a default method ValueNode#isSameNodeKind which the
          classes that implement several kinds need to override to add the extra
          kind check. This is then called from the isEquivalent overrides as
          needed.

          JavaDoc for isSameNodeKind:

          /**

          • Some node classes represent several logical node types (to reduce
          • footprint), which we call kinds.
          • This means that implementations of {@link #isEquivalent()}
          • cannot always just use instanceof to check if the other node
          • represents the same kind. Hence this method needs to be
          • implemented by all node classes that represent several kinds.
          • It is only called from implementations of isEquivalent.
            *
          • @param other The other value node whose kind we want to compare with.
          • @return true if this and o represent the same
          • logical node type, i.e. kind.
            */
          Show
          ASF subversion and git services added a comment - Commit 1506827 from Dag H. Wanvik in branch 'code/trunk' [ https://svn.apache.org/r1506827 ] DERBY-673 : Get rid of the NodeFactory Patch derby-673-nuke-ctypes-without-enum-3. This patch removes the usage of the global nodeType integer (see QueryTreeNode) defined by C_NodeType.java. After the removal of the node factory, this global quantity whose use mimicked "instanceof" can be discarded, mostly. For some node which represent several logical node types, e.g. BinaryArithmeticOperatorNode which can represent for example both + and minus, a new quantity, "kind" is introduced to differentiate between the logical types of nodes. We introduce a default method ValueNode#isSameNodeKind which the classes that implement several kinds need to override to add the extra kind check. This is then called from the isEquivalent overrides as needed. JavaDoc for isSameNodeKind: /** Some node classes represent several logical node types (to reduce footprint), which we call kinds. This means that implementations of {@link #isEquivalent()} cannot always just use instanceof to check if the other node represents the same kind. Hence this method needs to be implemented by all node classes that represent several kinds. It is only called from implementations of isEquivalent. * @param other The other value node whose kind we want to compare with. @return true if this and o represent the same logical node type, i.e. kind. */
          Hide
          ASF subversion and git services added a comment -

          Commit 1507747 from Dag H. Wanvik in branch 'code/trunk'
          [ https://svn.apache.org/r1507747 ]

          DERBY-673: Get rid of the NodeFactory

          Javadoc fix.

          Show
          ASF subversion and git services added a comment - Commit 1507747 from Dag H. Wanvik in branch 'code/trunk' [ https://svn.apache.org/r1507747 ] DERBY-673 : Get rid of the NodeFactory Javadoc fix.

            People

            • Assignee:
              Dag H. Wanvik
              Reporter:
              Rick Hillegas
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development