Accumulo
  1. Accumulo
  2. ACCUMULO-1730

ColumnVisibility parse tree nodes do not have correct location offsets for AND and OR nodes

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.5.0
    • Fix Version/s: 1.6.0
    • Component/s: client
    • Labels:
      None

      Description

      Trying to do some transformations on visibility strings and running into issues working with the parse tree:

      Clojure 1.5.1
      user=> (import [org.apache.accumulo.core.security ColumnVisibility])
      org.apache.accumulo.core.security.ColumnVisibility
      user=> (def vis (ColumnVisibility. "(W)|(U|V)"))
      #'user/vis
      user=> (.getTermStart (first (.getChildren (.getParseTree vis))))
      1
      user=> (.getTermEnd (first (.getChildren (.getParseTree vis))))
      2
      user=> (.getTermStart (second (.getChildren (.getParseTree vis))))
      0
      user=> (.getTermEnd (second (.getChildren (.getParseTree vis))))
      8

      Shouldn't those last two be 5 and 8?

        Activity

        Hide
        John Vines added a comment -

        I believe the fourth would be 5 and 8 as it's an inorder tree traversal

        Show
        John Vines added a comment - I believe the fourth would be 5 and 8 as it's an inorder tree traversal
        Hide
        John Stoneham added a comment -

        It builds out the tree. There's only two children of the top node.

        Show
        John Stoneham added a comment - It builds out the tree. There's only two children of the top node.
        Hide
        Christopher Tubbs added a comment -

        Is this really a bug? If so, can you provide a unit test that highlights the bug?
        I suspect it's not so much a bug, as it is non-intuitive API semantics for internal parsing code, or a non-optimal implementation.

        Show
        Christopher Tubbs added a comment - Is this really a bug? If so, can you provide a unit test that highlights the bug? I suspect it's not so much a bug, as it is non-intuitive API semantics for internal parsing code, or a non-optimal implementation.
        Hide
        Christopher Tubbs added a comment -

        Also, I recommend Java pseudo-code to report bugs for a Java application You can be sure we know Java... but it's not necessarily safe to say we know Clojure (though, the above is reasonably easy to figure out).

        Show
        Christopher Tubbs added a comment - Also, I recommend Java pseudo-code to report bugs for a Java application You can be sure we know Java... but it's not necessarily safe to say we know Clojure (though, the above is reasonably easy to figure out).
        Hide
        Eric Newton added a comment -

        This code

        import org.apache.accumulo.core.security.ColumnVisibility;
        import org.apache.accumulo.core.security.ColumnVisibility.Node;
        
        
        public class Test {
          
          static void printTree(int indent, String v, Node node) {
            for (int i = 0; i < indent; i++)
              System.out.print(' ');
            System.out.println(node.getType() + " " + v.substring(node.getTermStart(), node.getTermEnd()) + " " + node.getTermStart() + " " + node.getTermEnd());
            for (Node child : node.getChildren()) {
              printTree(indent + 1, v, child);
            }
          }
          
          public static void main(String[] args) {
            String s = "(W)|(U|V)";
            ColumnVisibility v = new ColumnVisibility(s);
            System.out.println(s);
            printTree(0, s, v.getParseTree());
          }
          
        }
        

        prints this:

        (W)|(U|V)
        OR  0 0
         TERM W 1 2
         OR (W)|(U|V 0 8
          TERM U 5 6
          TERM V 7 8
        
        Show
        Eric Newton added a comment - This code import org.apache.accumulo.core.security.ColumnVisibility; import org.apache.accumulo.core.security.ColumnVisibility.Node; public class Test { static void printTree(int indent, String v, Node node) { for (int i = 0; i < indent; i++) System.out.print(' '); System.out.println(node.getType() + " " + v.substring(node.getTermStart(), node.getTermEnd()) + " " + node.getTermStart() + " " + node.getTermEnd()); for (Node child : node.getChildren()) { printTree(indent + 1, v, child); } } public static void main(String[] args) { String s = "(W)|(U|V)"; ColumnVisibility v = new ColumnVisibility(s); System.out.println(s); printTree(0, s, v.getParseTree()); } } prints this: (W)|(U|V) OR 0 0 TERM W 1 2 OR (W)|(U|V 0 8 TERM U 5 6 TERM V 7 8
        Hide
        ASF subversion and git services added a comment -

        Commit a25e3af8876f2118fb8370c3c4e3794ffc69989f in branch refs/heads/master from Eric Newton
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=a25e3af ]

        ACCUMULO-1730 set offsets for AND and OR parse nodes

        Show
        ASF subversion and git services added a comment - Commit a25e3af8876f2118fb8370c3c4e3794ffc69989f in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=a25e3af ] ACCUMULO-1730 set offsets for AND and OR parse nodes
        Hide
        ASF subversion and git services added a comment -

        Commit c091b545a7f79efb3dc9b8b279cfb4465b476ecc in branch refs/heads/master from Eric Newton
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=c091b54 ]

        ACCUMULO-1730 set offsets for AND and OR parse nodes

        Show
        ASF subversion and git services added a comment - Commit c091b545a7f79efb3dc9b8b279cfb4465b476ecc in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=c091b54 ] ACCUMULO-1730 set offsets for AND and OR parse nodes
        Hide
        John Stoneham added a comment -

        This mentions it was fixed in 1.4.5 and 1.5.1, but it appears that no commit was actually pushed to any of the backport branches.

        Appears that c091b54 was intended for 1.4.5, and a25e3af was intended for 1.5.1.

        I've got some further commits that fix the offsets for the AND and OR nodes in such a way as they are useful for substringing the actual expression. Will provide pull requests on GitHub.

        Show
        John Stoneham added a comment - This mentions it was fixed in 1.4.5 and 1.5.1, but it appears that no commit was actually pushed to any of the backport branches. Appears that c091b54 was intended for 1.4.5, and a25e3af was intended for 1.5.1. I've got some further commits that fix the offsets for the AND and OR nodes in such a way as they are useful for substringing the actual expression. Will provide pull requests on GitHub.
        Hide
        John Stoneham added a comment -

        Pull request for 1.4.5-SNAPSHOT at https://github.com/apache/accumulo/pull/1
        Pull request for 1.5.1-SNAPSHOT at https://github.com/apache/accumulo/pull/2

        Show
        John Stoneham added a comment - Pull request for 1.4.5-SNAPSHOT at https://github.com/apache/accumulo/pull/1 Pull request for 1.5.1-SNAPSHOT at https://github.com/apache/accumulo/pull/2
        Hide
        John Stoneham added a comment -

        And, sorry about the Clojure sample code. I already had it in the REPL so just pasted it.

        Show
        John Stoneham added a comment - And, sorry about the Clojure sample code. I already had it in the REPL so just pasted it.
        Hide
        Christopher Tubbs added a comment -

        Pull request 2 is insufficient as explained here: https://github.com/apache/accumulo/pull/2#issuecomment-25997371
        Additionally, I'm not sure these non-bugfix improvements to this important code are worth the risk of backporting. (See comment linked above).

        Show
        Christopher Tubbs added a comment - Pull request 2 is insufficient as explained here: https://github.com/apache/accumulo/pull/2#issuecomment-25997371 Additionally, I'm not sure these non-bugfix improvements to this important code are worth the risk of backporting. (See comment linked above).
        Hide
        Christopher Tubbs added a comment -

        Christopher Tubbs wrote on GitHub:

        Out of curiosity, why is this needed? I'm really hesitant to backport this change, since it's not really a bug, and because of the risks.

        John Stoneham wrote on GitHub:

        I was writing a MapReduce job to re-normalize the visibility strings in our database because the format we want to use has changed. This is easy enough on most visibility strings - parse them the old way, then re-write them out the new way. The issue I had was dealing with complex visibility strings with top-level ORs, such as (A&B)|(C&(D|E)). I wanted to use the parse tree to find the substrings of the top-level visibilities, since we can't safely just split on the | character. But the node offsets were wrong.

        I understand the risk of the backport, and making a call on whether or not to add it the 1.4 baseline based on that. I'm still not sure why it doesn't "count as a bug", though.

        (Since this is our issue tracker, I moved the conversation here.)

        I see. That's an interesting use case. I'd be reluctant to recommend relying on this code for that case. Originally, this code was written as a state machine. It was transformed to look more like a parse tree, to make it easier to fix some bugs at the time, but that cost us some efficiency. We addressed that loss by caching the results of visibility expression parsing in the visibility filter in the iterator stack with an LRUMap, but we've always considered this code to be "internal", and subject to change back to a much more efficient state machine or something else, as needed.

        The reason why it's technically an improvement rather than a bug, is because the behavior you're expecting is based on assumptions about the semantics of an inner class intended for internal use... assumptions that were incorrect. Until you (effectively) requested those narrower semantics by filing this JIRA issue, that class existed solely to evaluate visibility expressions, not to provide a parse tree for users. However, it can do both, and that's why Eric Newton added it as an improvement in 1.6.

        If you think this is a "must have", I can backport it to 1.4.5 and 1.5.1, but I'm unwilling to accept the maintenance costs/risks for doing so if it's a "might be nice". (Perhaps another commiter would be willing, though.) I did check your updated pull request, but I couldn't get it to merge cleanly. There might have been some other updates in the 1.4.5-SNAPSHOT and 1.5.1-SNAPSHOT branches that made the merge problematic.

        Show
        Christopher Tubbs added a comment - Christopher Tubbs wrote on GitHub : Out of curiosity, why is this needed? I'm really hesitant to backport this change, since it's not really a bug, and because of the risks. John Stoneham wrote on GitHub : I was writing a MapReduce job to re-normalize the visibility strings in our database because the format we want to use has changed. This is easy enough on most visibility strings - parse them the old way, then re-write them out the new way. The issue I had was dealing with complex visibility strings with top-level ORs, such as (A&B)|(C&(D|E)). I wanted to use the parse tree to find the substrings of the top-level visibilities, since we can't safely just split on the | character. But the node offsets were wrong. I understand the risk of the backport, and making a call on whether or not to add it the 1.4 baseline based on that. I'm still not sure why it doesn't "count as a bug", though. (Since this is our issue tracker, I moved the conversation here.) I see. That's an interesting use case. I'd be reluctant to recommend relying on this code for that case. Originally, this code was written as a state machine. It was transformed to look more like a parse tree, to make it easier to fix some bugs at the time, but that cost us some efficiency. We addressed that loss by caching the results of visibility expression parsing in the visibility filter in the iterator stack with an LRUMap, but we've always considered this code to be "internal", and subject to change back to a much more efficient state machine or something else, as needed. The reason why it's technically an improvement rather than a bug, is because the behavior you're expecting is based on assumptions about the semantics of an inner class intended for internal use... assumptions that were incorrect. Until you (effectively) requested those narrower semantics by filing this JIRA issue, that class existed solely to evaluate visibility expressions, not to provide a parse tree for users. However, it can do both, and that's why Eric Newton added it as an improvement in 1.6. If you think this is a "must have", I can backport it to 1.4.5 and 1.5.1, but I'm unwilling to accept the maintenance costs/risks for doing so if it's a "might be nice". (Perhaps another commiter would be willing, though.) I did check your updated pull request, but I couldn't get it to merge cleanly. There might have been some other updates in the 1.4.5-SNAPSHOT and 1.5.1-SNAPSHOT branches that made the merge problematic.
        Hide
        John Stoneham added a comment -

        Makes sense. I though of the ParseTree as a public API, but sounds like it wasn't originally intended to be that way.

        I don't need this immediately. I'll close the pull requests and open a new one against master. Thanks.

        Show
        John Stoneham added a comment - Makes sense. I though of the ParseTree as a public API, but sounds like it wasn't originally intended to be that way. I don't need this immediately. I'll close the pull requests and open a new one against master. Thanks.
        Hide
        Christopher Tubbs added a comment -

        I thought it was fixed in master.

        Show
        Christopher Tubbs added a comment - I thought it was fixed in master.
        Hide
        John Stoneham added a comment -

        Eric's commit is in master. It addresses the offsets for the terms themselves. My additional commits address the nodes representing the logical operators.

        Show
        John Stoneham added a comment - Eric's commit is in master. It addresses the offsets for the terms themselves. My additional commits address the nodes representing the logical operators.
        Hide
        Christopher Tubbs added a comment -

        Ah.

        Show
        Christopher Tubbs added a comment - Ah.
        Hide
        ASF subversion and git services added a comment -

        Commit 9ceb320e1742a7b0fbb3bd753d28453e9313f517 in branch refs/heads/master from John Stoneham
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=9ceb320 ]

        ACCUMULO-1730 Rename variable for clarity.

        Show
        ASF subversion and git services added a comment - Commit 9ceb320e1742a7b0fbb3bd753d28453e9313f517 in branch refs/heads/master from John Stoneham [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=9ceb320 ] ACCUMULO-1730 Rename variable for clarity.
        Hide
        ASF subversion and git services added a comment -

        Commit 33118cb721b9598af5cbe26cd4048ecb7189f0a0 in branch refs/heads/master from John Stoneham
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=33118cb ]

        ACCUMULO-1730 Correct ParseTree start/end markers for logical connector nodes.

        Show
        ASF subversion and git services added a comment - Commit 33118cb721b9598af5cbe26cd4048ecb7189f0a0 in branch refs/heads/master from John Stoneham [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=33118cb ] ACCUMULO-1730 Correct ParseTree start/end markers for logical connector nodes.
        Hide
        ASF subversion and git services added a comment -

        Commit 876c5ce5ee9a039d56c8e4c7ca04d0401a47cdac in branch refs/heads/master from John Stoneham
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=876c5ce ]

        ACCUMULO-1730 Expose ColumnVisibility normalize/stringify methods.

        Show
        ASF subversion and git services added a comment - Commit 876c5ce5ee9a039d56c8e4c7ca04d0401a47cdac in branch refs/heads/master from John Stoneham [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=876c5ce ] ACCUMULO-1730 Expose ColumnVisibility normalize/stringify methods.
        Hide
        ASF subversion and git services added a comment -

        Commit 611463972a52feab96156012c4a9d75c8df3d882 in branch refs/heads/master from Corey J. Nolet
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=6114639 ]

        ACCUMULO-1751 Fixing off by one error.
        ACCUMULO-1730 Missing ) in test

        Show
        ASF subversion and git services added a comment - Commit 611463972a52feab96156012c4a9d75c8df3d882 in branch refs/heads/master from Corey J. Nolet [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=6114639 ] ACCUMULO-1751 Fixing off by one error. ACCUMULO-1730 Missing ) in test
        Hide
        Corey J. Nolet added a comment -

        It looks like there's some test failures here.

        Show
        Corey J. Nolet added a comment - It looks like there's some test failures here.
        Hide
        John Stoneham added a comment -

        Which tests? I compiled and ran them.

        Show
        John Stoneham added a comment - Which tests? I compiled and ran them.
        Show
        Corey J. Nolet added a comment - https://builds.apache.org/job/Accumulo-Master-Hadoop-2/480/
        Hide
        John Stoneham added a comment -

        I hereby assign copyright of the code submitted for this issue to the ASF to be licensed as they see fit.

        Show
        John Stoneham added a comment - I hereby assign copyright of the code submitted for this issue to the ASF to be licensed as they see fit.
        Hide
        John Stoneham added a comment -

        Patch for lines lost in broken cherry-pick and push. Fixes tests and behavior.

        Show
        John Stoneham added a comment - Patch for lines lost in broken cherry-pick and push. Fixes tests and behavior.
        Hide
        John Stoneham added a comment -

        Sorry about that. I ran the tests after the cherry-pick but must have done something out of order or pushed the wrong commits. Attached patch 0001 fixes tests and behavior.

        Show
        John Stoneham added a comment - Sorry about that. I ran the tests after the cherry-pick but must have done something out of order or pushed the wrong commits. Attached patch 0001 fixes tests and behavior.
        Hide
        ASF subversion and git services added a comment -

        Commit 872b6db372bba953e3c435fcfcb1c64c0713ff49 in branch refs/heads/master from John Stoneham
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=872b6db ]

        ACCUMULO-1730 Reinsert changes inadvertently lost in cherry-pick

        Show
        ASF subversion and git services added a comment - Commit 872b6db372bba953e3c435fcfcb1c64c0713ff49 in branch refs/heads/master from John Stoneham [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=872b6db ] ACCUMULO-1730 Reinsert changes inadvertently lost in cherry-pick
        Hide
        Eric Newton added a comment -

        I don't know, either. I absolutely ran the test after doing the pull.

        Show
        Eric Newton added a comment - I don't know, either. I absolutely ran the test after doing the pull.
        Hide
        Eric Newton added a comment -

        For some reason the patch does not apply clean:

        $ git apply ~/Downloads/0001-ACCUMULO-1730-Reinsert-changes-inadvertently-lost-in.patch 
        error: patch failed: core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java:302
        error: core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java: patch does not apply
        error: patch failed: core/src/test/java/org/apache/accumulo/core/security/ColumnVisibilityTest.java:152
        error: core/src/test/java/org/apache/accumulo/core/security/ColumnVisibilityTest.java: patch does not apply
        
        Show
        Eric Newton added a comment - For some reason the patch does not apply clean: $ git apply ~/Downloads/0001-ACCUMULO-1730-Reinsert-changes-inadvertently-lost-in.patch error: patch failed: core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java:302 error: core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java: patch does not apply error: patch failed: core/src/test/java/org/apache/accumulo/core/security/ColumnVisibilityTest.java:152 error: core/src/test/java/org/apache/accumulo/core/security/ColumnVisibilityTest.java: patch does not apply
        Hide
        John Stoneham added a comment -

        Corey Nolet was having the same problem for a while. Then it worked. It's in now.

        Show
        John Stoneham added a comment - Corey Nolet was having the same problem for a while. Then it worked. It's in now.
        Hide
        John Stoneham added a comment -

        Perhaps you were trying to reapply on top of his apply?

        Show
        John Stoneham added a comment - Perhaps you were trying to reapply on top of his apply?
        Hide
        Eric Newton added a comment -

        John Stoneham is this ticket done?

        Show
        Eric Newton added a comment - John Stoneham is this ticket done?
        Hide
        John Stoneham added a comment -

        Fixed on master.

        Show
        John Stoneham added a comment - Fixed on master.

          People

          • Assignee:
            Eric Newton
            Reporter:
            John Stoneham
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development