Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-688

splitCondition does not behave correctly when one side of the condition references columns from different inputs

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0-incubating
    • Component/s: None
    • Labels:
      None

      Issue Links

        Activity

        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Julian Hyde, I have created a pull request. Could you take a look? Thanks

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Julian Hyde , I have created a pull request. Could you take a look? Thanks
        Hide
        julianhyde Julian Hyde added a comment -

        Looking at it now.

        Show
        julianhyde Julian Hyde added a comment - Looking at it now.
        Hide
        julianhyde Julian Hyde added a comment -

        I presume that you will want this in Hive 1.2. I don't think we can patch Calcite in time for that. So for now let's focus on reviewing the patch. We can get it checked in with a test case at our leisure. I presume that you can create a copy of the rule in Hive that uses a patched version of splitCondition.

        Do you have an example query that causes this?

        The logic would be easier to understand and maintain if you could express it in terms of set-theory operations on bit sets, rather than the low-level nextSetBit operations. You could create a bit set that represents all of the fields of a particular input, using ImmutableBitSet.range(int, int), then intersect it with the bit sets of the conditions.

        I ran the full test suite including integration tests, and the patch does not break anything. Therefore I think it is good for your purposes. But before I accept this patch I will need a test case, and refactoring in terms of set operations.

        Show
        julianhyde Julian Hyde added a comment - I presume that you will want this in Hive 1.2. I don't think we can patch Calcite in time for that. So for now let's focus on reviewing the patch. We can get it checked in with a test case at our leisure. I presume that you can create a copy of the rule in Hive that uses a patched version of splitCondition. Do you have an example query that causes this? The logic would be easier to understand and maintain if you could express it in terms of set-theory operations on bit sets, rather than the low-level nextSetBit operations. You could create a bit set that represents all of the fields of a particular input, using ImmutableBitSet.range(int, int), then intersect it with the bit sets of the conditions. I ran the full test suite including integration tests, and the patch does not break anything. Therefore I think it is good for your purposes. But before I accept this patch I will need a test case, and refactoring in terms of set operations.
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited

        Julian Hyde, no worries, HIVE-10388 solved the problem in the Hive side. Once we get a new Calcite release, I will change that code to point to the fixed method in Calcite.

        About the fix in Calcite, I will rewrite it following your advice.

        The following query was the one that was causing the problem in Hive:

        SELECT src1.key, src3.value
        FROM src src1 JOIN src src2 ON (src1.key = src2.key) JOIN src src3 ON (src1.key + src2.key = src3.key);
        
        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited Julian Hyde , no worries, HIVE-10388 solved the problem in the Hive side. Once we get a new Calcite release, I will change that code to point to the fixed method in Calcite. About the fix in Calcite, I will rewrite it following your advice. The following query was the one that was causing the problem in Hive: SELECT src1.key, src3.value FROM src src1 JOIN src src2 ON (src1.key = src2.key) JOIN src src3 ON (src1.key + src2.key = src3.key);
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Julian Hyde, working back on this issue. Where should I add the test query?

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Julian Hyde , working back on this issue. Where should I add the test query?
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        I have updated the pull request.

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - I have updated the pull request.
        Hide
        julianhyde Julian Hyde added a comment -

        New version of the pull request https://github.com/apache/incubator-calcite/pull/78 looks great.

        If you can reproduce this bug in pure SQL, the best place for a test case is probably join.oq, which can be run from JdbcTest.testRunJoin.

        Show
        julianhyde Julian Hyde added a comment - New version of the pull request https://github.com/apache/incubator-calcite/pull/78 looks great. If you can reproduce this bug in pure SQL, the best place for a test case is probably join.oq, which can be run from JdbcTest.testRunJoin.
        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/53b4b096 .
        Hide
        julianhyde Julian Hyde added a comment -

        Resolved in release 1.3.0-incubating (2015-05-30).

        Show
        julianhyde Julian Hyde added a comment - Resolved in release 1.3.0-incubating (2015-05-30).

          People

          • Assignee:
            jcamachorodriguez Jesus Camacho Rodriguez
            Reporter:
            jcamachorodriguez Jesus Camacho Rodriguez
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development