Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-541

Parsing Group by clause differently from Hive

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Cannot Reproduce
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      testGroupby1() of TestHiveConverter.java failed if I changed the group by_1.sql as follows.

      select col0, col1, col2, col3, sum(col4) as total, avg(col5) from base group by col0, cube (col1, col2), col3 having total > 100
      

      Following is error log (look at "groups").

      2014-01-22 11:11:41,702 INFO  parser.TestHiveConverter (TestHiveConverter.java:compareJsonResult(63)) - ### Tajo Parse Result ### 
      {
        "all": false,
        "distinct": false,
        "targets": [
          {
            "child": {
              "name": "col0",
              "opType": "Column"
            },
            "opType": "Target"
          },
          {
            "child": {
              "name": "col1",
              "opType": "Column"
            },
            "opType": "Target"
          },
          {
            "child": {
              "name": "col2",
              "opType": "Column"
            },
            "opType": "Target"
          },
          {
            "child": {
              "name": "col3",
              "opType": "Column"
            },
            "opType": "Target"
          },
          {
            "alias": "total",
            "child": {
              "distinct": false,
              "signature": "sum",
              "params": [
                {
                  "name": "col4",
                  "opType": "Column"
                }
              ],
              "opType": "GeneralSetFunction"
            },
            "opType": "Target"
          },
          {
            "child": {
              "distinct": false,
              "signature": "avg",
              "params": [
                {
                  "name": "col5",
                  "opType": "Column"
                }
              ],
              "opType": "GeneralSetFunction"
            },
            "opType": "Target"
          }
        ],
        "child": {
          "qual": {
            "left": {
              "name": "total",
              "opType": "Column"
            },
            "right": {
              "valueType": "Unsigned_Integer",
              "value": "100",
              "opType": "Literal"
            },
            "opType": "GreaterThan"
          },
          "child": {
            "groups": [
              {
                "group_type": "OrdinaryGroup",
                "grouping_sets": [
                  {
                    "name": "col0",
                    "opType": "Column"
                  }
                ]
              },
              {
                "group_type": "Cube",
                "grouping_sets": [
                  {
                    "name": "col1",
                    "opType": "Column"
                  },
                  {
                    "name": "col2",
                    "opType": "Column"
                  }
                ]
              },
              {
                "group_type": "OrdinaryGroup",
                "grouping_sets": [
                  {
                    "name": "col3",
                    "opType": "Column"
                  }
                ]
              }
            ],
            "child": {
              "relations": [
                {
                  "tableName": "base",
                  "opType": "Relation"
                }
              ],
              "opType": "RelationList"
            },
            "opType": "Aggregation"
          },
          "opType": "Having"
        },
        "opType": "Projection"
      }
      2014-01-22 11:11:41,710 INFO  parser.TestHiveConverter (TestHiveConverter.java:compareJsonResult(64)) - ### Hive Parse Result ### 
      {
        "all": false,
        "distinct": false,
        "targets": [
          {
            "child": {
              "name": "col0",
              "opType": "Column"
            },
            "opType": "Target"
          },
          {
            "child": {
              "name": "col1",
              "opType": "Column"
            },
            "opType": "Target"
          },
          {
            "child": {
              "name": "col2",
              "opType": "Column"
            },
            "opType": "Target"
          },
          {
            "child": {
              "name": "col3",
              "opType": "Column"
            },
            "opType": "Target"
          },
          {
            "alias": "total",
            "child": {
              "distinct": false,
              "signature": "sum",
              "params": [
                {
                  "name": "col4",
                  "opType": "Column"
                }
              ],
              "opType": "GeneralSetFunction"
            },
            "opType": "Target"
          },
          {
            "child": {
              "distinct": false,
              "signature": "avg",
              "params": [
                {
                  "name": "col5",
                  "opType": "Column"
                }
              ],
              "opType": "GeneralSetFunction"
            },
            "opType": "Target"
          }
        ],
        "child": {
          "qual": {
            "left": {
              "name": "total",
              "opType": "Column"
            },
            "right": {
              "valueType": "Unsigned_Integer",
              "value": "100",
              "opType": "Literal"
            },
            "opType": "GreaterThan"
          },
          "child": {
            "groups": [
              {
                "group_type": "OrdinaryGroup",
                "grouping_sets": [
                  {
                    "name": "col0",
                    "opType": "Column"
                  },
                  {
                    "name": "col3",
                    "opType": "Column"
                  }
                ]
              },
              {
                "group_type": "Cube",
                "grouping_sets": [
                  {
                    "name": "col1",
                    "opType": "Column"
                  },
                  {
                    "name": "col2",
                    "opType": "Column"
                  }
                ]
              }
            ],
            "child": {
              "relations": [
                {
                  "tableName": "base",
                  "opType": "Relation"
                }
              ],
              "opType": "RelationList"
            },
            "opType": "Aggregation"
          },
          "opType": "Having"
        },
        "opType": "Projection"
      }
      

      It seems that Hive always merges the same group_type as one but Tajo does only when the same group_type appears sequentially,
      as following query passes testGroupby1() without problem.

      select col0, col1, col2, col3, sum(col4) as total, avg(col5) from base group by col0, col3, cube (col1, col2) having total > 100
      

        Issue Links

          Activity

          Hide
          blrunner Jaehwa Jung added a comment -

          I can't reproduce this issue. I think that it was resolved by TAJO-554.

          Show
          blrunner Jaehwa Jung added a comment - I can't reproduce this issue. I think that it was resolved by TAJO-554 .
          Hide
          sirpkt Keuntae Park added a comment -

          OK, I see

          Show
          sirpkt Keuntae Park added a comment - OK, I see
          Hide
          blrunner Jaehwa Jung added a comment -

          Hi, Keuntae.
          Thank you for your report.
          HiveConveter needs to update because of TAJO-501. So, I'm going to update HiveConverter and TestHiveConverter.

          Show
          blrunner Jaehwa Jung added a comment - Hi, Keuntae. Thank you for your report. HiveConveter needs to update because of TAJO-501 . So, I'm going to update HiveConverter and TestHiveConverter.

            People

            • Assignee:
              blrunner Jaehwa Jung
              Reporter:
              sirpkt Keuntae Park
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development