Hive
  1. Hive
  2. HIVE-347

[hive] lot of mappers due to a user error while specifying the partitioning column

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.3.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      A common scenario when the table is partitioned on 'ds' column which is of type 'string' of a certain format 'yyyy-mm-dd'

      However, if the user forgets to add quotes while specifying the query:

      select ... from T where ds = 2009-02-02

      2009-02-02 is a valid integer expression. So, partition pruning makes all partitions unknown, since 2009-02-02 to double conversion is null.

      If all partitions are unknown, in strict mode, we should thrown an error

      1. hive.347.4.patch
        3 kB
        Namit Jain
      2. hive.347.3.patch
        3 kB
        Namit Jain
      3. hive.347.2.patch
        3 kB
        Namit Jain
      4. hive.347.1.patch
        3 kB
        Namit Jain

        Activity

        Hide
        Ashish Thusoo added a comment -

        +1

        looks good to me.

        Show
        Ashish Thusoo added a comment - +1 looks good to me.
        Hide
        Namit Jain added a comment -

        There was a problem with the above patch. TRUE and unknown is unknown, therefore the patch was breaking for the following case:

        select .. from T where partCol = value and col = 1;

        col = 1 will be null even for 'value' partition, therefore 'value' partition will also be unknown.

        The basic problem is that there is a implicit int to string conversion I dont think this can/should be fixed.
        will upload a new patch undoing the patch

        Show
        Namit Jain added a comment - There was a problem with the above patch. TRUE and unknown is unknown, therefore the patch was breaking for the following case: select .. from T where partCol = value and col = 1; col = 1 will be null even for 'value' partition, therefore 'value' partition will also be unknown. The basic problem is that there is a implicit int to string conversion I dont think this can/should be fixed. will upload a new patch undoing the patch
        Hide
        Namit Jain added a comment -

        committed.

        Show
        Namit Jain added a comment - committed.
        Hide
        Namit Jain added a comment -

        done

        Show
        Namit Jain added a comment - done
        Hide
        Raghotham Murthy added a comment -

        +1 looks good.

        Can you change the comment to remove mention of 'date string' since that's not always the case?

        Show
        Raghotham Murthy added a comment - +1 looks good. Can you change the comment to remove mention of 'date string' since that's not always the case?
        Hide
        Prasad Chakka added a comment -

        The changes look good. I think the Error Message will confuse users since they see a partition predicate. I think error message should say something about incorrect type or quotes.

        Show
        Prasad Chakka added a comment - The changes look good. I think the Error Message will confuse users since they see a partition predicate. I think error message should say something about incorrect type or quotes.

          People

          • Assignee:
            Namit Jain
            Reporter:
            Namit Jain
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development