Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7817

distinct/group by don't work on partition columns

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 0.14.0
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None

      Description

      suppose you have a table like this:

      CREATE TABLE page_view(
             viewTime INT,
             userid BIGINT,
              page_url STRING,
              referrer_url STRING,
              ip STRING COMMENT 'IP Address of the User')
      COMMENT 'This is the page view table'
      PARTITIONED BY(dt STRING, country STRING)
      CLUSTERED BY(userid) INTO 4 BUCKETS
      

      Then

      select distinct dt from page_view;
      select distinct dt, country from page_view;
      select dt, country from page_view group by dt, country;
      

      all fail with

      Query ID = ekoifman_20140820172626_b03ba819-c111-433f-a3fc-453c7d5a3e86
      Total jobs = 1
      Launching Job 1 out of 1
      Number of reduce tasks not specified. Estimated from input data size: 1
      In order to change the average load for a reducer (in bytes):
        set hive.exec.reducers.bytes.per.reducer=<number>
      In order to limit the maximum number of reducers:
        set hive.exec.reducers.max=<number>
      In order to set a constant number of reducers:
        set mapreduce.job.reduces=<number>
      Job running in-process (local Hadoop)
      Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
      2014-08-20 17:26:13,018 Stage-1 map = 0%,  reduce = 0%
      Ended Job = job_local165359429_0013 with errors
      Error during job, obtaining debugging information...
      FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
      MapReduce Jobs Launched: 
      Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
      Total MapReduce CPU Time Spent: 0 msec
      

      but

      select dt, country, count(*) from page_view group by dt, country;
      

      works fine.

        Issue Links

          Activity

          Hide
          vikram.dixit Vikram Dixit K added a comment -

          Eugene Koifman This seems like a major outage or is this a new feature request? Has this worked in an earlier version of hive say 0.13? If so can you please mark this as a blocker for 0.14.

          Thanks
          Vikram.

          Show
          vikram.dixit Vikram Dixit K added a comment - Eugene Koifman This seems like a major outage or is this a new feature request? Has this worked in an earlier version of hive say 0.13? If so can you please mark this as a blocker for 0.14. Thanks Vikram.
          Hide
          ekoifman Eugene Koifman added a comment -

          I don't know if this ever worked. I ran into this by accident while trying to test something else in 0.14.

          Show
          ekoifman Eugene Koifman added a comment - I don't know if this ever worked. I ran into this by accident while trying to test something else in 0.14.
          Hide
          pxiong Pengcheng Xiong added a comment -

          by the way, i do not think hive-3108 is ever solved.

          Show
          pxiong Pengcheng Xiong added a comment - by the way, i do not think hive-3108 is ever solved.
          Hide
          kivismik Mikko Kivistö added a comment -

          I just verified this with both Hive 0.13 and Hive 0.14 (1.0) and the behaviour is the same, i.e. SELECT DISTINCT partitionColumn FROM table does not work.

          Show
          kivismik Mikko Kivistö added a comment - I just verified this with both Hive 0.13 and Hive 0.14 (1.0) and the behaviour is the same, i.e. SELECT DISTINCT partitionColumn FROM table does not work.

            People

            • Assignee:
              Unassigned
              Reporter:
              ekoifman Eugene Koifman
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development