Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3935

New line character in output when sequence file is used for storage and table is empty

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.9.0, 0.10.0
    • None
    • Query Processor
    • None
    • Centos 6.3

    Description

      When a "select distinct" command is issued on empty table which uses sequence file for storage, a new extra line (0x0a) is present in the result set even when table has no data. This output is not consistent with result of same command Hive 0.7.1 and can cause workflows to fail due to wrong record count.

      Execution on Hive 0.9 and 0.10
      hive> create table hoge2(col1 string,col2 string) partitioned by (p_part
      string) stored as sequencefile;
      hive> describe hoge2;
      OK
      col1 string
      col2 string
      p_part string
      Time taken: 0.24 seconds
      hive> select distinct p_part from hoge2;
      Total MapReduce jobs = 1
      Launching Job 1 out of 1
      Number of reduce tasks not specified. Estimated from input data size: 1
      In order to change the average load for a reducer (in bytes):
      set hive.exec.reducers.bytes.per.reducer=<number>
      In order to limit the maximum number of reducers:
      set hive.exec.reducers.max=<number>
      In order to set a constant number of reducers:
      set mapred.reduce.tasks=<number>
      Starting Job = job_201301230112_0001, Tracking URL =
      http://testcluster2-1:50030/jobdetails.jsp?jobid=job_201301230112_0001
      Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job
      -Dmapred.job.tracker=maprfs:/// -kill job_201301230112_0001
      Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
      2013-01-23 02:50:16,843 Stage-1 map = 0%, reduce = 0%
      2013-01-23 02:50:26,897 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13
      sec
      2013-01-23 02:50:27,905 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13
      sec
      2013-01-23 02:50:28,911 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13
      sec
      2013-01-23 02:50:29,919 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13
      sec
      2013-01-23 02:50:30,925 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13
      sec
      2013-01-23 02:50:31,933 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13
      sec
      2013-01-23 02:50:32,939 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13
      sec
      2013-01-23 02:50:33,945 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.8
      sec
      MapReduce Total cumulative CPU time: 1 seconds 800 msec
      Ended Job = job_201301230112_0001
      MapReduce Jobs Launched:
      Job 0: Map: 1 Reduce: 1 Cumulative CPU: 1.8 sec MAPRFS Read: 327 MAPRFS
      Write: 71 SUCCESS
      Total MapReduce CPU Time Spent: 1 seconds 800 msec
      OK

      Time taken: 21.94 seconds

      Result on Hive 0.7.1
      hive> select count(distinct p_part) from hoge3;
      Total MapReduce jobs = 1
      Launching Job 1 out of 1
      Number of reduce tasks determined at compile time: 1
      In order to change the average load for a reducer (in bytes):
      set hive.exec.reducers.bytes.per.reducer=<number>
      In order to limit the maximum number of reducers:
      set hive.exec.reducers.max=<number>
      In order to set a constant number of reducers:
      set mapred.reduce.tasks=<number>
      Starting Job = job_201210261659_0019, Tracking URL =
      http://testcluster1-1:50030/jobdetails.jsp?jobid=job_201210261659_0019
      Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job
      -Dmapred.job.tracker=maprfs:/// -kill job_201210261659_0019
      2013-01-23 21:42:01,787 Stage-1 map = 0%, reduce = 0%
      2013-01-23 21:42:07,815 Stage-1 map = 100%, reduce = 0%
      2013-01-23 21:42:12,835 Stage-1 map = 100%, reduce = 100%
      Ended Job = job_201210261659_0019
      OK
      0
      Time taken: 16.637 seconds

      Underlying Hadoop version for Hive 0.9 is Hadoop 1.0.3 and for Hive 0.7 it is 0.20.203

      Attachments

        1. HIVE-3935-0.9.patch
          1 kB
          Aditya Kishore

        Activity

          People

            Unassigned Unassigned
            doodlegum Abhinav Chawade
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: