Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0.0
    • Component/s: None

      Description

      Compressed file with Hive on Tez returns header and footers - for both select * and select count ( * ):

      printf "offset,id,other\n9,\"20200315 X00 1356\",123\n17,\"20200315 X00 1357\",123\nrst,rst,rst" > data.csv
      hdfs dfs -put -f data.csv /apps/hive/warehouse/bz2test/bz2tbl1/
      bzip2 -f data.csv 
      hdfs dfs -put -f data.csv.bz2 /apps/hive/warehouse/bz2test/bz2tbl2/
      
      beeline -e "CREATE EXTERNAL TABLE default.bz2tst2 (
        sequence   int,
        id         string,
        other      string) 
      ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
      LOCATION '/apps/hive/warehouse/bz2test/bz2tbl2' 
      TBLPROPERTIES (
        'skip.header.line.count'='1',
        'skip.footer.line.count'='1');"
      
      beeline -e "
        SET hive.fetch.task.conversion = none;
        SELECT * FROM default.bz2tst2;"
      +-------------------+--------------------+----------------+
      | bz2tst2.sequence  |     bz2tst2.id     | bz2tst2.other  |
      +-------------------+--------------------+----------------+
      | offset            | id                 | other          |
      | 9                 | 20200315 X00 1356  | 123            |
      | 17                | 20200315 X00 1357  | 123            |
      | rst               | rst                | rst            |
      +-------------------+--------------------+----------------+
      

      PS: HIVE-22769 addressed the issue for Hive on LLAP.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                pgaref Panagiotis Garefalakis
                Reporter:
                pgaref Panagiotis Garefalakis
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m