Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1138

Hive using lzo comporession returns unexpected results.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Not A Problem
    • Affects Version/s: 0.6.0
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None
    • Environment:

      hadoop 0.20.1, hive trunk 2010-02-03

      Description

      I have a tab separated files I have loaded it with "load data inpath" then I do a

      SET hive.exec.compress.output=true;
      SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
      SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
      select distinct login_cldr_id as cldr_id from chatsessions_load;

      Ended Job = job_201001151039_1641
      OK
      NULL
      NULL
      NULL
      Time taken: 49.06 seconds

      however if I start it without the set commands I get this:
      Ended Job = job_201001151039_1642
      OK
      2283
      Time taken: 45.308 seconds

      Which is the correct result.

      When I do a "insert overwrite" on a rcfile table it will actually compress the data correctly.
      When I disable compression and query this new table the result is correct.
      When I enable compression it's wrong again.
      I see no errors in the logs.

        Attachments

        1. test.csv
          0.0 kB
          Bennie Schut

          Activity

            People

            • Assignee:
              bennies Bennie Schut
              Reporter:
              bennies Bennie Schut
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: