Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1138

Hive using lzo comporession returns unexpected results.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Not A Problem
    • 0.6.0
    • None
    • Query Processor
    • None
    • hadoop 0.20.1, hive trunk 2010-02-03

    Description

      I have a tab separated files I have loaded it with "load data inpath" then I do a

      SET hive.exec.compress.output=true;
      SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
      SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
      select distinct login_cldr_id as cldr_id from chatsessions_load;

      Ended Job = job_201001151039_1641
      OK
      NULL
      NULL
      NULL
      Time taken: 49.06 seconds

      however if I start it without the set commands I get this:
      Ended Job = job_201001151039_1642
      OK
      2283
      Time taken: 45.308 seconds

      Which is the correct result.

      When I do a "insert overwrite" on a rcfile table it will actually compress the data correctly.
      When I disable compression and query this new table the result is correct.
      When I enable compression it's wrong again.
      I see no errors in the logs.

      Attachments

        1. test.csv
          0.0 kB
          Bennie Schut

        Activity

          People

            bennies Bennie Schut
            bennies Bennie Schut
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: