Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-4169

'compressed' keyword in DDL syntax misleading and does not compress

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.19.0
    • None
    • None
    • Reviewed

    Description

      Hive produces two types of data files - flat files and sequencefiles. Syntax should reflect this. Currently the 'compressed' keyword is used to choose sequencefile format - but does not actually compress the files. this is misleading. In addition - flat files can also be compressed.

      Proposal is to replace 'compressed' with 'sequencefile'. And compression options should be applied from standard hadoop way of specifying whether output should be compressed (''mapred.output.compress') - ie. session options. (session options will also define codec etc.). default file format and compression options can be specified in conf file.

      Attachments

        1. 4169-1.txt
          14 kB
          Joydeep Sen Sarma

        Activity

          People

            jsensarma Joydeep Sen Sarma
            jsensarma Joydeep Sen Sarma
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: