Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-14

Files with .gz extension reported as 'not supported'

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 0.3
    • Impala 0.4
    • None
    • None

    Description

      See this block of code.

      HdfsCompression compressionType =
                  HdfsCompression.fromFileName(fileDescriptor.getFilePath());
              fileDescriptor.setCompression(compressionType);
              if (compressionType == HdfsCompression.LZO_INDEX) {
                // Skip index files, these are read by the LZO scanner directly.
                continue;
              }
      
              HdfsStorageDescriptor sd = partition.getInputFormatDescriptor();
              if (compressionType == HdfsCompression.LZO) {
                if (sd.getFileFormat() != HdfsFileFormat.LZO_TEXT) {
                  throw new RuntimeException(
                      "Compressed file not supported without compression input format: " + p);
                }
              } else if (compressionType != HdfsCompression.NONE) {
                throw new RuntimeException("Compressed file not supported: " + p);
              } else if (sd.getFileFormat() == HdfsFileFormat.LZO_TEXT) {
                throw new RuntimeException("Expected file with .lzo suffix: " + p);
              }
      

      The compression type checks aren't correct: they shouldn't fail for .gz, and they arguably shouldn't be run for non-text formats.

      Workaround for now is to rename files to .gz.

      Attachments

        Activity

          People

            alex.behm Alexander Behm
            henryr Henry Robinson
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: