Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-2065

RCFile issues

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • rcfile

    Description

      Some potential issues with RCFile

      1. Remove unwanted synchronized modifiers on the methods of RCFile. As per yongqiang he, the class is not meant to be thread-safe (and it is not). Might as well get rid of the confusing and performance-impacting lock acquisitions.

      2. Record Length overstated for compressed files. IIUC, the key compression happens after we have written the record length.

            int keyLength = key.getSize();
            if (keyLength < 0) {
              throw new IOException("negative length keys not allowed: " + key);
            }
      
            out.writeInt(keyLength + valueLength); // total record length
            out.writeInt(keyLength); // key portion length
            if (!isCompressed()) {
              out.writeInt(keyLength);
              key.write(out); // key
            } else {
              keyCompressionBuffer.reset();
              keyDeflateFilter.resetState();
              key.write(keyDeflateOut);
              keyDeflateOut.flush();
              keyDeflateFilter.finish();
              int compressedKeyLen = keyCompressionBuffer.getLength();
              out.writeInt(compressedKeyLen);
              out.write(keyCompressionBuffer.getData(), 0, compressedKeyLen);
            }
      

      3. For sequence file compatibility, the compressed key length should be the next field to record length, not the uncompressed key length.

      Attachments

        1. Slide1.png
          121 kB
          Krishna Kumar
        2. proposal.png
          152 kB
          Krishna Kumar
        3. HIVE.2065.patch.0.txt
          113 kB
          Krishna Kumar
        4. HIVE.2065.patch.1.txt
          90 kB
          Krishna Kumar

        Activity

          People

            n_krishna_kumar Krishna Kumar
            n_krishna_kumar Krishna Kumar
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: