Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3460

SequenceFileAsBinaryOutputFormat

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.18.0
    • None
    • None
    • Reviewed
    • Created SequenceFileAsBinaryOutputFormat to write raw bytes as keys and values to a SequenceFile.

    Description

      Add an OutputFormat to write raw bytes as keys and values to a SequenceFile.

      In C++-Pipes, we're using SequenceFileAsBinaryInputFormat to read Sequencefiles.
      However, we current don't have a way to write a sequencefile efficiently without going through extra (de)serializations.

      I'd like to store the correct classnames for key/values but use BytesWritable to write
      (in order for the next java or pig code to be able to read this sequencefile).

      Attachments

        1. HADOOP-3460-part1.patch
          12 kB
          Koji Noguchi
        2. HADOOP-3460-part2.patch
          16 kB
          Koji Noguchi
        3. HADOOP-3460-part3.patch
          18 kB
          Koji Noguchi

        Activity

          People

            knoguchi Koji Noguchi
            knoguchi Koji Noguchi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: