Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5698 Use protobuf to serialize / deserialize FSImage
  3. HDFS-5722

Implement compression in the HTTP server of SNN / SBN instead of FSImage

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • None
    • None
    • None
    • None

    Description

      The current FSImage format support compression, there is a field in the header which specifies the compression codec used to compress the data in the image. The main motivation was to reduce the number of bytes to be transferred between SNN / SBN / NN.

      The main disadvantage, however, is that it requires the client to access the FSImage in strictly sequential order. This might not fit well with the new design of FSImage. For example, serializing the data in protobuf allows the client to quickly skip data that it does not understand. The compression built-in the format, however, complicates the calculation of offsets and lengths. Recovering from a corrupted, compressed FSImage is also non-trivial as off-the-shelf tools like bzip2recover is inapplicable.

      This jira proposes to move the compression from the format of the FSImage to the transport layer, namely, the HTTP server of SNN / SBN. This design simplifies the format of FSImage, opens up the opportunity to quickly navigate through the FSImage, and eases the process of recovery. It also retains the benefits of reducing the number of bytes to be transferred across the wire since there are compression on the transport layer.

      Attachments

        Activity

          People

            Unassigned Unassigned
            wheat9 Haohui Mai
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: