Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-6984

Serialize FileStatus via protobuf

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha1
    • 3.0.0-beta1
    • None
    • Incompatible change, Reviewed
    • Hide
      FileStatus and FsPermission Writable serialization is deprecated and its implementation (incompatibly) replaced with protocol buffers. The FsPermissionProto record moved from hdfs.proto to acl.proto. HdfsFileStatus is now a subtype of FileStatus. FsPermissionExtension with its associated flags for ACLs, encryption, and erasure coding has been deprecated; users should query these attributes on the FileStatus object directly. The FsPermission instance in AclStatus no longer retains or reports these extended attributes (likely unused).
      Show
      FileStatus and FsPermission Writable serialization is deprecated and its implementation (incompatibly) replaced with protocol buffers. The FsPermissionProto record moved from hdfs.proto to acl.proto. HdfsFileStatus is now a subtype of FileStatus. FsPermissionExtension with its associated flags for ACLs, encryption, and erasure coding has been deprecated; users should query these attributes on the FileStatus object directly. The FsPermission instance in AclStatus no longer retains or reports these extended attributes (likely unused).

    Description

      FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this to serialize it and send it over the wire. But in Hadoop 2 and later, we have the protobuf HdfsFileStatusProto which serves to serialize this information. The protobuf form is preferable, since it allows us to add new fields in a backwards-compatible way. Another issue is that already a lot of subclasses of FileStatus don't override the Writable methods of the superclass, breaking the interface contract that read(status.write) should be equal to the original status.

      In Hadoop 3, we should just make FileStatus serialize itself via protobuf so that we don't have to deal with these issues. It's probably too late to do this in Hadoop 2, since user code may be relying on the existing FileStatus serialization there.

      Attachments

        1. HDFS-6984.015.patch
          104 kB
          Christopher Douglas
        2. HDFS-6984.014.patch
          103 kB
          Christopher Douglas
        3. HDFS-6984.013.patch
          103 kB
          Christopher Douglas
        4. HDFS-6984.012.patch
          102 kB
          Christopher Douglas
        5. HDFS-6984.011.patch
          97 kB
          Christopher Douglas
        6. HDFS-6984.010.patch
          97 kB
          Christopher Douglas
        7. HDFS-6984.009.patch
          93 kB
          Christopher Douglas
        8. HDFS-6984.008.patch
          87 kB
          Christopher Douglas
        9. HDFS-6984.007.patch
          83 kB
          Christopher Douglas
        10. HDFS-6984.006.patch
          83 kB
          Christopher Douglas
        11. HDFS-6984.005.patch
          24 kB
          Christopher Douglas
        12. HDFS-6984.004.patch
          21 kB
          Christopher Douglas
        13. HDFS-6984.nowritable.patch
          8 kB
          Andrew Wang
        14. HDFS-6984.003.patch
          20 kB
          Christopher Douglas
        15. HDFS-6984.002.patch
          7 kB
          Colin McCabe
        16. HDFS-6984.001.patch
          2 kB
          Colin McCabe

        Issue Links

          Activity

            People

              cdouglas Christopher Douglas
              cmccabe Colin McCabe
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: