Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-7158

Reduce RPC packet size for homogeneous arrays, such as the array responses to listStatus() and getBlockLocations()

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.22.0
    • None
    • io
    • None
    • Incompatible change
    • Writable, homogeneous arrays, RPC

    Description

      While commenting on HADOOP-6949, which proposes a big improvement in the RPC wire format for arrays of primitives, Konstantin Shvachko said:
      "Can/should we extend this to arrays of non-primitive types? This should benefit return types for calls like listStatus() and getBlockLocations() on a large directory."

      The improvement for primitive arrays is based on not type-labeling every element in the array, so the array in question must be strictly homogenous; it cannot have subtypes of the assignable type. For instance, it could not be applied to heartbeat responses of DatanodeCommand[], whose array elements carry subtypes of DatanodeCommand, each of which must be type-labeled independently. However, as Konstantin points out, it could really help lengthy response arrays for things like listStatus() and getBlockLocations().

      I will attach a prototype implementation to this Jira, for discussion. However, since it can't be automatically applied to all arrays passing through RPC, I'll just providing the wrapper type. By using it, a caller is asserting that the array is strictly homogeneous in the above sense.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mattf Matthew Foley
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: