Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1069

Write HDFS wire protocols in AVRO IDL

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      As part of the the move to AVRO and wire compatibility, write all HDFS protocols in AVRO IDL

        Issue Links

          Activity

          Hide
          Sanjay Radia added a comment -

          Linking as "relates to" rather than "subtask" or "blocks" because HADOOP-6659 proposes using AVRO-IDL to be a subsequent step in moving to AVRO.

          Show
          Sanjay Radia added a comment - Linking as "relates to" rather than "subtask" or "blocks" because HADOOP-6659 proposes using AVRO-IDL to be a subsequent step in moving to AVRO.
          Hide
          Sanjay Radia added a comment -

          There are two main parts here

          1. The HDFS RPC protocols
          2. The Data node transfer protocols (these are not RPC currently).
            • I suggest that as an initial step we simply define the message headers of this protocol using AVRO IDL. (ie. I recommend that we skip the reflection-based approach here and straight away write the headers in AVRO IDL; currently the headers are not even defined as a Java class - they are all "inline". )
            • At a future point we would convert the data transfer protocols to RPC - this is non-trivial work.
          Show
          Sanjay Radia added a comment - There are two main parts here The HDFS RPC protocols The Data node transfer protocols (these are not RPC currently). I suggest that as an initial step we simply define the message headers of this protocol using AVRO IDL. (ie. I recommend that we skip the reflection-based approach here and straight away write the headers in AVRO IDL; currently the headers are not even defined as a Java class - they are all "inline". ) At a future point we would convert the data transfer protocols to RPC - this is non-trivial work.
          Hide
          Doug Cutting added a comment -

          FYI, here's the Avro version of the current trunk's ClientProtocol, generated by reflection.

          I used the following command line to generate this:

          ~/src/avro/trunk/lang/java/build/avro-tools-1.4.0-SNAPSHOT.jar induce build/classes:build/ivy/lib/Hadoop-Hdfs/common/hadoop-core-0.22.0-SNAPSHOT.jar org.apache.hadoop.hdfs.protocol.ClientProtocol > /tmp/ClientProtocol.avpr

          Show
          Doug Cutting added a comment - FYI, here's the Avro version of the current trunk's ClientProtocol, generated by reflection. I used the following command line to generate this: ~/src/avro/trunk/lang/java/build/avro-tools-1.4.0-SNAPSHOT.jar induce build/classes:build/ivy/lib/Hadoop-Hdfs/common/hadoop-core-0.22.0-SNAPSHOT.jar org.apache.hadoop.hdfs.protocol.ClientProtocol > /tmp/ClientProtocol.avpr
          Hide
          Todd Lipcon added a comment -

          Hey Sanjay,

          I've started looking into the DataXceiver in avro a bit. I think I agree with your assessment that we should first use the existing protocol but use Avro to define the packet headers, operations, and status codes. Doing the whole thing as RPC is going to be a lot trickier if we want to do it right - AVRO-406 discusses some of the inherent issues.

          Show
          Todd Lipcon added a comment - Hey Sanjay, I've started looking into the DataXceiver in avro a bit. I think I agree with your assessment that we should first use the existing protocol but use Avro to define the packet headers, operations, and status codes. Doing the whole thing as RPC is going to be a lot trickier if we want to do it right - AVRO-406 discusses some of the inherent issues.
          Hide
          Doug Cutting added a comment -

          The above would be more readable in GenAvro format than it is in json...

          http://hadoop.apache.org/avro/docs/1.3.1/genavro.html

          GenAvro still need a little work. We should rename it to be Avro IDL (AVRO-372), have it support file includes, and add a tool to convert from json back to idl, not just from idl to json...

          Show
          Doug Cutting added a comment - The above would be more readable in GenAvro format than it is in json... http://hadoop.apache.org/avro/docs/1.3.1/genavro.html GenAvro still need a little work. We should rename it to be Avro IDL ( AVRO-372 ), have it support file includes, and add a tool to convert from json back to idl, not just from idl to json...
          Hide
          Allen Wittenauer added a comment -

          Closing as Won't Fix.

          We've already switched to protobuf.

          Show
          Allen Wittenauer added a comment - Closing as Won't Fix. We've already switched to protobuf.

            People

            • Assignee:
              Sanjay Radia
              Reporter:
              Sanjay Radia
            • Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development