Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-417

Improvements to Hadoop Thrift bindings

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • None
    • None
    • Tested under Linux x86-64

    Description

      I have made the following changes to hadoopfs.thrift:

      1. Added namespaces for Python, Perl and C++.
      1. Renamed parameters and struct members to camelCase versions to keep them consistent (in particular FileStatus {blockReplication,blockSize}

        vs FileStatus.

        {block_replication,blocksize}

        ).

      1. Renamed ThriftHadoopFileSystem to FileSystem. From the perspective of a Perl/Python/C++ user, 1) it is already clear that we're using Thrift, and 2) the fact that we're dealing with Hadoop is already explicit in the namespace. The usage of generated code is more compact and (in my opinion) clearer:

        Perl:
        use HadoopFS;

        my $client = HadoopFS::FileSystemClient->new(..);

        instead of:

        my $client = HadoopFS::ThriftHadoopFileSystemClient->new(..);

        Python:

        from hadoopfs import FileSystem

        client = FileSystem.Client(..)

        instead of

        from hadoopfs import ThriftHadoopFileSystem

        client = ThriftHadoopFileSystem.Client(..)

        (See also the attached diff [^scripts_hdfs_py.diff] for the
        new version of 'scripts/hdfs.py').

        C++:

        hadoopfs::FileSystemClient client(..);

        instead of:

        hadoopfs::ThriftHadoopFileSystemClient client(..);

      1. Renamed ThriftHandle to FileHandle: As in 3, it is clear that we're dealing with a Thrift object, and its purpose (to act as a handle for file operations) is clearer.
      1. Renamed ThriftIOException to IOException, to keep it simpler, and consistent with MalformedInputException.
      1. Added explicit version tags to fields of ThriftHandle/FileHandle, Pathname, MalformedInputException and ThriftIOException/IOException, to improve compatibility of existing clients with future versions of the interface which might add new fields to those objects (like stack traces for the exception types, for instance).

      Those changes are reflected in the attachment hadoopfs_thrift.diff.

      Changes in generated Java, Python, Perl and C++ code are also attached in gen.diff. They were generated by a Thrift checkout from trunk
      (http://svn.apache.org/repos/asf/incubator/thrift/trunk/) as of revision
      719697, plus the following Perl-related patches:

      The Thrift jar file libthrift.jar built from that Thrift checkout is also attached, since it's needed to run the Java Thrift server.

      I have also added a new target to src/contrib/thriftfs/build.xml to build the Java bindings needed for org.apache.hadoop.thriftfs.HadoopThriftServer.java (see attachment build_xml.diff and modified HadoopThriftServer.java to make use of the new bindings (see attachment HadoopThriftServer_java.diff).

      The jar file [^lib/hadoopthriftapi.jar] is also included, although it can be regenerated from the stuff under 'gen-java' and the new 'compile-gen' Ant target.

      The whole changeset is also included as all.diff.

      Attachments

        1. all.diff
          1.09 MB
          Carlos Valiente
        2. BlockManager.java
          0.8 kB
          Carlos Valiente
        3. build_xml.diff
          0.7 kB
          Carlos Valiente
        4. DefaultBlockManager.java
          2 kB
          Carlos Valiente
        5. DFSBlockManager.java
          3 kB
          Carlos Valiente
        6. gen.diff
          1.06 MB
          Carlos Valiente
        7. HADOOP-4707.diff
          3.02 MB
          Carlos Valiente
        8. HADOOP-4707.patch
          2.30 MB
          Carlos Valiente
        9. HADOOP-4707.patch
          2.30 MB
          Carlos Valiente
        10. hadoop-4707-31c331.patch.gz
          257 kB
          Todd Lipcon
        11. HADOOP-4707-55c046a.txt
          2.36 MB
          Todd Lipcon
        12. hadoop-4707-6bc958.txt
          2.33 MB
          Todd Lipcon
        13. hadoop-4707-867f26.txt.gz
          155 kB
          Todd Lipcon
        14. hadoopfs_thrift.diff
          5 kB
          Carlos Valiente
        15. hadoopthriftapi.jar
          124 kB
          Carlos Valiente
        16. HadoopThriftServer_java.diff
          17 kB
          Carlos Valiente
        17. HadoopThriftServer.java
          20 kB
          Carlos Valiente
        18. hdfs_py_venky.diff
          1 kB
          Carlos Valiente
        19. hdfs.py
          17 kB
          Venky Iyer
        20. libthrift.jar
          91 kB
          Todd Lipcon
        21. libthrift.jar
          168 kB
          Carlos Valiente
        22. libthrift.jar
          91 kB
          Carlos Valiente
        23. libthrift.jar
          72 kB
          Carlos Valiente

        Issue Links

          Activity

            People

              tlipcon Todd Lipcon
              carlos.valiente Carlos Valiente
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: