Uploaded image for project: 'Hama'
  1. Hama
  2. HAMA-815

Hama Pipes uses C++ templates

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.6.3
    • 0.7.0
    • bsp core, pipes

    Description

      Extending Hama Pipes to use C++ templates

      Currently all messages are converted to strings before they are transferred over a socket communication between C++ and Java and vice versa.

      To take advantage of the binary socket communication we will serialize and deserialize basic types like int, long, float, double directly without converting to strings. This will minimize the risk of type conversation errors. Other types (except these basic types) are transferred as strings.

      It's also possible to create custom Writables and serialize and deserialize the object to string by overriding the following methods. (e.g., PipesVectorWritable and PipesKeyValueWritable)

      @Override public void readFields(DataInput in) throws IOException
      @Override public void write(DataOutput out) throws IOException
      

      Hama Streaming, which depends on Hama Pipes, is still using strings.

      The following methods change

      virtual void sendMessage(const string& peerName, const string& msg)
      virtual const string& getCurrentMessage()
      virtual void write(const string& key, const string& value)
      virtual bool readNext(string& key, string& value)

      to support C++ templates:

      virtual void sendMessage(const string& peer_name, const M& msg)
      virtual M getCurrentMessage()
      virtual void write(const K2& key, const V2& value)
      virtual bool readNext(K1& key, V1& value)

      Also SequenceFile functions uses templates:

      bool sequenceFileReadNext(int32_t file_id, K& key, V& value)
      bool sequenceFileAppend(int32_t file_id, const K& key, const V& value)

      And the native Partitioner supports it:

        template<class K1, class V1, class K2, class V2, class M>
        class Partitioner {
        public:
          virtual int partition(const K1& key, const V1& value, int32_t num_tasks) = 0;
          virtual ~Partitioner() {}
        };
      

      This will minimize type conversation errors and change the compilation procedure. Because of the nature of C++ templates, static libraries are not possible anymore. The compiler will substitute all templates at compile time.

      The compile command will look like:

      g++ -m64 -Ic++/src/main/native/utils/api \
            -Ic++/src/main/native/pipes/api \
            -Lc++/target/native \
            -lhadooputils -lpthread \
            PROGRAM.cc \
            -o PROGRAM \
            -g -Wall -O2
      

      Finally the job configuration supports the following properties:

        <property>
          <name>bsp.input.format.class&lt;/name>
          <value>org.apache.hama.bsp.KeyValueTextInputFormat</value>
        </property>
        <property>
          <name>bsp.input.key.class&lt;/name>
          <value>org.apache.hadoop.io.Text</value>
        </property>
        <property>
          <name>bsp.input.value.class&lt;/name>
          <value>org.apache.hadoop.io.Text</value>
        </property>
        <property>
          <name>bsp.output.format.class&lt;/name>
          <value>org.apache.hama.bsp.SequenceFileOutputFormat</value>
        </property>
        <property>
          <name>bsp.output.key.class&lt;/name>
          <value>org.apache.hadoop.io.Text</value>
        </property>
        <property>
          <name>bsp.output.value.class&lt;/name>
          <value>org.apache.hadoop.io.DoubleWritable</value>
        </property>
        <property>
          <name>bsp.message.class&lt;/name>
          <value>org.apache.hadoop.io.DoubleWritable</value>
        </property>
      

      Attachments

        1. HAMA-815.patch
          245 kB
          Martin Illecker

        Issue Links

          Activity

            People

              bafu Martin Illecker
              bafu Martin Illecker
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: