• New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.5.0
    • 0.6.0
    • bsp core, messaging
    • None


      We can also do a Streaming job to allow other languages to use Hama's BSP API.

      Basically you fork a new process in the BSP method, then set a inputstream for the process which it can read very simple.
      Then an outputstream from the childprocess can be read to give it following abilities:

      • get a received message
      • send a new message
      • sync
      • read a line from input
      • write to output
      • reset the input to reread

      Those actions must have a constant prefix, for example send a message could look like this:

      %SEND_MESSAGE%=this is the message

      or sync:


      The logic behind it is that we can simply split in Java code by "=" and the lefthand side is the action and the righthandside is the value of this action.

      Between the peers the messages are Text, which has some overhead but is easier to implement and the communication between the BSP task and the forked process is based on text/strings anyway.

      This time I do not advise to copy the whole streaming from Hadoop itself. However the parts that repacks the jar with needed execution scripts and the option handling seems good to reuse.
      The input- and outputstream handling must be written from scratch because we want to take actions into account.


        1. HAMA-601_1.patch
          71 kB
          Thomas Jungblut
        2. HAMA-601_v2_final.patch
          69 kB
          Thomas Jungblut
        3. HAMA-601.patch
          77 kB
          Thomas Jungblut
        4. streaming_1.patch
          77 kB
          Thomas Jungblut

        Issue Links



              thomas.jungblut Thomas Jungblut
              thomas.jungblut Thomas Jungblut
              1 Vote for this issue
              10 Start watching this issue




                  Issue deployment