Uploaded image for project: 'REEF (Retired)'
  1. REEF (Retired)
  2. REEF-429

Implement IMRU on Group Communications

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • REEF.NET
    • None

    Description

      Client-Side input

      The user of the IMRU API will have to provide:

      • A REEF Runtime Configuration. This will be used to submit the IMRU Driver.
      • Configurations for the IMRU components: Map, Reduce and Update function as well as the needed Codec configurations.
      • A Configuration that allows us to instantiate an instance of the (yet to be defined) IInputProviderDriver API.
      • A Configuration for the yet to instantiate the (yet to be defined) IOutputProviderDriver

      Driver side

      Constructor

      • Use IInputProviderDriver to get partition information:
      • Number of partitions
      • Partition ID
      • Partition locations (not immediately, but we will want this in the future)
      • Use that information to configure Group Communications with:
      • Data Broadcast and Reduce
      • Control Broadcast (this is to indicate to the Mappers when to close)

      OnNext(DriverStarted)

      • Request the Evaluators as per the data provider's information
      • Set a timeout for this request to be satisfied. If it is not within that time, exit the Driver.

      OnNext(AllocatedEvaluator)

      • For the UpdateFunction:
      • Assemble the Service Configurations for group communication and data output
      • Submit the UpdateTask
      • For the MapFunction
      • Assemble the Service Configurations for group communication and data input
      • Submit the MapTask

      OnNext(CompletedTask) and OnNext(CompletedEvaluator)

      • Keep track of all Tasks and make sure that they all exit cleanly
      • Set a timer when the first task completes and fail when not all tasks complete by the end of the timer.

      OnNext(FailedTask) and OnNext(FailedEvaluator)

      • Fail the Driver (for now)

      UpdateTask

      Constructor

      • Establish and validate data output
      • Establish and validate group communications

      Call() main loop

      • Use the IUpdateFunction instance to determine whether there is a next iteration
      • If yes:
      • Send control message to the mappers, followed by the data message
      • If no:
      • Send the ending control message to the mappers
      • If the IUpdateFunction provided output, send it.

      MapTask

      Constructor

      • Establish and validate data input
      • Establish and validate group communications

      Call() main loop

      • Wait for control message.
      • If one more iteration, call the IMapFunction with the data sent on the data broadcast.
      • If the computation is to end, exit the loop.

      Attachments

        Issue Links

          1.
          Update IMRU APIs to enable writing of REEF IMRU driver and client Sub-task Resolved Dhruv Mahajan
          2.
          Introduce MapInputWithControlMessage Sub-task Resolved Dhruv Mahajan
          3.
          Implement MapTaskHost, UpdateTaskHost and ConfigurationManager required by IMRUDriver Sub-task Resolved Dhruv Mahajan
          4.
          Introduce IFileSystem in IUpdateFunction to write results Sub-task Resolved Dhruv Mahajan
          5.
          Implement IMRU Driver, Client, End to End MapperCount and BroadCast Reduce Examples Sub-task Closed Dhruv Mahajan
          6.
          Implement IMRU Client Sub-task Closed Dhruv Mahajan
          7.
          Introduce timeout for Evaluator request to be satisfied in IMRU Sub-task Open Unassigned
          8.
          Implement IMRU driver and client Sub-task Resolved Dhruv Mahajan
          9.
          Create empty REEF.IMRU.Examples project Sub-task Resolved Dhruv Mahajan
          10.
          Introduce Integer Array BroadcastReduce IMRU example code Sub-task Resolved Dhruv Mahajan
          11.
          Move MapperCount example to REEF.IMRU.Examples Sub-task Resolved Dhruv Mahajan
          12.
          Write the code to call IMRU Client for MapperCount and BroadcastReduce examples Sub-task Resolved Dhruv Mahajan
          13.
          Passing Mapper Specific configuration in IMRU Sub-task Resolved Unassigned
          14.
          Add Evaluator memory option to IMRU and its examples Sub-task Resolved Dhruv Mahajan
          15.
          Introduce example that actually uses IPerMapperConfigs in IMRU Sub-task Resolved Joo Seong Jeong
          16.
          Introduce retry logic for failed evaluators Sub-task Resolved Dhruv Mahajan
          17.
          Fix issues with InProcessIMRU Sub-task Resolved Dhruv Mahajan
          18.
          Make Named Parameter PerMapConfigGeneratorSet public Sub-task Resolved Dhruv Mahajan
          19.
          TcpPortProvider configuration is not passed correctly to evaluators Sub-task Resolved Dhruv Mahajan
          20.
          IParititon Configuration does not get passed to task from Root context in IMRU Driver Sub-task Resolved Dhruv Mahajan
          21.
          IMRU Map and Update task and driver have verbose setting on by default Sub-task Resolved Dhruv Mahajan
          22.
          IMRU Map and Update host tasks take a lot of internal memory Sub-task Resolved Dhruv Mahajan
          23.
          IMRUJobDefinition does not allow user to set number of cores Sub-task Resolved Dhruv Mahajan
          24.
          IIMRUClient does not need to have generic type Sub-task Resolved Dhruv Mahajan
          25.
          IIMRUClient design needs improvement Sub-task Open Dhruv Mahajan
          26.
          IIMRUClient does not return httpDriverEndPoint Sub-task Resolved Dhruv Mahajan

          Activity

            People

              dkm2110 Dhruv Mahajan
              markus.weimer Markus Weimer
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: