XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.1
    • None

    Description

      Task cloning is useful for things like straggler handling and skew handling.

      Some things to consider when implementing task cloning in Nemo are

      • In JobStateManager support multiple TaskStates, and specify what that means for their corresponding StageStates
      • In BlockManager, handle multiple BlockState transitions for the same block

      On the other hand, to handle stragglers, Google Cloud Dataflow does more of task splitting, which splits and assigns remaining works of a task to multiple new tasks. This is different from traditional task cloning, which creates clones that do the same work as the original task.

      https://cloud.google.com/blog/big-data/2016/05/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow

      A recent work on skew handling (EuroSys18) also uses the task splitting technique.

      https://infoscience.epfl.ch/record/253574/files/hurricane.pdf;

      Before jumping into implementing task 'cloning' or 'splitting', we may want to think about our priorities and also whether we can design a more general lower-level interface for expressing both of the techniques.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              johnyangk John Yang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: