XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      To decrease job startup time we should implement worker pools.

      Worker pools should start BSPTask JVM's based on the configured task capacity.
      This should greatly improve cold-start time for jobs. However, this cost is quite low compared to the long-running Hama task.

      The idea is from http://www.slideshare.net/hanborq/hanborq-optimizations-on-hadoop-mapreduce-20120216a (slide 4). Google Tenzing uses this, and I read about the gmail priority inbox jobs which also uses this task reuse.

      This will be the start of a number of tasks that will profile and improve startup time of jobs and cluster. (Umbrella follows).

      Attachments

        Activity

          People

            Unassigned Unassigned
            thomas.jungblut Thomas Jungblut
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: