Uploaded image for project: 'Crunch (Retired)'
  1. Crunch (Retired)
  2. CRUNCH-294

Cost-based job planning

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0, 0.8.2
    • Component/s: Core
    • Labels:
      None

      Description

      A bug report on the user list drove me to revisit some of the core planning logic, particularly around how we decide where to split up DoFns between two dependent MapReduce jobs.

      I found an old TODO about using the scale factor from a DoFn to decide where to split up the nodes between dependent GBKs, so I implemented a new version of the split algorithm that takes advantage of how we've propagated support for multiple outputs on both the map and reduce sides of a job to do finer-grained splits that use information from the scaleFactor calculations to make smarter split decisions.

      One high-level change along with this: I changed the default scaleFactor() value in DoFn to 0.99f to slightly prefer writes that occur later in a pipeline flow by default.

        Attachments

        1. jobplan-lopsided.png
          91 kB
          Gabriel Reid
        2. jobplan-large_s2_s3.png
          87 kB
          Gabriel Reid
        3. jobplan-default-old.png
          87 kB
          Gabriel Reid
        4. jobplan-default-new.png
          86 kB
          Gabriel Reid
        5. CRUNCH-294e.patch
          18 kB
          Josh Wills
        6. CRUNCH-294d.patch
          18 kB
          Josh Wills
        7. CRUNCH-294c.patch
          20 kB
          Josh Wills
        8. CRUNCH-294b.patch
          22 kB
          Josh Wills
        9. CRUNCH-294.patch
          6 kB
          Josh Wills

          Activity

            People

            • Assignee:
              jwills Josh Wills
              Reporter:
              jwills Josh Wills
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: