Uploaded image for project: 'Crunch (Retired)'
  1. Crunch (Retired)
  2. CRUNCH-390

Planner is not adding dependencies between jobs when planning is done in more than one stage.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8.2
    • 0.10.0, 0.8.3
    • Core
    • None

    Description

      The planner splits does the planning in multiple stages when it finds job dependencies on ReadableData. One example of this case is when using the BloomFilterJoinStrategy.
      While the generated plan dot file looks good, the planner actually does not add dependencies between jobs that are created in different planning stages.
      I have a pipeline that reads 3 input sources. It joins 2 of them using a bloom filter join strategy. Later on, it joins this with the output of a job coming from the third source path.
      In the case the jobs on the branch using the bloom filter finish before the one reading the third source, the executor attempts to start the 4-th job that is supposed to join everything before the 3-rd one finish, resulting in a input Path not found exception.

      Attachments

        Activity

          People

            jwills Josh Wills
            cmarius Ioan Marius Curelariu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: