Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-2950

Threads wait forever for a removed entry in the lineage cache

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • SystemDS 2.1
    • None
    • None

    Description

      With parfor, we ensure reuse across threads. The first thread puts a placeholder in the cache and continues executing, where the other threads wait on that placeholder. At the end of the execution, the first thread fills the placeholder with data and wakes up all the other threads, so that they can reuse and skip executing the instruction.

      However, in some cases (e.g. output is a frame, output is federated, output size is more than cache size, etc.) the first task simply removes the placeholder after executing the instruction, which means all the other sleeping threads wait forever. This leads to a hung state.

      Attachments

        Activity

          People

            Arnab Phani Arnab Phani
            Arnab Phani Arnab Phani
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: