Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-8765

WAIT/NOTIFY processor problems

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.13.2
    • Fix Version/s: None
    • Component/s: Extensions
    • Labels:
    • Environment:
      ubuntu 2004 lts
      sap machine java 1.11

      Description

      2021-07-09 - Ignore the ticket at the moment. 

       

      ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

      issue A

      WAIT/NOTIFY uses MapCache

      Using MapCache is a problem as the notify messages will expire from the cache. It's necessary to expire notify messages from the cache but the conditions are difficult.

       

      DistributeMapCacheServer 

          using persistence directory doesn't work and will cause problems under heavy load and is very likely to go OutOfMemory (OOM), even when persistence directory is used all data is kept in memory, if server goes OOM, craches or is restarted it is very likely to have data loss becasue data is only persisted to disk every now and then.

          not using persistence directory - data is lost at restart
          the cache can't be cleared but remain using java heap until NiFi is restarted
          eviction options - "last recently used" or "first in first out"

      Hazelcast

          Is in memory only so dataloss if a restart happens.
          the cache can't be cleared but remain using java heap until NiFi is restarted

          likely to go OOM under heavy load
          eviction based on time

      When messages go to the successful relationship after the WAIT step there should be a way to evict the corresponding notify from the cache.

      ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

      issue B 

       

      I have a WAIT processor and 2000 files queud up in the upstream connection

      The problem is that the WAIT processor uses more than 100% CPU all the time as long as the messages are on queue.
      Any idea about the problem?

       

      Release Signal Identifier = ${uuid}

      Target Signal Count = 1

      Signal Counter Name = No value set

      Wait Buffer Count = 1500

      Releasable FlowFile Count = 1

      Expiration Duration = 10 min

      Attribute Copy Mode = Keep original

      Wait Mode = Keep in the upstream connection

      Wait Penalty Duration = 5 min

       

      Number of Threads 10 

      ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

      issue C 

       

      When a notify message arrives you would like it to trigger release from the WAIT immediately.

      Instead the WAIT processor scans the cache at some interval.

          a - when messages at first hit the WAIT processor or after looping in the wait relationship

          b - "wait penalty duration" messages will be checked when expiry - with 100 000 000 messages on queue this will be problematic

      ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

       

       

      I'd planned to have more than 100 000 000 messages on queue and then let them expire after 7 days. Some or all messages may be notified to proceed. Is the WAIT intended to handel such a scenario? More extensive documentation would be good.

       Are you supposed to have multiple threads on the WAIT/NOTIFY processors? Seems like the MapCache hits problems ConcurrentModificationException. More extensive documentation would be good.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tomten1970 Jul Tomten
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: