Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-3273

MinimalLockingWriteAheadLog doesn't properly handle corrupted journals

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.2.0
    • None
    • None

    Description

      When NiFi is running if the system dies abruptly (sudden power loss) without flushing writes then anything that was being written to disk can become corrupted. A ticket for the provenance repository is already created here[1]. The content repo handles this automatically since the content claim won't be valid if it hasn't been written out yet. The database repo is just a cache and is rebuilt anyway. The logs are handled by logback. The flow.xml.gz can be rolled back to one the last archive (manually).

      This ticket is for the MinimalLockingWriteAheadLog which backs the FlowFile repo and local state. Originally brought up here[2] for MiNiFi, it will also affect NiFi.

      One possible solution is to restore transactions up until the corrupted id and then ignore the rest. This could cause state to become out of sync with the processed flowfiles (if FF repo is restored but local state cannot be fully restored) but given the rarity of the event I think it is an appropriate risk to accept.

      The workaround for the FF repo is to set "nifi.flowfile.repository.always.sync" but currently there is no way to set "alway sync" for the local state provider.

      [1] https://issues.apache.org/jira/browse/NIFI-2890
      [2] https://community.hortonworks.com/questions/75280/why-does-my-minifi-flow-fail-to-run-when-turning-o.html

      Attachments

        Issue Links

          Activity

            People

              joewitt Joe Witt
              jpercivall Joe Percivall
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: