Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-6923

If FlowFile content is truncated in the Content Repository, NiFi does not throw Exception when reading the content

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.11.0
    • Core Framework
    • None

    Description

      If the content of a FlowFile is truncated in the Content Repository, whenever NiFi attempts to read the content, it should throw a ContentNotFoundException because only part of the content was available. This is handled by the `FlowFileAccessInputStream` in the `ensureAllContentRead` method.

      However, in some cases this doesn't happen. To replicate, create the following flow:

      GenerateFlowFile -> MergeContent.

      In GenerateFlowFile, choose to use a batch size of 1000 FlowFiles, each 1 KB in size. Run the Processor once. Then, use vi to truncate a few bytes from the end of the file in the content repository. Then, run MergeContent. The processor should throw an Exception but doesn't.

      I think the problem is that the `FlowFileAccessInputStream.read(byte[])` calls `super.read(byte[])`. This, in turn, calls `FlowFileAccessInputStream.read(byte[], int, int)`, which increments `bytesConsumed`. This method then returns, as does the super call. Then, the `FlowFileAccessInputStream.read(byte[])` call increments `bytesConsumed` again.

      Instead, the InputStream should just delegate to `read(byte[], int, int)` directly when `read(byte[])` is called instead of delegating to `super.read(byte[], int, int)`.

      Attachments

        Issue Links

          Activity

            People

              Dayakar Dayakar M
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h