Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-1701

StreamScanner in PutKafka doesn't handle byte values < -1 properly

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0, 0.6.1
    • Component/s: None
    • Labels:
      None

      Issue Links

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user olegz opened a pull request:

        https://github.com/apache/nifi/pull/314

        NIFI-1701 fixed StreamScanner, added more tests

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/olegz/nifi NIFI-1701

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/nifi/pull/314.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #314


        commit bab1881abe462a022c98e53d21ee6c2cce91f395
        Author: Oleg Zhurakousky <oleg@suitcase.io>
        Date: 2016-03-31T04:59:26Z

        NIFI-1701 fixed StreamScanner, added more tests


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user olegz opened a pull request: https://github.com/apache/nifi/pull/314 NIFI-1701 fixed StreamScanner, added more tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/olegz/nifi NIFI-1701 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/314.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #314 commit bab1881abe462a022c98e53d21ee6c2cce91f395 Author: Oleg Zhurakousky <oleg@suitcase.io> Date: 2016-03-31T04:59:26Z NIFI-1701 fixed StreamScanner, added more tests
        Hide
        joewitt Joseph Witt added a comment -

        i think this warrants an 0.6.1 release. The impact of it is that certain bytes can cause the demarcation logic to stop too soon resulting in data loss in delivery to kafka.

        Show
        joewitt Joseph Witt added a comment - i think this warrants an 0.6.1 release. The impact of it is that certain bytes can cause the demarcation logic to stop too soon resulting in data loss in delivery to kafka.
        Hide
        joewitt Joseph Witt added a comment -

        left some comments on the PR itself.

        Show
        joewitt Joseph Witt added a comment - left some comments on the PR itself.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user joewitt commented on a diff in the pull request:

        https://github.com/apache/nifi/pull/314#discussion_r58008694

        — Diff: nifi-nar-bundles/nifi-kafka-bundle/nifi-kafka-processors/src/main/java/org/apache/nifi/processors/kafka/StreamScanner.java —
        @@ -55,29 +56,53 @@
        */
        boolean hasNext() {
        this.data = null;

        • if (!this.eos) {
          + int j = 0;
          + boolean moreData = true;
          + byte b;
          + while (this.data == null) {
          + this.expandBufferIfNecessary();
          try {
        • boolean keepReading = true;
        • while (keepReading) {
        • byte b = (byte) this.is.read();
        • if (b > -1) {
        • baos.write(b);
        • if (buffer.addAndCompare(b)) { - this.data = Arrays.copyOfRange(baos.getUnderlyingBuffer(), 0, baos.size() - delimiter.length); - keepReading = false; - }
        • } else {
        • this.data = baos.toByteArray();
        • keepReading = false;
        • this.eos = true;
          + b = (byte) this.is.read();
            • End diff –

        is.read() returns an int and -1 and represents end of stream. Once that has been verified can the byte value as represented in the range of 0 to 255. Will submit a patch containing a test that proves it is broken and will make the fix.

        Show
        githubbot ASF GitHub Bot added a comment - Github user joewitt commented on a diff in the pull request: https://github.com/apache/nifi/pull/314#discussion_r58008694 — Diff: nifi-nar-bundles/nifi-kafka-bundle/nifi-kafka-processors/src/main/java/org/apache/nifi/processors/kafka/StreamScanner.java — @@ -55,29 +56,53 @@ */ boolean hasNext() { this.data = null; if (!this.eos) { + int j = 0; + boolean moreData = true; + byte b; + while (this.data == null) { + this.expandBufferIfNecessary(); try { boolean keepReading = true; while (keepReading) { byte b = (byte) this.is.read(); if (b > -1) { baos.write(b); if (buffer.addAndCompare(b)) { - this.data = Arrays.copyOfRange(baos.getUnderlyingBuffer(), 0, baos.size() - delimiter.length); - keepReading = false; - } } else { this.data = baos.toByteArray(); keepReading = false; this.eos = true; + b = (byte) this.is.read(); End diff – is.read() returns an int and -1 and represents end of stream. Once that has been verified can the byte value as represented in the range of 0 to 255. Will submit a patch containing a test that proves it is broken and will make the fix.
        Hide
        joewitt Joseph Witt added a comment -

        Oleg Zhurakousky Have attached a patch which applies on top of your PR. It fixes the fact that when reading a byte from the input stream what is returned is an int and the value when -1 means end of stream whereas all other values mean there is a byte there and its value is represented between 0 and 255. The logic we had basically immediately cast to a byte then tested for -1. As a result if the byte stream itself contained a byte representing -1 we would improperly block/stop the stream. Created a unit test to show this as broken and then fixed.

        Show
        joewitt Joseph Witt added a comment - Oleg Zhurakousky Have attached a patch which applies on top of your PR. It fixes the fact that when reading a byte from the input stream what is returned is an int and the value when -1 means end of stream whereas all other values mean there is a byte there and its value is represented between 0 and 255. The logic we had basically immediately cast to a byte then tested for -1. As a result if the byte stream itself contained a byte representing -1 we would improperly block/stop the stream. Created a unit test to show this as broken and then fixed.
        Hide
        treardon Tim Reardon added a comment -

        This code bit my team yesterday. We moved to 0.6.0, and found that PutKafka was truncating messages. This morning we determined that messages containing multi-byte characters are being truncated at the first occurrence. We are not using a message delimiter. Needless to say, we have reverted to back to our 0.5.1 instance, which itself is using the 0.4.1 Kafka nar due to incompatibility with Kafka 0.8.x.

        Show
        treardon Tim Reardon added a comment - This code bit my team yesterday. We moved to 0.6.0, and found that PutKafka was truncating messages. This morning we determined that messages containing multi-byte characters are being truncated at the first occurrence. We are not using a message delimiter. Needless to say, we have reverted to back to our 0.5.1 instance, which itself is using the 0.4.1 Kafka nar due to incompatibility with Kafka 0.8.x.
        Hide
        joewitt Joseph Witt added a comment -

        Sorry about that Tim. We will initiate an 0.6.1 vote process asap. We have some very useful tests/conditions now and are running this live on highly complex flows where we originally found the problem. Far better now.

        Show
        joewitt Joseph Witt added a comment - Sorry about that Tim. We will initiate an 0.6.1 vote process asap. We have some very useful tests/conditions now and are running this live on highly complex flows where we originally found the problem. Far better now.
        Hide
        evega Edgardo Vega added a comment -

        I haven't seen a 0.6.1 vote. Is there something wrong with this patch? What is the hold up?

        Show
        evega Edgardo Vega added a comment - I haven't seen a 0.6.1 vote. Is there something wrong with this patch? What is the hold up?
        Hide
        joewitt Joseph Witt added a comment -

        Will have up hopefully today or tomorrow. Patch works very well.

        Show
        joewitt Joseph Witt added a comment - Will have up hopefully today or tomorrow. Patch works very well.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 25290cedc4f31c3bece3af34333b96d816dd9ec4 in nifi's branch refs/heads/master from Oleg Zhurakousky
        [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=25290ce ]

        NIFI-1701 fixed StreamScanner, added more tests

        NIFI-1701 additional refactoring, clean up and more tests

        Show
        jira-bot ASF subversion and git services added a comment - Commit 25290cedc4f31c3bece3af34333b96d816dd9ec4 in nifi's branch refs/heads/master from Oleg Zhurakousky [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=25290ce ] NIFI-1701 fixed StreamScanner, added more tests NIFI-1701 additional refactoring, clean up and more tests
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 25290cedc4f31c3bece3af34333b96d816dd9ec4 in nifi's branch refs/heads/master from Oleg Zhurakousky
        [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=25290ce ]

        NIFI-1701 fixed StreamScanner, added more tests

        NIFI-1701 additional refactoring, clean up and more tests

        Show
        jira-bot ASF subversion and git services added a comment - Commit 25290cedc4f31c3bece3af34333b96d816dd9ec4 in nifi's branch refs/heads/master from Oleg Zhurakousky [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=25290ce ] NIFI-1701 fixed StreamScanner, added more tests NIFI-1701 additional refactoring, clean up and more tests
        Hide
        markap14 Mark Payne added a comment -

        Code review looks good. Have been running for a while on a cluster and have been seeing great results. Additionally, there are a lot of really good unit tests included here. Nicely done! +1. Merged to master.

        Show
        markap14 Mark Payne added a comment - Code review looks good. Have been running for a while on a cluster and have been seeing great results. Additionally, there are a lot of really good unit tests included here. Nicely done! +1. Merged to master.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit d29d5a2a6d1fa19bdae90d7a3c467d9e0a98277c in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky
        [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=d29d5a2 ]

        NIFI-1701 fixed StreamScanner, added more tests

        NIFI-1701 additional refactoring, clean up and more tests

        Show
        jira-bot ASF subversion and git services added a comment - Commit d29d5a2a6d1fa19bdae90d7a3c467d9e0a98277c in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=d29d5a2 ] NIFI-1701 fixed StreamScanner, added more tests NIFI-1701 additional refactoring, clean up and more tests
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit d29d5a2a6d1fa19bdae90d7a3c467d9e0a98277c in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky
        [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=d29d5a2 ]

        NIFI-1701 fixed StreamScanner, added more tests

        NIFI-1701 additional refactoring, clean up and more tests

        Show
        jira-bot ASF subversion and git services added a comment - Commit d29d5a2a6d1fa19bdae90d7a3c467d9e0a98277c in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=d29d5a2 ] NIFI-1701 fixed StreamScanner, added more tests NIFI-1701 additional refactoring, clean up and more tests
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user olegz closed the pull request at:

        https://github.com/apache/nifi/pull/314

        Show
        githubbot ASF GitHub Bot added a comment - Github user olegz closed the pull request at: https://github.com/apache/nifi/pull/314
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 15bab7dda858f6f20c73a7b729962bb7c1f130b9 in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky
        [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=15bab7d ]

        NIFI-1701 fixed merge conflicts

        Show
        jira-bot ASF subversion and git services added a comment - Commit 15bab7dda858f6f20c73a7b729962bb7c1f130b9 in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=15bab7d ] NIFI-1701 fixed merge conflicts
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit b5e007213bfb8769842fce9f770a9890f2661d8c in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky
        [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=b5e0072 ]

        NIFI-1701 fixed StreamScanner, added more tests

        NIFI-1701 additional refactoring, clean up and more tests

        Show
        jira-bot ASF subversion and git services added a comment - Commit b5e007213bfb8769842fce9f770a9890f2661d8c in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=b5e0072 ] NIFI-1701 fixed StreamScanner, added more tests NIFI-1701 additional refactoring, clean up and more tests
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit b5e007213bfb8769842fce9f770a9890f2661d8c in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky
        [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=b5e0072 ]

        NIFI-1701 fixed StreamScanner, added more tests

        NIFI-1701 additional refactoring, clean up and more tests

        Show
        jira-bot ASF subversion and git services added a comment - Commit b5e007213bfb8769842fce9f770a9890f2661d8c in nifi's branch refs/heads/support/nifi-0.6.x from Oleg Zhurakousky [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=b5e0072 ] NIFI-1701 fixed StreamScanner, added more tests NIFI-1701 additional refactoring, clean up and more tests

          People

          • Assignee:
            ozhurakousky Oleg Zhurakousky
            Reporter:
            ozhurakousky Oleg Zhurakousky
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development