Flume
  1. Flume
  2. FLUME-1037

NETCAT handler theads terminate under stress test

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.1.0
    • Fix Version/s: v1.2.0
    • Component/s: Sinks+Sources
    • Labels:
      None
    • Environment:

      [CentOS 6.2 64-bit]

      Description

      Steps:

      1. Use a props file such as the following:
      # a = agent
      # r = source
      # c = channel
      # k = sink
      a1.sources = r1
      a1.channels = c1
      a1.sinks = k1
      # ===SOURCES===
      a1.sources.r1.type = NETCAT
      a1.sources.r1.channels = c1
      a1.sources.r1.bind = localhost
      a1.sources.r1.port = 1473
      # ===CHANNELS===
      a1.channels.c1.type = MEMORY
      # ===SINKS===
      a1.sinks.k1.type = NULL
      a1.sinks.k1.channel = c1

      2. Set the FLUME_CONF_DIR to point to your conf dir
      [will@localhost flume-1.2.0-incubating-SNAPSHOT]$ export FLUME_CONF_DIR=/home/will/git/apache/flume/flume-1.2.0-incubating-SNAPSHOT/conf

      3. Create a flume-env.sh file
      [will@localhost flume-1.2.0-incubating-SNAPSHOT]$ cp conf/flume-env.sh.template conf/flume-env.sh

      4. Adjust the memory size within flume-env.sh (this file will be automatically sourced when calling bin/flume-ng, but only if you've specified the FLUME_CONF_DIR env var)
      (here, I went to the extreme and I set the min and max heap to 1GB. I also specified a YourKit profiler agent)
      Sample contents of flume-env.sh:
      export JAVA_OPTS="-Xms1024m -Xmx1024m -agentpath:/home/will/tools/yjp-10.0.6/bin/linux-x86-64/libyjpagent.so=tracing,noj2ee"

      5. Run the flume NG agent:
      bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1

      6. Open-up 10 terminal windows (on the same host) to connect to the netcat server port. Sent continuous output in each terminal. I chose to use the command:
      yes | nc localhost 1473
      The "yes" unix cmd will continuously output 'y' char, followed by newline char. If you use YourKit and go into the Threads view, you'll see that after a while (possibly need to wait up to 10 mins) after a netcat handler thread has continuously been alternating between Runnable and Blocked states (blocking due to org.apache.klog4j.Category.log(..), but that's beside the point), the netcat handler thread enters a continuous wait state for exactly 1 minute, and then terminates (while its associated 'yes | nc localhost 1473' command is still running).

      I haven't done further analysis. My first thought was a thread safety issue. Note that there are no property file reconfigurations done during this test – I leave the props file alone.

      I welcome your ideas/comments. I initially ran this test with the default -Xmx20m but it ran out of memory. For a future test I might lower the Xmx/Xms from 1GB to maybe 128MB.

        Activity

        Hide
        Hudson added a comment -

        Integrated in flume-trunk #148 (See https://builds.apache.org/job/flume-trunk/148/)
        FLUME-1037. Netcat handler threads terminate under stress test.

        (Mike Percy via Arvind Prabhakar) (Revision 1306692)

        Result = SUCCESS
        arvind : http://svn.apache.org/viewvc/?view=rev&rev=1306692
        Files :

        • /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java
        • /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java
        • /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/source/NetcatSourceConfigurationConstants.java
        • /incubator/flume/trunk/flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java
        Show
        Hudson added a comment - Integrated in flume-trunk #148 (See https://builds.apache.org/job/flume-trunk/148/ ) FLUME-1037 . Netcat handler threads terminate under stress test. (Mike Percy via Arvind Prabhakar) (Revision 1306692) Result = SUCCESS arvind : http://svn.apache.org/viewvc/?view=rev&rev=1306692 Files : /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/source/NetcatSourceConfigurationConstants.java /incubator/flume/trunk/flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java
        Hide
        Arvind Prabhakar added a comment -

        Patch committed. Thanks Mike!

        Show
        Arvind Prabhakar added a comment - Patch committed. Thanks Mike!
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/#review6507
        -----------------------------------------------------------

        Ship it!

        +1

        • Arvind

        On 2012-03-29 01:25:26, Mike Percy wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4497/

        -----------------------------------------------------------

        (Updated 2012-03-29 01:25:26)

        Review request for Flume.

        Summary

        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.

        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)

        3. Properly flush responses to the client.

        4. Increment counters for successful processing and failure.

        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.

        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.

        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.

        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.

        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs

        -----

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSourceConfigurationConstants.java PRE-CREATION

        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing

        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/#review6507 ----------------------------------------------------------- Ship it! +1 Arvind On 2012-03-29 01:25:26, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-29 01:25:26) Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSourceConfigurationConstants.java PRE-CREATION flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-03-28 23:58:37, Hari Shreedharan wrote:

        > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 438

        > <https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line438>

        >

        > Don't we run the risk of buffer overflow here? I think reader would try to dump data using a put operation. If the buffer is full, there could be a BufferOverflowException here, right?

        Mike Percy wrote:

        Nope... see http://docs.oracle.com/javase/6/docs/api/java/io/Reader.html#read%28java.nio.CharBuffer%29

        Hari Shreedharan wrote:

        It does say it is a put operation.A put operation can cause an overflow exception, maybe the reader somehow handles it. Otherwise looks good!

        Thanks for checking, Hari. Basically what it does it put as many characters as it can into the buffer, given what's available to take and the space available to put it in. It returns the number of characters actually buffered, which may be 0 if the buffer is full (position == limit).

        • Mike

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/#review6498
        -----------------------------------------------------------

        On 2012-03-29 01:25:26, Mike Percy wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4497/

        -----------------------------------------------------------

        (Updated 2012-03-29 01:25:26)

        Review request for Flume.

        Summary

        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.

        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)

        3. Properly flush responses to the client.

        4. Increment counters for successful processing and failure.

        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.

        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.

        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.

        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.

        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs

        -----

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSourceConfigurationConstants.java PRE-CREATION

        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing

        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-03-28 23:58:37, Hari Shreedharan wrote: > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 438 > < https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line438 > > > Don't we run the risk of buffer overflow here? I think reader would try to dump data using a put operation. If the buffer is full, there could be a BufferOverflowException here, right? Mike Percy wrote: Nope... see http://docs.oracle.com/javase/6/docs/api/java/io/Reader.html#read%28java.nio.CharBuffer%29 Hari Shreedharan wrote: It does say it is a put operation.A put operation can cause an overflow exception, maybe the reader somehow handles it. Otherwise looks good! Thanks for checking, Hari. Basically what it does it put as many characters as it can into the buffer, given what's available to take and the space available to put it in. It returns the number of characters actually buffered, which may be 0 if the buffer is full (position == limit). Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/#review6498 ----------------------------------------------------------- On 2012-03-29 01:25:26, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-29 01:25:26) Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSourceConfigurationConstants.java PRE-CREATION flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/
        -----------------------------------------------------------

        (Updated 2012-03-29 01:25:26.818168)

        Review request for Flume.

        Changes
        -------

        Separated the config constants into their own class per Arvind.

        Summary
        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.
        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)
        3. Properly flush responses to the client.
        4. Increment counters for successful processing and failure.
        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.
        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.
        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.
        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.
        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs (updated)


        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e
        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70
        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSourceConfigurationConstants.java PRE-CREATION
        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing
        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-29 01:25:26.818168) Review request for Flume. Changes ------- Separated the config constants into their own class per Arvind. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs (updated) flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSourceConfigurationConstants.java PRE-CREATION flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-03-28 23:58:37, Hari Shreedharan wrote:

        > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 438

        > <https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line438>

        >

        > Don't we run the risk of buffer overflow here? I think reader would try to dump data using a put operation. If the buffer is full, there could be a BufferOverflowException here, right?

        Mike Percy wrote:

        Nope... see http://docs.oracle.com/javase/6/docs/api/java/io/Reader.html#read%28java.nio.CharBuffer%29

        It does say it is a put operation.A put operation can cause an overflow exception, maybe the reader somehow handles it. Otherwise looks good!

        • Hari

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/#review6498
        -----------------------------------------------------------

        On 2012-03-28 23:16:36, Mike Percy wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4497/

        -----------------------------------------------------------

        (Updated 2012-03-28 23:16:36)

        Review request for Flume.

        Summary

        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.

        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)

        3. Properly flush responses to the client.

        4. Increment counters for successful processing and failure.

        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.

        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.

        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.

        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.

        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs

        -----

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e

        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing

        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-03-28 23:58:37, Hari Shreedharan wrote: > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 438 > < https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line438 > > > Don't we run the risk of buffer overflow here? I think reader would try to dump data using a put operation. If the buffer is full, there could be a BufferOverflowException here, right? Mike Percy wrote: Nope... see http://docs.oracle.com/javase/6/docs/api/java/io/Reader.html#read%28java.nio.CharBuffer%29 It does say it is a put operation.A put operation can cause an overflow exception, maybe the reader somehow handles it. Otherwise looks good! Hari ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/#review6498 ----------------------------------------------------------- On 2012-03-28 23:16:36, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-28 23:16:36) Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-03-28 23:58:37, Hari Shreedharan wrote:

        > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 438

        > <https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line438>

        >

        > Don't we run the risk of buffer overflow here? I think reader would try to dump data using a put operation. If the buffer is full, there could be a BufferOverflowException here, right?

        Nope... see http://docs.oracle.com/javase/6/docs/api/java/io/Reader.html#read%28java.nio.CharBuffer%29

        • Mike

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/#review6498
        -----------------------------------------------------------

        On 2012-03-28 23:16:36, Mike Percy wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4497/

        -----------------------------------------------------------

        (Updated 2012-03-28 23:16:36)

        Review request for Flume.

        Summary

        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.

        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)

        3. Properly flush responses to the client.

        4. Increment counters for successful processing and failure.

        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.

        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.

        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.

        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.

        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs

        -----

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e

        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing

        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-03-28 23:58:37, Hari Shreedharan wrote: > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 438 > < https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line438 > > > Don't we run the risk of buffer overflow here? I think reader would try to dump data using a put operation. If the buffer is full, there could be a BufferOverflowException here, right? Nope... see http://docs.oracle.com/javase/6/docs/api/java/io/Reader.html#read%28java.nio.CharBuffer%29 Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/#review6498 ----------------------------------------------------------- On 2012-03-28 23:16:36, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-28 23:16:36) Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/#review6498
        -----------------------------------------------------------

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java
        <https://reviews.apache.org/r/4497/#comment14150>

        Don't we run the risk of buffer overflow here? I think reader would try to dump data using a put operation. If the buffer is full, there could be a BufferOverflowException here, right?

        • Hari

        On 2012-03-28 23:16:36, Mike Percy wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4497/

        -----------------------------------------------------------

        (Updated 2012-03-28 23:16:36)

        Review request for Flume.

        Summary

        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.

        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)

        3. Properly flush responses to the client.

        4. Increment counters for successful processing and failure.

        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.

        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.

        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.

        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.

        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs

        -----

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e

        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing

        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/#review6498 ----------------------------------------------------------- flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java < https://reviews.apache.org/r/4497/#comment14150 > Don't we run the risk of buffer overflow here? I think reader would try to dump data using a put operation. If the buffer is full, there could be a BufferOverflowException here, right? Hari On 2012-03-28 23:16:36, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-28 23:16:36) Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/#review6497
        -----------------------------------------------------------

        Ship it!

        +1 Changes look good Mike.

        One request - please consider using constant fields to represent the configuration keys (bind, port, max-line-length etc), preferably declared in a separate class called NetcatSourceConfigurationConstants or similar.

        Please attach the patch the Jira.

        • Arvind

        On 2012-03-28 23:16:36, Mike Percy wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4497/

        -----------------------------------------------------------

        (Updated 2012-03-28 23:16:36)

        Review request for Flume.

        Summary

        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.

        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)

        3. Properly flush responses to the client.

        4. Increment counters for successful processing and failure.

        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.

        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.

        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.

        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.

        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs

        -----

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e

        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing

        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/#review6497 ----------------------------------------------------------- Ship it! +1 Changes look good Mike. One request - please consider using constant fields to represent the configuration keys (bind, port, max-line-length etc), preferably declared in a separate class called NetcatSourceConfigurationConstants or similar. Please attach the patch the Jira. Arvind On 2012-03-28 23:16:36, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-28 23:16:36) Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-03-28 21:43:06, Hari Shreedharan wrote:

        > flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java, line 47

        > <https://reviews.apache.org/r/4497/diff/2/?file=96930#file96930line47>

        >

        > Maybe log this?

        Logging this isn't needed... this is not an exceptional case (just an empty body). This is just working around a bug in HexDump where it improperly handles a zero-length array.

        On 2012-03-28 21:43:06, Hari Shreedharan wrote:

        > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 130

        > <https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line130>

        >

        > If there is no "port", context.getInteger() will return null here - causing autoboxing of a null value - ending in NullPointerException. Either port has to be an Integer or you must check for the null value before assignment.

        This comes after the line:
        Configurables.ensureRequiredNonNull(context, "bind", "port");

        So by the time it reaches this statement, "port" is guaranteed not to be null.

        On 2012-03-28 21:43:06, Hari Shreedharan wrote:

        > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 315

        > <https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line315>

        >

        > Why would charsRead be 0, if the client sent a line longer than maxLineLength? Shouldn't it be > maxLineLength?

        No, the condition here is that we didn't read any additional data, the buffer is full, and we searched and didn't find any newlines. Therefore we conclude that the client sent a longer line than can fit into our buffer. I'll add this explanation in a comment and update the patch.

        On 2012-03-28 21:43:06, Hari Shreedharan wrote:

        > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 328

        > <https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line328>

        >

        > nit: Shouldn't this go at the beginning of the loop? Otherwise doesn't the first iteration of the loop just do nothing? Since the buffer was just created, it is obviously empty, so the first run of the loop seems like a no-op.

        You are right; this ordering is a remnant from a work-in-progress incarnation of the implementation. I will reorder these steps to be more intuitive.

        • Mike

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/#review6490
        -----------------------------------------------------------

        On 2012-03-27 09:19:51, Mike Percy wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4497/

        -----------------------------------------------------------

        (Updated 2012-03-27 09:19:51)

        Review request for Flume.

        Summary

        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.

        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)

        3. Properly flush responses to the client.

        4. Increment counters for successful processing and failure.

        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.

        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.

        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.

        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.

        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs

        -----

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e

        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing

        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-03-28 21:43:06, Hari Shreedharan wrote: > flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java, line 47 > < https://reviews.apache.org/r/4497/diff/2/?file=96930#file96930line47 > > > Maybe log this? Logging this isn't needed... this is not an exceptional case (just an empty body). This is just working around a bug in HexDump where it improperly handles a zero-length array. On 2012-03-28 21:43:06, Hari Shreedharan wrote: > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 130 > < https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line130 > > > If there is no "port", context.getInteger() will return null here - causing autoboxing of a null value - ending in NullPointerException. Either port has to be an Integer or you must check for the null value before assignment. This comes after the line: Configurables.ensureRequiredNonNull(context, "bind", "port"); So by the time it reaches this statement, "port" is guaranteed not to be null. On 2012-03-28 21:43:06, Hari Shreedharan wrote: > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 315 > < https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line315 > > > Why would charsRead be 0, if the client sent a line longer than maxLineLength? Shouldn't it be > maxLineLength? No, the condition here is that we didn't read any additional data, the buffer is full, and we searched and didn't find any newlines. Therefore we conclude that the client sent a longer line than can fit into our buffer. I'll add this explanation in a comment and update the patch. On 2012-03-28 21:43:06, Hari Shreedharan wrote: > flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java, line 328 > < https://reviews.apache.org/r/4497/diff/2/?file=96931#file96931line328 > > > nit: Shouldn't this go at the beginning of the loop? Otherwise doesn't the first iteration of the loop just do nothing? Since the buffer was just created, it is obviously empty, so the first run of the loop seems like a no-op. You are right; this ordering is a remnant from a work-in-progress incarnation of the implementation. I will reorder these steps to be more intuitive. Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/#review6490 ----------------------------------------------------------- On 2012-03-27 09:19:51, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-27 09:19:51) Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/
        -----------------------------------------------------------

        (Updated 2012-03-28 23:16:36.439284)

        Review request for Flume.

        Changes
        -------

        Hari, thanks for looking at this. I've updated the patch based on the above.

        Summary
        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.
        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)
        3. Properly flush responses to the client.
        4. Increment counters for successful processing and failure.
        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.
        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.
        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.
        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.
        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs (updated)


        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70
        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e
        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing
        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-28 23:16:36.439284) Review request for Flume. Changes ------- Hari, thanks for looking at this. I've updated the patch based on the above. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs (updated) flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/#review6490
        -----------------------------------------------------------

        Overall, looks good. Some minor feedback below:

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java
        <https://reviews.apache.org/r/4497/#comment14137>

        Maybe log this?

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java
        <https://reviews.apache.org/r/4497/#comment14138>

        If there is no "port", context.getInteger() will return null here - causing autoboxing of a null value - ending in NullPointerException. Either port has to be an Integer or you must check for the null value before assignment.

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java
        <https://reviews.apache.org/r/4497/#comment14144>

        Why would charsRead be 0, if the client sent a line longer than maxLineLength? Shouldn't it be > maxLineLength?

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java
        <https://reviews.apache.org/r/4497/#comment14143>

        nit: Shouldn't this go at the beginning of the loop? Otherwise doesn't the first iteration of the loop just do nothing? Since the buffer was just created, it is obviously empty, so the first run of the loop seems like a no-op.

        • Hari

        On 2012-03-27 09:19:51, Mike Percy wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4497/

        -----------------------------------------------------------

        (Updated 2012-03-27 09:19:51)

        Review request for Flume.

        Summary

        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.

        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)

        3. Properly flush responses to the client.

        4. Increment counters for successful processing and failure.

        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.

        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.

        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.

        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.

        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs

        -----

        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70

        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e

        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing

        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/#review6490 ----------------------------------------------------------- Overall, looks good. Some minor feedback below: flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java < https://reviews.apache.org/r/4497/#comment14137 > Maybe log this? flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java < https://reviews.apache.org/r/4497/#comment14138 > If there is no "port", context.getInteger() will return null here - causing autoboxing of a null value - ending in NullPointerException. Either port has to be an Integer or you must check for the null value before assignment. flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java < https://reviews.apache.org/r/4497/#comment14144 > Why would charsRead be 0, if the client sent a line longer than maxLineLength? Shouldn't it be > maxLineLength? flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java < https://reviews.apache.org/r/4497/#comment14143 > nit: Shouldn't this go at the beginning of the loop? Otherwise doesn't the first iteration of the loop just do nothing? Since the buffer was just created, it is obviously empty, so the first run of the loop seems like a no-op. Hari On 2012-03-27 09:19:51, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- (Updated 2012-03-27 09:19:51) Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4497/
        -----------------------------------------------------------

        Review request for Flume.

        Summary
        -------

        The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session.

        I have addressed multiple issues with NetcatSource in this patch:

        1. Properly process a new event per newline.
        2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.)
        3. Properly flush responses to the client.
        4. Increment counters for successful processing and failure.
        5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown.
        6. Made the inner classes of NetcatSource private; I believe making them public was unintentional.
        7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server.
        8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body.

        This addresses bug FLUME-1037.
        https://issues.apache.org/jira/browse/FLUME-1037

        Diffs


        flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70
        flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e
        flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960

        Diff: https://reviews.apache.org/r/4497/diff

        Testing
        -------

        Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink.

        Thanks,

        Mike

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4497/ ----------------------------------------------------------- Review request for Flume. Summary ------- The NetcatSource does not create one event per newline. Instead, it creates one event per TCP session. I have addressed multiple issues with NetcatSource in this patch: 1. Properly process a new event per newline. 2. Enforce a maximum length per line, to ensure clients cannot run the server out of memory. (This is now configurable; the default is 512 characters.) 3. Properly flush responses to the client. 4. Increment counters for successful processing and failure. 5. Use shutdownNow() when shutting down the server, otherwise an open client connection will cause the server to hang on shutdown. 6. Made the inner classes of NetcatSource private; I believe making them public was unintentional. 7. Corrected unit test so it now sends a newline with each request. Also now checks for "OK" response from server. 8. Attempting to sneak in a minor fix for the EventHelper.dumpEvent() method into this patch, since I'm testing with LoggerSink and it's bugging me when it throws an exception on a zero-length body. This addresses bug FLUME-1037 . https://issues.apache.org/jira/browse/FLUME-1037 Diffs flume-ng-core/src/main/java/org/apache/flume/event/EventHelper.java a326a70 flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java a841b0e flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java fb2a960 Diff: https://reviews.apache.org/r/4497/diff Testing ------- Unit tests pass. Did a bunch of manual testing using the nc tool with a Netcat source and a Logger sink. Thanks, Mike
        Hide
        Hari Shreedharan added a comment -

        I had seen this issue on the Mac nc client when I was testing the configuration system, and had brought it up a few days back, but didn't investigate it at that time

        Show
        Hari Shreedharan added a comment - I had seen this issue on the Mac nc client when I was testing the configuration system, and had brought it up a few days back, but didn't investigate it at that time
        Hide
        Will McQueen added a comment - - edited

        I now understand what's going on. NetcatSource is not creating one event per newline-separated text. Instead, it's creating one event per connection (the event body ends when the connection closes).

        The NetcatSource javadoc says:
        "A netcat-like source that listens on a given port and turns each line of text into an event....The expectation is that the supplied data is newline separated text. Each line of text is turned into a Flume event and sent via the connected channel."

        The handler code (in NetcatSource.NetcatSocketHandler.run) that reads from the socket is here:
        while (reader.read(buffer) != -1) {
        buffer.flip();

        logger.debug("read {} characters", buffer.remaining());

        counterGroup.addAndGet("characters.received",
        Long.valueOf(buffer.limit()));

        builder.append(buffer.array(), buffer.position(), buffer.length());
        }

        As you can see, the buffer (which is 512 bytes in size) is continuously used to read from the server port until the connection is EOL'd by the client. That means that the StringBuilder continues to grow without bound (assuming that you have a continuous input source such as with "yes | nc localhost 1473"). So no matter what you specify for -Xmx in flume-env.sh, you're guaranteed to eventually go Out Of Memory (which will sever the connection, and cause the source to restart due to the LifecyleSupervisor's AlwaysRestartPolicy, allowing a new connection over the same port).

        In case you're interested in my approach, I started with a very large heap size and I used YourKit Java Profiler (there are other profilers out there too like JProfiler) to take a snapshot of memory, identify the objects with the largest retained sizes (one of which turned out to be a single instance of class StringBuilder), then filtered-out "unreachable but not yet collected" objects, sorted the list by "Retained Size", and looked at top 3 hotspots: char[], Thread, and StringBuilder. There was only 1 object of class StringBuilder, and it had a retained size of over 500MB! So I drilled-down into that object (highlighting it and choosing "Selected Objects"), which showed the value of the StringBuilder object to be a single char[] of over 250 million 'y' characters. Going to <Paths from GC Roots> tab and clicking "start calculation" showed that the huge char[] was held by a local variable called "builder" in a thread named "netcat-handler-0"... which brought me to the while loop I pasted earlier.

        I initially used NULL for the sink, but if you replace that with LOGGER then you'll see that only a single event is output... and it's output only after the client closes the connection to the netcat source. I tried the LOGGER sink when Hari showed me that when he used netcat source on his Mac, only a single event was output to the logger after closing the netcat connection.

        Show
        Will McQueen added a comment - - edited I now understand what's going on. NetcatSource is not creating one event per newline-separated text. Instead, it's creating one event per connection (the event body ends when the connection closes). The NetcatSource javadoc says: "A netcat-like source that listens on a given port and turns each line of text into an event....The expectation is that the supplied data is newline separated text. Each line of text is turned into a Flume event and sent via the connected channel." The handler code (in NetcatSource.NetcatSocketHandler.run) that reads from the socket is here: while (reader.read(buffer) != -1) { buffer.flip(); logger.debug("read {} characters", buffer.remaining()); counterGroup.addAndGet("characters.received", Long.valueOf(buffer.limit())); builder.append(buffer.array(), buffer.position(), buffer.length()); } As you can see, the buffer (which is 512 bytes in size) is continuously used to read from the server port until the connection is EOL'd by the client. That means that the StringBuilder continues to grow without bound (assuming that you have a continuous input source such as with "yes | nc localhost 1473"). So no matter what you specify for -Xmx in flume-env.sh, you're guaranteed to eventually go Out Of Memory (which will sever the connection, and cause the source to restart due to the LifecyleSupervisor's AlwaysRestartPolicy, allowing a new connection over the same port). In case you're interested in my approach, I started with a very large heap size and I used YourKit Java Profiler (there are other profilers out there too like JProfiler) to take a snapshot of memory, identify the objects with the largest retained sizes (one of which turned out to be a single instance of class StringBuilder), then filtered-out "unreachable but not yet collected" objects, sorted the list by "Retained Size", and looked at top 3 hotspots: char[], Thread, and StringBuilder. There was only 1 object of class StringBuilder, and it had a retained size of over 500MB! So I drilled-down into that object (highlighting it and choosing "Selected Objects"), which showed the value of the StringBuilder object to be a single char[] of over 250 million 'y' characters. Going to <Paths from GC Roots> tab and clicking "start calculation" showed that the huge char[] was held by a local variable called "builder" in a thread named "netcat-handler-0"... which brought me to the while loop I pasted earlier. I initially used NULL for the sink, but if you replace that with LOGGER then you'll see that only a single event is output... and it's output only after the client closes the connection to the netcat source. I tried the LOGGER sink when Hari showed me that when he used netcat source on his Mac, only a single event was output to the logger after closing the netcat connection.

          People

          • Assignee:
            Mike Percy
            Reporter:
            Will McQueen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development