Flume
  1. Flume
  2. FLUME-2233

MemoryChannel lock contention on every put due to bytesRemaining Semaphore

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: v1.5.0
    • Component/s: None
    • Labels:
      None

      Description

      This semaphore is checked every time there is a put (unlike the queueRemaining semaphore which is checked only on transaction commits), causing the channel to slow down, even when the user does not care about memory usage.

      We must add a new parameter to make sure that we look at bytesRemaining only when required which the user can disable if memory is not something they worry about because they know the channel size will be sufficiently small. By default, we will need to still check bytesRemaining to avoid breaking existing configurations.

      1. FLUME-2233.patch
        6 kB
        Hari Shreedharan
      2. FLUME-2233.patch
        10 kB
        Hari Shreedharan
      3. FLUME-2233.patch
        10 kB
        Hari Shreedharan
      4. FLUME-2233.patch
        8 kB
        Hari Shreedharan

        Issue Links

          Activity

          Hide
          Roshan Naik added a comment -

          whatever it is, it would be good to make it default.
          It would be nice to have most flume components require minimal (if not 0) explicit config tweaks to have them run in a performant manner.

          Show
          Roshan Naik added a comment - whatever it is, it would be good to make it default. It would be nice to have most flume components require minimal (if not 0) explicit config tweaks to have them run in a performant manner.
          Hide
          Hari Shreedharan added a comment -

          I agree, but the issue is that we have made releases which automatically check for the available memory size fields. Adding this param means that we need to ensure that older config behavior does not change. So we need to check bytesRemaining by default.

          Show
          Hari Shreedharan added a comment - I agree, but the issue is that we have made releases which automatically check for the available memory size fields. Adding this param means that we need to ensure that older config behavior does not change. So we need to check bytesRemaining by default.
          Hide
          Hari Shreedharan added a comment -

          Patch that implements the fix. Unfortunately, there is no easy way to test this fix.

          Show
          Hari Shreedharan added a comment - Patch that implements the fix. Unfortunately, there is no easy way to test this fix.
          Hide
          Roshan Naik added a comment -

          Rather than optionally disabling the semaphore, i think it is better to instead update the semaphore on commit. That will help the perf issue and also flume wont be running in a unsafe mode. It is difficult in practice for users to decide upfront whether or not this flag should be turned on in production.

          Show
          Roshan Naik added a comment - Rather than optionally disabling the semaphore, i think it is better to instead update the semaphore on commit. That will help the perf issue and also flume wont be running in a unsafe mode. It is difficult in practice for users to decide upfront whether or not this flag should be turned on in production.
          Hide
          Hari Shreedharan added a comment -

          That was my initial thought - to do what we do with the queueRemaining semaphore, but a long running transaction could essentially hold up heap space - and the heap space used by the channel is not just the committed transactions. Even though there is a performance penalty, this actually is more of the intended meaning of the original patch. Also, I have rarely seen this enabled in prod. Most users with memory channels generally tend to keep smaller channels and don't really hit memory issues a lot.

          Show
          Hari Shreedharan added a comment - That was my initial thought - to do what we do with the queueRemaining semaphore, but a long running transaction could essentially hold up heap space - and the heap space used by the channel is not just the committed transactions. Even though there is a performance penalty, this actually is more of the intended meaning of the original patch. Also, I have rarely seen this enabled in prod. Most users with memory channels generally tend to keep smaller channels and don't really hit memory issues a lot.
          Hide
          Hari Shreedharan added a comment -

          Also note that I am not married to this approach. If other committers feel strongly that we should check the semaphore on commit, I am ok with that too - though I think that might be a bit misguiding to users.

          Show
          Hari Shreedharan added a comment - Also note that I am not married to this approach. If other committers feel strongly that we should check the semaphore on commit, I am ok with that too - though I think that might be a bit misguiding to users.
          Hide
          Roshan Naik added a comment -

          ok. I think the byteCapacityBufferPercentage can mitigate/alleviate the heap consumption issue.

          Show
          Roshan Naik added a comment - ok. I think the byteCapacityBufferPercentage can mitigate/alleviate the heap consumption issue.
          Hide
          Hari Shreedharan added a comment -

          Note sure what you mean? Are you suggesting we move this to just the commit and ask that the users specify a lower byteCapacityBufferPercentage to take care of the events currently in the pipeline?

          Show
          Hari Shreedharan added a comment - Note sure what you mean? Are you suggesting we move this to just the commit and ask that the users specify a lower byteCapacityBufferPercentage to take care of the events currently in the pipeline?
          Hide
          Roshan Naik added a comment -

          Yes, kind of.. i was thinking byteCapacityBufferPercentage can be used to account for the uncommitted events also. Today it seems to be used to account for header size only.

          Whether users need to tweak the setting or not would depend on the situation. For many use cases the current default value may already be generous enough since the transaction batch size is typically not very large. Use cases where lots of headers are applied to events, very large batch sizes on source, lots of sources are hooked up to a single mem channel, or event body size is relatively large are situations where user may need to tweak byteCapacityBufferPercentage.

          Show
          Roshan Naik added a comment - Yes, kind of.. i was thinking byteCapacityBufferPercentage can be used to account for the uncommitted events also. Today it seems to be used to account for header size only. Whether users need to tweak the setting or not would depend on the situation. For many use cases the current default value may already be generous enough since the transaction batch size is typically not very large. Use cases where lots of headers are applied to events, very large batch sizes on source, lots of sources are hooked up to a single mem channel, or event body size is relatively large are situations where user may need to tweak byteCapacityBufferPercentage.
          Hide
          Hari Shreedharan added a comment -

          Moving the semaphore update to commit. This should improve the lock contention issue.

          Show
          Hari Shreedharan added a comment - Moving the semaphore update to commit. This should improve the lock contention issue.
          Hide
          Roshan Naik added a comment -

          could you please update the patch on RB ?

          Show
          Roshan Naik added a comment - could you please update the patch on RB ?
          Hide
          Hari Shreedharan added a comment -

          Done

          Show
          Hari Shreedharan added a comment - Done
          Hide
          Roshan Naik added a comment -

          +1.
          Thanks for the patch Hari.

          Show
          Roshan Naik added a comment - +1. Thanks for the patch Hari.
          Hide
          Hudson added a comment -

          FAILURE: Integrated in flume-trunk #522 (See https://builds.apache.org/job/flume-trunk/522/)
          FLUME-2233. MemoryChannel lock contention on every put due to bytesRemaining Semaphore (roshan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=d3f5123c4d6cdbe4e5cca6e7e141e507bb1103a7)

          • flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java
          • flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannel.java
          Show
          Hudson added a comment - FAILURE: Integrated in flume-trunk #522 (See https://builds.apache.org/job/flume-trunk/522/ ) FLUME-2233 . MemoryChannel lock contention on every put due to bytesRemaining Semaphore (roshan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=d3f5123c4d6cdbe4e5cca6e7e141e507bb1103a7 ) flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannel.java

            People

            • Assignee:
              Hari Shreedharan
              Reporter:
              Hari Shreedharan
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development