Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.3.0
    • Fix Version/s: v1.4.0
    • Component/s: Configuration
    • Labels:
      None

      Description

      1) It's not currently possible to provide your own configuration source
      2) All sinks/sources/channels are reused, even across re-configurations which they are not used
      3) The flume-ng-node module is very complex and starts many threads which are not required. That is there are multiple life cycle supervisors.

      1. FLUME-1630-6.patch
        171 kB
        Brock Noland
      2. FLUME-1630-5.patch
        169 kB
        Brock Noland
      3. FLUME-1630-2.patch
        168 kB
        Brock Noland
      4. FLUME-1630-1.patch
        165 kB
        Brock Noland
      5. FLUME-1630-0.patch
        165 kB
        Brock Noland

        Issue Links

          Activity

          Hide
          Brock Noland added a comment -

          Patch from RB

          Show
          Brock Noland added a comment - Patch from RB
          Hide
          Brock Noland added a comment -

          latest

          Show
          Brock Noland added a comment - latest
          Hide
          Hari Shreedharan added a comment -

          Brock,

          A couple of concerns:

          • It looks like certain components are annotated to be "Reusable." I think we should change this to "NonReusable." All existing components except the ones like MemoryChannel should be tagged as "NonReusable."
            • Any custom components written to exploit the current code of being automatically "cached" needs to be able to function without any code changes.
            • Any change of behavior expected by custom components (without any code changes) would break backward compatibility and can only be done in a major release.
          • So any components that do not save state should explicitly state as such.
          • I suggest renaming "Reusable" to "Stateful"/"Stateless." or something like that.

          I have not reviewed the whole patch, but this is something I noticed.

          Show
          Hari Shreedharan added a comment - Brock, A couple of concerns: It looks like certain components are annotated to be "Reusable." I think we should change this to "NonReusable." All existing components except the ones like MemoryChannel should be tagged as "NonReusable." Any custom components written to exploit the current code of being automatically "cached" needs to be able to function without any code changes. Any change of behavior expected by custom components (without any code changes) would break backward compatibility and can only be done in a major release. So any components that do not save state should explicitly state as such. I suggest renaming "Reusable" to "Stateful"/"Stateless." or something like that. I have not reviewed the whole patch, but this is something I noticed.
          Hide
          Brock Noland added a comment - - edited

          First off I agree that backwards compatibility is very important. However, I believe this is an exceedingly rare case. In that I find it unlikely that there are custom components out there which require this behavior. Are we aware of any custom components exploiting this? It seems like this was a hack put in to support memory channel. As a developer, I would expect the new behavior as in the patch and it seems unlikely you would know about this functionality unless you looked at the default factories.

          Show
          Brock Noland added a comment - - edited First off I agree that backwards compatibility is very important. However, I believe this is an exceedingly rare case. In that I find it unlikely that there are custom components out there which require this behavior. Are we aware of any custom components exploiting this? It seems like this was a hack put in to support memory channel. As a developer, I would expect the new behavior as in the patch and it seems unlikely you would know about this functionality unless you looked at the default factories.
          Hide
          Brock Noland added a comment -

          I should have added, if there is a real probability that we will affect a custom component, I am for your changes. When writing the patch, I thought about the possible cases for which components would be affected and I came up with the following:

          1) Non-persistant channel
          2) Source/sink which non-persistently buffers data
          3) Source/Sink which has a bug in it about cleaning up resources

          #1 seems incredibly unlikely given MemoryChannel
          #2 is an anti-pattern and also seems unlikely since the channel is a buffer
          #3 is a bug and I don't think we should worry about breaking theoretical bugs

          That is how I came to the conclusion that only MemoryChannel would require this behavior off the bat.

          Show
          Brock Noland added a comment - I should have added, if there is a real probability that we will affect a custom component, I am for your changes. When writing the patch, I thought about the possible cases for which components would be affected and I came up with the following: 1) Non-persistant channel 2) Source/sink which non-persistently buffers data 3) Source/Sink which has a bug in it about cleaning up resources #1 seems incredibly unlikely given MemoryChannel #2 is an anti-pattern and also seems unlikely since the channel is a buffer #3 is a bug and I don't think we should worry about breaking theoretical bugs That is how I came to the conclusion that only MemoryChannel would require this behavior off the bat.
          Hide
          Hari Shreedharan added a comment -

          Brock,

          I agree that we need not worry about 2 or 3. 2 is ok even if the source/sink buffers data, since there is no guarantee that the data is preserved unless it is committed into a channel. So if a source/sink returns success without commit, that is just a violation of the semantics and we should not bother about it.

          I know people who have written proprietary custom components and have been using Flume in production since v1.1.0 (since many/most components were not available or were not stable at that time). They may have upgraded to newer versions, but I am not sure if they actually moved off the components they wrote at the time. If we make "Reusablity" explicit, we risk breaking these components in a minor release - which should not be the case (or we could move to hbase-style releases, which means maintaining multiple code lines).

          Show
          Hari Shreedharan added a comment - Brock, I agree that we need not worry about 2 or 3. 2 is ok even if the source/sink buffers data, since there is no guarantee that the data is preserved unless it is committed into a channel. So if a source/sink returns success without commit, that is just a violation of the semantics and we should not bother about it. I know people who have written proprietary custom components and have been using Flume in production since v1.1.0 (since many/most components were not available or were not stable at that time). They may have upgraded to newer versions, but I am not sure if they actually moved off the components they wrote at the time. If we make "Reusablity" explicit, we risk breaking these components in a minor release - which should not be the case (or we could move to hbase-style releases, which means maintaining multiple code lines).
          Hide
          Brock Noland added a comment - - edited

          Fair enough, reusability will be the default for Channels not implemented for sinks/sources.

          I am +1 on changing the name Reusable/NonReusable but I am -1 on Stateful and Stateless. The "State" term feels too close to the actual programming meaning of Stateless where as all the channels clearly have state. Perhaps Recycleable and Disposable? Or maybe just StatefulChannel StatelessChannel?

          Show
          Brock Noland added a comment - - edited Fair enough, reusability will be the default for Channels not implemented for sinks/sources. I am +1 on changing the name Reusable/NonReusable but I am -1 on Stateful and Stateless. The "State" term feels too close to the actual programming meaning of Stateless where as all the channels clearly have state. Perhaps Recycleable and Disposable? Or maybe just StatefulChannel StatelessChannel?
          Hide
          Hari Shreedharan added a comment -

          +1 on Recyleable/Disposable.

          Show
          Hari Shreedharan added a comment - +1 on Recyleable/Disposable.
          Hide
          Brock Noland added a comment -

          Updated patch from RB

          Show
          Brock Noland added a comment - Updated patch from RB
          Hide
          Hari Shreedharan added a comment -

          Brock,

          Just for ease of reviewing, how about creating a branch called FLUME-1502 and then once the whole thing is ready - merge it into Trunk?

          Show
          Hari Shreedharan added a comment - Brock, Just for ease of reviewing, how about creating a branch called FLUME-1502 and then once the whole thing is ready - merge it into Trunk?
          Show
          Brock Noland added a comment - OK, I created a branch here https://git-wip-us.apache.org/repos/asf?p=flume.git;a=shortlog;h=refs/heads/FLUME-1502
          Hide
          Brock Noland added a comment -

          This functionality was added the agent as part of the patch.

          Show
          Brock Noland added a comment - This functionality was added the agent as part of the patch.
          Hide
          Brock Noland added a comment -

          Last statement was when I linked this with another JIRA.

          Hari,

          I committed this to the FLUME-1502 branch. I think I was actually supposed to wait for a +1 on RB before doing so.

          Show
          Brock Noland added a comment - Last statement was when I linked this with another JIRA. Hari, I committed this to the FLUME-1502 branch. I think I was actually supposed to wait for a +1 on RB before doing so.
          Hide
          Hari Shreedharan added a comment -

          No, I think committing to branches other than trunk or a release branch(flume-1.3.0) - is ok. You aren't breaking anything.

          Show
          Hari Shreedharan added a comment - No, I think committing to branches other than trunk or a release branch(flume-1.3.0) - is ok. You aren't breaking anything.
          Hide
          Brock Noland added a comment -

          OK, how does the workflow work? Should patches which are in review be committed to the branch?

          Show
          Brock Noland added a comment - OK, how does the workflow work? Should patches which are in review be committed to the branch?
          Hide
          Brock Noland added a comment -

          Hari, does this mean you are +1 on this patch? If so can you please +1 the RB item so I can close this JIRA.

          Show
          Brock Noland added a comment - Hari, does this mean you are +1 on this patch? If so can you please +1 the RB item so I can close this JIRA.
          Hide
          Hari Shreedharan added a comment -

          Brock - I was at Strata the whole week, and have not got a chance to review. I will probably look at it next week - please push all your changes to the branch. I will review it, next week. I apologize for the delay.

          Show
          Hari Shreedharan added a comment - Brock - I was at Strata the whole week, and have not got a chance to review. I will probably look at it next week - please push all your changes to the branch. I will review it, next week. I apologize for the delay.
          Hide
          Mike Percy added a comment -

          Brock, the changes on the FLUME-1502 branch look good to me. I think someone else should look @ this also, but if you think it's ready then go ahead and rebase against trunk, update the patch against RB, and attach the latest patch to the JIRA. Thanks!

          Show
          Mike Percy added a comment - Brock, the changes on the FLUME-1502 branch look good to me. I think someone else should look @ this also, but if you think it's ready then go ahead and rebase against trunk, update the patch against RB, and attach the latest patch to the JIRA. Thanks!
          Hide
          Hari Shreedharan added a comment -

          Unfortunately my review is on rb, and rb seems to be down. I will publish it as soon as it is back up. Mostly minor, no major ones.

          Show
          Hari Shreedharan added a comment - Unfortunately my review is on rb, and rb seems to be down. I will publish it as soon as it is back up. Mostly minor, no major ones.
          Hide
          Brock Noland added a comment -

          Latest patch, rebased on trunk. Unless I misunderstand something, there is just one open question on RB.

          Show
          Brock Noland added a comment - Latest patch, rebased on trunk. Unless I misunderstand something, there is just one open question on RB.
          Hide
          Brock Noland added a comment -

          Latest patch

          Show
          Brock Noland added a comment - Latest patch
          Hide
          Hari Shreedharan added a comment -

          Patch committed, rev: 97ed09e6f8255ee99ebb27cd10ef11a90830db24. Great work Brock!Thanks!

          Show
          Hari Shreedharan added a comment - Patch committed, rev: 97ed09e6f8255ee99ebb27cd10ef11a90830db24. Great work Brock!Thanks!
          Hide
          Roshan Naik added a comment -

          An easy to understand description of this new architecture/functionality would be really nice.

          Show
          Roshan Naik added a comment - An easy to understand description of this new architecture/functionality would be really nice.
          Hide
          Brock Noland added a comment -

          Roshan,

          Note that this functionality is for Flume developers as opposed to user developers. The change doesn't really add new functionality (with one exception). The following is copied from the review:

          1) Abstract property file provider is changed to Abstract property provider. Two concrete implementations are provided, PropertyFileConfigurationProvider and PollingPropertyFileConfigurationProvider. There is an additional concrete implementation MemoryConfigurationProvider is in TestAbstractConfigurationProvider.

          2) Caching instances is removed from all factories. Instance caching is implemented in AbstractConfigurationProvider for channels if they have the Recyclable annotation. MemoryChannel has this annotation while the other channels have Disposable.

          3) A layer of supervisors is removed. The application class now starts and stops the components when handleConfigurationEvent is called. This is called on startup if PropertyFileConfigurationProvider is used or whenever the configuration file changes if PollingPropertyFileConfigurationProvider is used. PollingPropertyFileConfigurationProvider uses EventBus (guava) to trigger the re-configuration.

          and I would add:

          4) The Application class has a new command line option --no-reload-conf which means the agent will not try and re-load the configuration file when changed.

          5) This work was done to support FLUME-1502 which is meant to be the mechanism with which user developers would utilize these changes. That JIRA has a design document.

          Show
          Brock Noland added a comment - Roshan, Note that this functionality is for Flume developers as opposed to user developers. The change doesn't really add new functionality (with one exception). The following is copied from the review: 1) Abstract property file provider is changed to Abstract property provider. Two concrete implementations are provided, PropertyFileConfigurationProvider and PollingPropertyFileConfigurationProvider. There is an additional concrete implementation MemoryConfigurationProvider is in TestAbstractConfigurationProvider. 2) Caching instances is removed from all factories. Instance caching is implemented in AbstractConfigurationProvider for channels if they have the Recyclable annotation. MemoryChannel has this annotation while the other channels have Disposable. 3) A layer of supervisors is removed. The application class now starts and stops the components when handleConfigurationEvent is called. This is called on startup if PropertyFileConfigurationProvider is used or whenever the configuration file changes if PollingPropertyFileConfigurationProvider is used. PollingPropertyFileConfigurationProvider uses EventBus (guava) to trigger the re-configuration. and I would add: 4) The Application class has a new command line option --no-reload-conf which means the agent will not try and re-load the configuration file when changed. 5) This work was done to support FLUME-1502 which is meant to be the mechanism with which user developers would utilize these changes. That JIRA has a design document.
          Hide
          Hudson added a comment -

          Integrated in flume-trunk #335 (See https://builds.apache.org/job/flume-trunk/335/)
          FLUME-1630. Flume configuration code could be improved. (Revision 97ed09e6f8255ee99ebb27cd10ef11a90830db24)

          Result = SUCCESS
          hshreedharan : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=97ed09e6f8255ee99ebb27cd10ef11a90830db24
          Files :

          • flume-ng-node/src/test/java/org/apache/flume/node/TestAbstractLogicalNodeManager.java
          • flume-ng-core/src/main/java/org/apache/flume/annotations/Disposable.java
          • flume-ng-node/src/main/java/org/apache/flume/node/PropertiesFileConfigurationProvider.java
          • flume-ng-node/src/test/java/org/apache/flume/node/TestFlumeNode.java
          • flume-ng-core/src/main/java/org/apache/flume/SourceFactory.java
          • flume-ng-node/src/main/java/org/apache/flume/conf/file/SimpleNodeConfiguration.java
          • flume-ng-core/src/main/java/org/apache/flume/Constants.java
          • flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/JdbcChannel.java
          • flume-ng-node/src/test/java/org/apache/flume/node/TestPollingPropertiesFileConfigurationProvider.java
          • flume-ng-core/src/main/java/org/apache/flume/annotations/Recyclable.java
          • flume-ng-node/src/main/java/org/apache/flume/node/NodeManager.java
          • flume-ng-node/src/main/java/org/apache/flume/node/ConfigurationProvider.java
          • flume-ng-core/src/test/java/org/apache/flume/source/TestDefaultSourceFactory.java
          • flume-ng-core/src/main/java/org/apache/flume/source/DefaultSourceFactory.java
          • flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java
          • flume-ng-node/src/test/java/org/apache/flume/source/FlakeySequenceGeneratorSource.java
          • flume-ng-node/src/test/java/org/apache/flume/node/TestApplication.java
          • flume-ng-node/src/main/java/org/apache/flume/conf/properties/PropertiesFileConfigurationProvider.java
          • flume-ng-core/src/main/java/org/apache/flume/SinkFactory.java
          • flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java
          • flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/NodeConfigurationAware.java
          • flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/AbstractLogicalNodeManager.java
          • flume-ng-node/src/test/java/org/apache/flume/node/TestDefaultLogicalNodeManager.java
          • flume-ng-node/src/test/java/org/apache/flume/node/TestAbstractConfigurationProvider.java
          • flume-ng-node/src/main/java/org/apache/flume/conf/file/AbstractFileConfigurationProvider.java
          • flume-ng-node/src/test/java/org/apache/flume/node/TestPropertiesFileConfigurationProvider.java
          • flume-ng-node/pom.xml
          • flume-ng-node/src/main/java/org/apache/flume/node/Application.java
          • flume-ng-core/src/main/java/org/apache/flume/ChannelFactory.java
          • flume-ng-node/src/main/java/org/apache/flume/node/SimpleMaterializedConfiguration.java
          • flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java
          • flume-ng-channels/flume-recoverable-memory-channel/src/main/java/org/apache/flume/channel/recoverable/memory/RecoverableMemoryChannel.java
          • flume-ng-node/src/main/java/org/apache/flume/node/MaterializedConfiguration.java
          • flume-ng-node/src/main/java/org/apache/flume/node/NodeConfiguration.java
          • flume-ng-node/src/test/resources/flume-conf.properties
          • flume-ng-core/src/test/java/org/apache/flume/sink/TestDefaultSinkFactory.java
          • flume-ng-node/src/test/java/org/apache/flume/conf/properties/TestPropertiesFileConfigurationProvider.java
          • flume-ng-core/src/main/java/org/apache/flume/sink/DefaultSinkFactory.java
          • flume-ng-node/src/main/java/org/apache/flume/node/AbstractConfigurationProvider.java
          • flume-ng-node/src/main/java/org/apache/flume/node/PollingPropertiesFileConfigurationProvider.java
          • flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java
          • flume-ng-core/src/main/java/org/apache/flume/channel/DefaultChannelFactory.java
          • flume-ng-node/src/main/java/org/apache/flume/node/FlumeNode.java
          • flume-ng-node/src/test/java/org/apache/flume/node/TestFlumeNodeApplication.java
          Show
          Hudson added a comment - Integrated in flume-trunk #335 (See https://builds.apache.org/job/flume-trunk/335/ ) FLUME-1630 . Flume configuration code could be improved. (Revision 97ed09e6f8255ee99ebb27cd10ef11a90830db24) Result = SUCCESS hshreedharan : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=97ed09e6f8255ee99ebb27cd10ef11a90830db24 Files : flume-ng-node/src/test/java/org/apache/flume/node/TestAbstractLogicalNodeManager.java flume-ng-core/src/main/java/org/apache/flume/annotations/Disposable.java flume-ng-node/src/main/java/org/apache/flume/node/PropertiesFileConfigurationProvider.java flume-ng-node/src/test/java/org/apache/flume/node/TestFlumeNode.java flume-ng-core/src/main/java/org/apache/flume/SourceFactory.java flume-ng-node/src/main/java/org/apache/flume/conf/file/SimpleNodeConfiguration.java flume-ng-core/src/main/java/org/apache/flume/Constants.java flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/JdbcChannel.java flume-ng-node/src/test/java/org/apache/flume/node/TestPollingPropertiesFileConfigurationProvider.java flume-ng-core/src/main/java/org/apache/flume/annotations/Recyclable.java flume-ng-node/src/main/java/org/apache/flume/node/NodeManager.java flume-ng-node/src/main/java/org/apache/flume/node/ConfigurationProvider.java flume-ng-core/src/test/java/org/apache/flume/source/TestDefaultSourceFactory.java flume-ng-core/src/main/java/org/apache/flume/source/DefaultSourceFactory.java flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java flume-ng-node/src/test/java/org/apache/flume/source/FlakeySequenceGeneratorSource.java flume-ng-node/src/test/java/org/apache/flume/node/TestApplication.java flume-ng-node/src/main/java/org/apache/flume/conf/properties/PropertiesFileConfigurationProvider.java flume-ng-core/src/main/java/org/apache/flume/SinkFactory.java flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/NodeConfigurationAware.java flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/AbstractLogicalNodeManager.java flume-ng-node/src/test/java/org/apache/flume/node/TestDefaultLogicalNodeManager.java flume-ng-node/src/test/java/org/apache/flume/node/TestAbstractConfigurationProvider.java flume-ng-node/src/main/java/org/apache/flume/conf/file/AbstractFileConfigurationProvider.java flume-ng-node/src/test/java/org/apache/flume/node/TestPropertiesFileConfigurationProvider.java flume-ng-node/pom.xml flume-ng-node/src/main/java/org/apache/flume/node/Application.java flume-ng-core/src/main/java/org/apache/flume/ChannelFactory.java flume-ng-node/src/main/java/org/apache/flume/node/SimpleMaterializedConfiguration.java flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java flume-ng-channels/flume-recoverable-memory-channel/src/main/java/org/apache/flume/channel/recoverable/memory/RecoverableMemoryChannel.java flume-ng-node/src/main/java/org/apache/flume/node/MaterializedConfiguration.java flume-ng-node/src/main/java/org/apache/flume/node/NodeConfiguration.java flume-ng-node/src/test/resources/flume-conf.properties flume-ng-core/src/test/java/org/apache/flume/sink/TestDefaultSinkFactory.java flume-ng-node/src/test/java/org/apache/flume/conf/properties/TestPropertiesFileConfigurationProvider.java flume-ng-core/src/main/java/org/apache/flume/sink/DefaultSinkFactory.java flume-ng-node/src/main/java/org/apache/flume/node/AbstractConfigurationProvider.java flume-ng-node/src/main/java/org/apache/flume/node/PollingPropertiesFileConfigurationProvider.java flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java flume-ng-core/src/main/java/org/apache/flume/channel/DefaultChannelFactory.java flume-ng-node/src/main/java/org/apache/flume/node/FlumeNode.java flume-ng-node/src/test/java/org/apache/flume/node/TestFlumeNodeApplication.java

            People

            • Assignee:
              Brock Noland
              Reporter:
              Brock Noland
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development