Flume
  1. Flume
  2. FLUME-1502

Support for running simple configurations embedded in host process

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.2.0
    • Fix Version/s: v1.4.0
    • Component/s: None
    • Labels:
      None

      Description

      Flume should provide a light-weight embeddable node manager that can be started in process where necessary. This will allow the users to embed light-weight agents within the host process where necessary.

      1. embedded-agent-3.pdf
        111 kB
        Brock Noland
      2. FLUME-1502-3.patch
        65 kB
        Brock Noland
      3. FLUME-1502-1.patch
        65 kB
        Brock Noland
      4. FLUME-1502-0.patch
        65 kB
        Brock Noland
      5. embeeded-agent-2.pdf
        112 kB
        Brock Noland
      6. embeeded-agent-1.pdf
        110 kB
        Brock Noland

        Issue Links

          Activity

          Arvind Prabhakar created issue -
          Hide
          Ralph Goers added a comment -

          I recently completed embedding Flume into Log4j 2 and can provide some of the pain points.

          1. Log4j 2 uses XML or JSON while Flume cannot work without Properties. The solution I came up with was to use dynamically constructed properties.
          2. Flume's configuration cannot be separate from Log4j. If Flume were to detect a change in its configuration it would shutdown everything and restart. This would cause the application interacting with the source to fail. I solved this by requiring that Flume be configured as
          part of Log4j. If Log4j is reconfigured in such a way that the agent configuration changes Log4j will start a new agent, connect that with the new FlumeAppender instance and then cause Loggers to use the new configuration once it is fully instantiated. The previous agent will then be shut down.
          3. I ended up replacing ConfigurationProvider with FlumeConfigurationBuilder. It exposes a load method that accepts the agent name, the Properties, and the NodeManager and returns a NodeConfiguration.

          The source for this is at https://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/main/java/org/apache/logging/log4j/flume/appender/. In particular, FlumeEmbeddedManager.java and FlumeConfigurationBuilder.java do most of the work.

          Show
          Ralph Goers added a comment - I recently completed embedding Flume into Log4j 2 and can provide some of the pain points. 1. Log4j 2 uses XML or JSON while Flume cannot work without Properties. The solution I came up with was to use dynamically constructed properties. 2. Flume's configuration cannot be separate from Log4j. If Flume were to detect a change in its configuration it would shutdown everything and restart. This would cause the application interacting with the source to fail. I solved this by requiring that Flume be configured as part of Log4j. If Log4j is reconfigured in such a way that the agent configuration changes Log4j will start a new agent, connect that with the new FlumeAppender instance and then cause Loggers to use the new configuration once it is fully instantiated. The previous agent will then be shut down. 3. I ended up replacing ConfigurationProvider with FlumeConfigurationBuilder. It exposes a load method that accepts the agent name, the Properties, and the NodeManager and returns a NodeConfiguration. The source for this is at https://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/main/java/org/apache/logging/log4j/flume/appender/ . In particular, FlumeEmbeddedManager.java and FlumeConfigurationBuilder.java do most of the work.
          Hide
          Brock Noland added a comment -

          Thanks Ralph! I'd like to take a look at this.

          Show
          Brock Noland added a comment - Thanks Ralph! I'd like to take a look at this.
          Brock Noland made changes -
          Field Original Value New Value
          Assignee Brock Noland [ brocknoland ]
          Brock Noland made changes -
          Link This issue relates to FLUME-966 [ FLUME-966 ]
          Hide
          Brock Noland added a comment -

          This is in some ways related to FLUME-966 but neither is exclusive.

          Show
          Brock Noland added a comment - This is in some ways related to FLUME-966 but neither is exclusive.
          Hide
          Brock Noland added a comment -

          Attached is Rev1 of the design document for the embedded agent. Please share any feedback you have via this JIRA.

          Show
          Brock Noland added a comment - Attached is Rev1 of the design document for the embedded agent. Please share any feedback you have via this JIRA.
          Brock Noland made changes -
          Attachment embeeded-agent-1.pdf [ 12551922 ]
          Hide
          Ralph Goers added a comment - - edited

          I've reviewed the document and do have a few comments.
          1. The configuration approach seems fine and the Lifecycle events seem appropriate.
          2. The restrictions on not reconfiguring the agent are appropriate.
          3. Limiting support to not include the FileChannel is problematic. Without the FileChannel guaranteed delivery is lost as the application receives control back before the event makes it to a persistent destination. As such, the embedded agent is little more than a fancier version of the AsynchAppender.
          4. It isn't clear if the configuration is limited to a single sink. If it is, then the limitations of item 3 are even worse as the Agent can't even failover and events are guaranteed to be lost.

          Show
          Ralph Goers added a comment - - edited I've reviewed the document and do have a few comments. 1. The configuration approach seems fine and the Lifecycle events seem appropriate. 2. The restrictions on not reconfiguring the agent are appropriate. 3. Limiting support to not include the FileChannel is problematic. Without the FileChannel guaranteed delivery is lost as the application receives control back before the event makes it to a persistent destination. As such, the embedded agent is little more than a fancier version of the AsynchAppender. 4. It isn't clear if the configuration is limited to a single sink. If it is, then the limitations of item 3 are even worse as the Agent can't even failover and events are guaranteed to be lost.
          Hide
          Brock Noland added a comment -

          Ralph,

          Thank you very much for providing feedback so quickly.

          How about we change the propose to allow multiple sinks? AvroSinks are fairly lightweight so I'd prefer to allow them over embedding a more complex file channel. Would that alleviate your concerns?

          Brock

          Show
          Brock Noland added a comment - Ralph, Thank you very much for providing feedback so quickly. How about we change the propose to allow multiple sinks? AvroSinks are fairly lightweight so I'd prefer to allow them over embedding a more complex file channel. Would that alleviate your concerns? Brock
          Hide
          Ralph Goers added a comment -

          Only partially. In my applications there are many events where the application must be sure that the audit event has reached a point where delivery is guaranteed before it can attempt to perform the action being audited. With the Memory Channel the event will be accepted and the application can continue even though the event may not actually ever be delivered and will be lost if the JVM goes down. My understanding is that currently the only channel that provides sufficient guarantees is the File Channel.

          Show
          Ralph Goers added a comment - Only partially. In my applications there are many events where the application must be sure that the audit event has reached a point where delivery is guaranteed before it can attempt to perform the action being audited. With the Memory Channel the event will be accepted and the application can continue even though the event may not actually ever be delivered and will be lost if the JVM goes down. My understanding is that currently the only channel that provides sufficient guarantees is the File Channel.
          Hide
          Arvind Prabhakar added a comment -

          @Brock, thanks for the design document. On the point of File Channel, I do feel that it is important to have that support to ensure that we do not put excessive strain on memory for the host process, and that we do not lose events in the case of host process failure.

          Another point to consider is whether the source would be any different from a regular source when running in embedded mode. For example, does it make sense to have embedded agent with a network source like Avro working on it? For instance, it may make sense to have no source support, but a direct pass-through for the client API that directly talks with the channel in question.

          Show
          Arvind Prabhakar added a comment - @Brock, thanks for the design document. On the point of File Channel, I do feel that it is important to have that support to ensure that we do not put excessive strain on memory for the host process, and that we do not lose events in the case of host process failure. Another point to consider is whether the source would be any different from a regular source when running in embedded mode. For example, does it make sense to have embedded agent with a network source like Avro working on it? For instance, it may make sense to have no source support, but a direct pass-through for the client API that directly talks with the channel in question.
          Hide
          Brock Noland added a comment - - edited

          Hi guys,

          I will add FileChannel as one of the two Channel choices of the embedded agent. I am not opposed to embedding a file channel, I just want this first iteration to be as simple as possible until we see how people are using the agent.

          In the design doc I state that users will have to use the RPCClient to talk to the source despite this being the same JVM. I did that because the RPC client is well tested and as such didn't require us to create an additional embedded only source. Doing this would require more code because the embedded agent have to have a reference to the channel whereas if we use the avro source we re-use the same boostrap logic we have today. As such, I'd prefer to require the use of the RPCClient but I am not tied to this direction. We could of course put in some syntactic sugar so users didn't have to create the RPCClient themselves. The embedded agent could take of that for them and just expose a put() or putBatch() method.

          Show
          Brock Noland added a comment - - edited Hi guys, I will add FileChannel as one of the two Channel choices of the embedded agent. I am not opposed to embedding a file channel, I just want this first iteration to be as simple as possible until we see how people are using the agent. In the design doc I state that users will have to use the RPCClient to talk to the source despite this being the same JVM. I did that because the RPC client is well tested and as such didn't require us to create an additional embedded only source. Doing this would require more code because the embedded agent have to have a reference to the channel whereas if we use the avro source we re-use the same boostrap logic we have today. As such, I'd prefer to require the use of the RPCClient but I am not tied to this direction. We could of course put in some syntactic sugar so users didn't have to create the RPCClient themselves. The embedded agent could take of that for them and just expose a put() or putBatch() method.
          Hide
          Mike Percy added a comment -

          Agreed that we will need File Channel. A common problem is an application using the client SDK library needing to buffer its own events. File Channel would alleviate that in a durable way.

          Another thing to consider regarding skipping the source altogether and allowing an interface to the Channel such that an application could open a Transaction, put() events on the channel, and commit()/close() the channel. This would make the so-called embedded agent basically a glorified client. But that's the use case I think this is morphing into. In such a case I think we should consider disallowing take() calls, but that's a secondary point. Thoughts?

          Show
          Mike Percy added a comment - Agreed that we will need File Channel. A common problem is an application using the client SDK library needing to buffer its own events. File Channel would alleviate that in a durable way. Another thing to consider regarding skipping the source altogether and allowing an interface to the Channel such that an application could open a Transaction, put() events on the channel, and commit()/close() the channel. This would make the so-called embedded agent basically a glorified client. But that's the use case I think this is morphing into. In such a case I think we should consider disallowing take() calls, but that's a secondary point. Thoughts?
          Hide
          Ralph Goers added a comment -

          Arvind, it would be hard for me to disagree with your point of view since it matches exactly with how the embedded agent is currently implemented in Log4j 2. The current incarnation allows any channels or sinks, but all channels are automatically connected to the source (which is really the appender itself).

          As for simplicity, my intention with Log4j 2 is to have the configuration end up looking like

          <Flume name="eventLogger" suppressExceptions="false" compress="true" embedded="true" dataDir="$

          {sys:flumeDir}

          ">
          <Agent host="192.168.10.101" port="8800"/>
          <Agent host="192.168.10.102" port="8800"/>
          <RFC5424Layout enterpriseNumber="18060" includeMDC="true" appName="MyApp"/>
          </Flume>

          and then maybe something to configure encryption. Although Log4j 2 will support configuration by Flume properties, my guess is that most people would prefer the default configuration. If they don't want the FileChannel it would be fairly simple to add channel="Memory|File" as an attribute.

          If you look at the embedded Appender you will see that it creates the FlumeEvent (using code common to both the embedded and non-embedded versions) and then does

          public void send(FlumeEvent event) {
          sourceCounter.incrementAppendReceivedCount();
          sourceCounter.incrementEventReceivedCount();
          try

          { getChannelProcessor().processEvent(event); }

          catch (ChannelException ex) {
          logger.warn("Unabled to process event {}" + event, ex);
          throw ex;
          }
          sourceCounter.incrementAppendAcceptedCount();
          sourceCounter.incrementEventAcceptedCount();
          }

          in a class that extends AbstractSource. The only real challenge I had was in getting it into the Flume configuration so Flume could create the Source object and then obtaining a reference to the object so the Appender could call it. To do that I had to do

          SourceRunner runner = node.getConfiguration().getSourceRunners().get(SOURCE_NAME);
          if (runner == null || runner.getSource() == null)

          { throw new IllegalStateException("No Source has been created for Appender " + shortName); }

          source = (Log4jEventSource) runner.getSource();

          It would be much better if I could pass the Source class to the configuration processor and get back a Source instance when the embedded agent is started. In this csae I would recommend the source has to implement something like

          public interface EmbeddedSource

          { void send(FlumeEvent event); }

          OTOH, you could just provide EmbeddedSource that implements send() as shown above.

          Show
          Ralph Goers added a comment - Arvind, it would be hard for me to disagree with your point of view since it matches exactly with how the embedded agent is currently implemented in Log4j 2. The current incarnation allows any channels or sinks, but all channels are automatically connected to the source (which is really the appender itself). As for simplicity, my intention with Log4j 2 is to have the configuration end up looking like <Flume name="eventLogger" suppressExceptions="false" compress="true" embedded="true" dataDir="$ {sys:flumeDir} "> <Agent host="192.168.10.101" port="8800"/> <Agent host="192.168.10.102" port="8800"/> <RFC5424Layout enterpriseNumber="18060" includeMDC="true" appName="MyApp"/> </Flume> and then maybe something to configure encryption. Although Log4j 2 will support configuration by Flume properties, my guess is that most people would prefer the default configuration. If they don't want the FileChannel it would be fairly simple to add channel="Memory|File" as an attribute. If you look at the embedded Appender you will see that it creates the FlumeEvent (using code common to both the embedded and non-embedded versions) and then does public void send(FlumeEvent event) { sourceCounter.incrementAppendReceivedCount(); sourceCounter.incrementEventReceivedCount(); try { getChannelProcessor().processEvent(event); } catch (ChannelException ex) { logger.warn("Unabled to process event {}" + event, ex); throw ex; } sourceCounter.incrementAppendAcceptedCount(); sourceCounter.incrementEventAcceptedCount(); } in a class that extends AbstractSource. The only real challenge I had was in getting it into the Flume configuration so Flume could create the Source object and then obtaining a reference to the object so the Appender could call it. To do that I had to do SourceRunner runner = node.getConfiguration().getSourceRunners().get(SOURCE_NAME); if (runner == null || runner.getSource() == null) { throw new IllegalStateException("No Source has been created for Appender " + shortName); } source = (Log4jEventSource) runner.getSource(); It would be much better if I could pass the Source class to the configuration processor and get back a Source instance when the embedded agent is started. In this csae I would recommend the source has to implement something like public interface EmbeddedSource { void send(FlumeEvent event); } OTOH, you could just provide EmbeddedSource that implements send() as shown above.
          Hide
          Mike Percy added a comment -

          Heh, Arvind, clearly I should read more slowly since I think what I said exactly matches what you said.

          Ralph, not sure why you are using a source in this case, other than that's all you could get access to, which is understandable. I believe what you really wanted all along was a channel connected to a sink, right? Also, worth noting that the File channel without multiple puts per transaction is dog slow, due to the fsync() call. So we definitely need to expose some type of batch interface.

          Show
          Mike Percy added a comment - Heh, Arvind, clearly I should read more slowly since I think what I said exactly matches what you said. Ralph, not sure why you are using a source in this case, other than that's all you could get access to, which is understandable. I believe what you really wanted all along was a channel connected to a sink, right? Also, worth noting that the File channel without multiple puts per transaction is dog slow, due to the fsync() call. So we definitely need to expose some type of batch interface.
          Hide
          Ralph Goers added a comment -

          Yes, what is really required is the call to getChannelProcessor().processEvent(event). It didn't occur to me to see how to locate the ChannelProcessor(s) instead of a Source.

          As for the FileChannel being slow - our tests showed the non-embedded agent (i.e. Avro) takes just less than .1 seconds per event while the embedded agent takes between .015 and .001 seconds per event (interestingly, it got faster as more events were written). I would expect this is fine for a single container. That said, I have nothing against a simpler implementation of channel with guaranteed delivery.

          Show
          Ralph Goers added a comment - Yes, what is really required is the call to getChannelProcessor().processEvent(event). It didn't occur to me to see how to locate the ChannelProcessor(s) instead of a Source. As for the FileChannel being slow - our tests showed the non-embedded agent (i.e. Avro) takes just less than .1 seconds per event while the embedded agent takes between .015 and .001 seconds per event (interestingly, it got faster as more events were written). I would expect this is fine for a single container. That said, I have nothing against a simpler implementation of channel with guaranteed delivery.
          Hide
          Brock Noland added a comment -

          After thinking last night I am +1 on the passthru source. It's stupid to use tcp/ip as Itra-thread communication even if it makes the impl simpler.

          However, I am -1 on exposing Transaction to clients. It's a fairly complex interface to use and is full of gotchas. For example, with FileChannel if a thread fails to call rollback, data can be "delayed" until restart. I think we should only expose put and putBatch (or send and sendBatch) via the embedded source. This also further solidifies the one channel per embedded agent unless we put a channel identifier on the put method which I think is a mistake since multiple embedded agents can be created.

          Additionally, I think the embedded agent will need to be re-configurable due to the MemoryChannel.

          Show
          Brock Noland added a comment - After thinking last night I am +1 on the passthru source. It's stupid to use tcp/ip as Itra-thread communication even if it makes the impl simpler. However, I am -1 on exposing Transaction to clients. It's a fairly complex interface to use and is full of gotchas. For example, with FileChannel if a thread fails to call rollback, data can be "delayed" until restart. I think we should only expose put and putBatch (or send and sendBatch) via the embedded source. This also further solidifies the one channel per embedded agent unless we put a channel identifier on the put method which I think is a mistake since multiple embedded agents can be created. Additionally, I think the embedded agent will need to be re-configurable due to the MemoryChannel.
          Hide
          Ralph Goers added a comment -

          Reconfiguring is not a good idea as the way Flume reconfigures is essentially a shutdown and restart. Doing this to an embedded agent will cause all kinds of problems. Since it is likely the application Flume is embedded in will be managing the configuration delegate the reconfiguration handling to it.

          Show
          Ralph Goers added a comment - Reconfiguring is not a good idea as the way Flume reconfigures is essentially a shutdown and restart. Doing this to an embedded agent will cause all kinds of problems. Since it is likely the application Flume is embedded in will be managing the configuration delegate the reconfiguration handling to it.
          Hide
          Brock Noland added a comment -

          I agree that re-configuring is not a good idea, but I don't see a way around it. In the current design if someone wants to change the capacity of the memory channel, they will have to shutdown the current agent and create a new one, losing all the data in the memory channel.

          Show
          Brock Noland added a comment - I agree that re-configuring is not a good idea, but I don't see a way around it. In the current design if someone wants to change the capacity of the memory channel, they will have to shutdown the current agent and create a new one, losing all the data in the memory channel.
          Hide
          Ralph Goers added a comment -

          That may be a case where you could dynamically change the memory channel setting and get away with it, but more often you can't. What you really want is to
          a) start the new agent.
          b) "drain" the old agent.
          c) shut down the old agent.

          The key is step b.

          Why would that not be valuable even in a standalone agent?

          Show
          Ralph Goers added a comment - That may be a case where you could dynamically change the memory channel setting and get away with it, but more often you can't. What you really want is to a) start the new agent. b) "drain" the old agent. c) shut down the old agent. The key is step b. Why would that not be valuable even in a standalone agent?
          Hide
          Brock Noland added a comment -

          Sorry, I didn't mean dynamically change the memory channel setting. I meant shutdown the channel, sink, and create them again and restart them, the same way that flume works today. However, the way it works today is that the instances are cached and as such you are re-configuring the same object, this is due to the memory channel. This changes in FLUME-1630 for sinks and sources.

          As opposed to introducing a new drain concept I think it's best to use the same mechanism we have used before which is clarified in FLUME-1630 through the use of annotations on the channel.

          Show
          Brock Noland added a comment - Sorry, I didn't mean dynamically change the memory channel setting. I meant shutdown the channel, sink, and create them again and restart them, the same way that flume works today. However, the way it works today is that the instances are cached and as such you are re-configuring the same object, this is due to the memory channel. This changes in FLUME-1630 for sinks and sources. As opposed to introducing a new drain concept I think it's best to use the same mechanism we have used before which is clarified in FLUME-1630 through the use of annotations on the channel.
          Hide
          Mike Percy added a comment -

          I agree that people would want some interface to trigger a reconfiguration. Not to confuse this with the specifics of the log4j2 integration, but in the case of log4j it's possible to trigger an app to re-read log4j.properties. It would likely be a desired feature of Flume to re-read a flume.conf file or something based on some trigger... maybe just an API hook, and the application decides how to trigger it, like a reconfigure() call.

          I think people may also want to have a flume.conf file controlling the configuration, but it may be difficult / confusing when trying to enforce different constraints on it, such as only one channel, etc. So I'd be OK with leaving that out of the initial implementation, as long as we provide room to grow into it.

          Show
          Mike Percy added a comment - I agree that people would want some interface to trigger a reconfiguration. Not to confuse this with the specifics of the log4j2 integration, but in the case of log4j it's possible to trigger an app to re-read log4j.properties. It would likely be a desired feature of Flume to re-read a flume.conf file or something based on some trigger... maybe just an API hook, and the application decides how to trigger it, like a reconfigure() call. I think people may also want to have a flume.conf file controlling the configuration, but it may be difficult / confusing when trying to enforce different constraints on it, such as only one channel, etc. So I'd be OK with leaving that out of the initial implementation, as long as we provide room to grow into it.
          Hide
          Mike Percy added a comment -

          Also, I think a load balancing avro sink needs to be possible. So we may need to support Sink groups of Avro sinks... or somehow leverage the load balancing RPC client?

          A use case where someone would want to update their Flume configuration would be if a downstream server goes down and they want to remove it from their config to avoid exceptions. Or they might want to add one to the list.

          Show
          Mike Percy added a comment - Also, I think a load balancing avro sink needs to be possible. So we may need to support Sink groups of Avro sinks... or somehow leverage the load balancing RPC client? A use case where someone would want to update their Flume configuration would be if a downstream server goes down and they want to remove it from their config to avoid exceptions. Or they might want to add one to the list.
          Hide
          Brock Noland added a comment -

          Attached is design document updated based on the feedback received on this JIRA.

          Show
          Brock Noland added a comment - Attached is design document updated based on the feedback received on this JIRA.
          Brock Noland made changes -
          Attachment embeeded-agent-2.pdf [ 12553199 ]
          Hide
          Ralph Goers added a comment -

          I really don't know how you could make this any better.

          Show
          Ralph Goers added a comment - I really don't know how you could make this any better.
          Hide
          Mike Percy added a comment -

          Brock, +1 on this proposal. Sounds great to me!

          Show
          Mike Percy added a comment - Brock, +1 on this proposal. Sounds great to me!
          Hide
          Hari Shreedharan added a comment -

          +1 from me too. Good work Brock!

          Show
          Hari Shreedharan added a comment - +1 from me too. Good work Brock!
          Hide
          Brock Noland added a comment -

          Great! It'd be awesome if someone could give me a +1 or feedback on FLUME-1630 since I will be building a patch based on that. It's not absolutely necessary but if there is feedback it will require re-work here.

          Show
          Brock Noland added a comment - Great! It'd be awesome if someone could give me a +1 or feedback on FLUME-1630 since I will be building a patch based on that. It's not absolutely necessary but if there is feedback it will require re-work here.
          Hide
          Brock Noland added a comment -

          Just a quick update, I am really close to posting a patch. I could probably put it up now but I want to give it a last review with fresh eyes.

          Show
          Brock Noland added a comment - Just a quick update, I am really close to posting a patch. I could probably put it up now but I want to give it a last review with fresh eyes.
          Brock Noland made changes -
          Remote Link This issue links to "Review Board (Web Link)" [ 11504 ]
          Hide
          Brock Noland added a comment -

          Patch rev 0 attached

          Show
          Brock Noland added a comment - Patch rev 0 attached
          Brock Noland made changes -
          Attachment FLUME-1502-0.patch [ 12554222 ]
          Brock Noland made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s v1.4.0 [ 12323372 ]
          Hide
          Mike Percy added a comment - - edited

          Hi Brock, I've been looking at this patch, and overall it looks great! The design doc says that AvroSource should be supported, but with the LocalSource I don't see the benefit to doing that.

          What do you think about removing AvroSource support from the embedded agent?

          Show
          Mike Percy added a comment - - edited Hi Brock, I've been looking at this patch, and overall it looks great! The design doc says that AvroSource should be supported, but with the LocalSource I don't see the benefit to doing that. What do you think about removing AvroSource support from the embedded agent?
          Hide
          Brock Noland added a comment -

          The use case I can see for keeping Avro Source is lets say you had an application server running 5 applications. You'd like them to all log to an agent. You could have the application server run a single embedded agent application that all the other applications could log to.

          Now the opposing viewpoint is just that each application can run an embedded agent or IMHO this is really a use case for a traditional local agent.

          Thoughts?

          Show
          Brock Noland added a comment - The use case I can see for keeping Avro Source is lets say you had an application server running 5 applications. You'd like them to all log to an agent. You could have the application server run a single embedded agent application that all the other applications could log to. Now the opposing viewpoint is just that each application can run an embedded agent or IMHO this is really a use case for a traditional local agent. Thoughts?
          Hide
          Mike Percy added a comment -

          Personally, I think it's a use case for a traditional local agent, as you say. Or even a remote one. If we can get rid of the avro source support then that's one less thread pool to worry about, and basically it turns this thing into a super-client. Seems to me like that is what most people want w/ this.

          Show
          Mike Percy added a comment - Personally, I think it's a use case for a traditional local agent, as you say. Or even a remote one. If we can get rid of the avro source support then that's one less thread pool to worry about, and basically it turns this thing into a super-client. Seems to me like that is what most people want w/ this.
          Hide
          Brock Noland added a comment -

          I agree, let's strike AvroSource. This eliminates a good number of changes and an interface which is always good.

          Show
          Brock Noland added a comment - I agree, let's strike AvroSource. This eliminates a good number of changes and an interface which is always good.
          Hide
          Brock Noland added a comment -

          Latest patch from RB

          Show
          Brock Noland added a comment - Latest patch from RB
          Brock Noland made changes -
          Attachment FLUME-1502-1.patch [ 12560265 ]
          Hide
          Brock Noland added a comment -

          Latest patch

          Show
          Brock Noland added a comment - Latest patch
          Brock Noland made changes -
          Attachment FLUME-1502-3.patch [ 12560452 ]
          Mike Percy made changes -
          Issue Type Improvement [ 4 ] New Feature [ 2 ]
          Hide
          Mike Percy added a comment -

          Patch committed. Thank you for the patch, Brock!

          Trunk rev: 822e120aececa479cf9f5178d836cdc55852e23b

          Show
          Mike Percy added a comment - Patch committed. Thank you for the patch, Brock! Trunk rev: 822e120aececa479cf9f5178d836cdc55852e23b
          Mike Percy made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in flume-trunk #339 (See https://builds.apache.org/job/flume-trunk/339/)
          FLUME-1502. Support for running simple configurations embedded in host process. (Revision 822e120aececa479cf9f5178d836cdc55852e23b)

          Result = SUCCESS
          mpercy : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=822e120aececa479cf9f5178d836cdc55852e23b
          Files :

          • flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentEmbeddedSource.java
          • flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/MaterializedConfigurationProvider.java
          • flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentState.java
          • flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java
          • flume-ng-embedded-agent/src/test/resources/log4j.properties
          • flume-ng-doc/sphinx/FlumeDeveloperGuide.rst
          • pom.xml
          • flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgentConfiguration.java
          • flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgent.java
          • flume-ng-embedded-agent/pom.xml
          • flume-ng-node/src/main/java/org/apache/flume/node/MaterializedConfiguration.java
          • flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/package-info.java
          • flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentConfiguration.java
          • flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/MemoryConfigurationProvider.java
          • flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedSource.java
          • flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java
          • flume-ng-core/src/main/java/org/apache/flume/sink/SinkProcessorFactory.java
          • flume-ng-dist/pom.xml
          Show
          Hudson added a comment - Integrated in flume-trunk #339 (See https://builds.apache.org/job/flume-trunk/339/ ) FLUME-1502 . Support for running simple configurations embedded in host process. (Revision 822e120aececa479cf9f5178d836cdc55852e23b) Result = SUCCESS mpercy : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=822e120aececa479cf9f5178d836cdc55852e23b Files : flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentEmbeddedSource.java flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/MaterializedConfigurationProvider.java flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentState.java flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java flume-ng-embedded-agent/src/test/resources/log4j.properties flume-ng-doc/sphinx/FlumeDeveloperGuide.rst pom.xml flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgentConfiguration.java flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgent.java flume-ng-embedded-agent/pom.xml flume-ng-node/src/main/java/org/apache/flume/node/MaterializedConfiguration.java flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/package-info.java flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentConfiguration.java flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/MemoryConfigurationProvider.java flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedSource.java flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java flume-ng-core/src/main/java/org/apache/flume/sink/SinkProcessorFactory.java flume-ng-dist/pom.xml
          Hide
          Brock Noland added a comment -

          Thanks for committing this Mike! Attached is the design document which matches the implementation.

          Show
          Brock Noland added a comment - Thanks for committing this Mike! Attached is the design document which matches the implementation.
          Brock Noland made changes -
          Attachment embedded-agent-3.pdf [ 12560587 ]

            People

            • Assignee:
              Brock Noland
              Reporter:
              Arvind Prabhakar
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development