Flume
  1. Flume
  2. FLUME-1492

Create integration test for file channel

    Details

    • Type: Test Test
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.3.0
    • Fix Version/s: v1.3.0
    • Component/s: Test
    • Labels:
      None
    1. FLUME-1492.patch
      10 kB
      Will McQueen
    2. FLUME-1492v2.patch
      11 kB
      Will McQueen

      Activity

      Will McQueen created issue -
      Hide
      Denny Ye added a comment -

      hi Will, it sounds good. What's your plan of this suggestion? can the cope of integration overlay Source output and Sink input?

      Show
      Denny Ye added a comment - hi Will, it sounds good. What's your plan of this suggestion? can the cope of integration overlay Source output and Sink input?
      Will McQueen made changes -
      Field Original Value New Value
      Attachment FLUME-1492.patch [ 12541442 ]
      Hide
      Will McQueen added a comment -

      Hi Denny,

      I uploaded a patch that contains the test. The plan is to send a fixed number of events in a known sequence, and then confirm that the sink has received all of them and in the same order. The rolling file sink can be used for this. I'm not sure I understand your second question: "Can the cope of integration overlay Source output and Sink input?". Could you please clarify?

      Cheers,
      Will

      Show
      Will McQueen added a comment - Hi Denny, I uploaded a patch that contains the test. The plan is to send a fixed number of events in a known sequence, and then confirm that the sink has received all of them and in the same order. The rolling file sink can be used for this. I'm not sure I understand your second question: "Can the cope of integration overlay Source output and Sink input?". Could you please clarify? Cheers, Will
      Show
      Will McQueen added a comment - https://reviews.apache.org/r/6684/
      Hide
      Denny Ye added a comment -

      Sorry for my spilling mistake. The second question is "Can the scope of integration overlay Source output and Sink input?". I have reviewed your patch, thus I know the answer is Yes.

      You simulate the regular events flow. What my interesting is the failure both file channel and Sink in batch operation(writing batch events to file channel or consuming batch events to HDFS Sink). I would like to confirm one primary basic point : batch operation is integrated at single transaction. Either all events wrote to file channel successfully, or all events roll-back from failure. It can delete all existent events from file channel at whole transaction.

      Show
      Denny Ye added a comment - Sorry for my spilling mistake. The second question is "Can the scope of integration overlay Source output and Sink input?". I have reviewed your patch, thus I know the answer is Yes. You simulate the regular events flow. What my interesting is the failure both file channel and Sink in batch operation(writing batch events to file channel or consuming batch events to HDFS Sink). I would like to confirm one primary basic point : batch operation is integrated at single transaction. Either all events wrote to file channel successfully, or all events roll-back from failure. It can delete all existent events from file channel at whole transaction.
      Hide
      Will McQueen added a comment -

      Hi Denny,

      If I understand correctly, you would like to see an integration test that confirms the transactional nature of batches used by a source or sink, in the context of a file channel. Is that right? So some test cases might be:

      Part 1: Testing transaction nature of source and file channel
      ******

      Part 1-1
      ========
      1. Configure the file channel with a capacity of 10 events, a source, and no sink (we're staging the file channel with events here).
      2. The source sends 9 events to the file channel
      3. The source then sends batch request to put 2 events into FC.

      The expectations are:
      1. The batch request sent by the source (containing the 2 events) should fail. For this test, ensure that the source does not attempt to retry the request.
      2. The channel at this point should contain only 9 events.

      Part 1-2:
      ========
      1. Reconfigure the agent so that the source is removed, and a sink is added.
      2. The sink takes all events from the file channel.

      The expectations for the sink are:
      1. The sink receives only 9 events
      2. Those events are the same events that were sent by the source (same payload, same headers)
      3. Those events arrive in the same order as how the source sent them to the file channel.

      Part 2: Testing transaction nature of file channel and sink
      ======
      Briefly, the test could be to set the FC's capacity to 10 events, stage the file channel with 2 events, then have the sink attempt to send those 2 events in a batch but fail. This should result in the file channel still containing those same 2 events, in the same order. One way to verify this might be to setup 2 sinks in a failover group, where the 2 events are first sent to sink#1 (higher priority), which should fail and cause sink#2 to receive those same events and put them into a file (eg, using FILE_ROLL sink) so that the number of events, event payload, and event ordering can be verified in the file that's written-out by sink#2.

      Please let me know if this is what you had in mind. If so, I can open a separate ticket for these tests.

      Cheers,
      Will

      Show
      Will McQueen added a comment - Hi Denny, If I understand correctly, you would like to see an integration test that confirms the transactional nature of batches used by a source or sink, in the context of a file channel. Is that right? So some test cases might be: Part 1: Testing transaction nature of source and file channel ****** Part 1-1 ======== 1. Configure the file channel with a capacity of 10 events, a source, and no sink (we're staging the file channel with events here). 2. The source sends 9 events to the file channel 3. The source then sends batch request to put 2 events into FC. The expectations are: 1. The batch request sent by the source (containing the 2 events) should fail. For this test, ensure that the source does not attempt to retry the request. 2. The channel at this point should contain only 9 events. Part 1-2: ======== 1. Reconfigure the agent so that the source is removed, and a sink is added. 2. The sink takes all events from the file channel. The expectations for the sink are: 1. The sink receives only 9 events 2. Those events are the same events that were sent by the source (same payload, same headers) 3. Those events arrive in the same order as how the source sent them to the file channel. Part 2: Testing transaction nature of file channel and sink ====== Briefly, the test could be to set the FC's capacity to 10 events, stage the file channel with 2 events, then have the sink attempt to send those 2 events in a batch but fail. This should result in the file channel still containing those same 2 events, in the same order. One way to verify this might be to setup 2 sinks in a failover group, where the 2 events are first sent to sink#1 (higher priority), which should fail and cause sink#2 to receive those same events and put them into a file (eg, using FILE_ROLL sink) so that the number of events, event payload, and event ordering can be verified in the file that's written-out by sink#2. Please let me know if this is what you had in mind. If so, I can open a separate ticket for these tests. Cheers, Will
      Will McQueen made changes -
      Attachment FLUME-1492v2.patch [ 12541647 ]
      Hide
      Will McQueen added a comment -

      Att'd new patch. Addresses Hari's concerns from ReviewBoard.

      Show
      Will McQueen added a comment - Att'd new patch. Addresses Hari's concerns from ReviewBoard.
      Hide
      Denny Ye added a comment -

      hi Well, test plan of mentioned above might be looks like :

      Part 1-1
      ========
      Source wants to write 10 events to FC. There has file failure from FC while Source has wrote 5 events already. Roll back should happen.

      Expect result:
      5 events that have been recorded into file should be deleted.

      Part 1-2
      ========
      Sink is consuming events into downstream, or HDFS asynchronously.All of events retrieved from FC can be put into 'takeList'. Sink failure from downstream can make roll back from takeLisk.

      Expect result:
      No loss, no repeated event

      Show
      Denny Ye added a comment - hi Well, test plan of mentioned above might be looks like : Part 1-1 ======== Source wants to write 10 events to FC. There has file failure from FC while Source has wrote 5 events already. Roll back should happen. Expect result: 5 events that have been recorded into file should be deleted. Part 1-2 ======== Sink is consuming events into downstream, or HDFS asynchronously.All of events retrieved from FC can be put into 'takeList'. Sink failure from downstream can make roll back from takeLisk. Expect result: No loss, no repeated event
      Will McQueen made changes -
      Status Open [ 1 ] Patch Available [ 10002 ]
      Hide
      Mike Percy added a comment -

      Let's take these tests one at a time. This initial patch provides a foundation for more coverage going forward. Certainly we want additional coverage on the File Channel integration tests.

      Denny, I don't want to lose your thoughts here - I agree that we need coverage on the mid-transaction failure case if we don't already have it. So let's take that discussion to a new JIRA with that scope.

      Show
      Mike Percy added a comment - Let's take these tests one at a time. This initial patch provides a foundation for more coverage going forward. Certainly we want additional coverage on the File Channel integration tests. Denny, I don't want to lose your thoughts here - I agree that we need coverage on the mid-transaction failure case if we don't already have it. So let's take that discussion to a new JIRA with that scope.
      Hide
      Mike Percy added a comment -

      Patch committed. Thanks Will!

      Show
      Mike Percy added a comment - Patch committed. Thanks Will!
      Mike Percy made changes -
      Status Patch Available [ 10002 ] Resolved [ 5 ]
      Resolution Fixed [ 1 ]
      Hide
      Hudson added a comment -

      Integrated in flume-trunk #289 (See https://builds.apache.org/job/flume-trunk/289/)
      FLUME-1492. Create integration test for file channel. (Revision fc5d3f684861b49e54e7e81a7d393232d10d9e0a)

      Result = SUCCESS
      mpercy : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git;a=summary&a=commit&h=fc5d3f684861b49e54e7e81a7d393232d10d9e0a
      Files :

      • flume-ng-tests/pom.xml
      • flume-ng-tests/src/test/java/org/apache/flume/test/agent/TestFileChannel.java
      • flume-ng-tests/src/test/java/org/apache/flume/test/util/StagedInstall.java
      Show
      Hudson added a comment - Integrated in flume-trunk #289 (See https://builds.apache.org/job/flume-trunk/289/ ) FLUME-1492 . Create integration test for file channel. (Revision fc5d3f684861b49e54e7e81a7d393232d10d9e0a) Result = SUCCESS mpercy : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git;a=summary&a=commit&h=fc5d3f684861b49e54e7e81a7d393232d10d9e0a Files : flume-ng-tests/pom.xml flume-ng-tests/src/test/java/org/apache/flume/test/agent/TestFileChannel.java flume-ng-tests/src/test/java/org/apache/flume/test/util/StagedInstall.java
      Transition Time In Source Status Execution Times Last Executer Last Execution Date
      Open Open Patch Available Patch Available
      4d 17h 53m 1 Will McQueen 21/Aug/12 21:26
      Patch Available Patch Available Resolved Resolved
      1d 11h 55m 1 Mike Percy 23/Aug/12 09:21

        People

        • Assignee:
          Will McQueen
          Reporter:
          Will McQueen
        • Votes:
          0 Vote for this issue
          Watchers:
          5 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development