Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None
    1. kafka-571-v3.patch
      242 kB
      John Fung
    2. zkclient-0.1.jar
      61 kB
      John Fung
    3. kafka-perf-0.7.0.jar
      52 kB
      John Fung
    4. kafka-0.7.0.jar
      1.25 MB
      John Fung
    5. kafka-571-v2.patch
      237 kB
      John Fung
    6. kafka-571-SimpleConsumerShell.patch
      0.7 kB
      Jun Rao
    7. kafka-571-v1.patch
      236 kB
      John Fung

      Issue Links

        Activity

        John Fung created issue -
        John Fung made changes -
        Field Original Value New Value
        Fix Version/s 0.8 [ 12317244 ]
        John Fung made changes -
        Assignee John Fung [ jfung ]
        John Fung made changes -
        Summary Add more failure cases to System Test Add more test cases to System Test
        Hide
        John Fung added a comment - - edited

        1. Leader Hard Failure (kill -SIGKILL) : testcase_0151 ~ 0158

        2. Controller Controlled Failure (kill -SIGTERM) : testcase_0201 ~ 0208

        3. Follower Controlled Failure (kill -SIGTERM) : testcase_0251 ~ 0258

        4. Leader GC Pause (kill -SIGSTOP / -SIGCONT) : testcase_0301 ~ 0308

        Show
        John Fung added a comment - - edited 1. Leader Hard Failure (kill -SIGKILL) : testcase_0151 ~ 0158 2. Controller Controlled Failure (kill -SIGTERM) : testcase_0201 ~ 0208 3. Follower Controlled Failure (kill -SIGTERM) : testcase_0251 ~ 0258 4. Leader GC Pause (kill -SIGSTOP / -SIGCONT) : testcase_0301 ~ 0308
        John Fung made changes -
        Attachment kafka-571-v1.patch [ 12550038 ]
        Hide
        John Fung added a comment -

        Uploaded kafka-571-v1.patch with the followings:

        1. Leader Hard Failure (kill -SIGKILL) : testcase_0151 ~ 0158

        2. Controller Controlled Failure (kill -SIGTERM) : testcase_0201 ~ 0208

        3. Follower Controlled Failure (kill -SIGTERM) : testcase_0251 ~ 0258

        4. Leader GC Pause (kill -SIGSTOP / -SIGCONT) : testcase_0301 ~ 0308

        5. Minor fix for SimpleConsumerShell (KAFKA-576)

        6. Using SimpleConsumerShell to validate data loss in each replica per topic. If the data count doesn't match across all replicas, the test case fails.

        7. Checksum matching across all replicas are now validated by merging individual log segment files:

        i. Sort and merge *.log per topic-partition into 1 log segment file

        1. .../system_test/mirror_maker_testsuite/testcase_5002/logs/broker-4/kafka_server_4_logs
          • test_1-0
          • 00000000000000000000.index
          • 00000000000000000000.log
          • 00000000000000000020.index
          • 00000000000000000020.log
          • . . .
          • test_1-1
          • 00000000000000000000.index
          • 00000000000000000000.log
          • 00000000000000000020.index
          • 00000000000000000020.log
          • . . .

        ii. Get checksum of all merged log segment to its corresponding broker-topic-partition key:

        1. { # 'kafka_server_1_logs:tests_1-0': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_1_logs:tests_1-1': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_1_logs:tests_2-0': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_1_logs:tests_2-1': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_2_logs:tests_1-0': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_2_logs:tests_1-1': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_2_logs:tests_2-0': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_2_logs:tests_2-1': 'd41d8cd98f00b204e9800998ecf8427e' # }

        iii. Organize the checksum according to their topic-partition and do matching comparison:

        1. { # 'test_1-0' : ['d41d8cd98f00b204e9800998ecf8427e','d41d8cd98f00b204e9800998ecf8427e'], # 'test_1-1' : ['d41d8cd98f00b204e9800998ecf8427e','d41d8cd98f00b204e9800998ecf8427e'], # 'test_2-0' : ['d41d8cd98f00b204e9800998ecf8427e','d41d8cd98f00b204e9800998ecf8427e'], # 'test_2-1' : ['d41d8cd98f00b204e9800998ecf8427e','d41d8cd98f00b204e9800998ecf8427e'] # }

        8. As minor data loss is expected in leader failure cases when Acks == 1 (KAFKA-573), now a test case will pass if the data loss is under 1% in the Acks == 1 cases.

        Show
        John Fung added a comment - Uploaded kafka-571-v1.patch with the followings: 1. Leader Hard Failure (kill -SIGKILL) : testcase_0151 ~ 0158 2. Controller Controlled Failure (kill -SIGTERM) : testcase_0201 ~ 0208 3. Follower Controlled Failure (kill -SIGTERM) : testcase_0251 ~ 0258 4. Leader GC Pause (kill -SIGSTOP / -SIGCONT) : testcase_0301 ~ 0308 5. Minor fix for SimpleConsumerShell ( KAFKA-576 ) 6. Using SimpleConsumerShell to validate data loss in each replica per topic. If the data count doesn't match across all replicas, the test case fails. 7. Checksum matching across all replicas are now validated by merging individual log segment files: i. Sort and merge *.log per topic-partition into 1 log segment file .../system_test/mirror_maker_testsuite/testcase_5002/logs/broker-4/kafka_server_4_logs test_1-0 00000000000000000000.index 00000000000000000000.log 00000000000000000020.index 00000000000000000020.log . . . test_1-1 00000000000000000000.index 00000000000000000000.log 00000000000000000020.index 00000000000000000020.log . . . ii. Get checksum of all merged log segment to its corresponding broker-topic-partition key: { # 'kafka_server_1_logs:tests_1-0': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_1_logs:tests_1-1': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_1_logs:tests_2-0': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_1_logs:tests_2-1': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_2_logs:tests_1-0': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_2_logs:tests_1-1': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_2_logs:tests_2-0': 'd41d8cd98f00b204e9800998ecf8427e', # 'kafka_server_2_logs:tests_2-1': 'd41d8cd98f00b204e9800998ecf8427e' # } iii. Organize the checksum according to their topic-partition and do matching comparison: { # 'test_1-0' : ['d41d8cd98f00b204e9800998ecf8427e','d41d8cd98f00b204e9800998ecf8427e'], # 'test_1-1' : ['d41d8cd98f00b204e9800998ecf8427e','d41d8cd98f00b204e9800998ecf8427e'], # 'test_2-0' : ['d41d8cd98f00b204e9800998ecf8427e','d41d8cd98f00b204e9800998ecf8427e'], # 'test_2-1' : ['d41d8cd98f00b204e9800998ecf8427e','d41d8cd98f00b204e9800998ecf8427e'] # } 8. As minor data loss is expected in leader failure cases when Acks == 1 ( KAFKA-573 ), now a test case will pass if the data loss is under 1% in the Acks == 1 cases.
        John Fung made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        John Fung made changes -
        Link This issue contains KAFKA-576 [ KAFKA-576 ]
        Hide
        Jun Rao added a comment -

        Thanks for the patch. Some comments:

        1. I patched SimpleConsumerShell in a better way (attached).
        2. Could we remove all producer.properties since they are not really being used?
        3. Could we merge controller_testsuite and follower_testsuite into replication_testsuite? The only difference among them is on what entity to introduce failure. We can make replication_testsuite a bit more general to handle that.

        Show
        Jun Rao added a comment - Thanks for the patch. Some comments: 1. I patched SimpleConsumerShell in a better way (attached). 2. Could we remove all producer.properties since they are not really being used? 3. Could we merge controller_testsuite and follower_testsuite into replication_testsuite? The only difference among them is on what entity to introduce failure. We can make replication_testsuite a bit more general to handle that.
        Jun Rao made changes -
        Attachment kafka-571-SimpleConsumerShell.patch [ 12550101 ]
        John Fung made changes -
        Attachment kafka-571-v2.patch [ 12550233 ]
        John Fung made changes -
        Attachment kafka-0.7.0.jar [ 12550234 ]
        Attachment kafka-perf-0.7.0.jar [ 12550235 ]
        Attachment zkclient-0.1.jar [ 12550236 ]
        Hide
        John Fung added a comment -

        Thanks Jun for reviewing. Uploaded kafka-571-v2.patch with the following changes:

        1. Removed producer.properties from <testsuite>/config
        2. Merged controller_testsuite and follower_testsuite => replication_testsuite.
        3. migration_tool_testsuite is available with kafka 0.7 jars contained inside the testsuite directory.
        4. Attached the jars which are required for migration_tool_testsuite in the following directory:
        system_test/migration_tool_testsuite/0.7/lib/kafka-0.7.0.jar
        system_test/migration_tool_testsuite/0.7/lib/kafka-perf-0.7.0.jar
        system_test/migration_tool_testsuite/0.7/lib/zkclient-0.1.jar

        Show
        John Fung added a comment - Thanks Jun for reviewing. Uploaded kafka-571-v2.patch with the following changes: 1. Removed producer.properties from <testsuite>/config 2. Merged controller_testsuite and follower_testsuite => replication_testsuite. 3. migration_tool_testsuite is available with kafka 0.7 jars contained inside the testsuite directory. 4. Attached the jars which are required for migration_tool_testsuite in the following directory: system_test/migration_tool_testsuite/0.7/lib/kafka-0.7.0.jar system_test/migration_tool_testsuite/0.7/lib/kafka-perf-0.7.0.jar system_test/migration_tool_testsuite/0.7/lib/zkclient-0.1.jar
        Hide
        Jun Rao added a comment -

        Thanks for patch v2. Looks good. One more comment.

        20. In migration_tool_testsuite/config, some of the configs are for 0.7 and some others are for 0.8. Could we rename them with the associated version #? Note that some of the configs have changed btw 0.7 and 0.8.

        Could you also rebase and include the SimpleConsumerShell change in the final patch?

        Show
        Jun Rao added a comment - Thanks for patch v2. Looks good. One more comment. 20. In migration_tool_testsuite/config, some of the configs are for 0.7 and some others are for 0.8. Could we rename them with the associated version #? Note that some of the configs have changed btw 0.7 and 0.8. Could you also rebase and include the SimpleConsumerShell change in the final patch?
        John Fung made changes -
        Attachment kafka-571-v3.patch [ 12550678 ]
        Hide
        John Fung added a comment -

        Thanks Jun for reviewing. Uploaded kafka-571-v3.patch with the following changes:

        20. The properties files under migration_tool_testsuite/config are cleaned up as described below in #21.

        21. Removed producer.properties, consumer.properties producer_performance.properties, console_consumer.properties from <testsuite>/config as they are not used

        22. As KAFKA-576 patch has been checked in, this kafka-571-v3.patch has been rebased after that.

        23. Please check in the following libraries also for migration tool testcase:
        system_test/migration_tool_testsuite/0.7/lib/kafka-0.7.0.jar
        system_test/migration_tool_testsuite/0.7/lib/kafka-perf-0.7.0.jar
        system_test/migration_tool_testsuite/0.7/lib/zkclient-0.1.jar

        Show
        John Fung added a comment - Thanks Jun for reviewing. Uploaded kafka-571-v3.patch with the following changes: 20. The properties files under migration_tool_testsuite/config are cleaned up as described below in #21. 21. Removed producer.properties, consumer.properties producer_performance.properties, console_consumer.properties from <testsuite>/config as they are not used 22. As KAFKA-576 patch has been checked in, this kafka-571-v3.patch has been rebased after that. 23. Please check in the following libraries also for migration tool testcase: system_test/migration_tool_testsuite/0.7/lib/kafka-0.7.0.jar system_test/migration_tool_testsuite/0.7/lib/kafka-perf-0.7.0.jar system_test/migration_tool_testsuite/0.7/lib/zkclient-0.1.jar
        Hide
        Jun Rao added a comment -

        Thanks for the patch. +1. Committed to 0.8 with the 0.7 jars.

        Show
        Jun Rao added a comment - Thanks for the patch. +1. Committed to 0.8 with the 0.7 jars.
        Jun Rao made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Jun Rao made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        6d 21h 2m 1 John Fung 19/Oct/12 20:10
        Patch Available Patch Available Resolved Resolved
        5d 1h 48m 1 Jun Rao 24/Oct/12 21:58
        Resolved Resolved Closed Closed
        10s 1 Jun Rao 24/Oct/12 21:58

          People

          • Assignee:
            John Fung
            Reporter:
            John Fung
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development