Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0, 0.8.0
    • Fix Version/s: 0.9.0
    • Component/s: test
    • Labels:
      None

      Description

      Evaluate the state of the current integration test suite under samza-test to determine what it covers, what is working, and what is not, and fix if necessary.

      1. SAMZA-394-0.patch
        55 kB
        Navina Ramesh
      2. SAMZA-394-1.patch
        10 kB
        Chris Riccomini
      3. SAMZA-394-2.patch
        55 kB
        Chris Riccomini
      4. SAMZA-394-3.patch
        55 kB
        Chris Riccomini

        Issue Links

          Activity

          Hide
          criccomini Chris Riccomini added a comment -

          Let's disentangle this from Zopkio migration. Let's get the patch from SAMAZ-14 working, and commit it here. We can then open a follow-on to migrate the test to Zopkio.

          Show
          criccomini Chris Riccomini added a comment - Let's disentangle this from Zopkio migration. Let's get the patch from SAMAZ-14 working, and commit it here. We can then open a follow-on to migrate the test to Zopkio.
          Hide
          navina Navina Ramesh added a comment -

          Chris Riccomini
          Please try this patch on your machine and on a Linux box, just to make sure everything is working fine.

          Show
          navina Navina Ramesh added a comment - Chris Riccomini Please try this patch on your machine and on a Linux box, just to make sure everything is working fine.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching update of Navina Ramesh's patch with a few tweaks to make the python script work with -yarn-host, yarn-kill, and -kill-time.

          Other notes that aren't addressed in attached patch:

          1. Add note to README that this is meant to be used with hello-samza. DEPLOY_DIR is samza-hello-samza/deploy.
          2. Need to have paramiko installed (pip install paramiko).
          3. Running samza-test/src/main/python/samza_failure_testing.py without --help gives an index out of range error.

          An example kill execution is:

            python samza-test/src/main/python/samza_failure_testing.py \
              --node-list=/tmp/node_list.txt \
              --kill-time=60 \
              --kafka-dir=/Users/criccomi/Code/samza-hello-samza/deploy/kafka \
              --kafka-host=localhost \
              --yarn-dir=/Users/criccomi/Code/samza-hello-samza/deploy/yarn \
              --yarn-host=localhost \
              --kill-kafka \
              --kill-container
          

          We should add that to the README as well.

          Show
          criccomini Chris Riccomini added a comment - Attaching update of Navina Ramesh 's patch with a few tweaks to make the python script work with - yarn-host , yarn-kill , and -kill-time . Other notes that aren't addressed in attached patch: Add note to README that this is meant to be used with hello-samza. DEPLOY_DIR is samza-hello-samza/deploy. Need to have paramiko installed (pip install paramiko). Running samza-test/src/main/python/samza_failure_testing.py without --help gives an index out of range error. An example kill execution is: python samza-test/src/main/python/samza_failure_testing.py \ --node-list=/tmp/node_list.txt \ --kill-time=60 \ --kafka-dir=/Users/criccomi/Code/samza-hello-samza/deploy/kafka \ --kafka-host=localhost \ --yarn-dir=/Users/criccomi/Code/samza-hello-samza/deploy/yarn \ --yarn-host=localhost \ --kill-kafka \ --kill-container We should add that to the README as well.
          Hide
          criccomini Chris Riccomini added a comment -

          My SAMZA-394-1.patch didn't include my full diff. Attaching an second patch that includes everything.

          Show
          criccomini Chris Riccomini added a comment - My SAMZA-394 -1.patch didn't include my full diff. Attaching an second patch that includes everything.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching Navina Ramesh's patch with the doc updates that I described above.

          +1'ing Navina's work with my tweak. Will merge and commit.

          Show
          criccomini Chris Riccomini added a comment - Attaching Navina Ramesh 's patch with the doc updates that I described above. +1'ing Navina's work with my tweak. Will merge and commit.
          Hide
          criccomini Chris Riccomini added a comment -

          Merged and committed, thanks!

          Show
          criccomini Chris Riccomini added a comment - Merged and committed, thanks!
          Hide
          closeuris Yan Fang added a comment -

          Not sure if it's appropriate to comment here. Just ran the integration tests tonight, test_samza_job always failed.

          Error message:
          [FetchRequest(topic='samza-test-topic-output', partition=0, offset=0, max_bytes=4096)]
          

          Log:

          2015-02-26 20:44:26 performance_tests [INFO] Loaded 800000 messages.
          2015-02-26 20:44:29 performance_tests [INFO] Loaded 900000 messages.
          2015-02-26 20:44:32 util [INFO] Starting tests.kafka-read-write-performance
          2015-02-26 20:44:33 util [INFO] Awaiting tests.kafka-read-write-performance
          2015-02-26 20:45:02 zopkio.deployer [ERROR] Log file /tmp/samza2/deploy/kafka/log-cleaner.log does not exist on localhost
          2015-02-26 20:45:02 naarad.utils [WARNING] /tmp/samza2/scripts/naarad.cfg : file does not exist.
          2015-02-26 20:45:02 naarad.utils [WARNING] /tmp/samza2/scripts/naarad.cfg : file does not exist.
          2015-02-26 20:45:02 naarad.utils [WARNING] /tmp/samza2/scripts/naarad.cfg : file does not exist.
          2015-02-26 20:45:02 naarad.utils [WARNING] /tmp/samza2/scripts/naarad.cfg : file does not exist.
          2015-02-26 20:45:03 smoke_tests [INFO] Running validate_samza_job
          2015-02-26 20:47:04 kafka [ERROR] Unable to receive data from Kafka
          Traceback (most recent call last):
            File "/tmp/samza2/samza-integration-tests/lib/python2.7/site-packages/kafka/conn.py", line 93, in _read_bytes
              data = self._sock.recv(min(bytes_left, 4096))
          timeout: timed out
          2015-02-26 20:47:04 kafka [WARNING] Could not receive response to request [00000053000100000000003d000c6b61666b612d707974686f6effffffff000493e00000000100000001001773616d7a612d746573742d746f7069632d6f75747075740000000100000000000000000000000000001000] from server <KafkaConnection host=10.0.0.16 port=9092>: Kafka @ 10.0.0.16:9092 went away
          2015-02-26 20:47:04 performance_tests [INFO] Running validate_kafka_read_write_performance
          2015-02-26 20:47:31 zopkio.remote_host_helper [INFO] stopping resourcemanager
          2015-02-26 20:47:33 zopkio.remote_host_helper [INFO] Stopping zookeeper ... STOPPED
          2015-02-26 20:47:39 zopkio.remote_host_helper [INFO] stopping nodemanager
          nodemanager did not stop gracefully after 5 seconds: killing with kill -9
          2015-02-26 20:47:42 zopkio.test_runner [INFO] Execution of configuration: single execution complete
          2015-02-26 20:47:42 zopkio.test_runner [INFO] test_samza_job----failed
          2015-02-26 20:47:42 zopkio.test_runner [INFO] ["FailedPayloadsError: [FetchRequest(topic='samza-test-topic-output', partition=0, offset=0, max_bytes=4096)]\n"]
          
          Show
          closeuris Yan Fang added a comment - Not sure if it's appropriate to comment here. Just ran the integration tests tonight, test_samza_job always failed. Error message: [FetchRequest(topic='samza-test-topic-output', partition=0, offset=0, max_bytes=4096)] Log: 2015-02-26 20:44:26 performance_tests [INFO] Loaded 800000 messages. 2015-02-26 20:44:29 performance_tests [INFO] Loaded 900000 messages. 2015-02-26 20:44:32 util [INFO] Starting tests.kafka-read-write-performance 2015-02-26 20:44:33 util [INFO] Awaiting tests.kafka-read-write-performance 2015-02-26 20:45:02 zopkio.deployer [ERROR] Log file /tmp/samza2/deploy/kafka/log-cleaner.log does not exist on localhost 2015-02-26 20:45:02 naarad.utils [WARNING] /tmp/samza2/scripts/naarad.cfg : file does not exist. 2015-02-26 20:45:02 naarad.utils [WARNING] /tmp/samza2/scripts/naarad.cfg : file does not exist. 2015-02-26 20:45:02 naarad.utils [WARNING] /tmp/samza2/scripts/naarad.cfg : file does not exist. 2015-02-26 20:45:02 naarad.utils [WARNING] /tmp/samza2/scripts/naarad.cfg : file does not exist. 2015-02-26 20:45:03 smoke_tests [INFO] Running validate_samza_job 2015-02-26 20:47:04 kafka [ERROR] Unable to receive data from Kafka Traceback (most recent call last): File "/tmp/samza2/samza-integration-tests/lib/python2.7/site-packages/kafka/conn.py" , line 93, in _read_bytes data = self._sock.recv(min(bytes_left, 4096)) timeout: timed out 2015-02-26 20:47:04 kafka [WARNING] Could not receive response to request [00000053000100000000003d000c6b61666b612d707974686f6effffffff000493e00000000100000001001773616d7a612d746573742d746f7069632d6f75747075740000000100000000000000000000000000001000] from server <KafkaConnection host=10.0.0.16 port=9092>: Kafka @ 10.0.0.16:9092 went away 2015-02-26 20:47:04 performance_tests [INFO] Running validate_kafka_read_write_performance 2015-02-26 20:47:31 zopkio.remote_host_helper [INFO] stopping resourcemanager 2015-02-26 20:47:33 zopkio.remote_host_helper [INFO] Stopping zookeeper ... STOPPED 2015-02-26 20:47:39 zopkio.remote_host_helper [INFO] stopping nodemanager nodemanager did not stop gracefully after 5 seconds: killing with kill -9 2015-02-26 20:47:42 zopkio.test_runner [INFO] Execution of configuration: single execution complete 2015-02-26 20:47:42 zopkio.test_runner [INFO] test_samza_job----failed 2015-02-26 20:47:42 zopkio.test_runner [INFO] [ "FailedPayloadsError: [FetchRequest(topic='samza-test-topic-output', partition=0, offset=0, max_bytes=4096)]\n" ]
          Hide
          criccomini Chris Riccomini added a comment -

          Looks like negate-numbers.properties has a checkpoint manager added to it by mistake, and it didn't define the system to use. This was done as part of SAMZA-394, and I missed it. I'm opening a new ticket.

          Show
          criccomini Chris Riccomini added a comment - Looks like negate-numbers.properties has a checkpoint manager added to it by mistake, and it didn't define the system to use. This was done as part of SAMZA-394 , and I missed it. I'm opening a new ticket.

            People

            • Assignee:
              navina Navina Ramesh
              Reporter:
              davidzchen David Chen
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development