Details

    • Type: Test
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.9.0
    • Component/s: test
    • Labels:
      None

      Description

      We don't have a standardized test framework for integration tests. This task explores Zopkio as a potential test framework and prototypes a simple functional integration test for Samza.

      1. SAMZA-468-1.patch
        55 kB
        Chris Riccomini
      2. SAMZA-468-2.patch
        56 kB
        Chris Riccomini
      3. SAMZA-468-3.patch
        40 kB
        Chris Riccomini
      4. SAMZA-468-4.patch
        40 kB
        Chris Riccomini
      5. SAMZA-468-5.patch
        57 kB
        Chris Riccomini
      6. SAMZA-468-6.patch
        58 kB
        Chris Riccomini
      7. SAMZA-468-7.patch
        58 kB
        Chris Riccomini
      8. SAMZA-468-8.patch
        63 kB
        Chris Riccomini
      9. SAMZA-INT-TEST.patch
        70 kB
        Navina Ramesh
      10. zopkio1.png
        124 kB
        Chris Riccomini
      11. zopkio2.png
        173 kB
        Chris Riccomini
      12. zopkio3.png
        129 kB
        Chris Riccomini
      13. zopkio4.png
        149 kB
        Chris Riccomini
      14. zopkio5.png
        141 kB
        Chris Riccomini

        Issue Links

          Activity

          Hide
          navina Navina Ramesh added a comment -

          Includes a README file that explains how to install and use the integration test with Zopkio

          Show
          navina Navina Ramesh added a comment - Includes a README file that explains how to install and use the integration test with Zopkio
          Hide
          martinkl Martin Kleppmann added a comment -

          Intriguing, but…

          1. Download & Install Zopkio [ https://github.com/linkedin/Zopkio ]

          Link returns 404, and Google doesn't bring up anything either. What is this? Please keep in mind that as an Apache project, all development should happen in the open.

          Show
          martinkl Martin Kleppmann added a comment - Intriguing, but… 1. Download & Install Zopkio [ https://github.com/linkedin/Zopkio ] Link returns 404, and Google doesn't bring up anything either. What is this? Please keep in mind that as an Apache project, all development should happen in the open.
          Hide
          criccomini Chris Riccomini added a comment -

          Martin, you're right. Sorry, we jumped the gun on this a bit. We're in the process of open sourcing the Zopkio library right now (it's basically just some Python scripts to help with distributed systems testing). Once there's something to see, we'll follow up.

          Show
          criccomini Chris Riccomini added a comment - Martin, you're right. Sorry, we jumped the gun on this a bit. We're in the process of open sourcing the Zopkio library right now (it's basically just some Python scripts to help with distributed systems testing). Once there's something to see, we'll follow up.
          Hide
          navina Navina Ramesh added a comment -

          Sorry about the confusion, Martin. I should have opened this bug after Zopkio is open-sourced.

          Show
          navina Navina Ramesh added a comment - Sorry about the confusion, Martin. I should have opened this bug after Zopkio is open-sourced.
          Hide
          criccomini Chris Riccomini added a comment - - edited

          Navina and I would like to propose that we use Zopkio for Samza's integration test framework. Zopkio is a basic set of scripts that can be used to help manage the lifecycle of tests for distributed systems such as Samza, Kafka, etc. Distributed systems typically start with a few convenient scripts to manage the setup/teardown of their integration tests. Problems with this approach are:

          1. The scripts quickly evolve into a mess of system-specific spaghetti code.
          2. There is no code sharing between projects that are doing similar distributed systems testing (log aggregation, performance testing, test lifecycle management, deployment, reporting), so a lot of work is duplicated.

          Zopkio is a library that's meant to address these problems, and a few others. The library also has nice integration with Naarad, which can be used to gather performance data about all of the tests being run (including integration with SAR, GC logs, etc). Rather than write yet-another system-specific integration test suite, Navina and I took a stab at implementing Samza's integration tests using Zopkio.

          While it's true that Zopkio is still a young framework, I'm aware of at least one other distributed system that's working on using it, and a second that will likely convert in the near future. Even if no one else converts, it seems ideal to leverage a library that already exists, rather than write our own.

          Please have a look and provide feedback. See attached screenshots to get a feel for what we get:

          Attaching patch. RB at:

          https://reviews.apache.org/r/29342/

          • Added bin/integration-tests.sh script to run integration tests.
          • Wrote a deployment script, which is responsible for starting up YARN, ZooKeeper, and Kafka, and installing all Samza test jobs.
          • Wrote a Samza job deployer for YARN. Manages install/start/stop/uninstall lifecycle of Samza jobs in a YARN grid. Meant to be used with yarn.package.path URIS that are file:/// URIs.
          • Wrote a simple smoke test suite that runs a single job (NegateNumbers).
          • Updated contribute/tests.md and README.md to document integration tests.
          • Moved tests from SAMZA-14 into samza-test directory.

          Notes:

          • Please download and try this. I hit some SSH key turbulence when I initially began. I'd like to make sure the scripts are as hardened as possible.
          • SAMZA-14 tests are not yet migrated executed.
          • Integration tests are currently executed locally on the machine from which bin/integration-tests.sh was executed.

          Follow on JIRAs:

          1. Migrate existing integration tests (stateful task, join jobs).
          2. Hook Naarad up, so we get performance metrics (GC, etc).
          3. Use a proper python package instead of CP'ing the raw scripts in bin/integration-tests.sh.
          4. Support multi-node integration tests.
          Show
          criccomini Chris Riccomini added a comment - - edited Navina and I would like to propose that we use Zopkio for Samza's integration test framework. Zopkio is a basic set of scripts that can be used to help manage the lifecycle of tests for distributed systems such as Samza, Kafka, etc. Distributed systems typically start with a few convenient scripts to manage the setup/teardown of their integration tests. Problems with this approach are: The scripts quickly evolve into a mess of system-specific spaghetti code. There is no code sharing between projects that are doing similar distributed systems testing (log aggregation, performance testing, test lifecycle management, deployment, reporting), so a lot of work is duplicated. Zopkio is a library that's meant to address these problems, and a few others. The library also has nice integration with Naarad , which can be used to gather performance data about all of the tests being run (including integration with SAR , GC logs, etc). Rather than write yet-another system-specific integration test suite, Navina and I took a stab at implementing Samza's integration tests using Zopkio. While it's true that Zopkio is still a young framework, I'm aware of at least one other distributed system that's working on using it, and a second that will likely convert in the near future. Even if no one else converts, it seems ideal to leverage a library that already exists, rather than write our own. Please have a look and provide feedback. See attached screenshots to get a feel for what we get: Attaching patch. RB at: https://reviews.apache.org/r/29342/ Added bin/integration-tests.sh script to run integration tests. Wrote a deployment script, which is responsible for starting up YARN, ZooKeeper, and Kafka, and installing all Samza test jobs. Wrote a Samza job deployer for YARN. Manages install/start/stop/uninstall lifecycle of Samza jobs in a YARN grid. Meant to be used with yarn.package.path URIS that are file:/// URIs. Wrote a simple smoke test suite that runs a single job (NegateNumbers). Updated contribute/tests.md and README.md to document integration tests. Moved tests from SAMZA-14 into samza-test directory. Notes: Please download and try this. I hit some SSH key turbulence when I initially began. I'd like to make sure the scripts are as hardened as possible. SAMZA-14 tests are not yet migrated executed. Integration tests are currently executed locally on the machine from which bin/integration-tests.sh was executed. Follow on JIRAs: Migrate existing integration tests (stateful task, join jobs). Hook Naarad up, so we get performance metrics (GC, etc). Use a proper python package instead of CP'ing the raw scripts in bin/integration-tests.sh . Support multi-node integration tests.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching minor update that aggregates YARN RM/NM logs, as well as container logs.

          Show
          criccomini Chris Riccomini added a comment - Attaching minor update that aggregates YARN RM/NM logs, as well as container logs.
          Hide
          jghoman Jakob Homan added a comment -

          I'm aware of at least one other distributed system that's working on using it, and a second that will likely convert in the near future.

          What are those?

          Show
          jghoman Jakob Homan added a comment - I'm aware of at least one other distributed system that's working on using it, and a second that will likely convert in the near future. What are those?
          Hide
          criccomini Chris Riccomini added a comment -

          Jakob Homan, one is LI's media server, which is in the process of ramping up for open source. The other was Kafka. Kafka is obviously a stickier subject, but Jay Kreps was pushing Zopkio, and I spoke with Joel Koshy about whether we should leverage Kafka's integration testing framework for Kafka, and he recommended against it, in favor of Zopkio.

          I still maintain that, even if no one else uses Zopkio, it's still better than writing our own Samza-integrated one. A lot of the stuff that this framework does is super generic, and IMO, should be split out separately from Samza. We should just use it.

          Show
          criccomini Chris Riccomini added a comment - Jakob Homan , one is LI's media server, which is in the process of ramping up for open source. The other was Kafka. Kafka is obviously a stickier subject, but Jay Kreps was pushing Zopkio, and I spoke with Joel Koshy about whether we should leverage Kafka's integration testing framework for Kafka, and he recommended against it, in favor of Zopkio. I still maintain that, even if no one else uses Zopkio, it's still better than writing our own Samza-integrated one. A lot of the stuff that this framework does is super generic, and IMO, should be split out separately from Samza. We should just use it.
          Hide
          criccomini Chris Riccomini added a comment -

          As an aside, it is definitely NOT my intention to try and link Samza to some LI-specifc code, or something. The LI branding on the reports is mildly distasteful, and I've already opened a ticket to remove it. The project is really young, and I think that we should think of it as ours to contribute to, and make better.

          Show
          criccomini Chris Riccomini added a comment - As an aside, it is definitely NOT my intention to try and link Samza to some LI-specifc code, or something. The LI branding on the reports is mildly distasteful, and I've already opened a ticket to remove it . The project is really young, and I think that we should think of it as ours to contribute to, and make better.
          Hide
          jghoman Jakob Homan added a comment -

          No worries; I doubt anyone will be against, and I'm certainly not. Just curious.

          Show
          jghoman Jakob Homan added a comment - No worries; I doubt anyone will be against, and I'm certainly not. Just curious.
          Hide
          sriramsub Sriram Subramanian added a comment -

          +1 on using Zopkio. I have looked into it and have started using it for another project. Saves a lot of time and also should be easy to contribute back.

          Show
          sriramsub Sriram Subramanian added a comment - +1 on using Zopkio. I have looked into it and have started using it for another project. Saves a lot of time and also should be easy to contribute back.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching updated patch that fixes ZK version conflict between YARN and Kafka in samza-test.

          Show
          criccomini Chris Riccomini added a comment - Attaching updated patch that fixes ZK version conflict between YARN and Kafka in samza-test.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching updated patch, which upgrades to latest virtualenv, and does a safety check for pydistutils files that mess up virtualenv.

          Show
          criccomini Chris Riccomini added a comment - Attaching updated patch, which upgrades to latest virtualenv, and does a safety check for pydistutils files that mess up virtualenv.
          Hide
          criccomini Chris Riccomini added a comment -

          Opened https://github.com/linkedin/Zopkio/issues/40 to address SSH difficulties.

          In the meantime, the easiest way to make SSH work is to add your public key to ~/.ssh/authorized_keys on the box(es) you're deploying to (localhost in this patch).

          Show
          criccomini Chris Riccomini added a comment - Opened https://github.com/linkedin/Zopkio/issues/40 to address SSH difficulties. In the meantime, the easiest way to make SSH work is to add your public key to ~/.ssh/authorized_keys on the box(es) you're deploying to (localhost in this patch).
          Hide
          nickpan47 Yi Pan (Data Infrastructure) added a comment -

          +1 on all the other changes. I would vote to check in the current patch, leaving the SSH issue as a known issue and provide some documentation for the work-around mentioned above.
          Final resolution to the SSH issue would come from Zopkio fixes.

          Show
          nickpan47 Yi Pan (Data Infrastructure) added a comment - +1 on all the other changes. I would vote to check in the current patch, leaving the SSH issue as a known issue and provide some documentation for the work-around mentioned above. Final resolution to the SSH issue would come from Zopkio fixes.
          Hide
          closeuris Yan Fang added a comment -

          Tested in Mac, it works. As Yi Pan said, we can leave the SSH issue as a known issue and provide some workaround documentation for the ssh issue. Also, opening a ticket to tracking this issue maybe helpful as well. Overall, + 1 for the patch.

          Show
          closeuris Yan Fang added a comment - Tested in Mac, it works. As Yi Pan said, we can leave the SSH issue as a known issue and provide some workaround documentation for the ssh issue. Also, opening a ticket to tracking this issue maybe helpful as well. Overall, + 1 for the patch.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching latest patch. RB also updated at:

          https://reviews.apache.org/r/29342/

          Changes address Yan Fang's comments. Added support for relative test paths. Fixed licensing.

          I noticed that the SSH work-around documented in the Zopkio issue does not seem to work for me on my Linux box. I'm going to investigate this before committing.

          Show
          criccomini Chris Riccomini added a comment - Attaching latest patch. RB also updated at: https://reviews.apache.org/r/29342/ Changes address Yan Fang 's comments. Added support for relative test paths. Fixed licensing. I noticed that the SSH work-around documented in the Zopkio issue does not seem to work for me on my Linux box. I'm going to investigate this before committing.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching updated patch. Added docs on using SSH public key auth. Also added support for arbitrary Zopkio switches (--nopassword, etc). By default, Zopkio prompts for a password now, and will use it to login to remote machines.

          Show
          criccomini Chris Riccomini added a comment - Attaching updated patch. Added docs on using SSH public key auth. Also added support for arbitrary Zopkio switches (--nopassword, etc). By default, Zopkio prompts for a password now, and will use it to login to remote machines.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching updated patch with a one line fix from Navina Ramesh's feedback.

          Show
          criccomini Chris Riccomini added a comment - Attaching updated patch with a one line fix from Navina Ramesh 's feedback.
          Hide
          closeuris Yan Fang added a comment -

          Looks good for me. BTW, how about the Linux issue you mentioned? Is it working now?

          Show
          closeuris Yan Fang added a comment - Looks good for me. BTW, how about the Linux issue you mentioned? Is it working now?
          Hide
          criccomini Chris Riccomini added a comment -

          how about the Linux issue you mentioned? Is it working now?

          I haven't been able to get it to work, but I'm beginning to think that it's my Linux box's SSH configuration/ssh-agent. Still working through it, but I don't see it as a blocker, since the code has worked for several others.

          Show
          criccomini Chris Riccomini added a comment - how about the Linux issue you mentioned? Is it working now? I haven't been able to get it to work, but I'm beginning to think that it's my Linux box's SSH configuration/ssh-agent. Still working through it, but I don't see it as a blocker, since the code has worked for several others.
          Hide
          closeuris Yan Fang added a comment -

          OK. Tested in the Mac, the latest patch works well. In my Linux VM, it does not work because of

          Installing setuptools, pip...
            Complete output from command /tmp/samza-tests/sam...ion-tests/bin/python -c "import sys, pip; sys...d\"] + sys.argv[1:]))" setuptools pip:
            Traceback (most recent call last):
            File "<string>", line 1, in <module>
            File "/tmp/samza-tests/virtualenv-12.0.5/virtualenv_support/pip-6.0.6-py2.py3-none-any.whl/pip/__init__.py", line 15, in <module>
            File "/tmp/samza-tests/virtualenv-12.0.5/virtualenv_support/pip-6.0.6-py2.py3-none-any.whl/pip/vcs/mercurial.py", line 11, in <module>
            File "/tmp/samza-tests/virtualenv-12.0.5/virtualenv_support/pip-6.0.6-py2.py3-none-any.whl/pip/download.py", line 30, in <module>
            File "/tmp/samza-tests/virtualenv-12.0.5/virtualenv_support/pip-6.0.6-py2.py3-none-any.whl/pip/_vendor/__init__.py", line 81, in load_module
          ImportError: No module named 'pip._vendor.requests'
          ----------------------------------------
          ...Installing setuptools, pip...done.
          Traceback (most recent call last):
            File "virtualenv-12.0.5/virtualenv.py", line 2352, in <module>
              main()
            File "virtualenv-12.0.5/virtualenv.py", line 825, in main
              symlink=options.symlink)
            File "virtualenv-12.0.5/virtualenv.py", line 993, in create_environment
              install_wheel(to_install, py_executable, search_dirs)
            File "virtualenv-12.0.5/virtualenv.py", line 961, in install_wheel
              'PIP_NO_INDEX': '1'
            File "virtualenv-12.0.5/virtualenv.py", line 903, in call_subprocess
              % (cmd_desc, proc.returncode))
          OSError: Command /tmp/samza-tests/sam...ion-tests/bin/python -c "import sys, pip; sys...d\"] + sys.argv[1:]))" setuptools pip failed with error code 1
          
          Show
          closeuris Yan Fang added a comment - OK. Tested in the Mac, the latest patch works well. In my Linux VM, it does not work because of Installing setuptools, pip... Complete output from command /tmp/samza-tests/sam...ion-tests/bin/python -c " import sys, pip; sys...d\" ] + sys.argv[1:]))" setuptools pip: Traceback (most recent call last): File "<string>" , line 1, in <module> File "/tmp/samza-tests/virtualenv-12.0.5/virtualenv_support/pip-6.0.6-py2.py3-none-any.whl/pip/__init__.py" , line 15, in <module> File "/tmp/samza-tests/virtualenv-12.0.5/virtualenv_support/pip-6.0.6-py2.py3-none-any.whl/pip/vcs/mercurial.py" , line 11, in <module> File "/tmp/samza-tests/virtualenv-12.0.5/virtualenv_support/pip-6.0.6-py2.py3-none-any.whl/pip/download.py" , line 30, in <module> File "/tmp/samza-tests/virtualenv-12.0.5/virtualenv_support/pip-6.0.6-py2.py3-none-any.whl/pip/_vendor/__init__.py" , line 81, in load_module ImportError: No module named 'pip._vendor.requests' ---------------------------------------- ...Installing setuptools, pip...done. Traceback (most recent call last): File "virtualenv-12.0.5/virtualenv.py" , line 2352, in <module> main() File "virtualenv-12.0.5/virtualenv.py" , line 825, in main symlink=options.symlink) File "virtualenv-12.0.5/virtualenv.py" , line 993, in create_environment install_wheel(to_install, py_executable, search_dirs) File "virtualenv-12.0.5/virtualenv.py" , line 961, in install_wheel 'PIP_NO_INDEX': '1' File "virtualenv-12.0.5/virtualenv.py" , line 903, in call_subprocess % (cmd_desc, proc.returncode)) OSError: Command /tmp/samza-tests/sam...ion-tests/bin/python -c " import sys, pip; sys...d\" ] + sys.argv[1:]))" setuptools pip failed with error code 1
          Hide
          criccomini Chris Riccomini added a comment - - edited

          Yan Fang, what version of `pip` are you using? Also, does just running `pip` with no arguments work?

          Also, what version of Python is installed?

          Show
          criccomini Chris Riccomini added a comment - - edited Yan Fang , what version of `pip` are you using? Also, does just running `pip` with no arguments work? Also, what version of Python is installed?
          Hide
          closeuris Yan Fang added a comment -

          OK. fix this by giving "sudo"... Now seems I have the ssh problem as well:

           File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/zopkio/test_runner.py", line 107, in run
              self.deployment_module.setup_suite()
            File "/tmp/samza-tests/scripts/deployment.py", line 76, in setup_suite
              'hostname': host
            File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/zopkio/deployer.py", line 77, in deploy
              self.install(unique_id, configs)
            File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/zopkio/adhoc_deployer.py", line 100, in install
              with get_ssh_client(hostname, username=runtime.get_username(), password=runtime.get_password()) as ssh:
            File "/usr/lib64/python2.6/contextlib.py", line 16, in __enter__
              return self.gen.next()
            File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/zopkio/remote_host_helper.py", line 182, in get_ssh_client
              ssh.connect(hostname, username=username, password=password)
            File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/paramiko/client.py", line 307, in connect
              look_for_keys, gss_auth, gss_kex, gss_deleg_creds, gss_host)
            File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/paramiko/client.py", line 519, in _auth
              raise saved_exception
          AuthenticationException: Authentication failed.
          

          Is this the same as yours?

          Show
          closeuris Yan Fang added a comment - OK. fix this by giving "sudo"... Now seems I have the ssh problem as well: File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/zopkio/test_runner.py" , line 107, in run self.deployment_module.setup_suite() File "/tmp/samza-tests/scripts/deployment.py" , line 76, in setup_suite 'hostname': host File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/zopkio/deployer.py" , line 77, in deploy self.install(unique_id, configs) File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/zopkio/adhoc_deployer.py" , line 100, in install with get_ssh_client(hostname, username=runtime.get_username(), password=runtime.get_password()) as ssh: File "/usr/lib64/python2.6/contextlib.py" , line 16, in __enter__ return self.gen.next() File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/zopkio/remote_host_helper.py" , line 182, in get_ssh_client ssh.connect(hostname, username=username, password=password) File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/paramiko/client.py" , line 307, in connect look_for_keys, gss_auth, gss_kex, gss_deleg_creds, gss_host) File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/paramiko/client.py" , line 519, in _auth raise saved_exception AuthenticationException: Authentication failed. Is this the same as yours?
          Hide
          navina Navina Ramesh added a comment -

          I tried it out on my Linux box. I had setup the auth keys. However, it prompted for password anyway. So, I am not sure which one it used. I did not get any auth exception.
          However, my tests failed as it could not connect to Kafka broker.

          2015-01-07 14:33:11,132 kafka [ERROR] Unable to connect to kafka broker at :9092
          Traceback (most recent call last):
          File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/kafka/conn.py", line 192, in reinit
          self._sock = socket.create_connection((self.host, self.port), self.timeout)
          File "/usr/lib64/python2.6/socket.py", line 553, in create_connection
          for res in getaddrinfo(host, port, 0, SOCK_STREAM):
          gaierror: [Errno -2] Name or service not known
          2015-01-07 14:33:11,133 kafka [WARNING] Could not send request ['\x00\x00\x00,\x00\x03\x00\x00\x00\x00\x00\x03\x00\x0ckafka-python\x00\x00\x00\x01\x00\x10samza-test-topic'] to server :9092, trying next server: Kafka @ :9092 went away
          2015-01-07 14:33:11,138 kafka [WARNING] No partitions for samza-test-topic

          Digging in more to see why it didn't connect to the Kafka broker.

          Show
          navina Navina Ramesh added a comment - I tried it out on my Linux box. I had setup the auth keys. However, it prompted for password anyway. So, I am not sure which one it used. I did not get any auth exception. However, my tests failed as it could not connect to Kafka broker. 2015-01-07 14:33:11,132 kafka [ERROR] Unable to connect to kafka broker at :9092 Traceback (most recent call last): File "/tmp/samza-tests/samza-integration-tests/lib/python2.6/site-packages/kafka/conn.py", line 192, in reinit self._sock = socket.create_connection((self.host, self.port), self.timeout) File "/usr/lib64/python2.6/socket.py", line 553, in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): gaierror: [Errno -2] Name or service not known 2015-01-07 14:33:11,133 kafka [WARNING] Could not send request ['\x00\x00\x00,\x00\x03\x00\x00\x00\x00\x00\x03\x00\x0ckafka-python\x00\x00\x00\x01\x00\x10samza-test-topic'] to server :9092, trying next server: Kafka @ :9092 went away 2015-01-07 14:33:11,138 kafka [WARNING] No partitions for samza-test-topic Digging in more to see why it didn't connect to the Kafka broker.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching updated patch. I'm now able to run on both Mac OSX (Python 2.7) and Linux (Python 2.6) with public-key authentication.

          RB still at:

          https://reviews.apache.org/r/29342/

          Fixes since last patch:

          • Set vmem ratio to work properly on Linux. Containers were getting killed without this.
          • Added support for templated configs. In the future, we should use templated configs for everything, rather than sed'ing config files on remote machines.
          • Set YARN_CONF_DIR and HADOOP_CONF_DIR for run-job.sh and kill-yarn-job.sh.
          • Add .rstrip() to fix a Python 2.7 vs. Python 2.6 incompatibility.
          Show
          criccomini Chris Riccomini added a comment - Attaching updated patch. I'm now able to run on both Mac OSX (Python 2.7) and Linux (Python 2.6) with public-key authentication. RB still at: https://reviews.apache.org/r/29342/ Fixes since last patch: Set vmem ratio to work properly on Linux. Containers were getting killed without this. Added support for templated configs. In the future, we should use templated configs for everything, rather than sed'ing config files on remote machines. Set YARN_CONF_DIR and HADOOP_CONF_DIR for run-job.sh and kill-yarn-job.sh. Add .rstrip() to fix a Python 2.7 vs. Python 2.6 incompatibility.
          Hide
          criccomini Chris Riccomini added a comment -

          Things this patch will not address:

          1. Supporting execution on remote machines.
          2. Supporting execution on multiple machines.
          3. Using templated configs for YARN RM/NM, ZooKeeper, and Kafka.
          4. Enabling naarad performance logging.

          We should open these as follow-on tickets.

          Show
          criccomini Chris Riccomini added a comment - Things this patch will not address: Supporting execution on remote machines. Supporting execution on multiple machines. Using templated configs for YARN RM/NM, ZooKeeper, and Kafka. Enabling naarad performance logging. We should open these as follow-on tickets.
          Hide
          criccomini Chris Riccomini added a comment -

          Yan Fang, when you have a chance, could you have a look?

          We tested internally on 4 Mac OSX boxes, and 3 Linux boxes.

          Show
          criccomini Chris Riccomini added a comment - Yan Fang , when you have a chance, could you have a look? We tested internally on 4 Mac OSX boxes, and 3 Linux boxes.
          Hide
          closeuris Yan Fang added a comment -

          Yes, tested it. The previous authorization problem is gone. And the zopkio is installed and run successfully. The smoke-test is not successful and I think it's because I am using VM or the python-dev version. Will keep playing with it when I have time.

          Now will give it +1.

          Show
          closeuris Yan Fang added a comment - Yes, tested it. The previous authorization problem is gone. And the zopkio is installed and run successfully. The smoke-test is not successful and I think it's because I am using VM or the python-dev version. Will keep playing with it when I have time. Now will give it +1.
          Hide
          criccomini Chris Riccomini added a comment -

          Merged and committed. Thanks everyone!

          Yan Fang, if you determine that the Linux issues are not environment related, please open a follow-up JIRA, and we can go from there.

          I'm going to open JIRAs for future work.

          Show
          criccomini Chris Riccomini added a comment - Merged and committed. Thanks everyone! Yan Fang , if you determine that the Linux issues are not environment related, please open a follow-up JIRA, and we can go from there. I'm going to open JIRAs for future work.

            People

            • Assignee:
              navina Navina Ramesh
              Reporter:
              navina Navina Ramesh
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development