Uploaded image for project: 'OODT'
  1. OODT
  2. OODT-383

Workflow Manager Client - Add Connection Limit Option



    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • None
    • 0.11
    • workflow manager
    • None
    • centOS 5/6


      When using the wmgr-client to run thousands of jobs it is pretty easy to overwhelm the xml-rpc connection pool to the workflow manager. I was using a simple python script to submit 10K jobs and the workflow manager couldn't handle the jobs quickly enough and many jobs were dropped as a result.

      One fix I implemented in my Python code was to use lsof to check the number of ESTABLISHED connections to the workflow manager. If the workflow manager had more than say 30 connections, my program would go to sleep and try submitting jobs later.

      I would like to enhance the wmgr-client shell script with an option to limit the number of connections to the wmgr, by default this limit would not be set.

      If the connection limit is reached the wmgr-client would sleep for 10 seconds, and re-check the number of connections. This loop would continue until the number of connections dropped below the specified limit. Once the connection count drops below the target number, the wmgr-client would resume submitting jobs to the wmgr.

      On my production server I was using lsof to gather the number of connections to the wmgr. I am not sure if we can always rely on lsof being installed on all machines, so we might need to use a more universal method (maybe in Java).

      here is the lsof command I used with some grep and wc sprinkled in:
      /usr/sbin/lsof -i :9001 | grep ESTABLISHED | wc

      This assumes you are running wmgr on localhost:9001 and lsof is installed at /usr/sbin/lsof

      Any other thoughts or ideas to work this out would be appreciated.


        1. modscag-v2-job-runner.py
          3 kB
          Cameron Goodale



            cgoodale Cameron Goodale
            cgoodale Cameron Goodale
            0 Vote for this issue
            1 Start watching this issue