Uploaded image for project: 'Aurora'
  1. Aurora
  2. AURORA-1533

Transient connection errors can leave client in irrecoverable state

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.10.0
    • None
    • None

    Description

      During a cluster update, some of our schedulers returned an unknown error to connecting clients (relevant code). Long running clients failed to recover from these errors as the code assumed the connection was already established. Subsequent scheduling calls thus failed with the following exception:

      File  "venv/local/lib/python2.7/site-packages/apache/aurora/client/api/__init__.py"  in query_no_configs
        140.       raise self.ThriftInternalError(e.args[0])
      
      Exception Type: ThriftInternalError
      Exception Value: Error during thrift call getTasksWithoutConfigs to 
      testcluster: 'NoneType' object has no attribute 'getTasksWithoutConfigs'
      

      Background: We are using the python client to dispatch calls to Aurora from within a long-running web service. The connection is kept open as long as the web service is running.

      Attachments

        Activity

          People

            StephanErb Stephan Erb
            StephanErb Stephan Erb
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: