Uploaded image for project: 'Hadoop Distributed Data Store'
  1. Hadoop Distributed Data Store
  2. HDDS-2225

SCM fails to start in most unsecure environments due to leftover secure config

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: docker
    • Target Version/s:

      Description

      Intermittent failure of ozone-recon and some other acceptance tests where SCM container is not available is caused by leftover secure config in core-site.xml.

      Initially the config file is empty. Various test environments populate it with different settings. The problem happens when a test does not specify any config for core-site.xml, in which case the previous test's config file is retained.

      scm_1       | 2019-10-01 19:42:05 WARN  WebAppContext:531 - Failed startup of context o.e.j.w.WebAppContext@1cc680e{/,file:///tmp/jetty-0.0.0.0-9876-scm-_-any-1272594486261557815.dir/webapp/,UNAVAILABLE}{/scm}
      scm_1       | javax.servlet.ServletException: javax.servlet.ServletException: Keytab does not exist: /etc/security/keytabs/HTTP.keytab
      scm_1       | 	at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:188)
      ...
      scm_1       | 	at org.apache.hadoop.hdds.scm.server.StorageContainerManager.start(StorageContainerManager.java:791)
      ...
      scm_1       | Unable to initialize WebAppContext
      scm_1       | 2019-10-01 19:42:05 INFO  StorageContainerManagerStarter:51 - SHUTDOWN_MSG:
      scm_1       | /************************************************************
      scm_1       | SHUTDOWN_MSG: Shutting down StorageContainerManager at 8724df7131bb/192.168.128.6
      scm_1       | ************************************************************/
      

      The problem is intermittent due to ordering of test cases being different in different runs. If a secure test is run earlier, more tests are affected. If secure tests are run last, the issue does not happen.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                adoroszlai Attila Doroszlai
                Reporter:
                adoroszlai Attila Doroszlai
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m