Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4299

run-all.sh command fails because HiveServer2 service failed to start

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.7.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Infrastructure
    • Labels:
      None

      Description

      After rebasing to the latest commit, HiveServer2 fails to start when run-all is executed.

      Exception from hive.log (the file is also attached):

      ...
      Caused by: ERROR XJ040: Failed to start database 'metastore_db' with class loader sun.misc.Launcher$AppClassLoader@60dbf04d, see the next exception for details.
        at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)            
        at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source)
        ... 84 more                                                                              
      Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /home/tbobrovytsky/Impala/metastore_db.
      ...
      

      This problem is most likely caused by the following recent commit. This problem does not happen on the commit before it.

      commit 75a857c0ceb6a58691ff45f0df8d88586edc8acd
      Author: Tim Armstrong <tarmstrong@cloudera.com>
      Date:   Sun Oct 9 23:06:40 2016 -0700
      
          IMPALA-4259: build Impala without any test cluster setup.
          
          The main outcome of this change is to avoid making unnecessary
          modification to the Impala or other source trees when we don't need the
          test cluster.
      
      1. hive.log
        1.92 MB
        Taras Bobrovytsky

        Activity

        Hide
        henryr Henry Robinson added a comment -

        Are we seeing it in any automated builds?

        Show
        henryr Henry Robinson added a comment - Are we seeing it in any automated builds?
        Hide
        alex.behm Alexander Behm added a comment -

        We are not, but both Taras and I are hitting it very reliably.

        Show
        alex.behm Alexander Behm added a comment - We are not, but both Taras and I are hitting it very reliably.
        Hide
        tarasbob Taras Bobrovytsky added a comment -

        No, strangely this happens only in my local environment. Alexander Behm also ran into this issue. I am running Ubuntu 14, Alex is running Ubuntu 12.

        Show
        tarasbob Taras Bobrovytsky added a comment - No, strangely this happens only in my local environment. Alexander Behm also ran into this issue. I am running Ubuntu 14, Alex is running Ubuntu 12.
        Hide
        henryr Henry Robinson added a comment -

        Any idea what the difference is between your environment and the test one?

        Show
        henryr Henry Robinson added a comment - Any idea what the difference is between your environment and the test one?
        Hide
        dknupp David Knupp added a comment -

        Strangely enough, I'm able to call run-all.sh without an issue.

        I definitely have the latest changes:

        dknupp@dknupp-desktop:0:~/Impala (asf-gerrit-master) $ git lg
        * 1a5c43e - (HEAD, asf-gerrit/master, asf-gerrit-master) IMPALA-3644 Make predicate order deterministic (3 hours ago) <Lars Volker>
        * db5de41 - (dknupp/asf-gerrit-master, apache/master) IMPALA-4188: Leopard: support external Docker volumes (17 hours ago) <Michael Brown>
        * 784716f - IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment (19 hours ago) <Jim Apple>
        * 2a04b0e - IMPALA-3943: Address post-merge comments. (19 hours ago) <Alex Behm>
        * 47b8aa3 - IMPALA-4291: Reduce LLVM module's preparation time (22 hours ago) <Michael Ho>
        * c7fe438 - IMPALA-4231: fix codegen time regression (22 hours ago) <Tim Armstrong>
        * 89b41c6 - Match .clang-format more closely to actual practice. (25 hours ago) <Jim Apple>
        * 67a0451 - Bump Bzip2 version (28 hours ago) <Tim Armstrong>
        * 75a857c - (asf-gerrit/hadoop-next, apache/hadoop-next) IMPALA-4259: build Impala without any test cluster setup. (2 days ago) <Tim Armstrong>     <--- the commit in question.
        * 0b3efb1 - IMPALA-4289: Mark agg slots of NDV() functions as non-nullable. (2 days ago) <Alex Behm>
        * f2780b5 - IMPALA-4287: EE tests fail to run when KUDU_IS_SUPPORTED=false (2 days ago) <Matthew Jacobs>
        [...]
        

        ...yet run-all.sh succeeds.

        dknupp@dknupp-desktop:0:~/Impala (asf-gerrit-master) $ ./testdata/bin/run-all.sh
        Killing running services...
        Starting cluster services...
        Stopping kudu
        Stopping kms
        Stopping yarn
        Stopping hdfs
        Starting hdfs (Web UI - http://localhost:5070)
        Namenode started
        Starting yarn (Web UI - http://localhost:8088)
        Starting kms (Web UI - http://localhost:16000)
        Starting kudu (Web UI - http://localhost:8051)
        The cluster is running
         --> Starting HBase
        HBase start attempt: 1/5
        Killing any HBase processes possibly lingering from previous start attempts
        Stopping Hbase
        [...]
        Contents of HDFS root: [u'hbase', u'home', u'test-warehouse', u'tmp', u'user']
        Connecting to Zookeeper host(s).
        Success: <kazoo.client.KazooClient object at 0x7fc05b27e250>
        Waiting for HBase node: /hbase/master
        Success: /hbase/master
        Waiting for HBase node: /hbase/rs
        Success: /hbase/rs
        HBase startup scripts succeeded
         --> Starting Hive Server and Metastore Service
        Waiting for the Metastore at localhost:9083...
        Waiting for the Metastore at localhost:9083...
        Waiting for the Metastore at localhost:9083...
        Waiting for the Metastore at localhost:9083...
        Waiting for the Metastore at localhost:9083...
        Metastore service is up at localhost:9083.
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        Waiting for HiveServer2 at localhost:11050...
        Could not connect to localhost:11050
        HiveServer2 service is up at localhost:11050.
         --> Starting the Sentry Policy Server
        dknupp@dknupp-desktop:0:~/Impala (asf-gerrit-master) $
        

        I did notice some differences in my hive.log compared the attached one. Specifically, look at the first and second lines of the failing log after starting HiveServer2. It immediately jumps to Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore:

        2016-10-14 16:38:11,340 INFO  server.HiveServer2 (HiveServer2.java:startHiveServer2(507)) - Starting HiveServer2
        2016-10-14 16:38:11,788 INFO  metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(644)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
        2016-10-14 16:38:11,804 INFO  metastore.ObjectStore (ObjectStore.java:initialize(333)) - ObjectStore, initialize called
        2016-10-14 16:38:12,271 ERROR Datastore.Schema (Log4JLogger.java:error(125)) - Failed initialising database.
        [...]
        

        This is the hive.log from my successful startup. There are many additional steps.

        2016-10-14 17:44:38,949 INFO  server.HiveServer2 (HiveServer2.java:startHiveServer2(507)) - Starting HiveServer2
        2016-10-14 17:44:39,013 WARN  conf.HiveConf (HiveConf.java:initialize(2762)) - HiveConf of name hive.access.conf.url does not exist
        2016-10-14 17:44:39,346 INFO  hive.metastore (HiveMetaStoreClient.java:open(386)) - Trying to connect to metastore with URI thrift://localhost:9083
        2016-10-14 17:44:39,358 INFO  hive.metastore (HiveMetaStoreClient.java:open(431)) - Opened a connection to metastore, current connections: 1
        2016-10-14 17:44:39,369 INFO  log.PerfLogger (PerfLogger.java:PerfLogBegin(127)) - <PERFLOG method=set_ugi from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
        2016-10-14 17:44:39,369 INFO  log.PerfLogger (PerfLogger.java:PerfLogEnd(168)) - </PERFLOG method=set_ugi start=1476492279369 end=1476492279369 duration=0 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 retryCount=0 error=false>
        2016-10-14 17:44:39,370 INFO  hive.metastore (HiveMetaStoreClient.java:open(483)) - Connected to metastore.
        2016-10-14 17:44:39,437 INFO  log.PerfLogger (PerfLogger.java:PerfLogBegin(127)) - <PERFLOG method=get_all_functions from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
        2016-10-14 17:44:39,438 INFO  metastore.HiveMetaStore (HiveMetaStore.java:logInfo(801)) - 2: source:127.0.0.1 get_all_functions
        2016-10-14 17:44:39,439 INFO  HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(394)) - ugi=dknupp  ip=127.0.0.1    cmd=source:127.0.0.1 get_all_functions
        2016-10-14 17:44:39,439 INFO  metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(644)) - 2: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
        2016-10-14 17:44:39,441 INFO  metastore.ObjectStore (ObjectStore.java:initialize(333)) - ObjectStore, initialize called
        2016-10-14 17:44:39,444 INFO  metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:<init>(138)) - Using direct SQL, underlying DB is OTHER
        2016-10-14 17:44:39,444 INFO  metastore.ObjectStore (ObjectStore.java:setConf(316)) - Initialized ObjectStore
        [...]
        
        Show
        dknupp David Knupp added a comment - Strangely enough, I'm able to call run-all.sh without an issue. I definitely have the latest changes: dknupp@dknupp-desktop:0:~/Impala (asf-gerrit-master) $ git lg * 1a5c43e - (HEAD, asf-gerrit/master, asf-gerrit-master) IMPALA-3644 Make predicate order deterministic (3 hours ago) <Lars Volker> * db5de41 - (dknupp/asf-gerrit-master, apache/master) IMPALA-4188: Leopard: support external Docker volumes (17 hours ago) <Michael Brown> * 784716f - IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment (19 hours ago) <Jim Apple> * 2a04b0e - IMPALA-3943: Address post-merge comments. (19 hours ago) <Alex Behm> * 47b8aa3 - IMPALA-4291: Reduce LLVM module's preparation time (22 hours ago) <Michael Ho> * c7fe438 - IMPALA-4231: fix codegen time regression (22 hours ago) <Tim Armstrong> * 89b41c6 - Match .clang-format more closely to actual practice. (25 hours ago) <Jim Apple> * 67a0451 - Bump Bzip2 version (28 hours ago) <Tim Armstrong> * 75a857c - (asf-gerrit/hadoop-next, apache/hadoop-next) IMPALA-4259: build Impala without any test cluster setup. (2 days ago) <Tim Armstrong> <--- the commit in question. * 0b3efb1 - IMPALA-4289: Mark agg slots of NDV() functions as non-nullable. (2 days ago) <Alex Behm> * f2780b5 - IMPALA-4287: EE tests fail to run when KUDU_IS_SUPPORTED=false (2 days ago) <Matthew Jacobs> [...] ...yet run-all.sh succeeds. dknupp@dknupp-desktop:0:~/Impala (asf-gerrit-master) $ ./testdata/bin/run-all.sh Killing running services... Starting cluster services... Stopping kudu Stopping kms Stopping yarn Stopping hdfs Starting hdfs (Web UI - http://localhost:5070) Namenode started Starting yarn (Web UI - http://localhost:8088) Starting kms (Web UI - http://localhost:16000) Starting kudu (Web UI - http://localhost:8051) The cluster is running --> Starting HBase HBase start attempt: 1/5 Killing any HBase processes possibly lingering from previous start attempts Stopping Hbase [...] Contents of HDFS root: [u'hbase', u'home', u'test-warehouse', u'tmp', u'user'] Connecting to Zookeeper host(s). Success: <kazoo.client.KazooClient object at 0x7fc05b27e250> Waiting for HBase node: /hbase/master Success: /hbase/master Waiting for HBase node: /hbase/rs Success: /hbase/rs HBase startup scripts succeeded --> Starting Hive Server and Metastore Service Waiting for the Metastore at localhost:9083... Waiting for the Metastore at localhost:9083... Waiting for the Metastore at localhost:9083... Waiting for the Metastore at localhost:9083... Waiting for the Metastore at localhost:9083... Metastore service is up at localhost:9083. Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 Waiting for HiveServer2 at localhost:11050... Could not connect to localhost:11050 HiveServer2 service is up at localhost:11050. --> Starting the Sentry Policy Server dknupp@dknupp-desktop:0:~/Impala (asf-gerrit-master) $ I did notice some differences in my hive.log compared the attached one. Specifically, look at the first and second lines of the failing log after starting HiveServer2. It immediately jumps to Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore : 2016-10-14 16:38:11,340 INFO server.HiveServer2 (HiveServer2.java:startHiveServer2(507)) - Starting HiveServer2 2016-10-14 16:38:11,788 INFO metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(644)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 2016-10-14 16:38:11,804 INFO metastore.ObjectStore (ObjectStore.java:initialize(333)) - ObjectStore, initialize called 2016-10-14 16:38:12,271 ERROR Datastore.Schema (Log4JLogger.java:error(125)) - Failed initialising database. [...] This is the hive.log from my successful startup. There are many additional steps. 2016-10-14 17:44:38,949 INFO server.HiveServer2 (HiveServer2.java:startHiveServer2(507)) - Starting HiveServer2 2016-10-14 17:44:39,013 WARN conf.HiveConf (HiveConf.java:initialize(2762)) - HiveConf of name hive.access.conf.url does not exist 2016-10-14 17:44:39,346 INFO hive.metastore (HiveMetaStoreClient.java:open(386)) - Trying to connect to metastore with URI thrift://localhost:9083 2016-10-14 17:44:39,358 INFO hive.metastore (HiveMetaStoreClient.java:open(431)) - Opened a connection to metastore, current connections: 1 2016-10-14 17:44:39,369 INFO log.PerfLogger (PerfLogger.java:PerfLogBegin(127)) - <PERFLOG method=set_ugi from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> 2016-10-14 17:44:39,369 INFO log.PerfLogger (PerfLogger.java:PerfLogEnd(168)) - </PERFLOG method=set_ugi start=1476492279369 end=1476492279369 duration=0 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 retryCount=0 error=false> 2016-10-14 17:44:39,370 INFO hive.metastore (HiveMetaStoreClient.java:open(483)) - Connected to metastore. 2016-10-14 17:44:39,437 INFO log.PerfLogger (PerfLogger.java:PerfLogBegin(127)) - <PERFLOG method=get_all_functions from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> 2016-10-14 17:44:39,438 INFO metastore.HiveMetaStore (HiveMetaStore.java:logInfo(801)) - 2: source:127.0.0.1 get_all_functions 2016-10-14 17:44:39,439 INFO HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(394)) - ugi=dknupp ip=127.0.0.1 cmd=source:127.0.0.1 get_all_functions 2016-10-14 17:44:39,439 INFO metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(644)) - 2: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 2016-10-14 17:44:39,441 INFO metastore.ObjectStore (ObjectStore.java:initialize(333)) - ObjectStore, initialize called 2016-10-14 17:44:39,444 INFO metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:<init>(138)) - Using direct SQL, underlying DB is OTHER 2016-10-14 17:44:39,444 INFO metastore.ObjectStore (ObjectStore.java:setConf(316)) - Initialized ObjectStore [...]
        Hide
        tarmstrong Tim Armstrong added a comment -

        I think you need to run bin/create-test-configuration.sh to recreate the test config files. They were probably deleted by running ./bin/clean.sh or buildall.sh without -noclean.

        Before my commit this script was run incidentally a lot of the time by buildall.sh (which causes other problems when you just want to build Impala and not set up a minicluster).

        Show
        tarmstrong Tim Armstrong added a comment - I think you need to run bin/create-test-configuration.sh to recreate the test config files. They were probably deleted by running ./bin/clean.sh or buildall.sh without -noclean. Before my commit this script was run incidentally a lot of the time by buildall.sh (which causes other problems when you just want to build Impala and not set up a minicluster).
        Hide
        tarmstrong Tim Armstrong added a comment -

        IMPALA-4299: add buildall.sh option to start test cluster

        A previous commit "IMPALA-4259: build Impala without any test
        cluster setup" altered some undocumented side-effects of
        buildall.sh.

        Previously the following commands reconfigured and restarted the test
        cluster. It worked because buildall.sh unconditionally regenerated
        the test cluster configs.

        ./buildall.sh -notests && ./testdata/bin/run-all.sh
        ./buildall.sh -noclean -notests && ./testdata/bin/run-all.sh

        Instead of restoring the old behaviour and continuing to encourage
        mixing use of low and high level scripts like testdata/bin/run-all.sh
        as part of the "standard" workflow, this commit adds another
        high-level option to buildall.sh, -start_minicluster, that
        accomplishes the high-level task of restarting a minicluster with
        fresh configs. The above commands can be replaced with:

        ./buildall.sh -notests -start_minicluster
        ./buildall.sh -notests -noclean -start_minicluster

        Change-Id: I0ab3461f8ff3de49b3f28a0dc22fa0a6d5569da5
        Reviewed-on: http://gerrit.cloudera.org:8080/4734
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        tarmstrong Tim Armstrong added a comment - IMPALA-4299 : add buildall.sh option to start test cluster A previous commit " IMPALA-4259 : build Impala without any test cluster setup" altered some undocumented side-effects of buildall.sh. Previously the following commands reconfigured and restarted the test cluster. It worked because buildall.sh unconditionally regenerated the test cluster configs. ./buildall.sh -notests && ./testdata/bin/run-all.sh ./buildall.sh -noclean -notests && ./testdata/bin/run-all.sh Instead of restoring the old behaviour and continuing to encourage mixing use of low and high level scripts like testdata/bin/run-all.sh as part of the "standard" workflow, this commit adds another high-level option to buildall.sh, -start_minicluster, that accomplishes the high-level task of restarting a minicluster with fresh configs. The above commands can be replaced with: ./buildall.sh -notests -start_minicluster ./buildall.sh -notests -noclean -start_minicluster Change-Id: I0ab3461f8ff3de49b3f28a0dc22fa0a6d5569da5 Reviewed-on: http://gerrit.cloudera.org:8080/4734 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            tarmstrong Tim Armstrong
            Reporter:
            tarasbob Taras Bobrovytsky
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development