Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4088

HDFS data nodes pick HTTP server ports at random, sometimes stealing HBase master's port

    XMLWordPrintableJSON

Details

    Description

      Michael, can you take a first look since you've dabbled with HBase startup before? It looks like HBase may not be in a good state. I will not cancel the build so you can log into the machine if necessary. Feel free to cancel the build if you have all the info you need.

      There are several issues here:

      • figure out why data loading did not succeed
      • the build should not hang even if there are errors in data loading

      This is where it hangs:

      ...
      00:35:24.252 [INFO] BUILD SUCCESS
      00:35:24.253 ------------------------------------------------------------------------
      00:35:24.253 
      00:35:24.260 ========================================================================
      00:35:24.260 Running mvn package
      00:35:24.262 Directory: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata
      00:35:24.262 ========================================================================
      00:35:32.241 [INFO] /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata/src/main/java/com/cloudera/impala/datagenerator/HBaseTestDataRegionAssigment.java: Some input files use or override a deprecated API.
      00:35:32.242 [INFO] /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata/src/main/java/com/cloudera/impala/datagenerator/HBaseTestDataRegionAssigment.java: Recompile with -Xlint:deprecation for details.
      00:35:32.242 [INFO] BUILD SUCCESS
      00:35:32.242 ------------------------------------------------------------------------
      00:35:32.243 
      00:35:32.249 /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata/target
      00:35:34.029 SUCCESS, data generated into /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata/target
      00:45:44.231 Loading HDFS data from snapshot: /data/jenkins/workspace/impala-umbrella-build-and-test/testdata/test-warehouse-SNAPSHOT/test-warehouse-cdh5-59-SNAPSHOT.tar.gz (logging to load-test-warehouse-snapshot.log)... OK
      00:46:24.762 Starting Impala cluster (logging to start-impala-cluster.log)... OK
      00:46:44.765 Setting up HDFS environment (logging to setup-hdfs-env.log)... OK
      00:46:44.765 Skipped loading the metadata.
      <does not proceed beyond here>
      

      From the hive logs (repeats very often):

      org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for 126 in functional_hbase.alltypes after 35 tries.
      	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1329)
      	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1199)
      	at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:395)
      	at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:344)
      	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:238)
      	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:146)
      	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:113)
      	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1084)
      	at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:146)
      	at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:117)
      	at org.apache.hadoop.hive.ql.io.HivePassThroughRecordWriter.write(HivePassThroughRecordWriter.java:40)
      	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:697)
      	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
      	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
      	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
      	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
      	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
      	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
      	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
      	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
      	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
      	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      

      Build:
      http://sandbox.jenkins.cloudera.com/job/impala-umbrella-build-and-test/4465/

      Attachments

        1. logs.tar.gz
          326 kB
          Alexander Behm

        Issue Links

          Activity

            People

              laszlog Laszlo Gaal
              alex.behm Alexander Behm
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: