Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4088

HDFS data nodes pick HTTP server ports at random, sometimes stealing HBase master's port

    Details

      Description

      Michael, can you take a first look since you've dabbled with HBase startup before? It looks like HBase may not be in a good state. I will not cancel the build so you can log into the machine if necessary. Feel free to cancel the build if you have all the info you need.

      There are several issues here:

      • figure out why data loading did not succeed
      • the build should not hang even if there are errors in data loading

      This is where it hangs:

      ...
      00:35:24.252 [INFO] BUILD SUCCESS
      00:35:24.253 ------------------------------------------------------------------------
      00:35:24.253 
      00:35:24.260 ========================================================================
      00:35:24.260 Running mvn package
      00:35:24.262 Directory: /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata
      00:35:24.262 ========================================================================
      00:35:32.241 [INFO] /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata/src/main/java/com/cloudera/impala/datagenerator/HBaseTestDataRegionAssigment.java: Some input files use or override a deprecated API.
      00:35:32.242 [INFO] /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata/src/main/java/com/cloudera/impala/datagenerator/HBaseTestDataRegionAssigment.java: Recompile with -Xlint:deprecation for details.
      00:35:32.242 [INFO] BUILD SUCCESS
      00:35:32.242 ------------------------------------------------------------------------
      00:35:32.243 
      00:35:32.249 /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata/target
      00:35:34.029 SUCCESS, data generated into /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/testdata/target
      00:45:44.231 Loading HDFS data from snapshot: /data/jenkins/workspace/impala-umbrella-build-and-test/testdata/test-warehouse-SNAPSHOT/test-warehouse-cdh5-59-SNAPSHOT.tar.gz (logging to load-test-warehouse-snapshot.log)... OK
      00:46:24.762 Starting Impala cluster (logging to start-impala-cluster.log)... OK
      00:46:44.765 Setting up HDFS environment (logging to setup-hdfs-env.log)... OK
      00:46:44.765 Skipped loading the metadata.
      <does not proceed beyond here>
      

      From the hive logs (repeats very often):

      org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for 126 in functional_hbase.alltypes after 35 tries.
      	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1329)
      	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1199)
      	at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:395)
      	at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:344)
      	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:238)
      	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:146)
      	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:113)
      	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1084)
      	at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:146)
      	at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:117)
      	at org.apache.hadoop.hive.ql.io.HivePassThroughRecordWriter.write(HivePassThroughRecordWriter.java:40)
      	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:697)
      	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
      	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
      	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
      	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
      	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
      	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
      	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
      	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
      	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
      	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      

      Build:
      http://sandbox.jenkins.cloudera.com/job/impala-umbrella-build-and-test/4465/

        Attachments

        1. logs.tar.gz
          326 kB
          Alexander Behm

          Issue Links

            Activity

              People

              • Assignee:
                laszlog Laszlo Gaal
                Reporter:
                alex.behm Alexander Behm
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: