Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28187

NPE when flushing a non-existing column family

    XMLWordPrintableJSON

Details

    Description

      Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs.

      Reproduce

      Start up HBase 2.5.9 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically)

      create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'}
      incr 'table', 'row1', 'cf1:cell', 2
      flush 'table', 'cf3'

      The shell outputs

      hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'}
      Created table table
      Took 2.1238 seconds                                                                                                                                 
      => Hbase::Table - table
      hbase:007:0> 
      hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2
      COUNTER VALUE = 2
      Took 0.0131 seconds                                                                                                                                 
      hbase:009:0> 
      hbase:010:0> flush 'table', 'cf3'
      ERROR: java.io.IOException: java.lang.NullPointerException
       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479)
       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
       at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
       at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82)
      Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.lang.NullPointerException
       at org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274)
       at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115)
       at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126)
       at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160)
       at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       at java.lang.Thread.run(Thread.java:750)
      For usage try 'help "flush"'
      Took 12.1713 seconds                                                         

       

      According to the flush (flush.rb) command specification, user can flush a specific column family.

      Flush all regions in passed table or pass a region row to
      flush an individual region or a region server name whose format
      is 'host,port,startcode', to flush all its regions.
      You can also flush a single column family for all regions within a table,
      or for an specific region only.
      For example:
        hbase> flush 'TABLENAME'
        hbase> flush 'TABLENAME','FAMILYNAME' 

      In the above case, cf3 an incorrect input (non-existing column family). If user tries to flush it, the expected output is:

      1. HBase rejects this operation
      2. returns a prompt saying the column family doesn't exist "ERROR: Unknown CF...".

       

      In 2.6.0, the flush command would stuck and run into NPE

      java.lang.NullPointerException: null
              at org.apache.hadoop.hbase.regionserver.HRegion.logFatLineOnFlush(HRegion.java:2724) ~[hbase-server-2.6.0.jar:2.6.0]
              at org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2640) ~[hbase-server-2.6.0.jar:2.6.0]
              at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2587) ~[hbase-server-2.6.0.jar:2.6.0] 

      Root Cause

      There's a missing check for the whether the target flushing columnfamily exists.

      Attachments

        Activity

          People

            guluo guluo
            kehan5800 Ke Han
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: