Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20895

NPE in RpcServer#readAndProcess

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.2
    • Fix Version/s: 1.3.3, 1.2.7, 1.4.7
    • Component/s: rpc
    • Labels:
      None

      Description

      2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading:
      java.lang.NullPointerException
              at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
              at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
              at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
              at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

      This looks like it could be a use after close problem if there is concurrent access to a Connection.

      In process() we might store a null back to the 'data' field.

      Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE.

      count = channelRead(channel, data);
      1761 ---> if (count >= 0 && data.remaining() == 0)
      { process(); }

      Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile)

      We should do a null check here just to be defensive. We should also look at whether concurrent access to the Connection is happening and intended.The above is just a theory. We should also look at other execution sequences that could lead to 'data' being null in this location. At a glance I didn't find one but the store to 'data' happens behind conditionals so it is possible. 

        Attachments

        1. HBASE-20895-branch-1.patch
          2 kB
          Andrew Kyle Purtell
        2. HBASE-20895-branch-1.patch
          1 kB
          Andrew Kyle Purtell

          Issue Links

            Activity

              People

              • Assignee:
                apurtell Andrew Kyle Purtell
                Reporter:
                apurtell Andrew Kyle Purtell
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: