Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-505 OzoneManager HA
  3. HDDS-1861

Fix TableCacheImpl cleanup logic

    XMLWordPrintableJSON

    Details

    • Target Version/s:

      Description

      Currently in cleanup, we iterate over epochEntries and cleaup the entries from cache and epochEntries set.

       

      epochEntries is a TreeSet<> which is not a concurrent datastructure of java. We may see issue some times, when cleanup tries to remove entries and some other thread tries to add entries to cache. So, we need to use some concurrent set over there.

       

      During cluster testing, seen this some times randomly:
       

      019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9862, call Call#8974 Retry#0 org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 10.65.15.233:35222 java.lang.NullPointerException at java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at java.util.TreeMap.put(TreeMap.java:582) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) at org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292) at org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188) at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134) at org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method)

       
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bharat Bharat Viswanadham
                Reporter:
                bharat Bharat Viswanadham
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h