Uploaded image for project: 'Hadoop Distributed Data Store'
  1. Hadoop Distributed Data Store
  2. HDDS-850

ReadStateMachineData hits OverlappingFileLockException in ContainerStateMachine

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.4.0
    • Fix Version/s: 0.4.0
    • Component/s: Ozone Datanode
    • Labels:
      None
    • Target Version/s:

      Description

      2018-11-16 09:54:41,599 ERROR org.apache.ratis.server.impl.LogAppender: GrpcLogAppender(0813f1a9-61be-4cab-aa05-d5640f4a8341 -> c6ad906f-7e71-4bac-bde3-d22bc1aa8c7d) hit IOException while loading raft log
      
      org.apache.ratis.server.storage.RaftLogIOException: 0813f1a9-61be-4cab-aa05-d5640f4a8341: Failed readStateMachineData for (t:2, i:1), STATEMACHINELOGENTRY, client-7D19FB803B1E, cid=0
      
              at org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:370)
      
              at org.apache.ratis.server.impl.LogAppender$LogEntryBuffer.getAppendRequest(LogAppender.java:167)
      
              at org.apache.ratis.server.impl.LogAppender.createRequest(LogAppender.java:216)
      
              at org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:152)
      
              at org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:96)
      
              at org.apache.ratis.server.impl.LogAppender.runAppender(LogAppender.java:100)
      
              at java.lang.Thread.run(Thread.java:745)
      
      Caused by: java.nio.channels.OverlappingFileLockException
      
              at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255)
      
              at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152)
      
              at sun.nio.ch.AsynchronousFileChannelImpl.addToFileLockTable(AsynchronousFileChannelImpl.java:178)
      
              at sun.nio.ch.SimpleAsynchronousFileChannelImpl.implLock(SimpleAsynchronousFileChannelImpl.java:185)
      
              at sun.nio.ch.AsynchronousFileChannelImpl.lock(AsynchronousFileChannelImpl.java:118)
      
              at org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.readData(ChunkUtils.java:178)
      
              at org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerImpl.readChunk(ChunkManagerImpl.java:197)
      
              at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleReadChunk(KeyValueHandler.java:542)
      
              at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:174)
      
              at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:178)
      
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:290)
      
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.readStateMachineData(ContainerStateMachine.java:404)
      
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$readStateMachineData$6(ContainerStateMachine.java:462)
      
              at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
      
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      
              ... 1 more
      
      

      This happens in the Ratis leader where the stateMachineData is not  in the cached segments in Ratis while it gets a request for ReadStateMachineData while writeStateMachineData is not completed yet. The approach would be to cache the stateMachineData inside ContainerStateMachine and not cache it inside ratis.

        Attachments

        1. HDDS-850.000.patch
          11 kB
          Shashikant Banerjee
        2. HDDS-850.001.patch
          16 kB
          Shashikant Banerjee
        3. HDDS-850.002.patch
          16 kB
          Shashikant Banerjee
        4. HDDS-850.003.patch
          28 kB
          Shashikant Banerjee
        5. HDDS-850.004.patch
          28 kB
          Shashikant Banerjee

          Activity

            People

            • Assignee:
              shashikant Shashikant Banerjee
              Reporter:
              shashikant Shashikant Banerjee
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: