Uploaded image for project: 'Hadoop Distributed Data Store'
  1. Hadoop Distributed Data Store
  2. HDDS-1753

Datanode unable to find chunk while replication data using ratis.

    XMLWordPrintableJSON

    Details

    • Target Version/s:

      Description

      Leader datanode is unable to read chunk from the datanode while replicating data from leader to follower.
      Please note that deletion of keys is also happening while the data is being replicated.

      2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
      -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot (9770) already h
      as the append entries (first index: 1)
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
      hunk file. chunk info ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot (9770) already h
      as the append entries (first index: 2)
      2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | ret=FAILURE
      java.lang.Exception: Unable to find the chunk file. chunk info ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
              at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:346) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.readStateMachineData(ContainerStateMachine.java:476) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$getCachedStateMachineData$2(ContainerStateMachine.java:495) ~[hadoop-hdds-container-service-0.5.0-SN
      APSHOT.jar:?]
              at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache.get(LocalCache.java:3965) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764) ~[guava-11.0.2.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getCachedStateMachineData(ContainerStateMachine.java:494) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.ja
      r:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$readStateMachineData$4(ContainerStateMachine.java:542) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) [?:1.8.0_171]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
      

        Attachments

        1. HDDS-1753.000.patch
          33 kB
          Shashikant Banerjee

          Issue Links

            Activity

              People

              • Assignee:
                shashikant Shashikant Banerjee
                Reporter:
                msingh Mukul Kumar Singh
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5.5h
                  5.5h