Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-1753

Datanode unable to find chunk while replication data using ratis.

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Target Version/s:

      Description

      Leader datanode is unable to read chunk from the datanode while replicating data from leader to follower.
      Please note that deletion of keys is also happening while the data is being replicated.

      2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
      -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot (9770) already h
      as the append entries (first index: 1)
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
      hunk file. chunk info ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot (9770) already h
      as the append entries (first index: 2)
      2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | ret=FAILURE
      java.lang.Exception: Unable to find the chunk file. chunk info ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
              at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:346) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.readStateMachineData(ContainerStateMachine.java:476) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$getCachedStateMachineData$2(ContainerStateMachine.java:495) ~[hadoop-hdds-container-service-0.5.0-SN
      APSHOT.jar:?]
              at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache.get(LocalCache.java:3965) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764) ~[guava-11.0.2.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getCachedStateMachineData(ContainerStateMachine.java:494) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.ja
      r:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$readStateMachineData$4(ContainerStateMachine.java:542) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) [?:1.8.0_171]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
      

        Attachments

          Activity

            People

            • Assignee:
              shashikant Shashikant Banerjee
              Reporter:
              msingh Mukul Kumar Singh

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 5.5h
                5.5h

                  Issue deployment