Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-1753

Datanode unable to find chunk while replication data using ratis.

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Leader datanode is unable to read chunk from the datanode while replicating data from leader to follower.
      Please note that deletion of keys is also happening while the data is being replicated.

      2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
      -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot (9770) already h
      as the append entries (first index: 1)
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
      hunk file. chunk info ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
      2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot (9770) already h
      as the append entries (first index: 2)
      2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
      014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
      19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | ret=FAILURE
      java.lang.Exception: Unable to find the chunk file. chunk info ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
              at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:346) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.readStateMachineData(ContainerStateMachine.java:476) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$getCachedStateMachineData$2(ContainerStateMachine.java:495) ~[hadoop-hdds-container-service-0.5.0-SN
      APSHOT.jar:?]
              at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache.get(LocalCache.java:3965) ~[guava-11.0.2.jar:?]
              at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764) ~[guava-11.0.2.jar:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getCachedStateMachineData(ContainerStateMachine.java:494) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.ja
      r:?]
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$readStateMachineData$4(ContainerStateMachine.java:542) ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
              at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) [?:1.8.0_171]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            shashikant Shashikant Banerjee Assign to me
            msingh Mukul Kumar Singh
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 5.5h
              5.5h

              Slack

                Issue deployment