Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-10985

EC Reconstruction failed because the size of currentChunks was not equal to checksumBlockDataChunks

    XMLWordPrintableJSON

Details

    Description

      EC reconstruction failed with java.lang.IllegalArgumentException: The chunk list has 9 entries, but the checksum chunks has 10 entries. They should be equal in size exception. The DN had this problen when the EC data was reconstructed. And I found that this problem can occur whether the data block or the check block is missing.

      EC Policy: rs-10-3-2048k

      DN.log: 

      2024-06-06 18:20:17,837 [ContainerReplicationThread-12] WARN org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask: FAILED reconstructECContainersCommand: containerID=876481, replication=rs-10-3-2048k, missingIndexes=[11], sources={1=5919f690
      -3871-45d2-b414-004292b3e2d3(10.175.134.153/10.175.134.153), 2=718b671b-66ae-46eb-96fb-71411da7849d(10.175.134.172/10.175.134.172), 3=e0ce60b3-75d5-4d00-bcb9-7781ef61e827(10.175.134.135/10.175.134.135), 4=e9871cb6-44b0-4f39-ac8d-b04122dbd439(10.175.134.201/10.175.134.201), 5=b9319384-2f73-4610-9e03-c6b67bbfab0b(10.175.134.217/10.175.134.217), 6=9a0f6ff9-0772-4a1d-828e-96d3be50778c(10.175.134.199/10.175.134.199), 7=8c0800ad-0026-4fdd-bd6e-6d866e166e49(10.175.137.25/10.175.137.25), 8=24628bc9-5d7b-4310-a21f-9a35e2634fb4(10.175.134.200/10.175.134.200), 9=c23a4a3c-183a-4baf-ada4-e30800faa907(10.175.134.219/10.175.134.219), 10=c02658fa-898a-4406-a778-87653c2723c2(10.175.137.27/10.175.137.27), 12=2a598049-6f33-4f18-a32a-f9d1f2ad399d(10.175.137.43/10.175.137.43), 13=70cfa62e-5a7c-489e-bdf3-5527f9bb1679(10.175.134.203/10.175.134.203)}, targets={11=099a12a7-e276-4ce0-bb3d-d915879ba4d9(10.175.138.92/10.175.138.92)} after 316099 ms
      java.lang.IllegalArgumentException: The chunk list has 9 entries, but the checksum chunks has 10 entries. They should be equal in size.
              at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
              at org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.executePutBlock(ECBlockOutputStream.java:144)
              at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECBlockGroup(ECReconstructionCoordinator.java:340)
              at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECContainerGroup(ECReconstructionCoordinator.java:180)
              at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask.runTask(ECReconstructionCoordinatorTask.java:68)
              at org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:359)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:750) 

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              srnjl LiMinyu
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: