Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-6548 HDDS Stability Fixes and Improvements
  3. HDDS-8141

Exception "Non-force deletion of non-empty container is not allowed" in datanode logs

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • None
    • None

    Description

      This exception has been noticed a few times in datanode logs

      2023-02-16 14:57:11,330 ERROR org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler: Received container deletion command for container 54652 but the container is not empty.
      2023-02-16 14:57:11,330 ERROR org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteContainerCommandHandler: Exception occurred while deleting the container.
      org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: Non-force deletion of non-empty container is not allowed.
      	at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.deleteInternal(KeyValueHandler.java:1133)
      	at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.deleteContainer(KeyValueHandler.java:1094)
      	at org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.deleteContainer(ContainerController.java:182)
      	at org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteContainerCommandHandler.lambda$handle$0(DeleteContainerCommandHandler.java:75)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:750)
      

      This is a defensive code path that checks the block count metadata in RocksDB to determine if the container is empty. It is not expected to be hit.

      The last delete block command for this container was logged about 5 minutes prior to this message. When checking the disk of a few containers where this happened, we noticed there were no block files present there. Logs show SCM would retry the delete but get the same result every time.

      Later on, the container inspector was run on this cluster and it reported that there was only one copy of this container in the whole cluster. It had the following metadata:

      {
        "containerID": 54652,
        "schemaVersion": "2",
        "containerState": "CLOSED",
        "currentDatanodeID": "a160b3e2-a450-446d-a75c-898241a1ff7a",
        "originDatanodeID": "a160b3e2-a450-446d-a75c-898241a1ff7a",
        "dBMetadata": {
          "#BLOCKCOUNT": -6,
          "#BYTESUSED": -1431232412,
          "#PENDINGDELETEBLOCKCOUNT": 0,
          "#delTX": 46312,
          "#BCSID": 1548650
        },
        "aggregates": {
          "blockCount": 0,
          "usedBytes": 0,
          "pendingDeleteBlocks": 0,
          "pendingDeleteBytes": 0
        },
        "chunksDirectory": {
          "path": "<disk mount path>/current/containerDir106/54652/chunks",
          "present": true,
          "fileCount": 0
        },
        "dBMetadataDeleteCount_minus_aggregatedDeleteCount": 0,
        "correct": false,
        "errors": [
          {
            "property": "dBMetadata.#BLOCKCOUNT",
            "expected": 0,
            "actual": -6,
            "repaired": false
          },
          {
            "property": "dBMetadata.#BYTESUSED",
            "expected": 0,
            "actual": -1431232412,
            "repaired": false
          }
        ]
      }
      

      Attachments

        Issue Links

          Activity

            People

              sumitagrawl Sumit Agrawal
              erose Ethan Rose
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: