Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-9871

[snapshot] Deadlock in SnapshotCache

    XMLWordPrintableJSON

Details

    Description

      Snapshot diff request failing when setting ozone.om.snapshot.db.max.open.files=-1, due to double flush buffer issue.
      Snapshot creation request:

      2023-11-27 00:40:23,345 INFO [OM StateMachine ApplyTransaction Thread - 0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created snapshot: 'snap-ay36z' with snapshotId: 'bf0c6141-4185-4361-b15f-c4aa71c5c6d8' under path 'vol-2xd36/buck-id806'
      

      Double Buffer flush logs:

      2023-11-27 00:10:23,826 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager: Created checkpoint in rocksDB at /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-b2e9acb3-fee2-4190-8272-0649edca8d93 in 30 milliseconds
      2023-11-27 00:10:23,827 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: Waited for 1 milliseconds for checkpoint directory /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-b2e9acb3-fee2-4190-8272-0649edca8d93 availability.
      2023-11-27 00:10:23,828 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: Created checkpoint : /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-b2e9acb3-fee2-4190-8272-0649edca8d93 for snapshot snap-mswq9
      2023-11-27 00:10:39,586 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager: Created checkpoint in rocksDB at /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3369ac3a-61e1-4eca-b3cf-eb2de0b2d688 in 30 milliseconds
      2023-11-27 00:10:39,586 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: Waited for 0 milliseconds for checkpoint directory /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3369ac3a-61e1-4eca-b3cf-eb2de0b2d688 availability.
      2023-11-27 00:10:39,587 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: Created checkpoint : /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3369ac3a-61e1-4eca-b3cf-eb2de0b2d688 for snapshot snap-f5u3t
      2023-11-27 00:10:55,949 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager: Created checkpoint in rocksDB at /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3a690c8f-f3ef-415d-b25c-3aaf763c9507 in 22 milliseconds
      2023-11-27 00:10:55,950 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: Waited for 1 milliseconds for checkpoint directory /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3a690c8f-f3ef-415d-b25c-3aaf763c9507 availability.
      2023-11-27 00:10:55,950 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: Created checkpoint : /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3a690c8f-f3ef-415d-b25c-3aaf763c9507 for snapshot snap-jfktn
      2023-11-29 08:52:24,698 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager: Created checkpoint in rocksDB at /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-c3ba17ef-d947-454e-9c4f-b9063ae65650 in 15 milliseconds
      2023-11-29 08:52:24,715 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: Waited for 16 milliseconds for checkpoint directory /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-c3ba17ef-d947-454e-9c4f-b9063ae65650 availability.
      2023-11-29 08:52:24,717 WARN [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: Took 614733 ns to find endKey. Caller is deleteKeysFromDelKeyTableInSnapshotScope
      2023-11-29 08:52:24,718 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: Created checkpoint : /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-c3ba17ef-d947-454e-9c4f-b9063ae65650 for snapshot snap-ay36z
      2023-11-29 08:52:24,745 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager: Created checkpoint in rocksDB at /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-bf0c6141-4185-4361-b15f-c4aa71c5c6d8 in 12 milliseconds
      2023-11-29 08:52:24,746 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: Waited for 0 milliseconds for checkpoint directory /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-bf0c6141-4185-4361-b15f-c4aa71c5c6d8 availability.
      2023-11-29 08:52:24,747 INFO [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: Created checkpoint : /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-bf0c6141-4185-4361-b15f-c4aa71c5c6d8 for snapshot snap-ay36z
      

      Attachments

        Issue Links

          Activity

            People

              aswinshakil Aswin Shakil
              jyosin Jyotirmoy Sinha
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: