Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-11506

Improvements for large scale deletion

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • OM, Ozone Datanode, SCM
    • None

    Description

      Parent Jira to track bugs and improvements related to large scale deletion of data in Ozone. This includes:

      • Improving speed that space is reclaimed from the system.
      • Fixing bugs that have surfaced during large deletes.
      • Improving observability throughout the deletion process through logs, metrics, and dashboards.
      • Configurations that make deletion work well at scale out of the box while simplifying the configuration experience when tuning is required.

      Attachments

        Issue Links

          1.
          DirectoryDeletion task ignored via ratis Sub-task Resolved Sumit Agrawal
          2.
          Directory deletion get stuck having millions of directory Sub-task Resolved Sumit Agrawal
          3.
          ServiceException is not logged when OM delete submission to Ratis fails Sub-task Resolved Abhishek Pal
          4.
          Decouple delete batch limits from Ratis request size Sub-task Open Sadanand Shenoy
          5.
          Logging improvements for deletion services Sub-task Open Tejaskriya Madhan
          6.
          All deletion services should support multiple threads Sub-task Open Aryan Gupta
          7.
          OM deletion services should have consistent metrics Sub-task Open Tejaskriya Madhan
          8.
          Deletion services in SCM, OM and DN should have consistent metrics Sub-task In Progress Tejaskriya Madhan
          9.
          Create Grafana dashboard for tracking system wide deletion Sub-task Open Tejaskriya Madhan
          10.
          Set optimal default values for delete configurations based on live cluster testing Sub-task Open Unassigned
          11.
          All deletion configurations should be configurable without restart Sub-task Open Sarveksha Yeshavantha Raju
          12.
          Directory deletion service should support multiple threads Sub-task Open Aryan Gupta
          13.
          Delete message body too large, causing SCM to fail writing raft log Sub-task Open GuoHao
          14.
          Skip known tombstones when scanning rocksdb deletedtable in KeyDeletingService Sub-task Open GuoHao
          15.
          Add metrics in SCM to print number of delete command sent and response received per datanode Sub-task Open Tejaskriya Madhan
          16.
          Iterate whole scm delete block table in SCMBlockDeletingService before retrying the same transaction again Sub-task Open Ashish Kumar
          17.
          Use seek to reach the start transaction instead of looping table having millions of record Sub-task Open Ashish Kumar
          18.
          resetDeletedBlockRetryCount with --all may fail and can cause long db lock in large cluster Sub-task Open Aryan Gupta
          19.
          Display scm block delete information for faster debugging Sub-task Open Tejaskriya Madhan

          Activity

            People

              Unassigned Unassigned
              erose Ethan Rose
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: