Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44111 Prepare Apache Spark 4.0.0
  3. SPARK-48931

Reduce Cloud Store List API cost for state store maintenance task

    XMLWordPrintableJSON

Details

    Description

      Currently, during the state store maintenance process, we find which old version files of the RocksDB state store to delete by listing all existing snapshotted version files in the checkpoint directory every 1 minute by default. The frequent list calls in the cloud can result in high costs. To address this concern and reduce the cost associated with state store maintenance, we should aim to minimize the frequency of listing object stores inside the maintenance task. To minimize the frequency, we will try to accumulate versions to delete and only call list when the number of versions to delete reaches a configured threshold. 

      Attachments

        Activity

          People

            riya-verm Riya Verma
            riya-verm Riya Verma
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: