Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24485

Measure and log elapsed time for filesystem operations in HDFSBackedStateStoreProvider

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.4.0
    • 2.4.0
    • Structured Streaming
    • None

    Description

      There're couple of operations which communicate with file system (mostly remote HDFS in production) in HDFSBackedStateStoreProvider, which contribute huge part of latency.

      It would be better to measure the latency (elapsed time) and log to help debugging when there's unexpected huge latency on state store.

      Attachments

        Activity

          People

            kabhwan Jungtaek Lim
            kabhwan Jungtaek Lim
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: