Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41248

Add config flag to control before of JSON partial results parsing in SPARK-40646

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • SQL
    • None

    Description

      This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40646.

       

      It was observed in internal benchmarks that the JSON partial results parsing can be 30% slower compared to parsing without the patch. I could not find a regression and the Apache Spark JSON benchmark results are very similar with and without SPARK-40646.

      However, I would still like to add a config flag to enable/disable the feature in the case the regression is observed in users' queries.

      Benchmark results are attached below.

       

      Attachments

        1. json-benchmark-without-SPARK-40646.log
          10 kB
          Ivan Sadikov
        2. json-benchmark-with-SPARK-40646.log
          10 kB
          Ivan Sadikov

        Issue Links

          Activity

            People

              ivan.sadikov Ivan Sadikov
              sadikovi Ivan Sadikov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: