Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27841

Improve UTF8String fromString()/toString()/numChars() performance when strings are ASCII

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.1.0
    • None
    • SQL
    • None

    Description

      UTF8String's fromString(), toString(), and numChars() methods are performance hotspots. For strings which consist entirely of ASCII characters we can make performance optimizations which significantly reduce memory allocation and copying, greatly improving performance for many common workloads.

      Attachments

        Issue Links

          Activity

            People

              joshrosen Josh Rosen
              joshrosen Josh Rosen
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: