Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27841

Improve UTF8String fromString()/toString()/numChars() performance when strings are ASCII

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      UTF8String's fromString(), toString(), and numChars() methods are performance hotspots. For strings which consist entirely of ASCII characters we can make performance optimizations which significantly reduce memory allocation and copying, greatly improving performance for many common workloads.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                joshrosen Josh Rosen
                Reporter:
                joshrosen Josh Rosen
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: