Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47693

Improve UTF8_BINARY_LCASE collation comparison performance

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0
    • SQL

    Description

      Current collation benchmarks indicate that UTF8_BINARY_LCASE collation comparisons are order of magnitude slower (~7-10x) than plain binary comparisons. Improve the performance by optimizing lowercase comparison function for UTF8String instances instead of performing full lowercase conversion before binary comparison.

      Attachments

        Activity

          People

            nikolamand-db Nikola Mandic
            nikolamand-db Nikola Mandic
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: