Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32559

Fix the trim logic in UTF8String.toInt/toLong did't handle Chinese characters correctly

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.1
    • SQL

    Description

      The trim logic in Cast expression introduced in https://github.com/apache/spark/pull/26622 will trim chinese characters unexpectly.

      For example,  sql  select cast("1中文" as float) gives 1 instead of null

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            EdisonWang EdisonWang
            EdisonWang EdisonWang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment