Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36081

Update the document about the behavior change of trimming characters for cast

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.3, 3.1.2, 3.2.0, 3.3.0
    • 3.2.0, 3.1.3, 3.0.4
    • SQL
    • None

    Description

      sql-migration-guide.md mentions about the behavior of cast like as follows.

      In Spark 3.0, when casting string value to integral types(tinyint, smallint, int and bigint), datetime types(date, timestamp and interval) and boolean type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed before converted to these type values, for example, `cast(' 1\t' as int)` results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as date)` results the date value `2019-10-10`. In Spark version 2.4 and below, when casting string to integrals and booleans, it does not trim the whitespaces from both ends; the foregoing results is `null`, while to datetimes, only the trailing spaces (= ASCII 32) are removed.
      

      In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 in Spark 3.0.0.
      But SPARK-32559 changed this behavior and since 3.0.1, the query returns NULL.

      Attachments

        Activity

          People

            sarutak Kousuke Saruta
            sarutak Kousuke Saruta
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: