Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22771

SQL concat for binary

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.2.1
    • 2.3.0
    • SQL
    • None

    Description

      spark.sql concat function automatically casts arguments to StringType and returns a String.
      This might be the behavior of traditional databases, however in Spark there's Binary as a standard type, and concat'ing binary seems reasonable if it returns another binary sequence.

      Taking the example of, e.g. Python where both bytes and unicode represent text, by concat'ing both we end up with the same type as the arguments, and in case they are intermixed (str + unicode) the most generic type is returned (unicode).

      Following the same principle, I believe that when concat'ing binary it would make sense to return a binary.
      In terms of Spark behavior, it would affect only the case when all arguments are binary. All other cases should remain unchanged.

      Attachments

        Activity

          People

            maropu Takeshi Yamamuro
            ferdonline Fernando Pereira
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: