Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.2.1
    • Fix Version/s: 2.3.0
    • Component/s: SQL
    • Labels:
      None

      Description

      spark.sql concat function automatically casts arguments to StringType and returns a String.
      This might be the behavior of traditional databases, however in Spark there's Binary as a standard type, and concat'ing binary seems reasonable if it returns another binary sequence.

      Taking the example of, e.g. Python where both bytes and unicode represent text, by concat'ing both we end up with the same type as the arguments, and in case they are intermixed (str + unicode) the most generic type is returned (unicode).

      Following the same principle, I believe that when concat'ing binary it would make sense to return a binary.
      In terms of Spark behavior, it would affect only the case when all arguments are binary. All other cases should remain unchanged.

        Attachments

          Activity

            People

            • Assignee:
              maropu Takeshi Yamamuro
              Reporter:
              ferdonline Fernando Pereira
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: