Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7847

Fix dynamic partition path escaping

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.3.0, 1.3.1, 1.4.0
    • 1.4.0
    • SQL
    • None

    Description

      Background: when writing dynamic partitions, partition values are converted to string and escaped if necessary. For example, a partition column p of type String may have a value A/B, then the corresponding partition directory name is escaped into p=A%2fB.

      Currently, there are two issues regarding to dynamic partition path escaping. The first issue is that, when reading back partition values, escaped strings are not unescaped. This one is easy to fix.

      The second issue is more subtle. In PR #5381 we tried to use Path.toUri.toString to fix an escaping issue related to S3 credentials with / character. Unfortunately, Path.toUri.toString also escapes % characters in the path. Thus, using the dynamic partitioning case mentioned above, p=A%2fB is double escaped into p=A%252fB (% escaped into %25).

      The expected behavior here should be, only escaping the URI user info part (S3 key and secret) but leave all other components untouched.

      Attachments

        Activity

          People

            lian cheng Cheng Lian
            lian cheng Cheng Lian
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: