Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46761

quoted strings in a JSON path should support ? characters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0, 4.0.0
    • 4.0.0
    • SQL

    Description

      I think this impacts all versions of Spark after SPARK-18677, which made the operator work at all in 2.1.0/2.0.3

      I comes down to

       name <- '.' ~> "[^\\.\\[]+".r | "['" ~> "[^\\'\\?]+".r <~ "']"

      https://github.com/apache/spark/blob/01bb1b1a3dbfc68f41d9b13de863d26d587c7e2f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L79

       

      The regular expression/pattern is saying that we want a [' followed by one or more characters that are not a single quote ' or a question mark ? followed by ']. That question mark looks out of place. When I try to put in a question mark in a quoted string it fails to produce any result, but when I put the same data/path into https://jsonpath.com/ I get a result

       

      data

      {"?":"QUESTION"} 

      path

      $['?'] 

       

      I also see no tests validating that a question mark is not allowed so I suspect that it is a long standing bug.

      Attachments

        Issue Links

          Activity

            People

              revans2 Robert Joseph Evans
              revans2 Robert Joseph Evans
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: