Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22548

Incorrect nested AND expression pushed down to JDBC data source

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.1.3, 2.2.1, 2.3.0
    • Component/s: SQL
    • Labels:
      None

      Description

      Let’s say I have a JDBC data source table ‘foobar’ with 3 rows:

      NAME THEID
      ==================
      fred 1
      mary 2
      joe 'foo' "bar" 3

      This query returns incorrect result.
      SELECT * FROM foobar WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')

      It’s supposed to return:
      fred 1
      mary 2

      But it returns
      fred 1
      mary 2
      joe 'foo' "bar" 3

      This is because one leg of the nested AND predicate, TRIM(NAME) = 'mary’, can not be pushed down but is lost during JDBC push down filter translation. The same translation method is also called by Data Source V2. I have a fix for this issue and will open a PR.

        Attachments

          Activity

            People

            • Assignee:
              jliwork Jia Li
              Reporter:
              jliwork Jia Li
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: