Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22548

Incorrect nested AND expression pushed down to JDBC data source

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.1.3, 2.2.1, 2.3.0
    • SQL
    • None

    Description

      Let’s say I have a JDBC data source table ‘foobar’ with 3 rows:

      NAME THEID
      ==================
      fred 1
      mary 2
      joe 'foo' "bar" 3

      This query returns incorrect result.
      SELECT * FROM foobar WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')

      It’s supposed to return:
      fred 1
      mary 2

      But it returns
      fred 1
      mary 2
      joe 'foo' "bar" 3

      This is because one leg of the nested AND predicate, TRIM(NAME) = 'mary’, can not be pushed down but is lost during JDBC push down filter translation. The same translation method is also called by Data Source V2. I have a fix for this issue and will open a PR.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jliwork Jia Li
            jliwork Jia Li
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment