Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-9182

filter and groupBy on DataFrames are not passed through to jdbc source

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 1.4.1
    • None
    • SQL
    • None

    Description

      When running all of these API calls, the only one that passes the filter through to the backend jdbc source is equality. All filters in these commands should be able to be passed through to the jdbc database source.

      val url="jdbc:postgresql:grahn"
      val prop = new java.util.Properties
      val emp = sqlContext.read.jdbc(url, "emp", prop)
      
      emp.filter(emp("sal") === 5000).show()
      emp.filter(emp("sal") < 5000).show()
      emp.filter("sal = 3000").show()
      emp.filter("sal > 2500").show()
      emp.filter("sal >= 2500").show()
      emp.filter("sal < 2500").show()
      emp.filter("sal <= 2500").show()
      emp.filter("sal != 3000").show()
      emp.filter("sal between 3000 and 5000").show()
      emp.filter("ename in ('SCOTT','BLAKE')").show()
      

      We see from the PostgreSQL query log the following is run, and see that only equality predicates are passed through.

      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE sal = 5000
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE sal = 3000
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
      LOG:  execute <unnamed>: SET extra_float_digits = 3
      LOG:  execute <unnamed>: SELECT "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              grahn Greg Rahn
              Votes:
              3 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: