Uploaded image for project: 'Griffin'
  1. Griffin
  2. GRIFFIN-332

JDBC Connector: Ability to Select Specific Columns Instead of All the Columns

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.6.0
    • None
    • accuracy-batch

    Description

      Background:
      Thanks to https://issues.apache.org/jira/browse/GRIFFIN-315, we already have JDBC connector.
      However, currently, it is pulling all the columns using`"SELECT * FROM $fullTableName"`.
      It will cause some issues for larger JDBC tables -

      • memory overhead for spark data frame
      • longer execution time
      • resource overhear for RDBMS

      Proposed Improvement:
      So, I propose the feature to allow JDBC connector to able to select only required columns.

      Example:
      We have a rule `"rule":"src.id = tgt.id and src.country = tgt.country "`. Then we only need two columns `id` and 'country'.
      So, in connector we can add additional clause `columns` to select only required columns, like below:

       

      {   "name":"src",
         "connector":{      "type":"jdbc",
            "config":{         "database":"mydatabase",
               "tablename":"mytable",
               "columns":"id, country",
               "url":"jdbc:sqlserver://myhost:1433;databaseName=mydatabase",
               "user":"user",
               "password":"password",
               "driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",
               "where":""
            }
         }
      }
      

      We can implement it like this, if there is `columns` clause then use it otherwise use `*` as default.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              obaid Obaidul Karim
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: