Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-1115

Python: add SQL for DataFrame support with Table Display system

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: python-interpreter
    • Labels:
      None

      Description

      In spark interpreter group we have %sql interpreter which supports Table Display system, we use it in the tutorial notebook and it's very convenient for data explorations.

      The idea is to have the same kind of support for Python interpreter group i.e
      `%python.sql` but only for Pandas DataFrame

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user bzz opened a pull request:

          https://github.com/apache/zeppelin/pull/1164

          ZEPPELIN-1115:

              1. What is this PR for?
                Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support
              1. What type of PR is it?
                Improvement
              1. Todos
          • [x] add new interpreter `%python.sql`
          • [x] add test
          • [x] make Python-dependant tests, excluded from CI
          • PythonInterpreterWithPythonInstalledTest
          • PythonPandasSqlInterpreterTest
          • run manually by `mvn -Dpython.test.exclude='' test -pl python -am`
          • [x] add docs `%python.sql`
          • [ ] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed
              1. What is the Jira issue?
                ZEPPELIN-1115(https://issues.apache.org/jira/browse/ZEPPELIN-1115)
              1. How should this be tested?
                Outline the steps to test the PR here.
              1. Screenshots (if appropriate)
              1. Questions:
          • Does the licenses files need update?
          • Is there breaking changes for older versions?
          • Does this needs documentation?

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/bzz/incubator-zeppelin ZEPPELIN-1115/python/add-sql-for-dataframes

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/zeppelin/pull/1164.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #1164


          commit 5f0621230e9bf3a33114127572c60aff3a600e50
          Author: Alexander Bezzubov <bzz@apache.org>
          Date: 2016-07-05T09:53:27Z

          Add draft implementation of %python.sql for DataFrames

          commit 3f4422ca87110f71eb9e401a9c22567db98fc8e4
          Author: Alexander Bezzubov <bzz@apache.org>
          Date: 2016-07-08T03:55:27Z

          Add %python.sql to interpreter menue

          commit 86b16209bc13da2b74416b9485d64a89a55454f2
          Author: Alexander Bezzubov <bzz@apache.org>
          Date: 2016-07-11T11:13:56Z

          Complete implementation of the PythonPandasSqlInterpreter

          commit da9dc742dd621a9c2e9e0cf81e8b07cecac8bfbb
          Author: Alexander Bezzubov <bzz@apache.org>
          Date: 2016-07-11T11:35:53Z

          Make test for PythonPandasSqlInterpreter usable

          commit 37c53a60b17b27fc4097063a7370561f52f9e792
          Author: Alexander Bezzubov <bzz@apache.org>
          Date: 2016-07-11T14:11:19Z

          Add docs for %python.sql feature

          commit ee6048aeb1f9c4a2bcae72d85cf8acf6ac998d69
          Author: Alexander Bezzubov <bzz@apache.org>
          Date: 2016-07-11T14:40:40Z

          Update Python Matplotlib notebook example


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user bzz opened a pull request: https://github.com/apache/zeppelin/pull/1164 ZEPPELIN-1115 : What is this PR for? Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support What type of PR is it? Improvement Todos [x] add new interpreter `%python.sql` [x] add test [x] make Python-dependant tests, excluded from CI PythonInterpreterWithPythonInstalledTest PythonPandasSqlInterpreterTest run manually by `mvn -Dpython.test.exclude='' test -pl python -am` [x] add docs `%python.sql` [ ] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed What is the Jira issue? ZEPPELIN-1115 ( https://issues.apache.org/jira/browse/ZEPPELIN-1115 ) How should this be tested? Outline the steps to test the PR here. Screenshots (if appropriate) Questions: Does the licenses files need update? Is there breaking changes for older versions? Does this needs documentation? You can merge this pull request into a Git repository by running: $ git pull https://github.com/bzz/incubator-zeppelin ZEPPELIN-1115 /python/add-sql-for-dataframes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zeppelin/pull/1164.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1164 commit 5f0621230e9bf3a33114127572c60aff3a600e50 Author: Alexander Bezzubov <bzz@apache.org> Date: 2016-07-05T09:53:27Z Add draft implementation of %python.sql for DataFrames commit 3f4422ca87110f71eb9e401a9c22567db98fc8e4 Author: Alexander Bezzubov <bzz@apache.org> Date: 2016-07-08T03:55:27Z Add %python.sql to interpreter menue commit 86b16209bc13da2b74416b9485d64a89a55454f2 Author: Alexander Bezzubov <bzz@apache.org> Date: 2016-07-11T11:13:56Z Complete implementation of the PythonPandasSqlInterpreter commit da9dc742dd621a9c2e9e0cf81e8b07cecac8bfbb Author: Alexander Bezzubov <bzz@apache.org> Date: 2016-07-11T11:35:53Z Make test for PythonPandasSqlInterpreter usable commit 37c53a60b17b27fc4097063a7370561f52f9e792 Author: Alexander Bezzubov <bzz@apache.org> Date: 2016-07-11T14:11:19Z Add docs for %python.sql feature commit ee6048aeb1f9c4a2bcae72d85cf8acf6ac998d69 Author: Alexander Bezzubov <bzz@apache.org> Date: 2016-07-11T14:40:40Z Update Python Matplotlib notebook example
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bzz commented on the issue:

          https://github.com/apache/zeppelin/pull/1164

          Documentation review addressed in e432961

          Show
          githubbot ASF GitHub Bot added a comment - Github user bzz commented on the issue: https://github.com/apache/zeppelin/pull/1164 Documentation review addressed in e432961
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bzz commented on the issue:

          https://github.com/apache/zeppelin/pull/1164

          Last TODO and a feedback on graceful failure addressed in 11da87c

          Show
          githubbot ASF GitHub Bot added a comment - Github user bzz commented on the issue: https://github.com/apache/zeppelin/pull/1164 Last TODO and a feedback on graceful failure addressed in 11da87c
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user khalidhuseynov commented on the issue:

          https://github.com/apache/zeppelin/pull/1164

          Thanks for the improvement, LGTM

          Show
          githubbot ASF GitHub Bot added a comment - Github user khalidhuseynov commented on the issue: https://github.com/apache/zeppelin/pull/1164 Thanks for the improvement, LGTM
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bzz commented on the issue:

          https://github.com/apache/zeppelin/pull/1164

          Thank you guys for prompt reviews!

          Have added one minor TODO item to cleanup test profiles on CI, will merge after #747

          Show
          githubbot ASF GitHub Bot added a comment - Github user bzz commented on the issue: https://github.com/apache/zeppelin/pull/1164 Thank you guys for prompt reviews! Have added one minor TODO item to cleanup test profiles on CI, will merge after #747
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bzz commented on the issue:

          https://github.com/apache/zeppelin/pull/1164

          Done, merging after CI ♻️ if there is no further discussion

          Show
          githubbot ASF GitHub Bot added a comment - Github user bzz commented on the issue: https://github.com/apache/zeppelin/pull/1164 Done, merging after CI ♻️ if there is no further discussion
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/zeppelin/pull/1164

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/zeppelin/pull/1164
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user bzz opened a pull request:

          https://github.com/apache/zeppelin/pull/1238

          ZEPPELIN-1115: add correct %python.sql interpreter name to configuration

              1. What is this PR for?
                Hotfix for ZEPPELIN-1244(https://issues.apache.org/jira/browse/ZEPPELIN-1244)
              1. What type of PR is it?
                Hot Fix
              1. What is the Jira issue?
                ZEPPELIN-1244(https://issues.apache.org/jira/browse/ZEPPELIN-1244)
              1. How should this be tested?
                Remove interpreter configuration and try `%python.sql` i.e over [bank.csv](http://mlr.cs.umass.edu/ml/datasets/Bank+Marketing) read as `pandas.dataframe`
                ```
                #stop zeppelin
                rm conf/interpreter-setting.json
                #start zeppelin

          %python
          import pandas as pd
          rates = pd.read_csv("http://www3.dsi.uminho.pt/pcortez/data/bank.csv", sep=";")

          %python.sql
          SELECT * FROM rates LIMIT 10
          ```

              1. Questions:
          • Does the licenses files need update? No
          • Is there breaking changes for older versions? No
          • Does this needs documentation? No

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/bzz/incubator-zeppelin python/fix/ZEPPELIN-1244

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/zeppelin/pull/1238.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #1238


          commit befeebe79546aada8db306880b3cf375c6ab7db0
          Author: Alexander Bezzubov <bzz@apache.org>
          Date: 2016-07-28T09:49:32Z

          add correct %python.sql interpreter name to configuration


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user bzz opened a pull request: https://github.com/apache/zeppelin/pull/1238 ZEPPELIN-1115 : add correct %python.sql interpreter name to configuration What is this PR for? Hotfix for ZEPPELIN-1244 ( https://issues.apache.org/jira/browse/ZEPPELIN-1244 ) What type of PR is it? Hot Fix What is the Jira issue? ZEPPELIN-1244 ( https://issues.apache.org/jira/browse/ZEPPELIN-1244 ) How should this be tested? Remove interpreter configuration and try `%python.sql` i.e over [bank.csv] ( http://mlr.cs.umass.edu/ml/datasets/Bank+Marketing ) read as `pandas.dataframe` ``` #stop zeppelin rm conf/interpreter-setting.json #start zeppelin %python import pandas as pd rates = pd.read_csv("http://www3.dsi.uminho.pt/pcortez/data/bank.csv", sep=";") %python.sql SELECT * FROM rates LIMIT 10 ``` Questions: Does the licenses files need update? No Is there breaking changes for older versions? No Does this needs documentation? No You can merge this pull request into a Git repository by running: $ git pull https://github.com/bzz/incubator-zeppelin python/fix/ ZEPPELIN-1244 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zeppelin/pull/1238.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1238 commit befeebe79546aada8db306880b3cf375c6ab7db0 Author: Alexander Bezzubov <bzz@apache.org> Date: 2016-07-28T09:49:32Z add correct %python.sql interpreter name to configuration
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bzz commented on the issue:

          https://github.com/apache/zeppelin/pull/1238

          Merging it ASAP as a hotfix

          Show
          githubbot ASF GitHub Bot added a comment - Github user bzz commented on the issue: https://github.com/apache/zeppelin/pull/1238 Merging it ASAP as a hotfix
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1238

          LGTM

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1238 LGTM
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bzz commented on the issue:

          https://github.com/apache/zeppelin/pull/1238

          Thanks for the prompt review, merging if there is no further discussion.

          Show
          githubbot ASF GitHub Bot added a comment - Github user bzz commented on the issue: https://github.com/apache/zeppelin/pull/1238 Thanks for the prompt review, merging if there is no further discussion.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/zeppelin/pull/1238

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/zeppelin/pull/1238

            People

            • Assignee:
              bzz Alexander Bezzubov
              Reporter:
              bzz Alexander Bezzubov
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development