Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-1344 Improving matplotlib integration with zeppelin
  3. ZEPPELIN-1345

Create a custom matplotlib backend that natively supports inline plotting in a python interpreter cell

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.6.2
    • Fix Version/s: 0.7.0
    • Component/s: python-interpreter
    • Labels:
      None

      Description

      This is a sub-task for ZEPPELIN-1344. The deliverable will be a custom matplotlib backend which natively supports inline plotting in zeppelin notebook cells (for the python and eventually pyspark interpreters). More information regarding matplotlib backends can be found here:

      http://matplotlib.org/faq/usage_faq.html#what-is-a-backend

      The focus of this issue will be getting support for plotting just static images similar to those generated in a Jupyter notebook cell when %matplotlib inline is used. Getting support for more interactive plotting will also require a custom backend that will be based off of this one.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user a-rodin commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          I can't make the plot update from a different cell work in `%python` interpreter. I described the problem in details in ZEPPELIN-2511(https://issues.apache.org/jira/browse/ZEPPELIN-2511).

          Show
          githubbot ASF GitHub Bot added a comment - Github user a-rodin commented on the issue: https://github.com/apache/zeppelin/pull/1534 I can't make the plot update from a different cell work in `%python` interpreter. I described the problem in details in ZEPPELIN-2511 ( https://issues.apache.org/jira/browse/ZEPPELIN-2511 ).
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bzz commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @agoodm sorry for digging this out, but I have just realized that this PR changes only [Python `PyZeppelinContext`](https://github.com/apache/zeppelin/blob/master/python/src/main/resources/bootstrap.py#L121) and [Pyspark `PyZeppelinContext `](https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/python/zeppelin_pyspark.py#L53) but does not touch [Python `Py4jZeppelinContext`](https://github.com/apache/zeppelin/blob/master/python/src/main/resources/bootstrap_input.py#L28) - wich is used for dynamic forms, in case `py4j` is installed on the system.

          In latter case Python interpreter becomes not useable - any line results in

          ```
          x = 1

          Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
          AttributeError: 'Py4jZeppelinContext' object has no attribute '_displayhook'
          ```

          Show
          githubbot ASF GitHub Bot added a comment - Github user bzz commented on the issue: https://github.com/apache/zeppelin/pull/1534 @agoodm sorry for digging this out, but I have just realized that this PR changes only [Python `PyZeppelinContext`] ( https://github.com/apache/zeppelin/blob/master/python/src/main/resources/bootstrap.py#L121 ) and [Pyspark `PyZeppelinContext `] ( https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/python/zeppelin_pyspark.py#L53 ) but does not touch [Python `Py4jZeppelinContext`] ( https://github.com/apache/zeppelin/blob/master/python/src/main/resources/bootstrap_input.py#L28 ) - wich is used for dynamic forms, in case `py4j` is installed on the system. In latter case Python interpreter becomes not useable - any line results in ``` x = 1 Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Py4jZeppelinContext' object has no attribute '_displayhook' ```
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/zeppelin/pull/1534

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/zeppelin/pull/1534
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bzz commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          Looks great to me, 👍 for extra tests. Let's merge

          Show
          githubbot ASF GitHub Bot added a comment - Github user bzz commented on the issue: https://github.com/apache/zeppelin/pull/1534 Looks great to me, 👍 for extra tests. Let's merge
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          Merge to master if there's no further discussion.

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1534 Merge to master if there's no further discussion.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          Looks good to me!

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1534 Looks good to me!
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user felixcheung commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          Great, thank you for digging it through.

          Any more comment?

          Show
          githubbot ASF GitHub Bot added a comment - Github user felixcheung commented on the issue: https://github.com/apache/zeppelin/pull/1534 Great, thank you for digging it through. Any more comment?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          There is a JIRA issue: (ZEPPELIN-1623(https://issues.apache.org/jira/browse/ZEPPELIN-1623))

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 There is a JIRA issue: ( ZEPPELIN-1623 ( https://issues.apache.org/jira/browse/ZEPPELIN-1623 ))
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user felixcheung commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          LGTM.
          Is there any known issue with SELENIUM tests? from a quick glance it seems to have failed a few times here.

          Show
          githubbot ASF GitHub Bot added a comment - Github user felixcheung commented on the issue: https://github.com/apache/zeppelin/pull/1534 LGTM. Is there any known issue with SELENIUM tests? from a quick glance it seems to have failed a few times here.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @felixcheung done, I have tested with the tutorial notebook and it seems to work as well as before. Good to merge now?

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @felixcheung done, I have tested with the tutorial notebook and it seems to work as well as before. Good to merge now?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user felixcheung commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          How do we think about this https://github.com/apache/zeppelin/pull/1534#issuecomment-258332768

          Show
          githubbot ASF GitHub Bot added a comment - Github user felixcheung commented on the issue: https://github.com/apache/zeppelin/pull/1534 How do we think about this https://github.com/apache/zeppelin/pull/1534#issuecomment-258332768
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          LGTM and merge to master if there's no further discussion.
          Thanks @agoodm for the great contribution.

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1534 LGTM and merge to master if there's no further discussion. Thanks @agoodm for the great contribution.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @Leemoonsoo Tests are done, the only failure seems to be from a known flaky test ZEPPELIN-1623(https://issues.apache.org/jira/browse/ZEPPELIN-1623). Are we good to merge?

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @Leemoonsoo Tests are done, the only failure seems to be from a known flaky test ZEPPELIN-1623 ( https://issues.apache.org/jira/browse/ZEPPELIN-1623 ). Are we good to merge?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @agoodm now CI has fixed on master. Could you rebase and see if CI test goes green?

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1534 @agoodm now CI has fixed on master. Could you rebase and see if CI test goes green?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @Leemoonsoo Yep, I can confirm... right before you posted that comment, I patched Never mind, I manually patched in the changes from that PR to my branch and got the exact same stacktrace as you in my interpreter log file after running a `%pyspark` paragraph.

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @Leemoonsoo Yep, I can confirm... right before you posted that comment, I patched Never mind, I manually patched in the changes from that PR to my branch and got the exact same stacktrace as you in my interpreter log file after running a `%pyspark` paragraph.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @agoodm It's my bad, i was testing it with https://github.com/apache/zeppelin/pull/1596 merged.
          Tested just last commit in the branch again, and the error has gone.

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1534 @agoodm It's my bad, i was testing it with https://github.com/apache/zeppelin/pull/1596 merged. Tested just last commit in the branch again, and the error has gone.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @Leemoonsoo Sure, I will do that after this is merged.

          As for your error, I cannot reproduce it. I tested both the latest code from master as well as my fork after rebasing. How are you performing the maven build? I use:
          ```
          mvn clean package -Pbuild-distr -Ppyspark -DskipTests
          ```
          without modifying any of the configuration files.

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @Leemoonsoo Sure, I will do that after this is merged. As for your error, I cannot reproduce it. I tested both the latest code from master as well as my fork after rebasing. How are you performing the maven build? I use: ``` mvn clean package -Pbuild-distr -Ppyspark -DskipTests ``` without modifying any of the configuration files.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @agoodm
          I think placement of common python lib and problem printing output when `close=False` can be handled in the separate issues. Do you mind create issues?

          I have tested last commit and found that if i run %pyspark paragraph when spark interpreter is not initialized (i.e. right after Zeppelin start or spark interpreter restart), i'm seeing following error.

          ```
          ERROR [2016-11-04 11:32:09,138] (

          {pool-1-thread-2} PySparkInterpreter.java[open]:168) - Error
          java.lang.NullPointerException
          at org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:197)
          at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:166)
          at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
          at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.open(RemoteInterpreterServer.java:250)
          at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$open.getResult(RemoteInterpreterService.java:1621)
          at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$open.getResult(RemoteInterpreterService.java:1606)
          at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
          at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
          at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:745)
          ERROR [2016-11-04 11:32:09,140] ({pool-1-thread-2}

          TThreadPoolServer.java[run]:296) - Error occurred during processing of message.
          org.apache.zeppelin.interpreter.InterpreterException: java.lang.NullPointerException
          at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:169)
          at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
          at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.open(RemoteInterpreterServer.java:250)
          at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$open.getResult(RemoteInterpreterService.java:1621)
          at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$open.getResult(RemoteInterpreterService.java:1606)
          at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
          at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
          at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:745)
          Caused by: java.lang.NullPointerException
          at org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:197)
          at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:166)
          ... 10 more
          ```

          Do you see the same error?

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1534 @agoodm I think placement of common python lib and problem printing output when `close=False` can be handled in the separate issues. Do you mind create issues? I have tested last commit and found that if i run %pyspark paragraph when spark interpreter is not initialized (i.e. right after Zeppelin start or spark interpreter restart), i'm seeing following error. ``` ERROR [2016-11-04 11:32:09,138] ( {pool-1-thread-2} PySparkInterpreter.java [open] :168) - Error java.lang.NullPointerException at org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:197) at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:166) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.open(RemoteInterpreterServer.java:250) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$open.getResult(RemoteInterpreterService.java:1621) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$open.getResult(RemoteInterpreterService.java:1606) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR [2016-11-04 11:32:09,140] ({pool-1-thread-2} TThreadPoolServer.java [run] :296) - Error occurred during processing of message. org.apache.zeppelin.interpreter.InterpreterException: java.lang.NullPointerException at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:169) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.open(RemoteInterpreterServer.java:250) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$open.getResult(RemoteInterpreterService.java:1621) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$open.getResult(RemoteInterpreterService.java:1606) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:197) at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:166) ... 10 more ``` Do you see the same error?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @felixcheung
          In that case, I think that should be fine. Will do shortly.

          @Leemoonsoo
          The latest commit should fix the issue I just mentioned. Two other important things to mention:

          1) Any print statements made prior to printing the `%html` or `%angular` won't show up in the results. This means that when `close=False` and when any plots are open, print statements won't display since the output type magic will always come after user entered print-statements. This can be fixed for the matplotlib backend specifically by adding a pre-execute hook that prints the magic of the currently open plot, but it's not really ideal.

          2) Earlier we both had issues loading the tutorial notebook. As it turns out, the reason appears to be due the angular display system. This happens whenever I stop Zeppelin whenever variables are still bound to the angular display system. If I make sure to unbind everything first, then I can safely start Zeppelin again. I am pretty sure this issue was introduced in more recent commits since I didn't have this problem when I first made the tutorial notebook. It was not until after I rebased my branch to reproduce the problem when you initially discovered it. This is actually a serious issue, and in this case it would be bad to leave it unaddressed with this PR since that would mean anyone that runs the last example in the tutorial notebook without calling `plt.close()` at the end will brick their zeppelin installation until removing the notebook.

          How do you think we should proceed?

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @felixcheung In that case, I think that should be fine. Will do shortly. @Leemoonsoo The latest commit should fix the issue I just mentioned. Two other important things to mention: 1) Any print statements made prior to printing the `%html` or `%angular` won't show up in the results. This means that when `close=False` and when any plots are open, print statements won't display since the output type magic will always come after user entered print-statements. This can be fixed for the matplotlib backend specifically by adding a pre-execute hook that prints the magic of the currently open plot, but it's not really ideal. 2) Earlier we both had issues loading the tutorial notebook. As it turns out, the reason appears to be due the angular display system. This happens whenever I stop Zeppelin whenever variables are still bound to the angular display system. If I make sure to unbind everything first, then I can safely start Zeppelin again. I am pretty sure this issue was introduced in more recent commits since I didn't have this problem when I first made the tutorial notebook. It was not until after I rebased my branch to reproduce the problem when you initially discovered it. This is actually a serious issue, and in this case it would be bad to leave it unaddressed with this PR since that would mean anyone that runs the last example in the tutorial notebook without calling `plt.close()` at the end will brick their zeppelin installation until removing the notebook. How do you think we should proceed?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user felixcheung commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          Great.

          I think it will be better to have interpreter/lib - it might be easier to set ACL on a common interpreter root in the event there are user downloaded interpreters for examples.

          Show
          githubbot ASF GitHub Bot added a comment - Github user felixcheung commented on the issue: https://github.com/apache/zeppelin/pull/1534 Great. I think it will be better to have interpreter/lib - it might be easier to set ACL on a common interpreter root in the event there are user downloaded interpreters for examples.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @felixcheung
          I updated the documentation and also added the example gif shown in this PR. In regards to getting python dependencies installed for our CI tests, I would be glad to help with that. With that in mind though I think I'll hold off on committing a unit test for pyspark until then since I'll probably get around to doing this anyway right after this PR gets merged, so not much point in going through the trouble of excluding the test in the `pom.xml`.

          The reason I put the backend files in `/lib` and not in `/interpreter/python/lib` is because I didn't want to worry about editing multiple versions of the same file if these were to be used by another interpreter (eg, I would also want to have them in `/interpreter/spark/lib`). I suppose it may be possible to just have a top-level `lib/python` get copied directly to both directories through maven, but I didn't see the added complexity being worth it.

          Also on a final note @Leemoonsoo , the previous CI failures have alerted me to another issue: With `%pyspark`, the output from the last statement in a paragraph no longer gets automatically printed, eg

          ```python
          %pyspark
          x = 1
          x
          ```
          This is because the do-nothing `displayhook()` call is now always the last statement thanks to the hook registry. This is only a problem with `%pyspark`, but not in `python`. I don't know enough about the internals of the former that would explain the difference, but it might be something we should look into later.

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @felixcheung I updated the documentation and also added the example gif shown in this PR. In regards to getting python dependencies installed for our CI tests, I would be glad to help with that. With that in mind though I think I'll hold off on committing a unit test for pyspark until then since I'll probably get around to doing this anyway right after this PR gets merged, so not much point in going through the trouble of excluding the test in the `pom.xml`. The reason I put the backend files in `/lib` and not in `/interpreter/python/lib` is because I didn't want to worry about editing multiple versions of the same file if these were to be used by another interpreter (eg, I would also want to have them in `/interpreter/spark/lib`). I suppose it may be possible to just have a top-level `lib/python` get copied directly to both directories through maven, but I didn't see the added complexity being worth it. Also on a final note @Leemoonsoo , the previous CI failures have alerted me to another issue: With `%pyspark`, the output from the last statement in a paragraph no longer gets automatically printed, eg ```python %pyspark x = 1 x ``` This is because the do-nothing `displayhook()` call is now always the last statement thanks to the hook registry. This is only a problem with `%pyspark`, but not in `python`. I don't know enough about the internals of the former that would explain the difference, but it might be something we should look into later.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user felixcheung commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          about the /lib directory, would it make sense if it's under the interpreter directory (like interpreter/python/lib)? seems like that would be consistent with everything interpreter is isolated

          Show
          githubbot ASF GitHub Bot added a comment - Github user felixcheung commented on the issue: https://github.com/apache/zeppelin/pull/1534 about the /lib directory, would it make sense if it's under the interpreter directory (like interpreter/python/lib)? seems like that would be consistent with everything interpreter is isolated
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user felixcheung commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @agoodm how about a bit on the output in docs/interpreter/python.md? I do see your reference to the notebook but it is a bit harder for general user to access that way (and it is not readily readable off github either)

          great point on more packages for python and more tests 😄 perhaps you could help later?

          Show
          githubbot ASF GitHub Bot added a comment - Github user felixcheung commented on the issue: https://github.com/apache/zeppelin/pull/1534 @agoodm how about a bit on the output in docs/interpreter/python.md? I do see your reference to the notebook but it is a bit harder for general user to access that way (and it is not readily readable off github either) great point on more packages for python and more tests 😄 perhaps you could help later?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @felixcheung The main idea is that the output is set to HTML if `angular=False` (default). If `angular=True`, the figure data gets bound to the angular display system by figure id and the output is set to angular. This is essentially explained already in the tutorial notebook. Would you like to see some separate documentation somewhere else? If so, where do you think I should put it?

          As for unit tests for pyspark, I believe this would be easy enough to do but in that case I would need to exclude them from the main test suite as we do with python because matplotlib is required. We should probably have python, matplotlib, pandas, and pandasql automatically downloaded for our CI tests, most likely using conda since those packages are in binary form.

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @felixcheung The main idea is that the output is set to HTML if `angular=False` (default). If `angular=True`, the figure data gets bound to the angular display system by figure id and the output is set to angular. This is essentially explained already in the tutorial notebook. Would you like to see some separate documentation somewhere else? If so, where do you think I should put it? As for unit tests for pyspark, I believe this would be easy enough to do but in that case I would need to exclude them from the main test suite as we do with python because matplotlib is required. We should probably have python, matplotlib, pandas, and pandasql automatically downloaded for our CI tests, most likely using conda since those packages are in binary form.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user felixcheung commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          very cool! could you add some documentation on how html or angular output can plug in to this?
          would it be possible to add the same tests to Spark pyspark interpreter too?

          Show
          githubbot ASF GitHub Bot added a comment - Github user felixcheung commented on the issue: https://github.com/apache/zeppelin/pull/1534 very cool! could you add some documentation on how html or angular output can plug in to this? would it be possible to add the same tests to Spark pyspark interpreter too?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @agoodm Thanks for the quick update. It works well! Let's see if CI goes green.

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1534 @agoodm Thanks for the quick update. It works well! Let's see if CI goes green.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @Leemoonsoo Try it now, I think it should work. The main thing I changed was printing the last image as HTML instead of ANGULAR (for the purpose of saving the image in the notebook). I also exported the old note first before making these changes, then deleted the old note in place of this new one.

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @Leemoonsoo Try it now, I think it should work. The main thing I changed was printing the last image as HTML instead of ANGULAR (for the purpose of saving the image in the notebook). I also exported the old note first before making these changes, then deleted the old note in place of this new one.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          @Leemoonsoo
          Yep, I also had trouble starting Zeppelin. I can run it on my main development directory, but couldn't when I cloned the current master branch to an empty directory and patched my commits manually. I confirmed this by deleting the note directory and then restarting Zeppelin, after which there were no problems.

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 @Leemoonsoo Yep, I also had trouble starting Zeppelin. I can run it on my main development directory, but couldn't when I cloned the current master branch to an empty directory and patched my commits manually. I confirmed this by deleting the note directory and then restarting Zeppelin, after which there were no problems.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user Leemoonsoo commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          Thanks @agoodm for great contribution.
          I have tested this and it works really well on both %pyspark and %python.
          Also verified `/lib/python` dir is packaged with -Pbuild-distr flag.

          One thing is, with the tutorial note `/notebook/2BQA35CJZ`, Zeppelin fails to start with error

          ```
          Caused by: java.lang.NullPointerException
          at org.apache.zeppelin.notebook.Notebook.loadNoteFromRepo(Notebook.java:447)
          at org.apache.zeppelin.notebook.Notebook.loadAllNotes(Notebook.java:469)
          at org.apache.zeppelin.notebook.Notebook.<init>(Notebook.java:123)
          at org.apache.zeppelin.server.ZeppelinServer.<init>(ZeppelinServer.java:107)
          ... 29 more
          ```

          Without tutorial note `notebook/2BQA35CJZ`, Zeppelin starts without problem.
          Do you experience the same problem with tutorial note?

          Show
          githubbot ASF GitHub Bot added a comment - Github user Leemoonsoo commented on the issue: https://github.com/apache/zeppelin/pull/1534 Thanks @agoodm for great contribution. I have tested this and it works really well on both %pyspark and %python. Also verified `/lib/python` dir is packaged with -Pbuild-distr flag. One thing is, with the tutorial note `/notebook/2BQA35CJZ`, Zeppelin fails to start with error ``` Caused by: java.lang.NullPointerException at org.apache.zeppelin.notebook.Notebook.loadNoteFromRepo(Notebook.java:447) at org.apache.zeppelin.notebook.Notebook.loadAllNotes(Notebook.java:469) at org.apache.zeppelin.notebook.Notebook.<init>(Notebook.java:123) at org.apache.zeppelin.server.ZeppelinServer.<init>(ZeppelinServer.java:107) ... 29 more ``` Without tutorial note `notebook/2BQA35CJZ`, Zeppelin starts without problem. Do you experience the same problem with tutorial note?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user agoodm commented on the issue:

          https://github.com/apache/zeppelin/pull/1534

          Ok, that should do it for now. @Leemoonsoo @bzz @felixcheung please feel free to review.

          Show
          githubbot ASF GitHub Bot added a comment - Github user agoodm commented on the issue: https://github.com/apache/zeppelin/pull/1534 Ok, that should do it for now. @Leemoonsoo @bzz @felixcheung please feel free to review.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user agoodm opened a pull request:

          https://github.com/apache/zeppelin/pull/1534

          [WIP]ZEPPELIN-1345 - Create a custom matplotlib backend that natively supports inline plotting in a python interpreter cell

              1. What is this PR for?
                This PR is the first of two major steps needed to improve matplotlib integration in Zeppelin (ZEPPELIN-1344). The latter, which is a plotting backend with fully interactive tools enabled, will be done afterwards in a separate PR. This PR specifically for automatically displaying output from calls to matplotlib plotting functions inline with each paragraph. Thanks to the addition of post-execute hooks (ZEPPELIN-1423), there is no need to call any `show()` function to display an inline plot, just like in Jupyter.
              1. What type of PR is it?
                Improvement
              1. Todos
                The main code has been written and anyone who reads this is encouraged to test it, but there are a few minor todos:
          • [ ] - Add unit tests
          • [ ] - Add documentation
          • [ ] - Add screenshot showing iterative plotting with angular mode
              1. What is the Jira issue?
                ZEPPELIN-1345(https://issues.apache.org/jira/browse/ZEPPELIN-1345)
              1. How should this be tested?
                In a pyspark or python paragraph, enter and run
                ```python
                import matplotlib.pyplot as plt
                plt.plot([1, 2, 3])
                ```
                The plot should be displayed automatically without calling any `show()` function whatsoever. A special method called `configure_mpl()` can also be used to modify the inline plotting behavior. For example,

          ```python
          z.configure_mpl(close=False, angular=True)
          plt.plot([1, 2, 3])
          ```
          allows for iterative updates to the plot provided you have PY4J installed for your python installation (which of course is always the case if you use pypsark). Doing something like:
          ```
          plt.plot([3, 2, 1])
          ```
          will update the plot that was generated by the previous paragraph by leveraging Zeppelin's Angular Display System. However, by setting `close=False`, matplotlib will no longer automatically close figures so it is now up to the user to explicitly close each figure instance they create. There's quite a bit more options for `z.configure_mpl()`, but I will save that discussion for the documentation.

              1. Screenshots (if appropriate)
                TODO
              1. Questions:
          • Does the licenses files need update? No
          • Is there breaking changes for older versions? No
          • Does this needs documentation? Yes

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/agoodm/zeppelin ZEPPELIN-1345

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/zeppelin/pull/1534.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #1534


          commit fbc41c196823f0274f1a64011eb6581f6154547f
          Author: Alex Goodman <agoodm@users.noreply.github.com>
          Date: 2016-10-05T22:26:14Z

          Add new matplotlib backend for python/pyspark interpreters

          commit ced636d1b96f4a566bf46e4b35bbabfa511ae146
          Author: Alex Goodman <agoodm@users.noreply.github.com>
          Date: 2016-10-06T22:16:59Z

          Added support for Angular Display System

          commit c5bcbcceb89e6daf3fdf484589c0aed8efe8d7ae
          Author: Alex Goodman <agoodm@users.noreply.github.com>
          Date: 2016-10-06T23:40:54Z

          Removed unused variable


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user agoodm opened a pull request: https://github.com/apache/zeppelin/pull/1534 [WIP] ZEPPELIN-1345 - Create a custom matplotlib backend that natively supports inline plotting in a python interpreter cell What is this PR for? This PR is the first of two major steps needed to improve matplotlib integration in Zeppelin ( ZEPPELIN-1344 ). The latter, which is a plotting backend with fully interactive tools enabled, will be done afterwards in a separate PR. This PR specifically for automatically displaying output from calls to matplotlib plotting functions inline with each paragraph. Thanks to the addition of post-execute hooks ( ZEPPELIN-1423 ), there is no need to call any `show()` function to display an inline plot, just like in Jupyter. What type of PR is it? Improvement Todos The main code has been written and anyone who reads this is encouraged to test it, but there are a few minor todos: [ ] - Add unit tests [ ] - Add documentation [ ] - Add screenshot showing iterative plotting with angular mode What is the Jira issue? ZEPPELIN-1345 ( https://issues.apache.org/jira/browse/ZEPPELIN-1345 ) How should this be tested? In a pyspark or python paragraph, enter and run ```python import matplotlib.pyplot as plt plt.plot( [1, 2, 3] ) ``` The plot should be displayed automatically without calling any `show()` function whatsoever. A special method called `configure_mpl()` can also be used to modify the inline plotting behavior. For example, ```python z.configure_mpl(close=False, angular=True) plt.plot( [1, 2, 3] ) ``` allows for iterative updates to the plot provided you have PY4J installed for your python installation (which of course is always the case if you use pypsark). Doing something like: ``` plt.plot( [3, 2, 1] ) ``` will update the plot that was generated by the previous paragraph by leveraging Zeppelin's Angular Display System. However, by setting `close=False`, matplotlib will no longer automatically close figures so it is now up to the user to explicitly close each figure instance they create. There's quite a bit more options for `z.configure_mpl()`, but I will save that discussion for the documentation. Screenshots (if appropriate) TODO Questions: Does the licenses files need update? No Is there breaking changes for older versions? No Does this needs documentation? Yes You can merge this pull request into a Git repository by running: $ git pull https://github.com/agoodm/zeppelin ZEPPELIN-1345 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zeppelin/pull/1534.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1534 commit fbc41c196823f0274f1a64011eb6581f6154547f Author: Alex Goodman <agoodm@users.noreply.github.com> Date: 2016-10-05T22:26:14Z Add new matplotlib backend for python/pyspark interpreters commit ced636d1b96f4a566bf46e4b35bbabfa511ae146 Author: Alex Goodman <agoodm@users.noreply.github.com> Date: 2016-10-06T22:16:59Z Added support for Angular Display System commit c5bcbcceb89e6daf3fdf484589c0aed8efe8d7ae Author: Alex Goodman <agoodm@users.noreply.github.com> Date: 2016-10-06T23:40:54Z Removed unused variable

            People

            • Assignee:
              agoodman Alex Goodman
              Reporter:
              agoodman Alex Goodman
            • Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development