Uploaded image for project: 'Livy'
  1. Livy
  2. LIVY-774

Logging does not print to stdout or stderr correctly on PySpark through Livy

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 0.7.0
    • 0.9.0
    • API
    • None

    Description

      Summary

      When using PySpark through Livy on Zeppelin or Jupyter Notebook, or Linux curl,  For the 1st time, it could print out the log to stdout or stderr. But for the 2nd time and afterwards, it will show the error stack:  ValueError: I/O operation on closed file

      If we use PySpark CLI on the master node, it works well, you could check the attachment: Works_on_PySpark_CLI.png

      Reproduce Step

      In Zeppelin using Livy as interpreter

      %pyspark
      
      import sys
      import logging;
      
      // OUTPUT
      Spark Application Id: application_1591899500515_0002
      
      

      When the 1st time, we try to print log to stdout or stderr, it works well.

      %pyspark
      
      logger = logging.getLogger("log_example")
      logger.setLevel(logging.ERROR)
      ch = logging.StreamHandler(sys.stderr)
      ch.setLevel(logging.ERROR)
      logger.addHandler(ch)
      logger.error("test error!")
      
      // OUTPUT is expected
      test error!

      When we try to print log to stdout or stderr 2nd time and afterwards, it will show the error stack.

      %pyspark
      
      logger.error("test error again!")
      
      // OUTPUT showing error stack
      --- Logging error ---
      Traceback (most recent call last):
        File "/usr/lib64/python3.7/logging/__init__.py", line 1028, in emit
          stream.write(msg + self.terminator)
        File "/tmp/1262710270598062870", line 534, in write
          super(UnicodeDecodingStringIO, self).write(s)
      ValueError: I/O operation on closed file
      Call stack:
        File "/tmp/1262710270598062870", line 714, in <module>
          sys.exit(main())
        File "/tmp/1262710270598062870", line 686, in main
          response = handler(content)
        File "/tmp/1262710270598062870", line 318, in execute_request
          result = node.execute()
        File "/tmp/1262710270598062870", line 229, in execute
          exec(code, global_dict)
        File "<stdin>", line 1, in <module>
      Message: 'test error again!'

      For Jupyter notebook, or Linux curl command, they got the same error. You could check the attachments:

      1. Zeppelin_use_Livy_bug.png

      2. JupyterNotebook_use_Livy_bug.png

      3. LinuxCurl_use_Livy_error.png

       

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            chaoga Chao Gao

            Dates

              Created:
              Updated:

              Slack

                Issue deployment