Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Spark
    • None


      There is a race condition in RemoteSparkJobMonitor. Sometimes the info in RemoteSparkJobMonitor#startMonitor.STARTED gets printed out, sometimes it doesn't. This can be easily verified by running a qtest on TestMiniSparkOnYarnCliDriver and counting the number of times Query Hive on Spark job is printed vs. the number of times Finished successfully in gets printed.

      The issue is that RemoteSparkJobMonitor runs every one second, and checks the state of JobHandle. Depending on the state, it prints out some logging info. The content of the logs contain an implicit assumption that logs in the STARTED state are printed before the logs in the SUCCEEDED state. However, this isn't always the case. The state transitions are driven by how long the remote Spark job takes to run, and it it finishes within one second then the logs in the STARTED state never printed.

      This can be confusing to users, and there is key debugging information that is printed in the STARTED state.


        1. HIVE-18684.1.patch
          12 kB
          Sahil Takiar
        2. HIVE-18684.2.patch
          15 kB
          Sahil Takiar
        3. HIVE-18684.3.patch
          16 kB
          Sahil Takiar


          This comment will be Viewable by All Users Viewable by All Users


            stakiar Sahil Takiar Assign to me
            stakiar Sahil Takiar




                Issue deployment