Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-10291

Hive on Spark job configuration needs to be logged [Spark Branch]

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0
    • 1.2.0
    • Spark
    • None

    Description

      In a Hive on MR job, all the job properties are put into the JobConf, which can then be viewed via the MR2 HistoryServer's Job UI.

      However, in Hive on Spark we are submitting an application that is long-lived. Hence, we only put properties into the SparkConf relevant to application submission (spark and yarn properties). Only these are viewable through the Spark HistoryServer Application UI.

      It is the Hive application code (RemoteDriver, aka RemoteSparkContext) that is responsible for serializing and deserializing the job.xml per job (ie, query) within the application. Thus, for supportability we also need to give an equivalent mechanism to print the job.xml per job.

      Attachments

        1. HIVE-10291.3-spark.patch
          2 kB
          Szehon Ho
        2. HIVE-10291.2-spark.patch
          3 kB
          Szehon Ho
        3. HIVE-10291-spark.patch
          3 kB
          Szehon Ho

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            szehon Szehon Ho Assign to me
            szehon Szehon Ho
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment