Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8118

Support work that have multiple child works to work around SPARK-3622 [Spark Branch]

Log workAgile BoardRank to TopRank to BottomVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.1.0
    • Spark

    Description

      In the current implementation, both SparkMapRecordHandler and SparkReduceRecorderHandler takes only one result collector, which limits that the corresponding map or reduce task can have only one child. It's very comment in multi-insert queries where a map/reduce task has more than one children. A query like the following has two map tasks as parents:

      select name, sum(value) from dec group by name union all select name, value from dec order by name
      

      It's possible in the future an optimation may be implemented so that a map work is followed by two reduce works and then connected to a union work.

      Thus, we should take this as a general case. Tez is currently providing a collector for each child operator in the map-side or reduce side operator tree. We can take Tez as a reference.

      Spark currently doesn't have a tranformation that supports mutliple output datasets from a single input dataset (SPARK-3622). This is a workaround for this gap.

      Likely this is a big change and subtasks are possible.

      With this, we can have a simpler and clean multi-insert implementation. This is also the problem observed in HIVE-7731 and HIVE-7503.

      Attachments

        1. HIVE-8118.pdf
          112 kB
          Xuefu Zhang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            csun Chao Sun Assign to me
            xuefuz Xuefu Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment