Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8233

multi-table insertion doesn't work with ForwardOperator [Spark Branch]

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • Spark
    • None

    Description

      Right now, for multi-table insertion, we will start from multiple FileSinkOperators, and break from their lowest common ancestor, adding temporary FileSinkOperator and TableScanOperators. A special case is when the LCA is a ForwardOperator, in which case we don't break it, since it's already been optimized.

      However, there's a issue, considering the following plan:

            ...
            RS_0
             |
            FOR
             |
           /   \
         GBY_1  GBY_2
          |     |
         ...   ...
          |     |
         RS_1  RS_2
          |     |
         ...   ...
          |     |
         FS_1  FS_2
      

      which may result to:

                RW
               /  \
             RW    RW
      

      Hence, because of the issue in HIVE-7731 and HIVE-8118, both downstream branches will get duplicated (and same) input.

      Attachments

        1. HIVE-8233.5-spark.patch
          231 kB
          Chao Sun
        2. HIVE-8233.4-spark.patch
          231 kB
          Chao Sun
        3. HIVE-8233.3-spark.patch
          210 kB
          Chao Sun
        4. HIVE-8233.2-spark.patch
          210 kB
          Chao Sun
        5. HIVE-8233.1-spark.patch
          284 kB
          Chao Sun

        Issue Links

          Activity

            People

              csun Chao Sun
              csun Chao Sun
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: