Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15844

Make ReduceSinkOperator independent of Acid

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.3.0
    • Transactions
    • None

    Description

      1. both FileSinkDesk and ReduceSinkDesk have special code path for Update/Delete operations. It is not always set correctly for ReduceSink. ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't set correctly, elsewhere (SemanticAnalyzer.getPartitionColsFromBucketColsForUpdateDelete()) we set ROW_ID to be the partition column of the ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from ROW_ID. We need to modify Explain Plan to record Write Type (i.e. insert/update/delete) to make sure we have tests that can catch errors here.
      2. Add some validation at the end of the plan to make sure that RSO/FSO which represent the end of the pipeline and write to acid table have WriteType set (to something other than default).
      3. We don't seem to have any tests where number of buckets is > number of reducers. Add those.

      Attachments

        1. HIVE-15844.08.patch
          907 kB
          Eugene Koifman
        2. HIVE-15844.07.patch
          168 kB
          Eugene Koifman
        3. HIVE-15844.06.patch
          166 kB
          Eugene Koifman
        4. HIVE-15844.05.patch
          163 kB
          Eugene Koifman
        5. HIVE-15844.04.patch
          165 kB
          Eugene Koifman
        6. HIVE-15844.03.patch
          45 kB
          Eugene Koifman
        7. HIVE-15844.02.patch
          18 kB
          Eugene Koifman
        8. HIVE-15844.01.patch
          6 kB
          Eugene Koifman

        Issue Links

          Activity

            People

              ekoifman Eugene Koifman
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: