Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16832

duplicate ROW__ID possible in multi insert into transactional table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.2.0
    • 3.0.0
    • Transactions
    • None

    Description

       create table AcidTablePart(a int, b int) partitioned by (p string) clustered by (a) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');
       create temporary table if not exists data1 (x int);
       insert into data1 values (1);
       from data1
         insert into AcidTablePart partition(p) select 0, 0, 'p' || x
         insert into AcidTablePart partition(p='p1') select 0, 1
      

      Each branch of this multi-insert create a row in partition p1/bucket0 with ROW__ID=(1,0,0).
      The same can happen when running SQL Merge (HIVE-10924) statement that has both Insert and Update clauses when target table has 'transactional'='true','transactional_properties'='default' (see HIVE-14035). This is so because Merge is internally run as a multi-insert statement.

      The solution relies on statement ID introduced in HIVE-11030. Each Insert clause of a multi-insert is gets a unique ID.
      The ROW__ID.bucketId now becomes a bit packed triplet (format version, bucketId, statementId).
      (Since ORC stores field names in the data file we can't rename ROW__ID.bucketId).
      This ensures that there are no collisions and retains desired sort properties of ROW__ID.
      In particular SortedDynPartitionOptimizer works w/o any changes even in cases where there fewer reducers than buckets.

      Attachments

        1. HIVE-16832.01.patch
          32 kB
          Eugene Koifman
        2. HIVE-16832.03.patch
          78 kB
          Eugene Koifman
        3. HIVE-16832.04.patch
          84 kB
          Eugene Koifman
        4. HIVE-16832.05.patch
          95 kB
          Eugene Koifman
        5. HIVE-16832.06.patch
          95 kB
          Eugene Koifman
        6. HIVE-16832.08.patch
          113 kB
          Eugene Koifman
        7. HIVE-16832.09.patch
          114 kB
          Eugene Koifman
        8. HIVE-16832.10.patch
          128 kB
          Eugene Koifman
        9. HIVE-16832.11.patch
          227 kB
          Eugene Koifman
        10. HIVE-16832.14.patch
          34 kB
          Eugene Koifman
        11. HIVE-16832.15.patch
          87 kB
          Eugene Koifman
        12. HIVE-16832.16.patch
          95 kB
          Eugene Koifman
        13. HIVE-16832.17.patch
          100 kB
          Eugene Koifman
        14. HIVE-16832.18.patch
          105 kB
          Eugene Koifman
        15. HIVE-16832.19.patch
          117 kB
          Eugene Koifman
        16. HIVE-16832.20.patch
          121 kB
          Eugene Koifman
        17. HIVE-16832.20.patch
          121 kB
          Eugene Koifman
        18. HIVE-16832.21.patch
          121 kB
          Eugene Koifman
        19. HIVE-16832.22.patch
          127 kB
          Eugene Koifman

        Issue Links

          Activity

            People

              ekoifman Eugene Koifman
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: