Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14366

Conversion of a Non-ACID table to an ACID table produces non-unique primary keys

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.0.0
    • 1.3.0, 2.1.1, 2.2.0
    • Transactions
    • None

    Description

      When a Non-ACID table is converted to an ACID table, the primary key consisting of (original transaction id, bucket_id, row_id) is not generated uniquely. Currently, the row_id is always set to 0 for most rows. This leads to correctness issue for such tables.

      Quickest way to reproduce is to add the following unit test to ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java

      ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
        @Test
        public void testOriginalReader() throws Exception {
          FileSystem fs = FileSystem.get(hiveConf);
          FileStatus[] status;
      
          // 1. Insert five rows to Non-ACID table.
          runStatementOnDriver("insert into " + Table.NONACIDORCTBL + "(a,b) values(1,2),(3,4),(5,6),(7,8),(9,10)");
      
          // 2. Convert NONACIDORCTBL to ACID table.
          runStatementOnDriver("alter table " + Table.NONACIDORCTBL + " SET TBLPROPERTIES ('transactional'='true')");
      
          // 3. Perform a major compaction.
          runStatementOnDriver("alter table "+ Table.NONACIDORCTBL + " compact 'MAJOR'");
          runWorker(hiveConf);
      
          // 4. Perform a delete.
          runStatementOnDriver("delete from " + Table.NONACIDORCTBL + " where a = 1");
      
          // 5. Now do a projection should have (3,4) (5,6),(7,8),(9,10) only since (1,2) has been deleted.
          List<String> rs = runStatementOnDriver("select a,b from " + Table.NONACIDORCTBL + " order by a,b");
          int[][] resultData = new int[][] {{3,4}, {5,6}, {7,8}, {9,10}};
          Assert.assertEquals(stringifyValues(resultData), rs);
        }
      

      Attachments

        1. HIVE-14366.02.patch
          9 kB
          Eugene Koifman
        2. HIVE-14366.01.patch
          4 kB
          Saket Saurabh

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            saketj Saket Saurabh Assign to me
            saketj Saket Saurabh
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment