Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-562

Don't wrap readerSchema in acidSchema, if readerSchema is already acid

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.5.6, 1.6.0
    • 1.5.8, 1.6.2, 1.7.0
    • Java
    • None

    Description

      create table tbl1 (a int, b string) partitioned by (ds string) stored as orc tblproperties ('transactional'='true');
      insert into tbl1 partition (ds) values (1, 'fred', 'today'), (2, 'wilma', 'yesterday');
      

      As this table is transactional, all the modifications will generate a new delta directory, containing a delta file in orc format. The schema of this file will be

      struct<operation:int,originaltransaction:bigint,bucket:int,rowid:bigint,currenttransaction:bigint,row:struct<a:int,b:string>>
      

      If I create a new partitioned table with the very same schema, and change the partition location to one of the delta directories, I would assume that I would be able to run queries against the contents of the delta file.

      Right now this is not possible in orc, because the original readerschema is wrapped in acidschema again, regardless that the readerschema is already acid.

      struct<operation:int,originalTransaction:bigint,bucket:int,rowId:bigint,currentTransaction:bigint,row:struct<operation:int,originaltransaction:bigint,bucket:int,rowid:bigint,currenttransaction:bigint,row:struct<a:int,b:string>>>
      

      Attachments

        1. ORC-562.02.patch
          3 kB
          László Pintér
        2. ORC-562.01.patch
          1 kB
          László Pintér

        Issue Links

          Activity

            People

              lpinter László Pintér
              lpinter László Pintér
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m