Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-991

enctypt data throw exception with a sql filter push down

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.7.0, 1.6.8, 1.6.9, 1.6.10
    • 1.7.0, 1.6.11
    • Java
    • None
    • 1.ORC 1.6.8+
      2.SparkSQL 2.4.7
      3.JDK 1.8

    Description

      1.create a table 

      CREATE TABLE `itmp8888`(`id` INT, `name` STRING)
      ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
      WITH SERDEPROPERTIES (
      'serialization.format' = '1'
      )
      STORED AS
      INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
      OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
      TBLPROPERTIES (
      'transient_lastDdlTime' = '1631174384',
      'orc.encrypt' = 'AES_CTR_128:id,name',
      'orc.mask' = 'sha256:id,name',
      'orc.encrypt.ezk' = 'jNCeDBtNfT8wPaTpR34JHA=='
      )

      2. insert data

      3.  a select statement that no filters is fine

         select * from itmp8888

      4. a select statement  with the filter including the encrypted column will throw exception

        select * from itmp8888 where id = 1

       

      5.the stack trace

      Caused by: java.lang.AssertionError: Index is not populated for 1Caused by: java.lang.AssertionError: Index is not populated for 1 at org.apache.orc.impl.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:995) at org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1083) at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1101) at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1151) at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1186) at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:248) at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:864) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initialize(OrcColumnarBatchReader.java:142) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat$$anonfun$buildReaderWithPartitionValues$2.apply(OrcFileFormat.scala:211) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat$$anonfun$buildReaderWithPartitionValues$2.apply(OrcFileFormat.scala:175) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:124) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:177) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:101)

      6. I debug the code find that the RowIndex is null for all the encrypted columns

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Guiyankuang Yiqun Zhang
            hgs19921112 hgs
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment