Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-18390

IndexOutOfBoundsException when query a partitioned view in ColumnPruner

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.1.1
    • 3.0.0
    • Query Planning, Views
    • None

    Description

      IndexOutOfBoundsException is encountered when query a partitioned view.
      in Column Prunning, each SEL operator collects the accessed column in current SEL operator,
      When ColumnPrunerSelectProc getting a view's columns accessed, it will first get the index of output column names in the view, then call Table.getCols().get(index).getName() to finally get the
      name of output column, but Table.getCols() will not return all columns (partitioned column is
      lacked), so if partitioned columns is queried, an IndexOutOfBoundsException will throw.

      REPRODUCE:

      create table foo
      (
      `a` string
      ) partitioned by (`b` string)
      ;
      
      create view bar partitioned on (b) as
      select a,b from foo;
      
      select * from bar;     --IndexOutOfBoundsException
      

      OPERATORE TREE:

      TS[0]
         |
      SEL[1]
         |
      SEL[2]
         |
      FS[3]
      

      SEL[1] collects accessed column(contains partitioned column b), b's internal column name is '_col1', the corresponding column index is 1, but actually bar's getCols() returned a list of length 1: ['a'], so tab.getCols().get(1) throw tab.getCols().get(index)

      HOW TO FIX:
      instead of call view's getCols() method, we should get all columns including partitioned columns

      Attachments

        1. HIVE-18390.patch
          4 kB
          Hengyu Dai

        Activity

          People

            hengyu.dai Hengyu Dai
            hengyu.dai Hengyu Dai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: