Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7629

Problem in SMB Joins between two Parquet tables

    XMLWordPrintableJSON

Details

    Description

      The issue is clearly seen when two bucketed and sorted parquet tables with different number of columns are involved in the join . The following exception is seen

      Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
              at java.util.ArrayList.rangeCheck(ArrayList.java:635)
              at java.util.ArrayList.get(ArrayList.java:411)
              at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
              at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
              at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:79)
              at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:66)
              at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
              at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:65)
      

      Attachments

        1. HIVE-7629.patch
          20 kB
          Suma Shivaprasad
        2. HIVE-7629.1.patch
          21 kB
          Suma Shivaprasad

        Issue Links

          Activity

            People

              suma.shivaprasad Suma Shivaprasad
              suma.shivaprasad Suma Shivaprasad
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: