Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12608

Parquet Schema Evolution doesn't work when a column is dropped from array<struct<>>

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • File Formats
    • None

    Description

      When a column is dropped from array<struct<>>, I got the following exception.

      I used the following sql to test it.

      CREATE TABLE arrays_of_struct_to_map (locations1 array<struct<c1:int,c2:int>>, locations2 array<struct<f1:int,
      f2:int,f3:int>>) STORED AS PARQUET;
      INSERT INTO TABLE arrays_of_struct_to_map select array(named_struct("c1",1,"c2",2)), array(named_struct("f1",
      77,"f2",88,"f3",99)) FROM parquet_type_promotion LIMIT 1;
      SELECT * FROM arrays_of_struct_to_map;
      – Testing schema evolution of dropping column from array<struct<>>
      ALTER TABLE arrays_of_struct_to_map REPLACE COLUMNS (locations1 array<struct<c1:int>>, locations2
      array<struct<f2:int>>);
      SELECT * FROM arrays_of_struct_to_map;

      2015-12-07 11:47:28,503 ERROR [main]: CliDriver (SessionState.java:printError(921)) - Failed with exception java.io.IOException:java.lang.RuntimeException: cannot find field c2 in [c1]
      java.io.IOException: java.lang.RuntimeException: cannot find field c2 in [c1]
      at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
      at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
      at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
      at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
      at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
      at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
      at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
      at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
      at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1029)
      at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1003)
      at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:139)
      at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_type_promotion(TestCliDriver.java:123)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at junit.framework.TestCase.runTest(TestCase.java:176)
      at junit.framework.TestCase.runBare(TestCase.java:141)
      at junit.framework.TestResult$1.protect(TestResult.java:122)
      at junit.framework.TestResult.runProtected(TestResult.java:142)
      at junit.framework.TestResult.run(TestResult.java:125)
      at junit.framework.TestCase.run(TestCase.java:129)
      at junit.framework.TestSuite.runTest(TestSuite.java:255)
      at junit.framework.TestSuite.run(TestSuite.java:250)
      at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
      at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
      at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
      at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
      at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
      at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
      at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
      Caused by: java.lang.RuntimeException: cannot find field c2 in [c1]
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.getStructFieldTypeInfo(HiveStructConverter.java:130)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.getFieldTypeIgnoreCase(HiveStructConverter.java:103)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.init(HiveStructConverter.java:90)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.<init>(HiveStructConverter.java:67)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.<init>(HiveStructConverter.java:59)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:63)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:75)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveCollectionConverter$ElementConverter.<init>(HiveCollectionConverter.java:141)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveCollectionConverter.<init>(HiveCollectionConverter.java:52)

      Attachments

        1. HIVE-12608.1.patch
          4 kB
          Mohammad Islam

        Activity

          People

            kamrul Mohammad Islam
            kamrul Mohammad Islam
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: