Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12608

Parquet Schema Evolution doesn't work when a column is dropped from array<struct<>>

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.1.0
    • Component/s: File Formats
    • Labels:
      None

      Description

      When a column is dropped from array<struct<>>, I got the following exception.

      I used the following sql to test it.

      CREATE TABLE arrays_of_struct_to_map (locations1 array<struct<c1:int,c2:int>>, locations2 array<struct<f1:int,
      f2:int,f3:int>>) STORED AS PARQUET;
      INSERT INTO TABLE arrays_of_struct_to_map select array(named_struct("c1",1,"c2",2)), array(named_struct("f1",
      77,"f2",88,"f3",99)) FROM parquet_type_promotion LIMIT 1;
      SELECT * FROM arrays_of_struct_to_map;
      – Testing schema evolution of dropping column from array<struct<>>
      ALTER TABLE arrays_of_struct_to_map REPLACE COLUMNS (locations1 array<struct<c1:int>>, locations2
      array<struct<f2:int>>);
      SELECT * FROM arrays_of_struct_to_map;

      2015-12-07 11:47:28,503 ERROR [main]: CliDriver (SessionState.java:printError(921)) - Failed with exception java.io.IOException:java.lang.RuntimeException: cannot find field c2 in [c1]
      java.io.IOException: java.lang.RuntimeException: cannot find field c2 in [c1]
      at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
      at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
      at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
      at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
      at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
      at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
      at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
      at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
      at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1029)
      at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1003)
      at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:139)
      at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_type_promotion(TestCliDriver.java:123)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at junit.framework.TestCase.runTest(TestCase.java:176)
      at junit.framework.TestCase.runBare(TestCase.java:141)
      at junit.framework.TestResult$1.protect(TestResult.java:122)
      at junit.framework.TestResult.runProtected(TestResult.java:142)
      at junit.framework.TestResult.run(TestResult.java:125)
      at junit.framework.TestCase.run(TestCase.java:129)
      at junit.framework.TestSuite.runTest(TestSuite.java:255)
      at junit.framework.TestSuite.run(TestSuite.java:250)
      at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
      at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
      at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
      at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
      at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
      at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
      at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
      Caused by: java.lang.RuntimeException: cannot find field c2 in [c1]
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.getStructFieldTypeInfo(HiveStructConverter.java:130)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.getFieldTypeIgnoreCase(HiveStructConverter.java:103)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.init(HiveStructConverter.java:90)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.<init>(HiveStructConverter.java:67)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.<init>(HiveStructConverter.java:59)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:63)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:75)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveCollectionConverter$ElementConverter.<init>(HiveCollectionConverter.java:141)
      at org.apache.hadoop.hive.ql.io.parquet.convert.HiveCollectionConverter.<init>(HiveCollectionConverter.java:52)

        Attachments

        1. HIVE-12608.1.patch
          4 kB
          Mohammad Islam

          Activity

            People

            • Assignee:
              kamrul Mohammad Islam
              Reporter:
              kamrul Mohammad Islam
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: