Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7886

Aggregation queries fail with RCFile based Hive tables with S3 storage

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.13.1
    • None
    • File Formats
    • None

    Description

      Aggregation queries on Hive tables which use RCFile format and S3 storage are failing.

      My setup is Hadoop 2.5.0 and Hive 0.13.1.

      I create a table with following schema:-
      CREATE EXTERNAL TABLE `testtable`(
      `col1` string,
      `col2` tinyint,
      `col3` int,
      `col4` float,
      `col5` boolean,
      `col6` smallint)
      ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe'
      WITH SERDEPROPERTIES (
      'serialization.format'='\t',
      'line.delim'='\n',
      'field.delim'='\t'
      )
      STORED AS INPUTFORMAT
      'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
      OUTPUTFORMAT
      'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
      LOCATION
      's3n://<testbucket>/testtable';

      When I run 'select count from testtable', it gives the following exception stack:-

      Error: java.io.IOException: java.io.IOException: java.io.EOFException: Attempted to seek or read past the end of the file
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
      Caused by: java.io.IOException: java.io.EOFException: Attempted to seek or read past the end of the file
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
      at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
      at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
      ... 11 more
      Caused by: java.io.EOFException: Attempted to seek or read past the end of the file
      at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:462)
      at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
      at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:234)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:601)
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
      at org.apache.hadoop.fs.s3native.$Proxy17.retrieve(Unknown Source)
      at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:205)
      at org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:96)
      at org.apache.hadoop.fs.BufferedFSInputStream.skip(BufferedFSInputStream.java:67)
      at java.io.DataInputStream.skipBytes(DataInputStream.java:220)
      at org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.readFields(RCFile.java:739)
      at org.apache.hadoop.hive.ql.io.RCFile$Reader.currentValueBuffer(RCFile.java:1720)
      at org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1898)
      at org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:149)
      at org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:44)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339)
      ... 15 more

      Attachments

        Activity

          People

            Unassigned Unassigned
            vravuri@ea.com Venkata Puneet Ravuri
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: