Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5235

Infinite loop with ORC file and Hive 0.11

Details

    • Bug
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • 0.11.0, 0.12.0
    • None
    • None
    • None
    • Gentoo linux with Hortonworks Hadoop hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d

    Description

      We are using Hive 0.11 with ORC file format and we get some tasks blocked in some kind of infinite loop. They keep working indefinitely when we set a huge task expiry timeout. If we the expiry time to 600 second, the taks fail because of not reporting progress, and finally, the Job fails.

      That is not consistent, and some times between jobs executions the behavior changes. It happen for different queries.

      We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks that is blocked keeps consuming 100% of CPU usage, and the stack trace is always the same consistently. Everything points to some kind of infinite loop. My guessing is that it has some relation to the ORC file. Maybe some pointer is not right when writing generating some kind of infinite loop when reading. Or maybe there is a bug in the reading stage.

      More information below. The stack trace:

       
      "main" prio=10 tid=0x00007f20a000a800 nid=0x1ed2 runnable [0x00007f20a8136000]
         java.lang.Thread.State: RUNNABLE
      	at java.util.zip.Inflater.inflateBytes(Native Method)
      	at java.util.zip.Inflater.inflate(Inflater.java:256)
      	- locked <0x00000000f42a6ca0> (a java.util.zip.ZStreamRef)
      	at org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64)
      	at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128)
      	at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143)
      	at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54)
      	at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65)
      	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66)
      	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81)
      	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332)
      	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802)
      	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214)
      	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71)
      	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46)
      	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
      	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
      	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
      	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
      	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
      	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
      	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
      	- eliminated <0x00000000e1459700> (a org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
      	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
      	- locked <0x00000000e1459700> (a org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
      	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
      	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
      	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
      	at org.apache.hadoop.mapred.Child.main(Child.java:249)
      

      We have seen the same stack trace repeatedly for several executions of jstack.

      The log file for this kind of task is the following:

       
      2013-09-04 23:12:34,332 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
      2013-09-04 23:12:34,681 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /hd/hd6/mapred/local/taskTracker/distcache/1758137359311570022_-1138434677_1812098585/master/tmp/hive-datasalt/hive_2013-09-04_22-45-13_829_2202639092470021957/-mr-10004/22027cf8-f583-41d7-adb8-e7e74922d113 <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/HIVE_PLAN22027cf8-f583-41d7-adb8-e7e74922d113
      2013-09-04 23:12:34,718 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/com <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/com
      2013-09-04 23:12:34,733 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/javolution <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/javolution
      2013-09-04 23:12:34,743 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/org <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/org
      2013-09-04 23:12:34,756 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/job.jar <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/job.jar
      2013-09-04 23:12:34,768 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/javax <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/javax
      2013-09-04 23:12:34,779 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/hive-exec-log4j.properties <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/hive-exec-log4j.properties
      2013-09-04 23:12:34,790 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/META-INF <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/META-INF
      2013-09-04 23:12:34,796 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/javaewah <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/javaewah
      2013-09-04 23:12:34,813 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/.job.jar.crc <- /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/.job.jar.crc
      2013-09-04 23:12:35,127 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
      2013-09-04 23:12:35,412 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
      2013-09-04 23:12:35,473 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@5d0fa836
      2013-09-04 23:12:36,927 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-08-28/000798_0
      2013-09-04 23:12:36,927 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 72
      2013-09-04 23:12:36,936 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
      2013-09-04 23:12:37,028 INFO org.apache.hadoop.mapred.MapTask: data buffer = 79691776/99614720
      2013-09-04 23:12:37,028 INFO org.apache.hadoop.mapred.MapTask: record buffer = 262144/327680
      2013-09-04 23:12:37,036 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is available
      2013-09-04 23:12:37,036 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library loaded
      2013-09-04 23:12:37,046 INFO ExecMapper: maximum memory = 559284224
      2013-09-04 23:12:37,046 INFO ExecMapper: conf classpath = [file:/home/datasalt/sw/hadoop-1.1.2.23/conf/, file:/opt/oracle-jdk-bin-1.7.0.17/lib/tools.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/, file:/home/datasalt/sw/hadoop-1.1.2.23/hadoop-core-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/asm-3.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjrt-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjtools-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-1.7.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-core-1.8.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-cli-1.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-codec-1.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-collections-3.2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-configuration-1.6.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-daemon-1.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-digester-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-el-1.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-httpclient-3.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-io-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-lang-2.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-1.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-api-1.0.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-math-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-net-3.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/core-3.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-capacity-scheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-fairscheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-thriftfs-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hsqldb-1.8.0.10.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-core-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-mapper-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-compiler-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-runtime-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jdeb-0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-core-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-json-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-server-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jets3t-0.6.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-util-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsch-0.1.42.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/junit-4.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/kfs-0.2.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/log4j-1.2.15.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/mockito-all-1.8.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/oro-2.0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/servlet-api-2.5-20081211.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-api-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-log4j12-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/xmlenc-0.52.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-api-2.1.jar, file:/hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/classes, file:/hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/, file:/hd/hd1/mapred/local/taskTracker/datasalt/distcache/-1837517190973317952_-652944854_1812099894/master/tmp/hadoop-datasalt/mapred/staging/datasalt/.staging/job_201309040511_0003/libjars/derby-10.4.2.0.jar/, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/]
      2013-09-04 23:12:37,047 INFO ExecMapper: thread classpath = [file:/home/datasalt/sw/hadoop-1.1.2.23/conf/, file:/opt/oracle-jdk-bin-1.7.0.17/lib/tools.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/, file:/home/datasalt/sw/hadoop-1.1.2.23/hadoop-core-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/asm-3.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjrt-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjtools-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-1.7.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-core-1.8.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-cli-1.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-codec-1.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-collections-3.2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-configuration-1.6.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-daemon-1.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-digester-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-el-1.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-httpclient-3.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-io-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-lang-2.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-1.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-api-1.0.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-math-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-net-3.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/core-3.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-capacity-scheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-fairscheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-thriftfs-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hsqldb-1.8.0.10.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-core-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-mapper-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-compiler-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-runtime-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jdeb-0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-core-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-json-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-server-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jets3t-0.6.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-util-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsch-0.1.42.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/junit-4.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/kfs-0.2.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/log4j-1.2.15.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/mockito-all-1.8.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/oro-2.0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/servlet-api-2.5-20081211.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-api-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-log4j12-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/xmlenc-0.52.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-api-2.1.jar, file:/hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/classes, file:/hd/hd5/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/jars/, file:/hd/hd1/mapred/local/taskTracker/datasalt/distcache/-1837517190973317952_-652944854_1812099894/master/tmp/hadoop-datasalt/mapred/staging/datasalt/.staging/job_201309040511_0003/libjars/derby-10.4.2.0.jar/, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309040511_0003/attempt_201309040511_0003_m_000803_1/work/]
      2013-09-04 23:12:37,085 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hits to work list for file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-08-28
      2013-09-04 23:12:37,089 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS struct<hour:int,user_ip:string,country_iso:string,user_language:string,num_hits:int,site_hostname:string,idsite:int,num_ads:int,num_ads_wanted:int,category:int,idcampaign:int,idvariation:int,sub_id:int,cpm_value:double,publisher_share:int,publisher_currency:string,advertiser_currency:string,idzone:int,idagent:int,idadvertiser:int,idpublisher:int,ddate:string>
      2013-09-04 23:12:37,091 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hits to work list for file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-08-29
      2013-09-04 23:12:37,092 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hits to work list for file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-08-30
      2013-09-04 23:12:37,093 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hits to work list for file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-08-31
      2013-09-04 23:12:37,094 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hits to work list for file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-09-01
      2013-09-04 23:12:37,094 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hits to work list for file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-09-02
      2013-09-04 23:12:37,095 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hits to work list for file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-09-03
      2013-09-04 23:12:37,096 INFO ExecMapper: 
      <MAP>Id =7
        <Children>
          <TS>Id =3
            <Children>
              <SEL>Id =2
                <Children>
                  <GBY>Id =1
                    <Children>
                      <RS>Id =0
                        <Parent>Id = 1 null<\Parent>
                      <\RS>
                    <\Children>
                    <Parent>Id = 2 null<\Parent>
                  <\GBY>
                <\Children>
                <Parent>Id = 3 null<\Parent>
              <\SEL>
            <\Children>
            <Parent>Id = 7 null<\Parent>
          <\TS>
        <\Children>
      <\MAP>
      2013-09-04 23:12:37,100 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 7 MAP
      2013-09-04 23:12:37,100 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 3 TS
      2013-09-04 23:12:37,100 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 3 TS initialized
      2013-09-04 23:12:37,100 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 3 TS
      2013-09-04 23:12:37,100 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing child 2 SEL
      2013-09-04 23:12:37,100 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self 2 SEL
      2013-09-04 23:12:37,105 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT struct<hour:int,user_ip:string,country_iso:string,user_language:string,num_hits:int,site_hostname:string,idsite:int,num_ads:int,num_ads_wanted:int,category:int,idcampaign:int,idvariation:int,sub_id:int,cpm_value:double,publisher_share:int,publisher_currency:string,advertiser_currency:string,idzone:int,idagent:int,idadvertiser:int,idpublisher:int,ddate:string>
      2013-09-04 23:12:37,105 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Operator 2 SEL initialized
      2013-09-04 23:12:37,105 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children of 2 SEL
      2013-09-04 23:12:37,105 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: Initializing child 1 GBY
      2013-09-04 23:12:37,105 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: Initializing Self 1 GBY
      2013-09-04 23:12:37,116 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: Operator 1 GBY initialized
      2013-09-04 23:12:37,116 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: Initializing children of 1 GBY
      2013-09-04 23:12:37,116 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: Initializing child 0 RS
      2013-09-04 23:12:37,116 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: Initializing Self 0 RS
      2013-09-04 23:12:37,116 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: Using tag = -1
      2013-09-04 23:12:37,139 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: Operator 0 RS initialized
      2013-09-04 23:12:37,139 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: Initialization Done 0 RS
      2013-09-04 23:12:37,139 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: Initialization Done 1 GBY
      2013-09-04 23:12:37,139 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done 2 SEL
      2013-09-04 23:12:37,139 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 3 TS
      2013-09-04 23:12:37,139 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 7 MAP
      2013-09-04 23:12:37,165 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hits for file hdfs://master:54310/user/hive/warehouse/hits/ddate=2013-08-28
      2013-09-04 23:12:37,165 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 1 rows
      2013-09-04 23:12:37,165 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 1 rows
      2013-09-04 23:12:37,165 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 2 forwarding 1 rows
      2013-09-04 23:12:37,168 INFO ExecMapper: ExecMapper: processing 1 rows: used memory = 151151936
      2013-09-04 23:12:37,169 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 10 rows
      2013-09-04 23:12:37,169 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 10 rows
      2013-09-04 23:12:37,169 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 2 forwarding 10 rows
      2013-09-04 23:12:37,169 INFO ExecMapper: ExecMapper: processing 10 rows: used memory = 151151936
      2013-09-04 23:12:37,175 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 100 rows
      2013-09-04 23:12:37,175 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 100 rows
      2013-09-04 23:12:37,175 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 2 forwarding 100 rows
      2013-09-04 23:12:37,175 INFO ExecMapper: ExecMapper: processing 100 rows: used memory = 151151936
      2013-09-04 23:12:37,247 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 1000 rows
      2013-09-04 23:12:37,250 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 1000 rows
      2013-09-04 23:12:37,250 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 2 forwarding 1000 rows
      2013-09-04 23:12:37,251 INFO ExecMapper: ExecMapper: processing 1000 rows: used memory = 151151936
      2013-09-04 23:12:37,642 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 10000 rows
      2013-09-04 23:12:37,642 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 10000 rows
      2013-09-04 23:12:37,644 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 2 forwarding 10000 rows
      2013-09-04 23:12:37,644 INFO ExecMapper: ExecMapper: processing 10000 rows: used memory = 153032176
      2013-09-04 23:12:38,467 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 100000 rows
      2013-09-04 23:12:38,468 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 100000 rows
      2013-09-04 23:12:38,468 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 2 forwarding 100000 rows
      2013-09-04 23:12:38,469 INFO ExecMapper: ExecMapper: processing 100000 rows: used memory = 125346272
      2013-09-04 23:12:40,626 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 1000000 rows
      2013-09-04 23:12:40,627 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 1000000 rows
      2013-09-04 23:12:40,627 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 2 forwarding 1000000 rows
      2013-09-04 23:12:40,627 INFO ExecMapper: ExecMapper: processing 1000000 rows: used memory = 123383952
      

      For this particular example, the task was killed 6 hours after the last log line.

      The strace of the process when it is in the infinite loop has the following looking:

       
      Process 7887 attached with 24 threads
      [pid  7930] futex(0x7f20a06df28c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
      [pid  7922] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
      [pid  7920] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
      [pid  8333] accept(61,  <unfinished ...>
      [pid  7908] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
      [pid  7907] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
      [pid  7906] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
      [pid  7905] futex(0x7f20a00fa554, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished ...>
      [pid  7903] futex(0x7f20a00f5254, FUTEX_WAIT_PRIVATE, 417, NULL <unfinished ...>
      [pid  7904] futex(0x7f20a00f7c54, FUTEX_WAIT_PRIVATE, 387, NULL <unfinished ...>
      [pid  7902] futex(0x7f20a7513dc0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
      [pid  7901] futex(0x7f20a00a6f54, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished ...>
      [pid  7899] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
      [pid  7898] futex(0x7f20a0026554, FUTEX_WAIT_PRIVATE, 39, NULL <unfinished ...>
      [pid  7900] futex(0x7f20a00a4b54, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished ...>
      [pid  7897] futex(0x7f20a0024754, FUTEX_WAIT_PRIVATE, 39, NULL <unfinished ...>
      [pid  7896] futex(0x7f20a0022a54, FUTEX_WAIT_PRIVATE, 41, NULL <unfinished ...>
      [pid  7895] futex(0x7f20a0020c54, FUTEX_WAIT_PRIVATE, 39, NULL <unfinished ...>
      [pid  7894] futex(0x7f20a001ef54, FUTEX_WAIT_PRIVATE, 37, NULL <unfinished ...>
      [pid  7893] futex(0x7f20a001d154, FUTEX_WAIT_PRIVATE, 39, NULL <unfinished ...>
      [pid  7887] futex(0x7f20a81389d0, FUTEX_WAIT, 7890, NULL <unfinished ...>
      [pid  7892] futex(0x7f20a001b454, FUTEX_WAIT_PRIVATE, 35, NULL <unfinished ...>
      [pid  7891] futex(0x7f20a0019654, FUTEX_WAIT_PRIVATE, 37, NULL <unfinished ...>
      [pid  7906] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373836, 993325000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 43598000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 93858000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 144101000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 194344000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 244659000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 294924000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 345198000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 395462000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 445768000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 495983000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 546226000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 596489000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 646798000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 697063000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 747370000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 797688000}, ffffffff <unfinished ...>
      [pid  7899] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
      [pid  7899] futex(0x7f20a009d028, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7899] futex(0x7f20a009d054, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 782077000}, ffffffff <unfinished ...>
      [pid  7906] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 848014000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 898283000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 948489000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373837, 998751000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 49086000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 99351000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 149589000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 199852000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 250180000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 300489000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 350762000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 401078000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 451258000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 501527000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 551767000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 601970000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 652161000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 702290000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 752523000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 802739000}, ffffffff <unfinished ...>
      [pid  7899] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
      [pid  7899] futex(0x7f20a009d028, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7899] futex(0x7f20a009d054, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 782336000}, ffffffff <unfinished ...>
      [pid  7922] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
      [pid  7922] futex(0x7f20a06d3a28, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7922] futex(0x7f20a0550554, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f20a0550550, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
      [pid  7907] <... restart_syscall resumed> ) = 0
      [pid  7907] futex(0x7f20a0550528, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7922] write(56, "\0\0\0\335\0\0\2.\0\4ping\0\0\0\2\0&org.apache.h"..., 225 <unfinished ...>
      [pid  7907] read(56,  <unfinished ...>
      [pid  7922] <... write resumed> )       = 225
      [pid  7907] <... read resumed> 0x7f2098000b70, 8192) = -1 EAGAIN (Resource temporarily unavailable)
      [pid  7922] futex(0x7f20a009d054, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f20a009d050, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
      [pid  7899] <... futex resumed> )       = 0
      [pid  7922] futex(0x7f20a06d3e54, FUTEX_WAIT_PRIVATE, 53, NULL <unfinished ...>
      [pid  7907] epoll_ctl(59, EPOLL_CTL_ADD, 56, {...} <unfinished ...>
      [pid  7899] futex(0x7f20a009d028, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
      [pid  7907] <... epoll_ctl resumed> )   = 0
      [pid  7899] <... futex resumed> )       = 0
      [pid  7907] epoll_wait(59, {?} 0x7f20a0544000, 4096, 60000) = 1
      [pid  7899] mprotect(0x7f20a8147000, 4096, PROT_READ <unfinished ...>
      [pid  7907] futex(0x7f20a0551e54, FUTEX_WAIT_PRIVATE, 11, NULL <unfinished ...>
      [pid  7899] <... mprotect resumed> )    = 0
      [pid  7890] futex(0x7f20a000b854, FUTEX_WAIT_PRIVATE, 349, NULL <unfinished ...>
      [pid  7899] mprotect(0x7f20a8147000, 4096, PROT_READ|PROT_WRITE) = 0
      [pid  7899] mprotect(0x7f20a8148000, 4096, PROT_NONE) = 0
      [pid  7899] mprotect(0x7f20a8148000, 4096, PROT_READ) = 0
      [pid  7899] futex(0x7f20a0551e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f20a0551e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
      [pid  7899] futex(0x7f20a06d3e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f20a06d3e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
      [pid  7922] <... futex resumed> )       = 0
      [pid  7907] <... futex resumed> )       = 0
      [pid  7922] futex(0x7f20a06d3e28, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
      [pid  7899] <... futex resumed> )       = 1
      [pid  7907] futex(0x7f20a0551e28, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
      [pid  7899] futex(0x7f20a06d3e28, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
      [pid  7922] <... futex resumed> )       = 0
      [pid  7907] <... futex resumed> )       = 0
      [pid  7922] futex(0x7f20a06d3e28, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
      [pid  7899] <... futex resumed> )       = 1
      [pid  7922] <... futex resumed> )       = 0
      [pid  7899] futex(0x7f20a009d054, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 3, {1378373839, 794194000}, ffffffff <unfinished ...>
      [pid  7922] futex(0x7f20a06d3a54, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
      [pid  7907] futex(0x7f20a000b854, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f20a000b850, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
      [pid  7890] <... futex resumed> )       = 0
      [pid  7890] futex(0x7f20a000b828, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
      [pid  7907] epoll_ctl(59, EPOLL_CTL_DEL, 56, {...} <unfinished ...>
      [pid  7890] <... futex resumed> )       = 0
      [pid  7907] <... epoll_ctl resumed> )   = 0
      [pid  7907] epoll_wait(59, {}, 4096, 0) = 0
      [pid  7907] read(56, "\0\0\2.\0\0\0\0\0\7boolean\1", 8192) = 18
      [pid  7907] futex(0x7f20a06d3a54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f20a06d3a50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
      [pid  7922] <... futex resumed> )       = 0
      [pid  7907] futex(0x7f20a0550554, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1107, {1378373848, 793674000}, ffffffff <unfinished ...>
      [pid  7922] futex(0x7f20a06d3a28, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7922] futex(0x7f20a06d3a54, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 3, {1378373841, 794758000}, ffffffff <unfinished ...>
      [pid  7906] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 852867000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 903052000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373838, 953226000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 3436000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 53695000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 103966000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 154232000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 204466000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 254802000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 305067000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 355318000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 405623000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 455817000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 506130000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 556395000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 606661000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 656898000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      [pid  7906] futex(0x7f20a0104928, FUTEX_WAKE_PRIVATE, 1) = 0
      [pid  7906] futex(0x7f20a0104954, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1378373839, 707177000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
      

      Attachments

        1. gendata.py
          0.8 kB
          Prasanth Jayachandran

        Activity

          I think the issue might be the same than this Lucene issue: https://issues.apache.org/jira/browse/LUCENE-772

          When we import some data into ORC files, eventually some part gets corrupted and the next jobs enter at some point into this infinite loop when reading from the ORC file. We can reproduce the issue after the data has been imported. But if we re-import, the issue might disappear or appear at some other point. This is because the jobs that we execute after importing don't use all the fields, so if one field gets corrupted we don't notice until we execute the aggregation that uses that field.

          So my guess is that it is possible for the ORC file to be corrupted while writing.

          ferrerabertran Pere Ferrera Bertran added a comment - I think the issue might be the same than this Lucene issue: https://issues.apache.org/jira/browse/LUCENE-772 When we import some data into ORC files, eventually some part gets corrupted and the next jobs enter at some point into this infinite loop when reading from the ORC file. We can reproduce the issue after the data has been imported. But if we re-import, the issue might disappear or appear at some other point. This is because the jobs that we execute after importing don't use all the fields, so if one field gets corrupted we don't notice until we execute the aggregation that uses that field. So my guess is that it is possible for the ORC file to be corrupted while writing.
          omalley Owen O'Malley added a comment -

          Does the stack trace change? Or is it always stuck in the zip routines?

          Would it be possible to email me (omalley@apache.org) one of the ORC files that cause the reader to lock up? (Even better of course would be if I could recreate the problem on my machine.)

          It looks like somehow the compressed stream got corrupted. Which OS/platform are you on? What is the output of "java -version"?

          omalley Owen O'Malley added a comment - Does the stack trace change? Or is it always stuck in the zip routines? Would it be possible to email me (omalley@apache.org) one of the ORC files that cause the reader to lock up? (Even better of course would be if I could recreate the problem on my machine.) It looks like somehow the compressed stream got corrupted. Which OS/platform are you on? What is the output of "java -version"?

          Hi Owen,

          The stacktrace is always stuck at the zip routine.

          We don't have such a file right now but we can try to generate it again. We're unsure whether this is reproducible in a small scale, so we will just try, hoping that the environment rises the bug by itself (we had thoughts that the bug might be triggered only when the cluster has a lot of pressure).

          Even if this happens, what file size can we send to your apace e-mail? Will it fit?

          ferrerabertran Pere Ferrera Bertran added a comment - Hi Owen, The stacktrace is always stuck at the zip routine. We don't have such a file right now but we can try to generate it again. We're unsure whether this is reproducible in a small scale, so we will just try, hoping that the environment rises the bug by itself (we had thoughts that the bug might be triggered only when the cluster has a lot of pressure). Even if this happens, what file size can we send to your apace e-mail? Will it fit?

          Hi Owen,

          When trying to import again data into ORC files we got a different Exception at Stage-3 but I think it could be closely related to the same issue. Some tasks failed due to this Exception consistently after 4 trials in different data nodes, others failed once and succeeded afterwards. This is the exception that was reproduced consistently across the 4 trials, so probably something got corrupted in previous Stages:

          java.io.IOException: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143
          	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
          	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
          	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
          	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
          	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
          	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
          	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
          	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
          	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
          	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at javax.security.auth.Subject.doAs(Subject.java:415)
          	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
          	at org.apache.hadoop.mapred.Child.main(Child.java:249)
          Caused by: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143
          	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
          	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
          	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276)
          	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
          	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
          	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
          	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
          	... 11 more
          Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143
          	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:44)
          	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214)
          	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71)
          	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46)
          	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
          	... 15 more
          

          This is the task log:

          2013-09-12 20:47:45,050 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
          2013-09-12 20:47:45,294 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /hd/hd1/mapred/local/taskTracker/distcache/6799594255381089178_1804748353_341796893/master/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-mr-10006/4e87ee35-7660-440e-a25e-f2b85b3e9929 <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/HIVE_PLAN4e87ee35-7660-440e-a25e-f2b85b3e9929
          2013-09-12 20:47:45,307 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/META-INF <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/META-INF
          2013-09-12 20:47:45,310 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/javaewah <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/javaewah
          2013-09-12 20:47:45,311 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/.job.jar.crc <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/.job.jar.crc
          2013-09-12 20:47:45,312 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/job.jar <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/job.jar
          2013-09-12 20:47:45,314 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/javax <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/javax
          2013-09-12 20:47:45,315 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/hive-exec-log4j.properties <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/hive-exec-log4j.properties
          2013-09-12 20:47:45,317 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/org <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/org
          2013-09-12 20:47:45,318 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/com <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/com
          2013-09-12 20:47:45,319 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/javolution <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/javolution
          2013-09-12 20:47:45,555 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
          2013-09-12 20:47:45,733 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
          2013-09-12 20:47:45,747 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1007d798
          2013-09-12 20:47:48,268 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06/002565_0
          2013-09-12 20:47:48,268 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
          2013-09-12 20:47:48,277 INFO ExecMapper: maximum memory = 559284224
          2013-09-12 20:47:48,277 INFO ExecMapper: conf classpath = [file:/home/datasalt/sw/hadoop-1.1.2.23/conf/, file:/opt/oracle-jdk-bin-1.7.0.25/lib/tools.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/, file:/home/datasalt/sw/hadoop-1.1.2.23/hadoop-core-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/asm-3.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjrt-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjtools-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-1.7.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-core-1.8.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-cli-1.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-codec-1.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-collections-3.2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-configuration-1.6.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-daemon-1.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-digester-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-el-1.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-httpclient-3.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-io-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-lang-2.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-1.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-api-1.0.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-math-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-net-3.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/core-3.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-capacity-scheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-fairscheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-thriftfs-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hsqldb-1.8.0.10.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-core-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-mapper-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-compiler-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-runtime-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jdeb-0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-core-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-json-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-server-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jets3t-0.6.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-util-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsch-0.1.42.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/junit-4.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/kfs-0.2.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/log4j-1.2.15.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/mockito-all-1.8.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/oro-2.0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/servlet-api-2.5-20081211.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-api-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-log4j12-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/xmlenc-0.52.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-api-2.1.jar, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/classes, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/, file:/hd/hd2/mapred/local/taskTracker/datasalt/distcache/4273833688434235296_359306031_341796978/master/tmp/hadoop-datasalt/mapred/staging/datasalt/.staging/job_201309100703_0031/libjars/derby-10.4.2.0.jar/, file:/hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/]
          2013-09-12 20:47:48,278 INFO ExecMapper: thread classpath = [file:/home/datasalt/sw/hadoop-1.1.2.23/conf/, file:/opt/oracle-jdk-bin-1.7.0.25/lib/tools.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/, file:/home/datasalt/sw/hadoop-1.1.2.23/hadoop-core-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/asm-3.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjrt-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjtools-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-1.7.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-core-1.8.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-cli-1.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-codec-1.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-collections-3.2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-configuration-1.6.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-daemon-1.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-digester-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-el-1.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-httpclient-3.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-io-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-lang-2.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-1.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-api-1.0.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-math-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-net-3.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/core-3.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-capacity-scheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-fairscheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-thriftfs-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hsqldb-1.8.0.10.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-core-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-mapper-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-compiler-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-runtime-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jdeb-0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-core-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-json-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-server-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jets3t-0.6.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-util-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsch-0.1.42.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/junit-4.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/kfs-0.2.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/log4j-1.2.15.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/mockito-all-1.8.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/oro-2.0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/servlet-api-2.5-20081211.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-api-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-log4j12-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/xmlenc-0.52.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-api-2.1.jar, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/classes, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/, file:/hd/hd2/mapred/local/taskTracker/datasalt/distcache/4273833688434235296_359306031_341796978/master/tmp/hadoop-datasalt/mapred/staging/datasalt/.staging/job_201309100703_0031/libjars/derby-10.4.2.0.jar/, file:/hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/]
          2013-09-12 20:47:48,300 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06
          2013-09-12 20:47:48,302 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS struct<hour:int,user_ip:string,country_iso:string,user_language:string,num_hits:int,site_hostname:string,idsite:int,num_ads:int,num_ads_wanted:int,category:int,idcampaign:int,idvariation:int,sub_id:int,cpm_value:double,publisher_share:int,publisher_currency:string,advertiser_currency:string,idzone:int,idagent:int,idadvertiser:int,idpublisher:int,ddate:string>
          2013-09-12 20:47:48,303 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-07
          2013-09-12 20:47:48,304 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-08
          2013-09-12 20:47:48,305 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-09
          2013-09-12 20:47:48,306 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-10
          2013-09-12 20:47:48,307 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-11
          2013-09-12 20:47:48,307 INFO ExecMapper: 
          <MAP>Id =2
            <Children>
              <TS>Id =1
                <Children>
                  <FS>Id =0
                    <Parent>Id = 1 null<\Parent>
                  <\FS>
                <\Children>
                <Parent>Id = 2 null<\Parent>
              <\TS>
            <\Children>
          <\MAP>
          2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 2 MAP
          2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 1 TS
          2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 1 TS initialized
          2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 1 TS
          2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 0 FS
          2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 0 FS
          2013-09-12 20:47:48,314 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 0 FS initialized
          2013-09-12 20:47:48,314 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 0 FS
          2013-09-12 20:47:48,314 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 1 TS
          2013-09-12 20:47:48,314 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 2 MAP
          2013-09-12 20:47:48,361 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06
          2013-09-12 20:47:48,361 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 1 rows
          2013-09-12 20:47:48,361 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 1 rows
          2013-09-12 20:47:48,365 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/_tmp.-ext-10000/ddate=2013-09-06/000255_0
          2013-09-12 20:47:48,365 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/_task_tmp.-ext-10000/ddate=2013-09-06/_tmp.000255_0
          2013-09-12 20:47:48,365 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/_tmp.-ext-10000/ddate=2013-09-06/000255_0
          2013-09-12 20:47:48,388 INFO ExecMapper: ExecMapper: processing 1 rows: used memory = 117452448
          2013-09-12 20:47:48,392 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 10 rows
          2013-09-12 20:47:48,392 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 10 rows
          2013-09-12 20:47:48,393 INFO ExecMapper: ExecMapper: processing 10 rows: used memory = 117452448
          2013-09-12 20:47:48,430 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 100 rows
          2013-09-12 20:47:48,430 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 100 rows
          2013-09-12 20:47:48,431 INFO ExecMapper: ExecMapper: processing 100 rows: used memory = 118610416
          2013-09-12 20:47:48,809 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 1000 rows
          2013-09-12 20:47:48,809 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 1000 rows
          2013-09-12 20:47:48,810 INFO ExecMapper: ExecMapper: processing 1000 rows: used memory = 98592408
          2013-09-12 20:47:49,406 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 10000 rows
          2013-09-12 20:47:49,407 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 10000 rows
          2013-09-12 20:47:49,408 INFO ExecMapper: ExecMapper: processing 10000 rows: used memory = 130619792
          2013-09-12 20:47:51,320 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 100000 rows
          2013-09-12 20:47:51,321 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 100000 rows
          2013-09-12 20:47:51,322 INFO ExecMapper: ExecMapper: processing 100000 rows: used memory = 122068200
          2013-09-12 20:48:10,995 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 1000000 rows
          2013-09-12 20:48:10,995 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 1000000 rows
          2013-09-12 20:48:10,996 INFO ExecMapper: ExecMapper: processing 1000000 rows: used memory = 154097584
          2013-09-12 20:48:33,717 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 2000000 rows
          2013-09-12 20:48:33,717 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 2000000 rows
          2013-09-12 20:48:33,717 INFO ExecMapper: ExecMapper: processing 2000000 rows: used memory = 309911600
          2013-09-12 20:48:56,451 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 3000000 rows
          2013-09-12 20:48:56,451 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 3000000 rows
          2013-09-12 20:48:56,452 INFO ExecMapper: ExecMapper: processing 3000000 rows: used memory = 263235784
          2013-09-12 20:49:12,316 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06/002566_0
          2013-09-12 20:49:12,625 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06
          2013-09-12 20:49:20,324 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 4000000 rows
          2013-09-12 20:49:20,324 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 4000000 rows
          2013-09-12 20:49:20,325 INFO ExecMapper: ExecMapper: processing 4000000 rows: used memory = 307222344
          2013-09-12 20:49:43,626 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 5000000 rows
          2013-09-12 20:49:43,626 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 5000000 rows
          2013-09-12 20:49:43,627 INFO ExecMapper: ExecMapper: processing 5000000 rows: used memory = 390959336
          2013-09-12 20:50:06,151 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 6000000 rows
          2013-09-12 20:50:06,151 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 6000000 rows
          2013-09-12 20:50:06,152 INFO ExecMapper: ExecMapper: processing 6000000 rows: used memory = 393745752
          2013-09-12 20:50:44,108 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 7000000 rows
          2013-09-12 20:50:44,108 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 7000000 rows
          2013-09-12 20:50:44,108 INFO ExecMapper: ExecMapper: processing 7000000 rows: used memory = 316866264
          2013-09-12 20:50:51,899 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06/002582_0
          2013-09-12 20:50:51,947 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06
          2013-09-12 20:51:06,473 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 8000000 rows
          2013-09-12 20:51:06,473 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 8000000 rows
          2013-09-12 20:51:06,473 INFO ExecMapper: ExecMapper: processing 8000000 rows: used memory = 247596776
          2013-09-12 20:51:28,606 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 9000000 rows
          2013-09-12 20:51:28,606 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 9000000 rows
          2013-09-12 20:51:28,607 INFO ExecMapper: ExecMapper: processing 9000000 rows: used memory = 261148536
          2013-09-12 20:51:51,043 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 10000000 rows
          2013-09-12 20:51:51,044 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 10000000 rows
          2013-09-12 20:51:51,044 INFO ExecMapper: ExecMapper: processing 10000000 rows: used memory = 372977008
          2013-09-12 20:52:12,277 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06/002665_0
          2013-09-12 20:52:12,312 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06
          2013-09-12 20:52:14,571 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 11000000 rows
          2013-09-12 20:52:14,572 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 11000000 rows
          2013-09-12 20:52:14,572 INFO ExecMapper: ExecMapper: processing 11000000 rows: used memory = 354197824
          2013-09-12 20:52:37,247 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 12000000 rows
          2013-09-12 20:52:37,248 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 12000000 rows
          2013-09-12 20:52:37,248 INFO ExecMapper: ExecMapper: processing 12000000 rows: used memory = 363017344
          2013-09-12 20:53:14,269 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 13000000 rows
          2013-09-12 20:53:14,269 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 13000000 rows
          2013-09-12 20:53:14,269 INFO ExecMapper: ExecMapper: processing 13000000 rows: used memory = 457861440
          2013-09-12 20:53:35,824 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 14000000 rows
          2013-09-12 20:53:35,824 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 14000000 rows
          2013-09-12 20:53:35,825 INFO ExecMapper: ExecMapper: processing 14000000 rows: used memory = 433185992
          2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 finished. closing... 
          2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarded 14510283 rows
          2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
          2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 finished. closing... 
          2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarded 14510283 rows
          2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 0 finished. closing... 
          2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 0 forwarded 0 rows
          2013-09-12 20:53:51,508 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 Close done
          2013-09-12 20:53:51,508 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 Close done
          2013-09-12 20:53:51,508 INFO ExecMapper: ExecMapper: processed 14510283 rows: used memory = 248811928
          2013-09-12 20:53:51,514 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
          2013-09-12 20:53:51,603 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
          2013-09-12 20:53:51,604 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName datasalt for UID 5001 from the native implementation
          2013-09-12 20:53:51,606 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:datasalt cause:java.io.IOException: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143
          2013-09-12 20:53:51,607 WARN org.apache.hadoop.mapred.Child: Error running child
          java.io.IOException: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143
          	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
          	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
          	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
          	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
          	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
          	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
          	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
          	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
          	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
          	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at javax.security.auth.Subject.doAs(Subject.java:415)
          	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
          	at org.apache.hadoop.mapred.Child.main(Child.java:249)
          Caused by: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143
          	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
          	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
          	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276)
          	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
          	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
          	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
          	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
          	... 11 more
          Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143
          	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:44)
          	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214)
          	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71)
          	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46)
          	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
          	... 15 more
          2013-09-12 20:53:51,646 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
          
          ferrerabertran Pere Ferrera Bertran added a comment - Hi Owen, When trying to import again data into ORC files we got a different Exception at Stage-3 but I think it could be closely related to the same issue. Some tasks failed due to this Exception consistently after 4 trials in different data nodes, others failed once and succeeded afterwards. This is the exception that was reproduced consistently across the 4 trials, so probably something got corrupted in previous Stages: java.io.IOException: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300) ... 11 more Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143 at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:44) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) ... 15 more This is the task log: 2013-09-12 20:47:45,050 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library 2013-09-12 20:47:45,294 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /hd/hd1/mapred/local/taskTracker/distcache/6799594255381089178_1804748353_341796893/master/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-mr-10006/4e87ee35-7660-440e-a25e-f2b85b3e9929 <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/HIVE_PLAN4e87ee35-7660-440e-a25e-f2b85b3e9929 2013-09-12 20:47:45,307 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/META-INF <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/META-INF 2013-09-12 20:47:45,310 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/javaewah <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/javaewah 2013-09-12 20:47:45,311 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/.job.jar.crc <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/.job.jar.crc 2013-09-12 20:47:45,312 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/job.jar <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/job.jar 2013-09-12 20:47:45,314 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/javax <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/javax 2013-09-12 20:47:45,315 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/hive-exec-log4j.properties <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/hive-exec-log4j.properties 2013-09-12 20:47:45,317 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/org <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/org 2013-09-12 20:47:45,318 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/com <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/com 2013-09-12 20:47:45,319 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/javolution <- /hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/javolution 2013-09-12 20:47:45,555 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2013-09-12 20:47:45,733 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2013-09-12 20:47:45,747 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1007d798 2013-09-12 20:47:48,268 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06/002565_0 2013-09-12 20:47:48,268 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0 2013-09-12 20:47:48,277 INFO ExecMapper: maximum memory = 559284224 2013-09-12 20:47:48,277 INFO ExecMapper: conf classpath = [file:/home/datasalt/sw/hadoop-1.1.2.23/conf/, file:/opt/oracle-jdk-bin-1.7.0.25/lib/tools.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/, file:/home/datasalt/sw/hadoop-1.1.2.23/hadoop-core-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/asm-3.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjrt-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjtools-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-1.7.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-core-1.8.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-cli-1.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-codec-1.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-collections-3.2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-configuration-1.6.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-daemon-1.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-digester-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-el-1.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-httpclient-3.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-io-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-lang-2.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-1.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-api-1.0.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-math-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-net-3.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/core-3.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-capacity-scheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-fairscheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-thriftfs-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hsqldb-1.8.0.10.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-core-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-mapper-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-compiler-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-runtime-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jdeb-0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-core-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-json-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-server-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jets3t-0.6.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-util-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsch-0.1.42.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/junit-4.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/kfs-0.2.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/log4j-1.2.15.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/mockito-all-1.8.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/oro-2.0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/servlet-api-2.5-20081211.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-api-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-log4j12-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/xmlenc-0.52.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-api-2.1.jar, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/classes, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/, file:/hd/hd2/mapred/local/taskTracker/datasalt/distcache/4273833688434235296_359306031_341796978/master/tmp/hadoop-datasalt/mapred/staging/datasalt/.staging/job_201309100703_0031/libjars/derby-10.4.2.0.jar/, file:/hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/] 2013-09-12 20:47:48,278 INFO ExecMapper: thread classpath = [file:/home/datasalt/sw/hadoop-1.1.2.23/conf/, file:/opt/oracle-jdk-bin-1.7.0.25/lib/tools.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/, file:/home/datasalt/sw/hadoop-1.1.2.23/hadoop-core-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/asm-3.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjrt-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/aspectjtools-1.6.11.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-1.7.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-beanutils-core-1.8.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-cli-1.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-codec-1.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-collections-3.2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-configuration-1.6.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-daemon-1.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-digester-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-el-1.0.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-httpclient-3.0.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-io-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-lang-2.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-1.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-logging-api-1.0.4.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-math-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/commons-net-3.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/core-3.1.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-capacity-scheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-fairscheduler-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hadoop-thriftfs-1.1.2.23.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/hsqldb-1.8.0.10.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-core-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jackson-mapper-asl-1.8.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-compiler-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jasper-runtime-5.5.12.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jdeb-0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-core-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-json-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jersey-server-1.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jets3t-0.6.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jetty-util-6.1.26.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsch-0.1.42.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/junit-4.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/kfs-0.2.2.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/log4j-1.2.15.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/mockito-all-1.8.5.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/oro-2.0.8.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/servlet-api-2.5-20081211.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-api-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/slf4j-log4j12-1.4.3.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/xmlenc-0.52.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-2.1.jar, file:/home/datasalt/sw/hadoop-1.1.2.23/lib/jsp-2.1/jsp-api-2.1.jar, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/classes, file:/hd/hd1/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/jars/, file:/hd/hd2/mapred/local/taskTracker/datasalt/distcache/4273833688434235296_359306031_341796978/master/tmp/hadoop-datasalt/mapred/staging/datasalt/.staging/job_201309100703_0031/libjars/derby-10.4.2.0.jar/, file:/hd/hd6/mapred/local/taskTracker/datasalt/jobcache/job_201309100703_0031/attempt_201309100703_0031_m_000255_0/work/] 2013-09-12 20:47:48,300 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06 2013-09-12 20:47:48,302 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS struct<hour:int,user_ip:string,country_iso:string,user_language:string,num_hits:int,site_hostname:string,idsite:int,num_ads:int,num_ads_wanted:int,category:int,idcampaign:int,idvariation:int,sub_id:int,cpm_value:double,publisher_share:int,publisher_currency:string,advertiser_currency:string,idzone:int,idagent:int,idadvertiser:int,idpublisher:int,ddate:string> 2013-09-12 20:47:48,303 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-07 2013-09-12 20:47:48,304 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-08 2013-09-12 20:47:48,305 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-09 2013-09-12 20:47:48,306 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-10 2013-09-12 20:47:48,307 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 to work list for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-11 2013-09-12 20:47:48,307 INFO ExecMapper: <MAP>Id =2 <Children> <TS>Id =1 <Children> <FS>Id =0 <Parent>Id = 1 null<\Parent> <\FS> <\Children> <Parent>Id = 2 null<\Parent> <\TS> <\Children> <\MAP> 2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 2 MAP 2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 1 TS 2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 1 TS initialized 2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 1 TS 2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 0 FS 2013-09-12 20:47:48,310 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 0 FS 2013-09-12 20:47:48,314 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 0 FS initialized 2013-09-12 20:47:48,314 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 0 FS 2013-09-12 20:47:48,314 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 1 TS 2013-09-12 20:47:48,314 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 2 MAP 2013-09-12 20:47:48,361 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06 2013-09-12 20:47:48,361 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 1 rows 2013-09-12 20:47:48,361 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 1 rows 2013-09-12 20:47:48,365 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/_tmp.-ext-10000/ddate=2013-09-06/000255_0 2013-09-12 20:47:48,365 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/_task_tmp.-ext-10000/ddate=2013-09-06/_tmp.000255_0 2013-09-12 20:47:48,365 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/_tmp.-ext-10000/ddate=2013-09-06/000255_0 2013-09-12 20:47:48,388 INFO ExecMapper: ExecMapper: processing 1 rows: used memory = 117452448 2013-09-12 20:47:48,392 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 10 rows 2013-09-12 20:47:48,392 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 10 rows 2013-09-12 20:47:48,393 INFO ExecMapper: ExecMapper: processing 10 rows: used memory = 117452448 2013-09-12 20:47:48,430 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 100 rows 2013-09-12 20:47:48,430 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 100 rows 2013-09-12 20:47:48,431 INFO ExecMapper: ExecMapper: processing 100 rows: used memory = 118610416 2013-09-12 20:47:48,809 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 1000 rows 2013-09-12 20:47:48,809 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 1000 rows 2013-09-12 20:47:48,810 INFO ExecMapper: ExecMapper: processing 1000 rows: used memory = 98592408 2013-09-12 20:47:49,406 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 10000 rows 2013-09-12 20:47:49,407 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 10000 rows 2013-09-12 20:47:49,408 INFO ExecMapper: ExecMapper: processing 10000 rows: used memory = 130619792 2013-09-12 20:47:51,320 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 100000 rows 2013-09-12 20:47:51,321 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 100000 rows 2013-09-12 20:47:51,322 INFO ExecMapper: ExecMapper: processing 100000 rows: used memory = 122068200 2013-09-12 20:48:10,995 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 1000000 rows 2013-09-12 20:48:10,995 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 1000000 rows 2013-09-12 20:48:10,996 INFO ExecMapper: ExecMapper: processing 1000000 rows: used memory = 154097584 2013-09-12 20:48:33,717 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 2000000 rows 2013-09-12 20:48:33,717 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 2000000 rows 2013-09-12 20:48:33,717 INFO ExecMapper: ExecMapper: processing 2000000 rows: used memory = 309911600 2013-09-12 20:48:56,451 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 3000000 rows 2013-09-12 20:48:56,451 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 3000000 rows 2013-09-12 20:48:56,452 INFO ExecMapper: ExecMapper: processing 3000000 rows: used memory = 263235784 2013-09-12 20:49:12,316 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06/002566_0 2013-09-12 20:49:12,625 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06 2013-09-12 20:49:20,324 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 4000000 rows 2013-09-12 20:49:20,324 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 4000000 rows 2013-09-12 20:49:20,325 INFO ExecMapper: ExecMapper: processing 4000000 rows: used memory = 307222344 2013-09-12 20:49:43,626 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 5000000 rows 2013-09-12 20:49:43,626 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 5000000 rows 2013-09-12 20:49:43,627 INFO ExecMapper: ExecMapper: processing 5000000 rows: used memory = 390959336 2013-09-12 20:50:06,151 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 6000000 rows 2013-09-12 20:50:06,151 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 6000000 rows 2013-09-12 20:50:06,152 INFO ExecMapper: ExecMapper: processing 6000000 rows: used memory = 393745752 2013-09-12 20:50:44,108 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 7000000 rows 2013-09-12 20:50:44,108 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 7000000 rows 2013-09-12 20:50:44,108 INFO ExecMapper: ExecMapper: processing 7000000 rows: used memory = 316866264 2013-09-12 20:50:51,899 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06/002582_0 2013-09-12 20:50:51,947 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06 2013-09-12 20:51:06,473 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 8000000 rows 2013-09-12 20:51:06,473 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 8000000 rows 2013-09-12 20:51:06,473 INFO ExecMapper: ExecMapper: processing 8000000 rows: used memory = 247596776 2013-09-12 20:51:28,606 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 9000000 rows 2013-09-12 20:51:28,606 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 9000000 rows 2013-09-12 20:51:28,607 INFO ExecMapper: ExecMapper: processing 9000000 rows: used memory = 261148536 2013-09-12 20:51:51,043 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 10000000 rows 2013-09-12 20:51:51,044 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 10000000 rows 2013-09-12 20:51:51,044 INFO ExecMapper: ExecMapper: processing 10000000 rows: used memory = 372977008 2013-09-12 20:52:12,277 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06/002665_0 2013-09-12 20:52:12,312 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002 for file hdfs://master:54310/tmp/hive-datasalt/hive_2013-09-12_05-36-35_024_230175250616241183/-ext-10002/ddate=2013-09-06 2013-09-12 20:52:14,571 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 11000000 rows 2013-09-12 20:52:14,572 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 11000000 rows 2013-09-12 20:52:14,572 INFO ExecMapper: ExecMapper: processing 11000000 rows: used memory = 354197824 2013-09-12 20:52:37,247 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 12000000 rows 2013-09-12 20:52:37,248 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 12000000 rows 2013-09-12 20:52:37,248 INFO ExecMapper: ExecMapper: processing 12000000 rows: used memory = 363017344 2013-09-12 20:53:14,269 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 13000000 rows 2013-09-12 20:53:14,269 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 13000000 rows 2013-09-12 20:53:14,269 INFO ExecMapper: ExecMapper: processing 13000000 rows: used memory = 457861440 2013-09-12 20:53:35,824 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarding 14000000 rows 2013-09-12 20:53:35,824 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarding 14000000 rows 2013-09-12 20:53:35,825 INFO ExecMapper: ExecMapper: processing 14000000 rows: used memory = 433185992 2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 finished. closing... 2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 forwarded 14510283 rows 2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0 2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 finished. closing... 2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 forwarded 14510283 rows 2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 0 finished. closing... 2013-09-12 20:53:47,321 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 0 forwarded 0 rows 2013-09-12 20:53:51,508 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 1 Close done 2013-09-12 20:53:51,508 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 2 Close done 2013-09-12 20:53:51,508 INFO ExecMapper: ExecMapper: processed 14510283 rows: used memory = 248811928 2013-09-12 20:53:51,514 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2013-09-12 20:53:51,603 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds. 2013-09-12 20:53:51,604 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName datasalt for UID 5001 from the native implementation 2013-09-12 20:53:51,606 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:datasalt cause:java.io.IOException: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143 2013-09-12 20:53:51,607 WARN org.apache.hadoop.mapred.Child: Error running child java.io.IOException: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.IOException: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300) ... 11 more Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 11 kind DATA base: 26245065 offset: 32702649 limit: 32702649 uncompressed: 38143 to 38143 at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:44) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) ... 15 more 2013-09-12 20:53:51,646 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
          omalley Owen O'Malley added a comment -

          Pere,
          If when you regenerate the data, the problem goes away it is most likely a race condition down in the interface to the zlib compression library that results in the compressed bytes being corrupted. In both of the stacks you've posted, it is reading an integer column with compression turned on and a data corruption bug down in the zlib routines could easily explain both stacks.

          • Which version of the jdk are you running on?
          • Which exact OS version are you running?

          If you come across one of the broken files, I would like to see it. You can use my gmail address (owen.omalley@gmail.com) to send files up to 10GB from a gmail address. Hopefully, 10GB is big enough.

          omalley Owen O'Malley added a comment - Pere, If when you regenerate the data, the problem goes away it is most likely a race condition down in the interface to the zlib compression library that results in the compressed bytes being corrupted. In both of the stacks you've posted, it is reading an integer column with compression turned on and a data corruption bug down in the zlib routines could easily explain both stacks. Which version of the jdk are you running on? Which exact OS version are you running? If you come across one of the broken files, I would like to see it. You can use my gmail address (owen.omalley@gmail.com) to send files up to 10GB from a gmail address. Hopefully, 10GB is big enough.
          thejas Thejas Nair added a comment -

          Pere, We would like to get this fixed for 0.12 release if possible. Can you please give any additional information you have for Owen ?

          thejas Thejas Nair added a comment - Pere, We would like to get this fixed for 0.12 release if possible. Can you please give any additional information you have for Owen ?

          Hi guys,

          Here the OS details (uname -a):

          Linux xxxxxxxxxxxx 3.6.11-gentoo-xxx #1 SMP Wed Jan 23 12:25:47 EST 2013 x86_64 Intel(R) Xeon(R) CPU L5520 @ 2.27GHz GenuineIntel GNU/Linux

          The Java version at the moment of the crashes was the "Java SE 7 Update 25". We are currently not using the ORC file format in this cluster, and we can't distribute data from it, however we will try to launch a query with fake data to try to reproduce it.

          ferrerabertran Pere Ferrera Bertran added a comment - Hi guys, Here the OS details (uname -a): Linux xxxxxxxxxxxx 3.6.11-gentoo-xxx #1 SMP Wed Jan 23 12:25:47 EST 2013 x86_64 Intel(R) Xeon(R) CPU L5520 @ 2.27GHz GenuineIntel GNU/Linux The Java version at the moment of the crashes was the "Java SE 7 Update 25". We are currently not using the ORC file format in this cluster, and we can't distribute data from it, however we will try to launch a query with fake data to try to reproduce it.

          We managed to reproduce the issue again. We have a 350 MB file out of a big ORC table that fails consistently. Will now send it as agreed.

          Generating the ORC table wasn't any problem. But then we ran a foo query so that it would scan the whole table and the problem appeared.

          We are now running the latest Java (Update 40).

          ferrerabertran Pere Ferrera Bertran added a comment - We managed to reproduce the issue again. We have a 350 MB file out of a big ORC table that fails consistently. Will now send it as agreed. Generating the ORC table wasn't any problem. But then we ran a foo query so that it would scan the whole table and the problem appeared. We are now running the latest Java (Update 40).

          Hi Pere

          Are you using generated dataset? If so can you please upload the data generation script?
          If any query that scans the whole table fails then it may be because ORC file itself is corrupted. It will be much easier to have the data instead of ORC file to reproduce this issue.

          prasanth_j Prasanth Jayachandran added a comment - Hi Pere Are you using generated dataset? If so can you please upload the data generation script? If any query that scans the whole table fails then it may be because ORC file itself is corrupted. It will be much easier to have the data instead of ORC file to reproduce this issue.

          The data is not generated. But the data is nothing special, as the same error happened when using different days of data.

          I sent the corrupted ORC file to Owen. It's just a bunch of Integer columns so as I said nothing special.

          ferrerabertran Pere Ferrera Bertran added a comment - The data is not generated. But the data is nothing special, as the same error happened when using different days of data. I sent the corrupted ORC file to Owen. It's just a bunch of Integer columns so as I said nothing special.
          omalley Owen O'Malley added a comment -

          Ok, here is the current state. We know that the problem is in the last compression block of the last column's data stream in the first stripe. The metadata is all consistent and the problem seems to be localized to that one compression block. Based on all of this we know that the problem is somewhere between rows 21770000 and 21860000. I'm going to try to localize the problem further.

          omalley Owen O'Malley added a comment - Ok, here is the current state. We know that the problem is in the last compression block of the last column's data stream in the first stripe. The metadata is all consistent and the problem seems to be localized to that one compression block. Based on all of this we know that the problem is somewhere between rows 21770000 and 21860000. I'm going to try to localize the problem further.

          Hi Pere

          I am working with Owen to help resolving this issue.
          We tried to generate around 10s of columns with random integers of different ranges. Generated around 25M rows and created ORC file with hive 0.11 version. The ORC file had multiple stripes to mimic similar scenario. we tried multiple runs of random dataset (also tried with JDK 7) but wasn't able to reproduce the issue. We tried it on Mac OSX but haven't tried it on Gentoo yet. Have you tried this on any other OS than Gentoo? If so, are you facing similar issues with other OSes? Also have you tried this on JDK 6? Will it be possible to post/send only the segment of integer values from last column and between the rows that Owen specified just to make sure there aren't some weird pattern in the input dataset?

          prasanth_j Prasanth Jayachandran added a comment - Hi Pere I am working with Owen to help resolving this issue. We tried to generate around 10s of columns with random integers of different ranges. Generated around 25M rows and created ORC file with hive 0.11 version. The ORC file had multiple stripes to mimic similar scenario. we tried multiple runs of random dataset (also tried with JDK 7) but wasn't able to reproduce the issue. We tried it on Mac OSX but haven't tried it on Gentoo yet. Have you tried this on any other OS than Gentoo? If so, are you facing similar issues with other OSes? Also have you tried this on JDK 6? Will it be possible to post/send only the segment of integer values from last column and between the rows that Owen specified just to make sure there aren't some weird pattern in the input dataset?

          Hi Prasanth,

          We only know about the specified environment: Gentoo + Java 7. Unfortunately we don't have much control over this cluster as of now. What I can tell you is that the sequence of integers you mention contains also NULLs (maybe that changes something in your tests).

          ferrerabertran Pere Ferrera Bertran added a comment - Hi Prasanth, We only know about the specified environment: Gentoo + Java 7. Unfortunately we don't have much control over this cluster as of now. What I can tell you is that the sequence of integers you mention contains also NULLs (maybe that changes something in your tests).

          Hi Pere

          I do have some NULLs in few columns. I used the attached gendata.py script to generate random integers. The last 2 columns inserts some NULLs to the column. The gendata.py generates 20M rows and each fields in the row are tab separated. I tried different datasets using this script but it seems to work fine for me in my environment (Mac OSX and CentOS).
          Will it be possible for you to generate data using this script and test the generated data to see if it works fine in your environment?

          prasanth_j Prasanth Jayachandran added a comment - Hi Pere I do have some NULLs in few columns. I used the attached gendata.py script to generate random integers. The last 2 columns inserts some NULLs to the column. The gendata.py generates 20M rows and each fields in the row are tab separated. I tried different datasets using this script but it seems to work fine for me in my environment (Mac OSX and CentOS). Will it be possible for you to generate data using this script and test the generated data to see if it works fine in your environment?
          omalley Owen O'Malley added a comment -

          Pere, thanks again for the example orc file. Digging deeper into the file, I can see something going wrong earlier in the file for column 13.

          When I compare the locations in the index to the locations in the data, they are aligned up to row 7,600,000 but by row 7,610,000 the locations are wrong. If I send you the equivalent json to the file you sent me, do you have a way of checking the data?

          omalley Owen O'Malley added a comment - Pere, thanks again for the example orc file. Digging deeper into the file, I can see something going wrong earlier in the file for column 13. When I compare the locations in the index to the locations in the data, they are aligned up to row 7,600,000 but by row 7,610,000 the locations are wrong. If I send you the equivalent json to the file you sent me, do you have a way of checking the data?

          Hi guys,

          We have this cluster as a proof-of-concept and data is rotating periodically, so there are chances we don't have that data anymore. However I will try to generate a corrupted ORC file from Prasanth's script so we have both the original data and the corrupted file, if that sounds convenient.

          ferrerabertran Pere Ferrera Bertran added a comment - Hi guys, We have this cluster as a proof-of-concept and data is rotating periodically, so there are chances we don't have that data anymore. However I will try to generate a corrupted ORC file from Prasanth's script so we have both the original data and the corrupted file, if that sounds convenient.
          omalley Owen O'Malley added a comment -

          That would be great.

          omalley Owen O'Malley added a comment - That would be great.
          omalley Owen O'Malley added a comment -

          Pere,
          Could you also try to turn off the compression and see if you can still create the problem? To do so, set the table property "orc.compress"="NONE".

          You can see the rest of the documentation here: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.0.2/ds_Hive/orcfile.html

          omalley Owen O'Malley added a comment - Pere, Could you also try to turn off the compression and see if you can still create the problem? To do so, set the table property "orc.compress"="NONE". You can see the rest of the documentation here: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.0.2/ds_Hive/orcfile.html

          One thing I would like to emphasize, is that the error is not deterministic with the input data. The same input may produce a corrupted file, or it may not. If I leave it working for enough hours, at some point there's a high chance I will produce a corrupted ORC file. This is all I know so far.

          So I'm not sure if it will be valuable to reproduce this again with some determined input data, as you won't be able to reproduce it anyway. Some strange conditions are met into this cluster which causes a random corruption to happen. Maybe some packet gets corrupted in the network under high load and the underlying implementation doesn't recover from that.

          I will however turn off compression and try to generate a corrupted file, so at least we know something else.

          ferrerabertran Pere Ferrera Bertran added a comment - One thing I would like to emphasize, is that the error is not deterministic with the input data. The same input may produce a corrupted file, or it may not. If I leave it working for enough hours, at some point there's a high chance I will produce a corrupted ORC file. This is all I know so far. So I'm not sure if it will be valuable to reproduce this again with some determined input data, as you won't be able to reproduce it anyway. Some strange conditions are met into this cluster which causes a random corruption to happen. Maybe some packet gets corrupted in the network under high load and the underlying implementation doesn't recover from that. I will however turn off compression and try to generate a corrupted file, so at least we know something else.

          Hello,

          I used the script to generate a big file (60 GB) and ran 5 iterations of CREATE TABLE + SELECT without compression. One of them failed randomly with the same infinite loop symptoms (tasks died due to not reporting timeout).

          ferrerabertran Pere Ferrera Bertran added a comment - Hello, I used the script to generate a big file (60 GB) and ran 5 iterations of CREATE TABLE + SELECT without compression. One of them failed randomly with the same infinite loop symptoms (tasks died due to not reporting timeout).
          cyril.liao cyril liao added a comment -

          in my environment, the problem appeared when a int type column contains null values. it was solved after i change the null values to -1 as default.

          cyril.liao cyril liao added a comment - in my environment, the problem appeared when a int type column contains null values. it was solved after i change the null values to -1 as default.

          The same issue appeared in a cluster with the following software:

          Linux version 2.6.32-431.3.1.el6.x86_64 (mockbuild@x86-003.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Dec 13 06:58:20 EST 2013
          java version "1.6.0_30"
          OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64)
          OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)
          CDH 5 beta2
          

          The exeption I got when reading is:

          2014-02-22 09:47:44,912 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 3 kind DATA position: 10739597 length: 10739597 range: 0 offset: 24245452 limit: 24245452 range 0 = 0 to 10739597 uncompressed: 62451 to 62451
          	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:46)
          	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:422)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1193)
          	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2240)
          	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:105)
          	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:56)
          	at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:182)
          	at com.datasalt.pangool.tuplemr.mapred.lib.input.HCatTupleInputFormat$1.nextKeyValue(HCatTupleInputFormat.java:159)
          	at com.datasalt.pangool.tuplemr.mapred.lib.input.DelegatingRecordReader.nextKeyValue(DelegatingRecordReader.java:89)
          	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
          	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
          	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
          	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
          	at com.datasalt.pangool.tuplemr.mapred.lib.input.DelegatingMapper.run(DelegatingMapper.java:50)
          	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
          	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
          	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at javax.security.auth.Subject.doAs(Subject.java:415)
          	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
          	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
          
          ivan.prado Iván de Prado added a comment - The same issue appeared in a cluster with the following software: Linux version 2.6.32-431.3.1.el6.x86_64 (mockbuild@x86-003.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Dec 13 06:58:20 EST 2013 java version "1.6.0_30" OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64) OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode) CDH 5 beta2 The exeption I got when reading is: 2014-02-22 09:47:44,912 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 3 kind DATA position: 10739597 length: 10739597 range: 0 offset: 24245452 limit: 24245452 range 0 = 0 to 10739597 uncompressed: 62451 to 62451 at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:46) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:422) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1193) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2240) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:105) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:56) at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:182) at com.datasalt.pangool.tuplemr.mapred.lib.input.HCatTupleInputFormat$1.nextKeyValue(HCatTupleInputFormat.java:159) at com.datasalt.pangool.tuplemr.mapred.lib.input.DelegatingRecordReader.nextKeyValue(DelegatingRecordReader.java:89) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at com.datasalt.pangool.tuplemr.mapred.lib.input.DelegatingMapper.run(DelegatingMapper.java:50) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

          Sorry, the Java version is:

          java version "1.6.0_31"
          Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
          Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
          
          ivan.prado Iván de Prado added a comment - Sorry, the Java version is: java version "1.6.0_31" Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
          e64 Edwin Chiu added a comment -

          Seeing this issue with Hive 0.12 on HDP 2.0 as well...

          e64 Edwin Chiu added a comment - Seeing this issue with Hive 0.12 on HDP 2.0 as well...
          buryat Vadim Semenov added a comment -

          cyril.liao,

          it was solved after i change the null values to -1 as default.


          confirm it, changing null to -1 for an INT column resolved the issue as well.

          buryat Vadim Semenov added a comment - cyril.liao , it was solved after i change the null values to -1 as default. confirm it, changing null to -1 for an INT column resolved the issue as well.

          People

            Unassigned Unassigned
            ivan.prado Iván de Prado
            Votes:
            3 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated: