Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Test Case
################# # STEP 01 - Setup Table and Data ################# export MYCONN=jdbc:oracle:thin:@oracle.cloudera.com:1521/orcl12c; export MYUSER=sqoop export MYPSWD=cloudera sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "drop table t1_oracle" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "create table t1_oracle (c1 int, c2 varchar(10))" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into t1_oracle values (1, 'data')" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1_oracle" Output: ------------------------------------- | C1 | C2 | ------------------------------------- | 1 | data | ------------------------------------- ################# # STEP 02 - Import Data as Parquet ################# sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1_ORACLE --target-dir /user/user1/t1_oracle_parquet --delete-target-dir --num-mappers 1 --as-parquetfile Output: 16/12/21 07:11:47 INFO mapreduce.ImportJobBase: Transferred 1.624 KB in 50.1693 seconds (33.1478 bytes/sec) 16/12/21 07:11:47 INFO mapreduce.ImportJobBase: Retrieved 1 records. ################# # STEP 03 - Verify Parquet Data ################# hdfs dfs -ls /user/user1/t1_oracle_parquet/*.parquet parquet-tools schema -d hdfs://namenode.cloudera.com/user/user1/t1_oracle_parquet/a6ba3dda-b5fc-42d7-9555-5837a12a036b.parquet Output: -rw-r--r-- 3 user1 user1 597 2016-12-21 07:11 /user/user1/t1_oracle_parquet/a6ba3dda-b5fc-42d7-9555-5837a12a036b.parquet --- message T1_ORACLE { optional binary C1 (UTF8); optional binary C2 (UTF8); } creator: parquet-mr version 1.5.0-cdh5.8.3 (build ${buildNumber}) extra: parquet.avro.schema = {"type":"record","name":"T1_ORACLE","doc":"Sqoop import of T1_ORACLE","fields":[{"name":"C1","type":["null","string"],"default":null,"columnName":"C1","sqlType":"2"},{"name":"C2","type":["null","string"],"default":null,"columnName":"C2","sqlType":"12"}],"tableName":"T1_ORACLE"} file schema: T1_ORACLE ------------------------------------------------------------------------------------------------------------------------ C1: OPTIONAL BINARY O:UTF8 R:0 D:1 C2: OPTIONAL BINARY O:UTF8 R:0 D:1 row group 1: RC:1 TS:85 ------------------------------------------------------------------------------------------------------------------------ C1: BINARY SNAPPY DO:0 FPO:4 SZ:40/38/0.95 VC:1 ENC:PLAIN,RLE,BIT_PACKED C2: BINARY SNAPPY DO:0 FPO:44 SZ:49/47/0.96 VC:1 ENC:PLAIN,RLE,BIT_PACKED ################# # STEP 04 - Export Parquet Data ################# sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1_ORACLE --export-dir /user/user1/t1_oracle_parquet --num-mappers 1 --verbose Output: [sqoop debug] 16/12/21 07:15:06 INFO mapreduce.Job: map 0% reduce 0% 16/12/21 07:15:40 INFO mapreduce.Job: map 100% reduce 0% 16/12/21 07:15:40 INFO mapreduce.Job: Job job_1481911879790_0026 failed with state FAILED due to: Task failed task_1481911879790_0026_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 16/12/21 07:15:40 INFO mapreduce.Job: Counters: 8 Job Counters Failed map tasks=1 Launched map tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=32125 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=32125 Total vcore-seconds taken by all map tasks=32125 Total megabyte-seconds taken by all map tasks=32896000 16/12/21 07:15:40 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 16/12/21 07:15:40 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 46.8304 seconds (0 bytes/sec) 16/12/21 07:15:40 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 16/12/21 07:15:40 INFO mapreduce.ExportJobBase: Exported 0 records. 16/12/21 07:15:40 DEBUG util.ClassLoaderStack: Restoring classloader: java.net.FactoryURLClassLoader@577cfae6 16/12/21 07:15:40 ERROR tool.ExportTool: Error during export: Export job failed! [yarn debug] 2016-12-21 07:15:38,911 DEBUG [Thread-11] org.apache.sqoop.mapreduce.AsyncSqlOutputFormat: Committing transaction of 0 statements 2016-12-21 07:15:38,914 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file hdfs://nameservice1/user/user1/t1_oracle_parquet/a6ba3dda-b5fc-42d7-9555-5837a12a036b.parquet at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:241) at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:227) at org.kitesdk.data.spi.filesystem.AbstractCombineFileRecordReader.nextKeyValue(AbstractCombineFileRecordReader.java:68) at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.nextKeyValue(CombineFileRecordReader.java:69) at org.kitesdk.data.spi.AbstractKeyRecordReaderWrapper.nextKeyValue(AbstractKeyRecordReaderWrapper.java:55) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.ClassCastException: T1_ORACLE cannot be cast to org.apache.avro.generic.IndexedRecord at parquet.avro.AvroIndexedRecordConverter.start(AvroIndexedRecordConverter.java:185) at parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:391) at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:216)
Attachments
Issue Links
- is cloned by
-
SQOOP-3088 Sqoop export with Parquet data failure does not contain the MapTask error
- Open