Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.4.5
-
None
Description
Parquet data import into a Hive table may fail if called a second time with the --hive-overwrite option set.
1: Run a successful Sqoop --hive-import
2. And run another import with --hive-overwrite option, to just overwrite the previously loaded data
Observed error:
ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetExistsException: Metadata already exists for dataset: foo.bar org.kitesdk.data.DatasetExistsException: Metadata already exists for dataset: foo.bar at org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:51) at org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:77) at org.kitesdk.data.Datasets.create(Datasets.java:239) at org.kitesdk.data.Datasets.create(Datasets.java:307) at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:102) at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:89) at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:106) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:668) at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236)