Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
HiveStorageHandler.configureTableJobProperties() has been replaced with configureInputJobProperties() and configureOutputJobProperties().
Description
HiveStorageHandler.configureTableJobProperties() is called to allow the storage handler to setup any properties that the underlying inputformat/outputformat/serde may need. But the handler implementation does not know whether it is being called for configuring input or output. This makes it a problem for handlers which sets an external state. In the case of HCatalog's HBase storageHandler, whenever a write needs to be configured we create a write transaction which needs to be committed or aborted later on. In this case configuring for both input and output each time configureTableJobProperties() is called would not be desirable. This has become an issue since HCatalog is dropping storageDrivers for SerDe and StorageHandler (see HCATALOG-237).
My proposal is to replace configureTableJobProperties() with two methods:
configureInputJobProperties()
configureOutputJobProperties()
Each method will have the same signature. I cursory look at the code and I believe changes should be straighforward also given that we are not really changing anything just splitting responsibility. If the community is fine with this approach I will go ahead and create a aptch.
Attachments
Attachments
Issue Links
- blocks
-
HCATALOG-263 Make HBaseHCatStorageHandler work with Hive
- Open
- is related to
-
HIVE-2764 Obtain delegation tokens for MR jobs in secure hbase setup
- Closed