Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24396

[New Feature] Add data connector support for remote datasources

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0.0
    • Component/s: Hive

      Description

      This feature work is to be able to support in Hive Metastore to be able to configure data connectors for remote datasources and map databases. We currently have support for remote tables via StorageHandlers like JDBCStorageHandler and HBaseStorageHandler.

      Data connectors are a natural extension to this where we can map an entire database or catalogs instead of individual tables. The tables within are automagically mapped at runtime. The metadata for these tables are not persisted in Hive. They are always mapped and built at runtime.

      With this feature, we introduce a concept of type for Databases in Hive. NATIVE vs REMOTE. All current databases are NATIVE. To create a REMOTE database, the following syntax is to be used
      CREATE REMOTE DATABASE remote_db USING <dataconnector> WITH DCPROPERTIES (....);

      Will attach a design doc to this jira.

        Attachments

          Issue Links

          1.
          Create table in REMOTE db should fail Sub-task Resolved Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          2.
          Move create/drop/alter table to the provider interface Sub-task Resolved Naveen Gangam  
          3.
          Support case-sensitivity for tables in REMOTE database. Sub-task In Progress Sai Hemanth Gantasala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          4.
          Implement connector provider for Derby DB Sub-task Resolved Naveen Gangam  
          5.
          Add schema changes for MSSQL Sub-task Resolved Sai Hemanth Gantasala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 10m
          6.
          Add a generic JDBC implementation that can be used to other JDBC DBs Sub-task Open Unassigned  
          7.
          Provide CachedStore implementation for dataconnectors Sub-task Open Unassigned  
          8.
          Evaluate the need to have directSQL implementation for data connectors Sub-task Open Unassigned  
          9.
          Throw error when respective connector JDBC jar is not present in the lib/ path. Sub-task Resolved Sai Hemanth Gantasala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          10.
          Add implementation for a 'hive' connector provider Sub-task Open Narayanan Venkateswaran  
          11.
          [Evaluate] Dataconnector URL validation on create Sub-task Open Naveen Gangam  
          12.
          [Evaluate] if ReplicationSpec is needed for DataConnectors. Sub-task Resolved Naveen Gangam  
          13.
          Consider use of lambda expressions in formatters. Sub-task Open Narayanan Venkateswaran  
          14.
          Reject location and managed locations in DDL for REMOTE databases. Sub-task Resolved Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 20m
          15.
          Implement connector provider for MSSQL and Oracle Sub-task In Progress Sai Hemanth Gantasala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          16.
          Implement List<Table> getTables() for existing connectors. Sub-task Resolved Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 50m
          17.
          Add hive authorization support for Data connectors. Sub-task Resolved Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          18.
          Drop/Alter table in REMOTE db should fail Sub-task Resolved Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m

            Activity

              People

              • Assignee:
                ngangam Naveen Gangam
                Reporter:
                ngangam Naveen Gangam
              • Votes:
                0 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 19h 10m
                  19h 10m