Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24396

[New Feature] Add data connector support for remote datasources

    XMLWordPrintableJSON

Details

    Description

      This feature work is to be able to support in Hive Metastore to be able to configure data connectors for remote datasources and map databases. We currently have support for remote tables via StorageHandlers like JDBCStorageHandler and HBaseStorageHandler.

      Data connectors are a natural extension to this where we can map an entire database or catalogs instead of individual tables. The tables within are automagically mapped at runtime. The metadata for these tables are not persisted in Hive. They are always mapped and built at runtime.

      With this feature, we introduce a concept of type for Databases in Hive. NATIVE vs REMOTE. All current databases are NATIVE. To create a REMOTE database, the following syntax is to be used
      CREATE REMOTE DATABASE remote_db USING <dataconnector> WITH DCPROPERTIES (....);

      Will attach a design doc to this jira.

      Attachments

        Issue Links

          1.
          Create table in REMOTE db should fail Sub-task Closed Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          2.
          Move create/drop/alter table to the provider interface Sub-task Closed Naveen Gangam  
          3.
          Support case-sensitivity for tables in REMOTE database. Sub-task In Progress Sai Hemanth Gantasala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          4.
          Implement connector provider for Derby DB Sub-task Closed Naveen Gangam  
          5.
          Add schema changes for MSSQL Sub-task Closed Sai Hemanth Gantasala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 10m
          6.
          Add a generic JDBC implementation that can be used to other JDBC DBs Sub-task Open Unassigned  
          7.
          Provide CachedStore implementation for dataconnectors Sub-task Open Unassigned  
          8.
          Evaluate the need to have directSQL implementation for data connectors Sub-task Open Unassigned  
          9.
          Throw error when respective connector JDBC jar is not present in the lib/ path. Sub-task Closed Sai Hemanth Gantasala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          10.
          Add implementation for a 'hive' connector provider Sub-task Open Narayanan Venkateswaran  
          11.
          [Evaluate] Dataconnector URL validation on create Sub-task Open Naveen Gangam  
          12.
          [Evaluate] if ReplicationSpec is needed for DataConnectors. Sub-task Closed Naveen Gangam  
          13.
          Consider use of lambda expressions in formatters. Sub-task Open Narayanan Venkateswaran  
          14.
          Reject location and managed locations in DDL for REMOTE databases. Sub-task Closed Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 20m
          15.
          Implement connector provider for MSSQL and Oracle Sub-task Closed Sai Hemanth Gantasala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h
          16.
          Implement List<Table> getTables() for existing connectors. Sub-task Closed Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 50m
          17.
          Add hive authorization support for Data connectors. Sub-task Closed Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 10m
          18.
          Drop/Alter table in REMOTE db should fail Sub-task Closed Dantong Dong

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          19.
          Implement Connector Provider for Amazon Redshift Sub-task In Progress Narayanan Venkateswaran

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 20m
          20.
          Detect timed out connections for providers and auto-reconnect Sub-task Closed Butao Zhang

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 7h 10m
          21.
          MySQL's bit datatype is default to void datatype in hive Sub-task Open Butao Zhang

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h
          22.
          Schema upgrade for MSSQL fails when adding TYPE column in DBS table Sub-task Closed Ayush Saxena

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          23.
          Drop data connector with argument ifNotExists(true) should not throw NoSuchObjectException Sub-task Closed Butao Zhang

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          24.
          HMSHandler get_all_tables method can not retrieve tables from remote database Sub-task Closed Butao Zhang

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          25.
          Improve data connector cache Sub-task Closed Butao Zhang  
          26.
          Implement JDBC Connector for HiveServer Sub-task Closed Naveen Gangam  
          27.
          Support VIEWs in the metadata federation Sub-task Open Naveen Gangam  
          28.
          [Postgres] Use schema names instead of db names Sub-task Open Unassigned  
          29.
          [Athena] Add connector for Amazon Athena Sub-task Open Unassigned  

          Activity

            People

              ngangam Naveen Gangam
              ngangam Naveen Gangam
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 36h 40m
                  36h 40m