Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-3194

Support Hive Metastore in Presto CarbonData.

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.2
    • None
    • None

    Description

      Current Carbon Presto integration added a new presto connector that takes 
      the carbon store folder and lists the databases and tables from the folders. 
      In this implementation, we have many issues like. 
      1. DB and table always need to be in specific order and name of the folders 
      should always match the DB name and table name. 
      2. The table which is created in presto cannot be reflected directly in 
      other execution engines like Spark. 
      3. DB with location and table with location cannot work. 
      4. There will not be any access control on tables. 
      5. There is no interoperability between hive tables like ORC or Parquet with 
      carbon. Like if we want to join some hive table with Carbon Table then it 
      won't be possible. 

      To overcome the above limitations we can support HiveMetastore in Presto 
      Carbon. Basically, instead of creating a new Presto Connector for Carbon, we 
      can extend the HiveConnector and override and add new 
      CarbonPageSourceFactory for reading the data and FileWriterFactory for 
      writing the data. So Carbon Table becomes one of the hive supported format 
      for Presto.  So whatever the tables added in spark can be reflected 
      immediately in Carbon and also the limitations mentioned above will be 
      solved with this type of implementation. 

      Attachments

        Activity

          People

            Unassigned Unassigned
            ravi.pesala Ravindra Pesala
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 9h 50m
                9h 50m