Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6589

Automatically add partitions for external tables

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 0.14.0
    • None
    • None
    • None
    • This issue is not resolved under this ticekt. Needs to be worked on.

    Description

      I have a data stream being loaded into Hadoop via Flume. It loads into a date partition folder in HDFS. The path looks like this:

      /flume/my_data/YYYY/MM/DD/HH
      /flume/my_data/2014/03/02/01
      /flume/my_data/2014/03/02/02
      /flume/my_data/2014/03/02/03

      On top of it I create an EXTERNAL hive table to do querying. As of now, I have to manually add partitions. What I want is for EXTERNAL tables, Hive should "discover" those partitions. Additionally I would like to specify a partition pattern so that when I query Hive will know to use the partition pattern to find the HDFS folder.

      So something like this:

      CREATE EXTERNAL TABLE my_data (
        col1 STRING,
        col2 INT
      )
      PARTITIONED BY (
        dt STRING,
        hour STRING
      )
      LOCATION 
        '/flume/mydata'
      TBLPROPERTIES (
        'hive.partition.spec' = 'dt=$Y-$M-$D, hour=$H',
        'hive.partition.spec.location' = '$Y/$M/$D/$H',
      );
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            dallmkp Ken Dallmeyer
            Votes:
            42 Vote for this issue
            Watchers:
            36 Start watching this issue

            Dates

              Created:
              Updated: