Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6589

Automatically add partitions for external tables

    Details

    • Type: New Feature
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.14.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Release Note:
      This issue is not resolved under this ticekt. Needs to be worked on.

      Description

      I have a data stream being loaded into Hadoop via Flume. It loads into a date partition folder in HDFS. The path looks like this:

      /flume/my_data/YYYY/MM/DD/HH
      /flume/my_data/2014/03/02/01
      /flume/my_data/2014/03/02/02
      /flume/my_data/2014/03/02/03

      On top of it I create an EXTERNAL hive table to do querying. As of now, I have to manually add partitions. What I want is for EXTERNAL tables, Hive should "discover" those partitions. Additionally I would like to specify a partition pattern so that when I query Hive will know to use the partition pattern to find the HDFS folder.

      So something like this:

      CREATE EXTERNAL TABLE my_data (
        col1 STRING,
        col2 INT
      )
      PARTITIONED BY (
        dt STRING,
        hour STRING
      )
      LOCATION 
        '/flume/mydata'
      TBLPROPERTIES (
        'hive.partition.spec' = 'dt=$Y-$M-$D, hour=$H',
        'hive.partition.spec.location' = '$Y/$M/$D/$H',
      );
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              dallmkp Ken Dallmeyer
            • Votes:
              59 Vote for this issue
              Watchers:
              51 Start watching this issue

              Dates

              • Created:
                Updated: