Hive
  1. Hive
  2. HIVE-951

Selectively include EXTERNAL TABLE source files via REGEX

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None
    • Tags:
      external table regex

      Description

      CREATE EXTERNAL TABLE should allow users to cherry-pick files via regular expression.
      CREATE EXTERNAL TABLE was designed to allow users to access data that exists outside of Hive, and
      currently makes the assumption that all of the files located under the supplied path should be included
      in the new table. Users frequently encounter directories containing multiple
      datasets, or directories that contain data in heterogeneous schemas, and it's often
      impractical or impossible to adjust the layout of the directory to meet the requirements of
      CREATE EXTERNAL TABLE. A good example of this problem is creating an external table based
      on the contents of an S3 bucket.

      One way to solve this problem is to extend the syntax of CREATE EXTERNAL TABLE
      as follows:

      CREATE EXTERNAL TABLE
      ...
      LOCATION path [file_regex]
      ...

      For example:

      CREATE EXTERNAL TABLE mytable1 ( a string, b string, c string )
      STORED AS TEXTFILE
      LOCATION 's3://my.bucket/' 'folder/2009.*\.bz2$';
      

      Creates mytable1 which includes all files in s3:/my.bucket with a filename matching 'folder/2009*.bz2'

      CREATE EXTERNAL TABLE mytable2 ( d string, e int, f int, g int )
      STORED AS TEXTFILE 
      LOCATION 'hdfs://data/' 'xyz.*2009????.bz2$';
      

      Creates mytable2 including all files matching 'xyz*2009????.bz2' located under hdfs://data/

      1. HIVE-951.patch
        94 kB
        Carl Steinbach

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Carl Steinbach
            • Votes:
              21 Vote for this issue
              Watchers:
              29 Start watching this issue

              Dates

              • Created:
                Updated:

                Development