Uploaded image for project: 'Falcon'
  1. Falcon
  2. FALCON-1096

Scan Hive Metastore to automatically create Falcon feeds for existing Hive tables

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      In my organisation we create a Hive table for each production dataset in HDFS. When creating a Hive table, you supply a lot of information about your dataset: its name, fields and their types and comments, the location, the data format, properties in form of the key-value pairs and meaningful description of the dataset. We think of Hive as a central and nicely documented repository of our datasets.

      When using Falcon, we again need to create Falcon feed for each dataset (that corresponds to a Hive table) and even specify multiple redundant properties (e.g. description).

      To make it simpler, Falcon could scan the Hive Metastore and automatically create feeds for each Hive table and inherit its properties.

      The properties of Hive tables could be also used when searching for a dataset using new Falcon Web UI e.g. field name, field comment, file format (some other statistics like total file size, the last modification or access time could be also used).

        Activity

        Hide
        pallavi.rao Pallavi Rao added a comment -

        +1
        Falcon can have a utility script to do that, with a whitelist/blacklist of Hive tables. And, the searchable fields can be specified as tags in the feed.

        Show
        pallavi.rao Pallavi Rao added a comment - +1 Falcon can have a utility script to do that, with a whitelist/blacklist of Hive tables. And, the searchable fields can be specified as tags in the feed.
        Hide
        suhas.ysr Suhas Vasu added a comment -

        As per FALCON-703, we are introducing a plugin which listens to JMSmessages and registers the partitions on HCatalog table.

        I prefer to have another plugin which solves this purpose, so that it can be enabled by users who really need that feature.

        Show
        suhas.ysr Suhas Vasu added a comment - As per FALCON-703 , we are introducing a plugin which listens to JMSmessages and registers the partitions on HCatalog table. I prefer to have another plugin which solves this purpose, so that it can be enabled by users who really need that feature.

          People

          • Assignee:
            Unassigned
            Reporter:
            kawaa Adam Kawa
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development