Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2920

Provide an HDFS pseudotable in Impala

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • Impala 2.2.4
    • None
    • Backend, Catalog, Frontend

    Description

      Would it be possible to implement some sort of pseudo table(s) in Impala that one could query to get HDFS information? For example, instead of having to do some sort of "hdfs dfs -ls *" from the command line (or through an hdfs api), cut, sort, filter and dump to a file to operate on, it'd be nice to do something like select filename from hdfs where path like '/tmp/some/path/%'. Similarly you could then run queries to do things like group by files in directories, sizes of files, etc. The possibilities are pretty endless of how useful it could be. And all easily done through the Hue interface.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jslagel_impala_9c97 Joe Slagel
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: