Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2920

Provide an HDFS pseudotable in Impala

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Impala 2.2.4
    • Fix Version/s: None
    • Component/s: Frontend
    • Labels:

      Description

      Would it be possible to implement some sort of pseudo table(s) in Impala that one could query to get HDFS information? For example, instead of having to do some sort of "hdfs dfs -ls *" from the command line (or through an hdfs api), cut, sort, filter and dump to a file to operate on, it'd be nice to do something like select filename from hdfs where path like '/tmp/some/path/%'. Similarly you could then run queries to do things like group by files in directories, sizes of files, etc. The possibilities are pretty endless of how useful it could be. And all easily done through the Hue interface.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              jslagel_impala_9c97 Joe Slagel
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: