Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-801

Add function or virtual column for file name

    XMLWordPrintableJSON

Details

    Description

      Hive can list the data files in a table. For eg the following query lists all the data files for the table or partition:

      select INPUT__FILE__NAME, count(*) from <table_name> where dt='20140210' group by INPUT__FILE__NAME;
      

      This has two advantages over the existing "show files" functionality:

      • The output can be used in arbitrary SQL statements.
      • You can see which record came from which file.

      Attachments

        Issue Links

          Activity

            People

              boroknagyz Zoltán Borók-Nagy
              udai Udai Kiran Potluri
              Votes:
              9 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: