Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2807

During insert operation impala creates too many files for a table size < block size

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: Impala 2.3.0
    • Fix Version/s: None
    • Component/s: Perf Investigation
    • Labels:
      None

      Description

      When loading the "customer" table from TPC-DS based schema, total no. of files created is 20 (which is equal to number of impala nodes in the cluster).
      The total size of the this table is 204.2 MiB which can fit in a single block while it occupies 20 blocks in this case.
      When ran the same insert command with a single impalad running in the cluster single block was able to hold all the table data and only one hdfs file was created.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              dkumar@cloudera.com Dileep Kumar
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: