Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7854

Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 2.12.0
    • None
    • Catalog
    • 14 Nodes
      Table in question has 20 columns, 3 partition columns, and 57,475 partitions
    • ghx-label-4

    Description

      ALTER TABLE and LOAD DATA statements take minutes (9 minutes for ALTER TABLE and 6 minutes for LOAD DATA) for tables with a large number of partitions.

      Our workaround was to use Hive to perform the LOAD DATA and then perform a REFRESH PARTITION using Impala.

      • 14 Nodes
      • Table in question has 20 columns, 3 partition columns, and 57,475 partitions

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vihangk1 Vihang Karajgaonkar
            vietn vietn

            Dates

              Created:
              Updated:

              Slack

                Issue deployment