Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12809

Iceberg metadata table scanner should always be scheduled to the coordinator

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.4.0
    • None
    • Backend
    • ghx-label-14

    Description

      On larger clusters the Iceberg metadata scanner can be scheduled to executors, for example during a join. The fragment in this case will fail a precondition check, because either the frontend_ object will not be present or the table. Setting exec_at_coord to true is not enough and these fragments should be scheduled to the coord_only_executor_group.

      Additionally, setting NUM_NODES=1 should be a viable workaround.

      Reproducible with the following local dev Impala cluster:
      ./bin/start-impala-cluster.py --cluster_size=3 --num_coordinators=1 --use_exclusive_coordinators
      and query:
      select count(b.parent_id) from functional_parquet.iceberg_query_metadata.history a
      join functional_parquet.iceberg_query_metadata.history b on a.snapshot_id = b.snapshot_id;

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            daniel.becker Daniel Becker
            tmate Tamas Mate
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment