Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10317

Add query option that limits join #rows at runtime

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.0.0
    • Backend
    • None
    • ghx-label-9

    Description

      Reject queries that rows produced too bigger by join operator when executing the query.
      This is a mechanism to protect the cluster from potentially harmful queries.

      When the cardinality of the table is very large and the join conditions are very bad, the number of rows produced by the join will be very large, sometimes tens of billions, which affects the cluster status and other running queries.

      In our environment, the NUM_JOIN_ROWS_PRODUCED_LIMIT query option is added to limit the number of rows produced by a single join operator.
      Implementation refers to IMPALA-6034 and summary (see the figure below), check the join operator #rows size

      Attachments

        1. query82_summary.png
          59 kB
          Fucun Chu

        Activity

          People

            chufucun Fucun Chu
            chufucun Fucun Chu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: