Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10393

Iceberg field id-based column resolution fails in ASAN builds

    XMLWordPrintableJSON

Details

    • ghx-label-4

    Description

      For MAP types field id resolution indexes the top-level columns via the current 'table_idx - 1' at:

      https://github.com/apache/impala/blob/4c0bdbada0bc0eeb0435e1ea647573566f0cddbd/be/src/exec/parquet/parquet-metadata-utils.cc#L769-L771

      In this case table_idx is either SchemaPathConstants::MAP_KEY or SchemaPathConstants::MAP_VALUE which are 0 and 1 respectively. Hence 'table_idx - 1' can be -1 which is not a valid index for a vector, hence we get an ASAN error. Even if the 'table_idx - 1' is zero we get a wrong field id.

      Note that at this point in the schema resolution we have successfully found a MAP type with a matching field id, therefore it is safe to resolve the child via the value of 'table_idx' (which is the position of the child, MAP_KEY or MAP_VALUE).

      Attachments

        Activity

          People

            boroknagyz Zoltán Borók-Nagy
            boroknagyz Zoltán Borók-Nagy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: