Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10135

Insert events doesn't contain the inserted data files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None
    • ghx-label-8

    Description

      When Impala generates INSERT EVENTs it doesn't add the newly inserted datafiles.

      The problem is that Impala misuses Sets.difference(set1, set2). From the API doc at https://guava.dev/releases/28.2-jre/api/docs/com/google/common/collect/Sets.html#difference-java.util.Set-java.util.Set-

      "The returned set contains all elements that are contained by set1 and not contained by set2set2 may also contain elements not present in set1; these are simply ignored."

      So the name "difference" is a bit misleading, it's rather a subtraction between set1 and set2.

      Unfortunately Impala passes the parameters in wrong order: Sets.difference(beforeInsert, afterInsert):

      https://github.com/apache/impala/blob/4cb3c3556e77ee24003383155ca5e1b70be4db6e/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L4581

      So the result will be always empty.

      There's another problem with INSERT OVERWRITEs, as it doesn't send any INSERT events.

      Attachments

        Issue Links

          Activity

            People

              vihangk1 Vihang Karajgaonkar
              boroknagyz Zoltán Borók-Nagy
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: