Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5817

column name to index mapping in VectorizationContext is broken

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.13.0
    • 0.13.0
    • Vectorization
    • None

    Description

      Columns coming from different operators may have the same internal names ("_colNN"). There exists a query in the form select b.cb, a.ca from a JOIN b ON ... JOIN x ON ...; (distilled from a more complex query), which runs ok w/o vectorization. With vectorization, it will run ok for most ca, but for some ca it will fail (or can probably return incorrect results). That is because when building column-to-VRG-index map in VectorizationContext, internal column name for ca that the first map join operator adds to the mapping may be the same as internal name for cb that the 2nd one tries to add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to output stuff, it retrieves wrong index from the map by name, and then wrong vector from VRG.

      Attachments

        1. HIVE-5817.00-broken.patch
          60 kB
          Sergey Shelukhin
        2. HIVE-5817.4.patch
          14 kB
          Remus Rusanu
        3. HIVE-5817.5.patch
          17 kB
          Remus Rusanu
        4. HIVE-5817.6.patch
          34 kB
          Ashutosh Chauhan
        5. HIVE-5817-uniquecols.broken.patch
          63 kB
          Sergey Shelukhin

        Issue Links

          Activity

            People

              rusanu Remus Rusanu
              sershe Sergey Shelukhin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: