Hive
  1. Hive
  2. HIVE-1682

Wrong results with MAPJOIN when cols from non-MAPJOINed table are selected

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Invalid
    • Affects Version/s: 0.7.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Hive trunk (rev 1003407)
      Hadoop 20.2

      Description

      Results of this query is wrong:

      set hive.mapjoin.cache.numrows=100;
      select /*+ MAPJOIN(invites) */ pokes.bar from pokes join invites on (pokes.bar = invites.bar);

      Results of all the queries below match:

      /* This is the same as problematic query without specifying numrows - which defaults to 25k much greater than the number of rows in pokes table */
      select /*+ MAPJOIN(invites) */ pokes.bar from pokes join invites on (pokes.bar = invites.bar)

      set hive.mapjoin.cache.numrows=100;
      select /*+ MAPJOIN(invites) */ invites.bar from pokes join invites on (pokes.bar = invites.bar);

      select invites.bar from pokes join invites on (pokes.bar = invites.bar);

      select pokes.bar from pokes join invites on (pokes.bar = invites.bar);

        Activity

        Thiruvel Thirumoolan made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Invalid [ 6 ]
        Thiruvel Thirumoolan created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Thiruvel Thirumoolan
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development