Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6217

NaN/Inf: NestedLoopJoin processes NaN values incorrectly

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.12.0
    • Fix Version/s: 1.13.0
    • Component/s: None
    • Labels:

      Description

      AFFECTED_FUNCTIONALITY: INNER JOIN (nestedloopjoin)

      ISSUE_DESCRIPTION: according to nestedloopjoin query result NaN != NaN, however hashjoin / mergejoin behaves another way - NaN = NaN. As far as I understand, nestedloopjoin should behave like hashjoin / mergejoin. STEPS:

      • Upload the attached file to Hadoop fs (ObjsX.json);
      • Setup the following system settings:
        set planner.enable_nljoin_for_scalar_only = false
        set planner.enable_hashjoin = false
        set planner.enable_mergejoin = false
        set planner.enable_nestedloopjoin = true
      • Run the following sql query
         select distinct t.name from dfs.tmp.`ObjsX.json` t inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 

        EXPECTED_RESULT: It was expected to get the following result:

        	 object1
        	 object2
        	 object3
        	 object4
        	

      ACTUAL_RESULT: The actual result is:

      	 object2
      	 object3
      	 object4
      	

      Please investigate and fix

        Attachments

        1. ObjsX.json
          0.7 kB
          Volodymyr Tkach

          Issue Links

            Activity

              People

              • Assignee:
                volodymyr.tkach Volodymyr Tkach
                Reporter:
                volodymyr.tkach Volodymyr Tkach
                Reviewer:
                Arina Ielchiieva
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: