Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6217

NaN/Inf: NestedLoopJoin processes NaN values incorrectly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.12.0
    • 1.13.0
    • None

    Description

      AFFECTED_FUNCTIONALITY: INNER JOIN (nestedloopjoin)

      ISSUE_DESCRIPTION: according to nestedloopjoin query result NaN != NaN, however hashjoin / mergejoin behaves another way - NaN = NaN. As far as I understand, nestedloopjoin should behave like hashjoin / mergejoin. STEPS:

      • Upload the attached file to Hadoop fs (ObjsX.json);
      • Setup the following system settings:
        set planner.enable_nljoin_for_scalar_only = false
        set planner.enable_hashjoin = false
        set planner.enable_mergejoin = false
        set planner.enable_nestedloopjoin = true
      • Run the following sql query
         select distinct t.name from dfs.tmp.`ObjsX.json` t inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 

        EXPECTED_RESULT: It was expected to get the following result:

        	 object1
        	 object2
        	 object3
        	 object4
        	

      ACTUAL_RESULT: The actual result is:

      	 object2
      	 object3
      	 object4
      	

      Please investigate and fix

      Attachments

        1. ObjsX.json
          0.7 kB
          Volodymyr Tkach

        Issue Links

          Activity

            People

              volodymyr.tkach Volodymyr Tkach
              volodymyr.tkach Volodymyr Tkach
              Arina Ielchiieva Arina Ielchiieva
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: