Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6217

NaN/Inf: NestedLoopJoin processes NaN values incorrectly

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.12.0
    • 1.13.0
    • None

    Description

      AFFECTED_FUNCTIONALITY: INNER JOIN (nestedloopjoin)

      ISSUE_DESCRIPTION: according to nestedloopjoin query result NaN != NaN, however hashjoin / mergejoin behaves another way - NaN = NaN. As far as I understand, nestedloopjoin should behave like hashjoin / mergejoin. STEPS:

      • Upload the attached file to Hadoop fs (ObjsX.json);
      • Setup the following system settings:
        set planner.enable_nljoin_for_scalar_only = false
        set planner.enable_hashjoin = false
        set planner.enable_mergejoin = false
        set planner.enable_nestedloopjoin = true
      • Run the following sql query
         select distinct t.name from dfs.tmp.`ObjsX.json` t inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 

        EXPECTED_RESULT: It was expected to get the following result:

        	 object1
        	 object2
        	 object3
        	 object4
        	

      ACTUAL_RESULT: The actual result is:

      	 object2
      	 object3
      	 object4
      	

      Please investigate and fix

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            volodymyr.tkach Volodymyr Tkach
            volodymyr.tkach Volodymyr Tkach
            Arina Ielchiieva Arina Ielchiieva
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment