Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6856

Wrong result returned if the query filters a boolean column with both "is true" and "is null" conditions

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.15.0
    • 1.16.0
    • None
    • None

    Description

      Data:
      A parquet file with a boolean column that contains null values.
      An example is attached.

      Query:

      select bool_col from dfs.tmp.`Test_data` where bool_col is true or bool_col is null
      

      Result:

      null
      null
      

      Plan:

      00-00    Screen : rowType = RecordType(ANY bool_col): rowcount = 3.75, cumulative cost = {37.875 rows, 97.875 cpu, 15.0 io, 0.0 network, 0.0 memory}, id = 1980
      00-01      Project(bool_col=[$0]) : rowType = RecordType(ANY bool_col): rowcount = 3.75, cumulative cost = {37.5 rows, 97.5 cpu, 15.0 io, 0.0 network, 0.0 memory}, id = 1979
      00-02        SelectionVectorRemover : rowType = RecordType(ANY bool_col): rowcount = 3.75, cumulative cost = {33.75 rows, 93.75 cpu, 15.0 io, 0.0 network, 0.0 memory}, id = 1978
      00-03          Filter(condition=[IS NULL($0)]) : rowType = RecordType(ANY bool_col): rowcount = 3.75, cumulative cost = {30.0 rows, 90.0 cpu, 15.0 io, 0.0 network, 0.0 memory}, id = 1977
      00-04            Scan(table=[[dfs, tmp, Test_data]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/Test_data]], selectionRoot=maprfs:/tmp/Test_data, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`bool_col`]]]) : rowType = RecordType(ANY bool_col): rowcount = 15.0, cumulative cost = {15.0 rows, 15.0 cpu, 15.0 io, 0.0 network, 0.0 memory}, id = 1976
      

      Notes:

      • "true" values were not included in the result though they should have.
      • Result is correct if use "bool_col = true" instead of "is true"
      • In the plan you can see that "is true" condition is absent in the Filter operator

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ihuzenko Igor Guzenko
            angozhiy Anton Gozhiy
            Vova Vysotskyi Vova Vysotskyi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment