Apache Drill
  1. Apache Drill
  2. DRILL-789

Left outer join returns "null" values for columns from the right table

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: Execution - Flow
    • Labels:
      None

      Description

      git.commit.id.abbrev=5d7e3d3

      0: jdbc:drill:schema=dfs> select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter left outer join student on (student.name = voter.name) where voter.age < 30;
      ----------------------------------------------+

      voter_name registration student_name gpa

      ----------------------------------------------+

      [B@6ca2652 [B@4199d4f9 null null
      [B@5a6d4914 [B@3cd8ee6d null null
      [B@460d5550 [B@155c1b1e null null
      [B@51f85986 [B@7bd9675 null null
      [B@2fe0df4b [B@5463cd7b null null
      [B@64477185 [B@e6e0632 null null
      1. DRILL-789.patch
        5 kB
        Steven Phillips

        Activity

        Hide
        Krystal added a comment -

        git.commit.id.abbrev=4198a17

        Query now returns expected data.

        Show
        Krystal added a comment - git.commit.id.abbrev=4198a17 Query now returns expected data.
        Hide
        Krystal added a comment -

        git.commit.id.abbrev=5b8f8d8

        The query now fails an error:

        0: jdbc:drill:schema=dfs> select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter left outer join student on (student.name = voter.name);
        message: "Failure while setting up Foreman. < AssertionError"

        Seems like the problem occurs when the columns in the select statement are not the same as the join columns - in this case the "name" column or the select columns are from the right table. The following modified queries run successfully:

        select student.name, voter.name from student left outer join voter on (student.student_id = voter.voter_id);
        select student.student_id, voter.voter_id, student.name, student.age from student left outer join voter on (student.student_id = voter.voter_id);

        If you add a column from the voter table to either one of the query above, the query fails:
        select student.student_id, voter.voter_id, student.name, student.age, voter.contributions from student left outer join voter on (student.student_id = voter.voter_id);

        Below is the stack trace:
        ava.lang.AssertionError
        org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:3932) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3214) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3189) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter.access$1400(SqlToRelConverter.java:62) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4220) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3611) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql.SqlIdentifier.accept(SqlIdentifier.java:222) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4116) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.StandardConvertletTable$7.convertCall(StandardConvertletTable.java:130) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:52) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4210) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3611) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql.SqlCall.accept(SqlCall.java:133) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4116) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3427) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:519) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:474) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2787) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.eigenbase.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:432) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        net.hydromatic.optiq.prepare.PlannerImpl.convert(PlannerImpl.java:192) ~[optiq-core-0.7-20140513.013236-5.jar:na]
        org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:124) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:106) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:129) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:313) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:167) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
        java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]

        Show
        Krystal added a comment - git.commit.id.abbrev=5b8f8d8 The query now fails an error: 0: jdbc:drill:schema=dfs> select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter left outer join student on (student.name = voter.name); message: "Failure while setting up Foreman. < AssertionError" Seems like the problem occurs when the columns in the select statement are not the same as the join columns - in this case the "name" column or the select columns are from the right table. The following modified queries run successfully: select student.name, voter.name from student left outer join voter on (student.student_id = voter.voter_id); select student.student_id, voter.voter_id, student.name, student.age from student left outer join voter on (student.student_id = voter.voter_id); If you add a column from the voter table to either one of the query above, the query fails: select student.student_id, voter.voter_id, student.name, student.age, voter.contributions from student left outer join voter on (student.student_id = voter.voter_id); Below is the stack trace: ava.lang.AssertionError org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:3932) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3214) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3189) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter.access$1400(SqlToRelConverter.java:62) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4220) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3611) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql.SqlIdentifier.accept(SqlIdentifier.java:222) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4116) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.StandardConvertletTable$7.convertCall(StandardConvertletTable.java:130) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:52) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4210) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3611) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql.SqlCall.accept(SqlCall.java:133) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4116) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3427) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:519) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:474) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2787) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.eigenbase.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:432) ~ [optiq-core-0.7-20140513.013236-5.jar:na] net.hydromatic.optiq.prepare.PlannerImpl.convert(PlannerImpl.java:192) ~ [optiq-core-0.7-20140513.013236-5.jar:na] org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:124) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:106) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:129) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:313) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:167) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
        Hide
        Krystal added a comment -

        git.commit.id.abbrev=f948d71

        Running the same query, I am getting the following error:
        message: "Failure while running fragment. < IllegalStateException:[ Failure while closing accountor. Expected private and shared pools to be set to initial values. However, one or more were not. Stats are
        zone init allocated delta
        private 1000000 1000000 0
        shared 9999000000 9998996414 3586. ]"
        ]

        Stack trace:
        java.lang.IllegalStateException: Failure while closing accountor. Expected private and shared pools to be set to initial values. However, one or more were not. Stats are
        zone init allocated delta
        private 1000000 1000000 0
        shared 9999000000 9998996414 3586.
        org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:164) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.memory.Accountor.close(Accountor.java:156) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.close(TopLevelAllocator.java:170) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.ops.OperatorContext.close(OperatorContext.java:56) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.stop(PartitionSenderRootExec.java:353) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:114) ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
        java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]

        Show
        Krystal added a comment - git.commit.id.abbrev=f948d71 Running the same query, I am getting the following error: message: "Failure while running fragment. < IllegalStateException:[ Failure while closing accountor. Expected private and shared pools to be set to initial values. However, one or more were not. Stats are zone init allocated delta private 1000000 1000000 0 shared 9999000000 9998996414 3586. ]" ] Stack trace: java.lang.IllegalStateException: Failure while closing accountor. Expected private and shared pools to be set to initial values. However, one or more were not. Stats are zone init allocated delta private 1000000 1000000 0 shared 9999000000 9998996414 3586. org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:164) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.memory.Accountor.close(Accountor.java:156) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.close(TopLevelAllocator.java:170) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.ops.OperatorContext.close(OperatorContext.java:56) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.stop(PartitionSenderRootExec.java:353) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:114) ~ [drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT] java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
        Hide
        Aman Sinha added a comment -

        I am able to reproduce this on my machine using data supplied by Krystal. The issue reproduces when join columns have null values and we are doing a distributed hash join (does not repro with single mode hash join or distributed merge join). I am adding some instrumentation to debug this further.

        Show
        Aman Sinha added a comment - I am able to reproduce this on my machine using data supplied by Krystal. The issue reproduces when join columns have null values and we are doing a distributed hash join (does not repro with single mode hash join or distributed merge join). I am adding some instrumentation to debug this further.
        Hide
        Aman Sinha added a comment -

        The wrong result seems to be caused by distribution on varchar column. If I disable exchanges (set planner.disable_exchanges to True) on the cluster, we get right results. Also, joining on voter_id = student_id columns returns right results.
        BTW, the bug 787 with inner joins also has the same underlying cause.

        Show
        Aman Sinha added a comment - The wrong result seems to be caused by distribution on varchar column. If I disable exchanges (set planner.disable_exchanges to True) on the cluster, we get right results. Also, joining on voter_id = student_id columns returns right results. BTW, the bug 787 with inner joins also has the same underlying cause.
        Hide
        Krystal added a comment -

        Here is the full explain plan:
        0: jdbc:drill:schema=dfs> explain plan for select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter inner join student on (student.name = voter.name) where voter.age < 30;
        ----------------------+

        text json

        ----------------------+

        ScreenPrel
        UnionExchangePrel
        ProjectPrel(voter_name=[$1], registration=[$3], student_name=[$5], gpa=[$6])
        HashJoinPrel(condition=[=($5, $1)], joinType=[inner])
        HashToRandomExchangePrel(dist0=[[$1]])
        FilterPrel(condition=[<($2, 30)])
        ScanPrel(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/p1tests/voter]], selectionRoot=/drill/testdata/p1tests/voter, columns=[SchemaPath [`name`], SchemaPath [`age`], SchemaPath [`registration`]]]])
        HashToRandomExchangePrel(dist0=[[$1]])
        ScanPrel(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/p1tests/student]], selectionRoot=/drill/testdata/p1tests/student, columns=[SchemaPath [`name`], SchemaPath [`gpa`]]]])
        {
        "head" :
        Unknown macro: { "version" }

        ,
        "graph" : [ {
        "pop" : "parquet-scan",
        "@id" : 1,
        "entries" : [

        { "path" : "maprfs:/drill/testdata/p1tests/voter" }

        ],
        "storage" : {
        "type" : "file",
        "connection" : "maprfs:///",
        "workspaces" :

        Unknown macro: { "root" }

        ,
        "formats" : null
        },
        "format" :

        { "type" : "parquet" }

        ,
        "columns" : [ "`name`", "`age`", "`registration`" ],
        "selectionRoot" : "/drill/testdata/p1tests/voter"
        },

        { "pop" : "filter", "@id" : 2, "child" : 1, "expr" : "less_than(`age`, 30) ", "initialAllocation" : 1000000, "maxAllocation" : 10000000000 }

        ,

        { "pop" : "hash-to-random-exchange", "@id" : 3, "child" : 2, "expr" : "hash(`name`) ", "initialAllocation" : 1000000, "maxAllocation" : 10000000000 }

        , {
        "pop" : "parquet-scan",
        "@id" : 4,
        "entries" : [

        { "path" : "maprfs:/drill/testdata/p1tests/student" }

        ],
        "storage" : {
        "type" : "file",
        "connection" : "maprfs:///",
        "workspaces" :

        Unknown macro: { "root" }

        ,
        "formats" : null
        },
        "format" :

        { "type" : "parquet" }

        ,
        "columns" : [ "`name`", "`gpa`" ],
        "selectionRoot" : "/drill/testdata/p1tests/student"
        },

        { "pop" : "hash-to-random-exchange", "@id" : 5, "child" : 4, "expr" : "hash(`name`) ", "initialAllocation" : 1000000, "maxAllocation" : 10000000000 }

        ,

        Unknown macro: { "pop" }

        ,

        Unknown macro: { "pop" }

        ,

        Unknown macro: { "pop" }

        ,

        { "pop" : "union-exchange", "@id" : 9, "child" : 8, "initialAllocation" : 1000000, "maxAllocation" : 10000000000 }

        ,

        { "pop" : "screen", "@id" : 10, "child" : 9, "initialAllocation" : 1000000, "maxAllocation" : 10000000000 }

        ]
        }

        ----------------------+
        1 row selected (0.495 seconds)

        Show
        Krystal added a comment - Here is the full explain plan: 0: jdbc:drill:schema=dfs> explain plan for select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter inner join student on (student.name = voter.name) where voter.age < 30; ----------- -----------+ text json ----------- -----------+ ScreenPrel UnionExchangePrel ProjectPrel(voter_name= [$1] , registration= [$3] , student_name= [$5] , gpa= [$6] ) HashJoinPrel(condition= [=($5, $1)] , joinType= [inner] ) HashToRandomExchangePrel(dist0=[ [$1] ]) FilterPrel(condition= [<($2, 30)] ) ScanPrel(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/p1tests/voter] ], selectionRoot=/drill/testdata/p1tests/voter, columns=[SchemaPath [`name`] , SchemaPath [`age`] , SchemaPath [`registration`] ]]]) HashToRandomExchangePrel(dist0=[ [$1] ]) ScanPrel(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/p1tests/student] ], selectionRoot=/drill/testdata/p1tests/student, columns=[SchemaPath [`name`] , SchemaPath [`gpa`] ]]]) { "head" : Unknown macro: { "version" } , "graph" : [ { "pop" : "parquet-scan", "@id" : 1, "entries" : [ { "path" : "maprfs:/drill/testdata/p1tests/voter" } ], "storage" : { "type" : "file", "connection" : "maprfs:///", "workspaces" : Unknown macro: { "root" } , "formats" : null }, "format" : { "type" : "parquet" } , "columns" : [ "`name`", "`age`", "`registration`" ], "selectionRoot" : "/drill/testdata/p1tests/voter" }, { "pop" : "filter", "@id" : 2, "child" : 1, "expr" : "less_than(`age`, 30) ", "initialAllocation" : 1000000, "maxAllocation" : 10000000000 } , { "pop" : "hash-to-random-exchange", "@id" : 3, "child" : 2, "expr" : "hash(`name`) ", "initialAllocation" : 1000000, "maxAllocation" : 10000000000 } , { "pop" : "parquet-scan", "@id" : 4, "entries" : [ { "path" : "maprfs:/drill/testdata/p1tests/student" } ], "storage" : { "type" : "file", "connection" : "maprfs:///", "workspaces" : Unknown macro: { "root" } , "formats" : null }, "format" : { "type" : "parquet" } , "columns" : [ "`name`", "`gpa`" ], "selectionRoot" : "/drill/testdata/p1tests/student" }, { "pop" : "hash-to-random-exchange", "@id" : 5, "child" : 4, "expr" : "hash(`name`) ", "initialAllocation" : 1000000, "maxAllocation" : 10000000000 } , Unknown macro: { "pop" } , Unknown macro: { "pop" } , Unknown macro: { "pop" } , { "pop" : "union-exchange", "@id" : 9, "child" : 8, "initialAllocation" : 1000000, "maxAllocation" : 10000000000 } , { "pop" : "screen", "@id" : 10, "child" : 9, "initialAllocation" : 1000000, "maxAllocation" : 10000000000 } ] } ----------- -----------+ 1 row selected (0.495 seconds)
        Hide
        Aman Sinha added a comment -

        I am getting the right results on the latest master branch using the voter.parquet and student.parquet files that I had received from QA sometime ago. Here's my query.

        0: jdbc:drill:zk=local> select cast(voter.name as varchar(20)) voter_name, cast(voter.registration as varchar(20)) registration, cast(student.name as varchar(20)) student_name, student.gpa gpa from dfs.`/Users/asinha/parquet/voter.parquet` voter left outer join dfs.`/Users/asinha/parquet/student.parquet` student on (student.name = voter.name) where voter.age < 30;
        ----------------------------------------------+

        voter_name registration student_name gpa

        ----------------------------------------------+

        katie nixon socialist katie nixon 1.46
        katie nixon socialist katie nixon 1.32
        calvin allen democrat null null
        rachel garcia republican rachel garcia 1.17
        holly robinson independent null null
        holly falkner libertarian holly falkner 1.36
        holly falkner libertarian holly falkner 1.29
        Show
        Aman Sinha added a comment - I am getting the right results on the latest master branch using the voter.parquet and student.parquet files that I had received from QA sometime ago. Here's my query. 0: jdbc:drill:zk=local> select cast(voter.name as varchar(20)) voter_name, cast(voter.registration as varchar(20)) registration, cast(student.name as varchar(20)) student_name, student.gpa gpa from dfs.`/Users/asinha/parquet/voter.parquet` voter left outer join dfs.`/Users/asinha/parquet/student.parquet` student on (student.name = voter.name) where voter.age < 30; ----------- ------------ ------------ -----------+ voter_name registration student_name gpa ----------- ------------ ------------ -----------+ katie nixon socialist katie nixon 1.46 katie nixon socialist katie nixon 1.32 calvin allen democrat null null rachel garcia republican rachel garcia 1.17 holly robinson independent null null holly falkner libertarian holly falkner 1.36 holly falkner libertarian holly falkner 1.29
        Hide
        Krystal added a comment -

        0: jdbc:drill:schema=dfs> explain plan for select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter left outer join student on (student.name = voter.name) where voter.age < 30;
        ----------------------+

        text json

        ----------------------+

        ScreenPrel
        UnionExchangePrel
        ProjectPrel(voter_name=[$1], registration=[$3], student_name=[$5],

        ----------------------+
        1 row selected (0.604 seconds)

        Show
        Krystal added a comment - 0: jdbc:drill:schema=dfs> explain plan for select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter left outer join student on (student.name = voter.name) where voter.age < 30; ----------- -----------+ text json ----------- -----------+ ScreenPrel UnionExchangePrel ProjectPrel(voter_name= [$1] , registration= [$3] , student_name= [$5] , ----------- -----------+ 1 row selected (0.604 seconds)
        Hide
        Mehant Baid added a comment -

        Could you provide the plan.

        Show
        Mehant Baid added a comment - Could you provide the plan.
        Hide
        Krystal added a comment -

        Here is part of the result from postgres:

        postgres=#select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter left outer join student on (student.name = voter.name) where voter.age < 30;
        voter_name | registration | student_name | gpa
        ---------------------------------------------------+-----
        luke white | independent | luke white | 3.61
        victor nixon | green | victor nixon | 1.04
        calvin carson | libertarian | calvin carson | 2.00
        rachel garcia | republican | rachel garcia | 1.17
        ethan ichabod | democrat | ethan ichabod | 2.54
        xavier ichabod | democrat | xavier ichabod | 2.38
        xavier ichabod | independent | xavier ichabod | 2.38
        holly xylophone | socialist | holly xylophone | 2.18

        From drill, every row has "null" from the columns from the right table.

        Show
        Krystal added a comment - Here is part of the result from postgres: postgres=#select voter.name voter_name, voter.registration registration, student.name student_name, student.gpa gpa from voter left outer join student on (student.name = voter.name) where voter.age < 30; voter_name | registration | student_name | gpa -------------------- ------------ ------------------- + ----- luke white | independent | luke white | 3.61 victor nixon | green | victor nixon | 1.04 calvin carson | libertarian | calvin carson | 2.00 rachel garcia | republican | rachel garcia | 1.17 ethan ichabod | democrat | ethan ichabod | 2.54 xavier ichabod | democrat | xavier ichabod | 2.38 xavier ichabod | independent | xavier ichabod | 2.38 holly xylophone | socialist | holly xylophone | 2.18 From drill, every row has "null" from the columns from the right table.
        Hide
        Aman Sinha added a comment -

        Why is this an issue ? Can you provide the results you are comparing to ?

        Show
        Aman Sinha added a comment - Why is this an issue ? Can you provide the results you are comparing to ?

          People

          • Assignee:
            Unassigned
            Reporter:
            Krystal
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development