Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-64

In reference implementation, full outer join returns too many rows

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.1.0-m1
    • None
    • None

    Description

      Full outer join returns too many rows. The query

      {
      "head" : {
      "type" : "apache_drill_logical_plan",
      "version" : 1,
      "generator" :

      { "type" : "manual", "info" : "na" }

      },
      "storage" : [

      { "type" : "queue", "name" : "queue" }

      ,

      { "type" : "classpath", "name" : "donuts-json" }

      ],
      "query" : [ {
      "op" : "scan",
      "@id" : 1,
      "memo" : "initial_scan",
      "storageengine" : "donuts-json",
      "selection" :

      { "path" : "/employees.json", "type" : "JSON" }

      ,
      "ref" : "_MAP"
      }, {
      "op" : "project",
      "input" : 1,
      "@id" : 2,
      "projections" : [

      { "ref" : "output.deptId", "expr" : "_MAP.deptId" }

      ]
      }, {
      "op" : "project",
      "input" : 2,
      "@id" : 10,
      "projections" : [

      { "ref" : "output.deptId1", "expr" : "deptId" }

      ]
      }, {
      "op" : "scan",
      "@id" : 3,
      "memo" : "initial_scan",
      "storageengine" : "donuts-json",
      "selection" :

      { "path" : "/departments.json", "type" : "JSON" }

      ,
      "ref" : "_MAP"
      }, {
      "op" : "project",
      "input" : 3,
      "@id" : 4,
      "projections" : [

      { "ref" : "output.deptId", "expr" : "_MAP.deptId" }

      ]
      }, {
      "op": "join",
      "left": 10,
      "right": 4,
      "@id" : 5,
      "type": "outer",
      "conditions": [

      {"relationship": "==", "left": "deptId1", "right": "deptId"}

      ]
      }, {
      "op" : "store",
      "input" : 5,
      "@id" : 6,
      "memo" : "output sink",
      "target" :

      { "number" : 0 }

      ,
      "partition" : null,
      "storageEngine" : "queue"
      } ]
      }

      returns

      { "deptId" : 31, "deptId1" : 31} { "deptId" : 33, "deptId1" : 33} { "deptId" : 33, "deptId1" : 33} { "deptId" : 34, "deptId1" : 34} { "deptId" : 34, "deptId1" : 34} { "deptId1" : null} { "deptId" : 31} { "deptId" : 33} { "deptId" : 34} { "deptId" : 35}

      but I think it should return

      { "deptId" : 31, "deptId1" : 31} { "deptId" : 33, "deptId1" : 33} { "deptId" : 33, "deptId1" : 33} { "deptId" : 34, "deptId1" : 34} { "deptId" : 34, "deptId1" : 34} { "deptId1" : null} { "deptId" : 35}

      because only 35 on the left is unmatched on the right;

      {31, 33, 34}

      have at least one match.

      Also, it's cosmetic, but the "outer" join type should be called "full" because "left" is also a kind of outer join, at least in SQL parlance.

      Attachments

        Activity

          People

            tnachen Timothy Chen
            julianhyde Julian Hyde
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: