Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7176

Invalid result for MAP structure in MongoDB collection

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.15.0
    • Fix Version/s: Future
    • Component/s: Storage - MongoDB
    • Labels:
      None

      Description

      The following query works fine for JSON file:

      apache drill (mongo.local)> select * from dfs.`/tmp/test.json` o;
      +-------------------------------------+--------+
      |                 _id                 |  name  |
      +-------------------------------------+--------+
      | {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
      | {"$oid":"5cb0e161f0849231dfe16c01"} | manuel |
      +-------------------------------------+--------+
      2 rows selected (0.129 seconds)
      apache drill (mongo.local)> select * from dfs.`/tmp/test.json` o where o.`_id`.`$oid` = '5cb0e161f0849231dfe16d99';
      +-------------------------------------+--------+
      |                 _id                 |  name  |
      +-------------------------------------+--------+
      | {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
      +-------------------------------------+--------+
      1 row selected (0.168 seconds)
      

      But the same query on the same MongoDB collection returns wrong result:

      apache drill (mongo.local)> use mongo.local;
      +------+-----------------------------------------+
      |  ok  |                 summary                 |
      +------+-----------------------------------------+
      | true | Default schema changed to [mongo.local] |
      +------+-----------------------------------------+
      1 row selected (0.107 seconds)
      apache drill (mongo.local)> alter session set store.mongo.bson.record.reader = false;
      +------+-----------------------------------------+
      |  ok  |                 summary                 |
      +------+-----------------------------------------+
      | true | store.mongo.bson.record.reader updated. |
      +------+-----------------------------------------+
      1 row selected (0.066 seconds)
      apache drill (mongo.local)> select * from json_test;
      +-------------------------------------+--------+
      |                 _id                 |  name  |
      +-------------------------------------+--------+
      | {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
      | {"$oid":"5cb0e161f0849231dfe16c01"} | manuel |
      +-------------------------------------+--------+
      2 rows selected (0.115 seconds)
      apache drill (mongo.local)> select * from json_test o where o.`_id`.`$oid` = '5cb0e161f0849231dfe16d99';
      +----+
      | ** |
      +----+
      +----+
      No rows selected (0.121 seconds)
      

      The logical plan for the last query:

      00-00    Screen : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {6.2 rows, 8.2 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 10157
      00-01      Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {6.0 rows, 8.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 10156
      00-02        Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {4.0 rows, 6.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 10155
      00-03          Scan(table=[[mongo, local, json_test]], groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=local, collectionName=json_test, filters=Document{{_id=Document{{$eq=5cb0e161f0849231dfe16d99}}}}], columns=[`**`, `_id`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY _id): rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 10154
      

      The same query, but with specified filed in project works fine:

      apache drill (mongo.local)> select `_id` from json_test o where o.`_id`.`$oid` = '5cb0e161f0849231dfe16d99';
      +-------------------------------------+
      |                 _id                 |
      +-------------------------------------+
      | {"$oid":"5cb0e161f0849231dfe16d99"} |
      +-------------------------------------+
      1 row selected (0.133 seconds)
      

      It looks like a planning issue, however the similar query on other field works fine with a similar logical plan:

      apache drill (mongo.local)> select * from json_test o  where o.`name` = 'thiago';
      +-------------------------------------+--------+
      |                 _id                 |  name  |
      +-------------------------------------+--------+
      | {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
      +-------------------------------------+--------+
      1 row selected (0.18 seconds)
      
      00-00    Screen : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {6.2 rows, 8.2 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 9981
      00-01      Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {6.0 rows, 8.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 9980
      00-02        Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {4.0 rows, 6.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 9979
      00-03          Scan(table=[[mongo, local, json_test]], groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=local, collectionName=json_test, filters=Document{{name=Document{{$eq=thiago}}}}], columns=[`**`, `name`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY name): rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 9978
      

      Datasource JSON file in attachments.
      Upload it to MongoDB:

      vitalii@vitalii-UX331UN:~$ mongoimport --host localhost --db local --collection vitalii_test --file /tmp/test.json
      
      vitalii@vitalii-UX331UN:~$ mongo
      ...
      rs0:PRIMARY> db.vitalii_test.find()
      { "_id" : ObjectId("5cb0e161f0849231dfe16d99"), "name" : "thiago" }
      { "_id" : ObjectId("5cb0e161f0849231dfe16c01"), "name" : "manuel" }
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              vitalii Vitalii Diravka
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: