Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.15.0
-
None
Description
The following query works fine for JSON file:
apache drill (mongo.local)> select * from dfs.`/tmp/test.json` o; +-------------------------------------+--------+ | _id | name | +-------------------------------------+--------+ | {"$oid":"5cb0e161f0849231dfe16d99"} | thiago | | {"$oid":"5cb0e161f0849231dfe16c01"} | manuel | +-------------------------------------+--------+ 2 rows selected (0.129 seconds) apache drill (mongo.local)> select * from dfs.`/tmp/test.json` o where o.`_id`.`$oid` = '5cb0e161f0849231dfe16d99'; +-------------------------------------+--------+ | _id | name | +-------------------------------------+--------+ | {"$oid":"5cb0e161f0849231dfe16d99"} | thiago | +-------------------------------------+--------+ 1 row selected (0.168 seconds)
But the same query on the same MongoDB collection returns wrong result:
apache drill (mongo.local)> use mongo.local; +------+-----------------------------------------+ | ok | summary | +------+-----------------------------------------+ | true | Default schema changed to [mongo.local] | +------+-----------------------------------------+ 1 row selected (0.107 seconds) apache drill (mongo.local)> alter session set store.mongo.bson.record.reader = false; +------+-----------------------------------------+ | ok | summary | +------+-----------------------------------------+ | true | store.mongo.bson.record.reader updated. | +------+-----------------------------------------+ 1 row selected (0.066 seconds) apache drill (mongo.local)> select * from json_test; +-------------------------------------+--------+ | _id | name | +-------------------------------------+--------+ | {"$oid":"5cb0e161f0849231dfe16d99"} | thiago | | {"$oid":"5cb0e161f0849231dfe16c01"} | manuel | +-------------------------------------+--------+ 2 rows selected (0.115 seconds) apache drill (mongo.local)> select * from json_test o where o.`_id`.`$oid` = '5cb0e161f0849231dfe16d99'; +----+ | ** | +----+ +----+ No rows selected (0.121 seconds)
The logical plan for the last query:
00-00 Screen : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {6.2 rows, 8.2 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 10157 00-01 Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {6.0 rows, 8.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 10156 00-02 Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {4.0 rows, 6.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 10155 00-03 Scan(table=[[mongo, local, json_test]], groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=local, collectionName=json_test, filters=Document{{_id=Document{{$eq=5cb0e161f0849231dfe16d99}}}}], columns=[`**`, `_id`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY _id): rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 10154
The same query, but with specified filed in project works fine:
apache drill (mongo.local)> select `_id` from json_test o where o.`_id`.`$oid` = '5cb0e161f0849231dfe16d99'; +-------------------------------------+ | _id | +-------------------------------------+ | {"$oid":"5cb0e161f0849231dfe16d99"} | +-------------------------------------+ 1 row selected (0.133 seconds)
It looks like a planning issue, however the similar query on other field works fine with a similar logical plan:
apache drill (mongo.local)> select * from json_test o where o.`name` = 'thiago'; +-------------------------------------+--------+ | _id | name | +-------------------------------------+--------+ | {"$oid":"5cb0e161f0849231dfe16d99"} | thiago | +-------------------------------------+--------+ 1 row selected (0.18 seconds)
00-00 Screen : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {6.2 rows, 8.2 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 9981 00-01 Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {6.0 rows, 8.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 9980 00-02 Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, cumulative cost = {4.0 rows, 6.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 9979 00-03 Scan(table=[[mongo, local, json_test]], groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=local, collectionName=json_test, filters=Document{{name=Document{{$eq=thiago}}}}], columns=[`**`, `name`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY name): rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 9978
Datasource JSON file in attachments.
Upload it to MongoDB:
vitalii@vitalii-UX331UN:~$ mongoimport --host localhost --db local --collection vitalii_test --file /tmp/test.json
vitalii@vitalii-UX331UN:~$ mongo ... rs0:PRIMARY> db.vitalii_test.find() { "_id" : ObjectId("5cb0e161f0849231dfe16d99"), "name" : "thiago" } { "_id" : ObjectId("5cb0e161f0849231dfe16c01"), "name" : "manuel" }