Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7428

Drill incorrectly allows a repeated map field to be projected to top level

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Consider the following query from the Mongo DB tests:

      select t.name as name, t.topping.type as type 
        from mongo.%s.`%s` t where t.sales >= 150
      

      The query is used in TestMongoQueries.testUnShardedDBInShardedClusterWithProjectionAndFilter().
       
      Here it turns out that topping is a repeated map. The query is projecting the members of that map to the top level. The query has five rows, but 24 values in the repeated map. The Project operator allows the projection, resulting in an output batch in which most vectors have 5 values, but the topping column, now at the top level and no longer in the map, has 24 values.

      As a result, the first five values, formerly associated with the first record, are now associated with the first five top-level records, while the values formerly associated with records 1-4 are lost.

      Thus, this is a data corruption bug.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Paul.Rogers Paul Rogers
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: