Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7082

Inconsistent results with implicit partition columns, multi scans

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.15.0
    • 1.17.0
    • None
    • None

    Description

      The runtime behavior of implicit partition columns is wildly inconsistent to the point of being unusable. Consider the following query:

      SELECT * FROM `myTable`
      

      Where myTable is a directory of CSV files, each with schema (a, b, c):

      myTable
      |- file1.csv
      |- nested
         |- file2.csv
      

      Our test files are small. Turn out that, even if we write a test that scans a few files, such as the above example, Drill will group all the reads into a single fragment with a single scan operator. When that happens:

      • The partition columns appear before the data columns: (dir0, a, b, c).
      • The partition columns always appear in every row.

      We get the above result because a single scan operator sees both files and knows the right number of partition columns to create for each.

      But, we know that, if two scans each read files at different depths, the "shallower" one won't see as many partition directories as the "deeper" one. To test this, I modified the text reader to accept a new session option that sets the minimum parallelization. I set it to 2 (same as the number of files.) One could probably also see this by creating large text files so that the Drill parallelizer will choose to create two fragments.

      Then, I ran the above query 10 times. Now, I get these results:

      • Half the time, the first row has only the data columns (a, b, c), the other half of the time the first row has a partition column. (Depending on which file returned data first.)
      • Some of the time the partition column appears in the first position (dir0, a, b, c) and some of the time in the last (a, b, c, dir0). (I have no idea why.)

      The result is, from a two-file query, depending on random factors, your first row schema could be:

      • (a, b, c)
      • (dir0, a, b, c)
      • (a, b, c, dir0)

      In many cases, the second row comes with a hard schema change to a different format.

      The above is demonstrated in the (soon to be provided) TestPartitionRace unit test.

      IMHO, the behavior is basically unusable as any JDBC/ODBC client will see an inconsistent, changing schema. Instead, what a user would expect is:

      • The partition columns are in the same location in every row (preferably at the end, so data columns remain in fixed positions regardless of the number of partition columns.)
      • The same number of columns in every row. This means that all scan operators must use a single uniform partition depth count, preferably set at plan type in the group scan node that has visibility to all the files to scan.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Paul.Rogers Paul Rogers
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: