Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5612

Random failure in TestMergeJoinWithSchemaChanges

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.11.0
    • 1.20.0
    • None
    • None

    Description

      The unit test org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns is subject to random failures, perhaps due to changes in file order in readers.

      The test builds a number of input files, then executes queries against them. On most runs, the output is fine:

      Running org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns
      /home/.../target/1498606483211-0/mergejoin-schemachanges-left
      /home/.../target/1498606483211-1/mergejoin-schemachanges-right
      

      But, on occasion, the query fails:

      org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges
      testMissingAndNewColumns(org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges)  Time elapsed: 0.569 sec  <<< ERROR!
      ...: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with changing schemas
      
      Fragment 0:0
      
        (org.apache.drill.exec.exception.SchemaChangeException) Sort currently only supports a single schema.
          org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build():152
          org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext():476
      ...
      

      The line in the exception above:

        public void build(VectorContainer outputContainer) throws SchemaChangeException {
          outputContainer.clear();
          if (batches.keySet().size() > 1) {
            throw new SchemaChangeException("Sort currently only supports a single schema.");
          }
      

      The above code has not changed in quite some time. The failure is in the "legacy" external sort.

      Although the external sort does support schema changes, it only does so in the form of a union vector, which must be enabled. (Other tests validate that schema changes work.)

      What is likely happening here is that the sort sometimes sees two files with differing schemas, sometimes multiple threads run so that a single sort sees only one file. This speculation can be verified by looking at a log file (not available in the test run that failed) to see if the scan under the sort read more than one file.

      Or, perhaps the order of the JSON files matters. Perhaps file order varies across machines (since the Linux command to list directories does not guarantee order.)

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vitalii Vitalii Diravka
            paul-rogers Paul Rogers
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment