Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20407

ParquetQuerySuite 'Enabling/disabling ignoreCorruptFiles' flaky test


    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.1.1, 2.2.0
    • Component/s: Tests
    • Labels:


      ParquetQuerySuite test "Enabling/disabling ignoreCorruptFiles" can sometimes fail. This is caused by the fact that when one task fails, the driver call returns and test code continues, but there might still be tasks running that will be killed at the next killing point.

      There are 2 specific issues created by this:
      1. Files can be closed some time after the test finishes, so DebugFilesystem.assertNoOpenStreams fails. One solution for this is to change SharedSqlContext and call assertNoOpenStreams inside eventually {}

      2. ParquetFileReader constructor from apache parquet 1.8.2 can leak a stream at line 538. This happens when the next line throws an exception. So, the constructor fails and Spark doesn't have any way to close the file.
      This happens in this test because the test deletes the temporary directory at the end (but while tasks might still be running). Deleting the directory causes the constructor to fail.
      The solution for this could be to Thread.sleep at the end of the test or to somehow wait for all tasks to be definitely killed before finishing the test




            • Assignee:
              bograd Bogdan Raducanu
              bograd Bogdan Raducanu
            • Votes:
              0 Vote for this issue
              2 Start watching this issue


              • Created: