Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32742

FileOutputCommitter warns "No Output found for attempt"

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.4.0
    • None
    • Spark Core
    • None
    • Hadoop 2.6.0-cdh5.16.2

      YARN(MR2 included)

       

    Description

      Hi team,

      This is my first time to report an issue here.

      We submitted and ran the spark job on the cluster. 

      We found that one of the parquet output partition is missing in the output directory. We checked the spark job log, all the tasks status are showing success. The output record size matches expected number.

      However, we checked the container log, found that there was a warning says No Output found for attempt_20200819094307_0003_m_000002_11, which stopped moving the output from taskAttemptPath to output directory. As a result, we are missing some of the output rows.

      Re-run the job helped to solve the issue, however the report is critical for us. It is appreciated if you can advise the cause for the issue.

       

      Below are the container logs:

       

      20/08/19 09:44:57 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
      20/08/19 09:44:57 INFO datasources.SQLHadoopMapReduceCommitProtocol: Using user defined output committer class parquet.hadoop.ParquetOutputCommitter
      20/08/19 09:44:57 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
      20/08/19 09:44:57 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
      20/08/19 09:44:57 INFO datasources.SQLHadoopMapReduceCommitProtocol: Using output committer class parquet.hadoop.ParquetOutputCommitter
      20/08/19 09:44:57 INFO codegen.CodeGenerator: Code generated in 12.370642 ms
      20/08/19 09:44:57 INFO codegen.CodeGenerator: Code generated in 6.927118 ms
      20/08/19 09:44:57 INFO codegen.CodeGenerator: Code generated in 12.004204 ms
      20/08/19 09:44:57 INFO parquet.ParquetWriteSupport: Initialized Parquet WriteSupport with Catalyst schema:
      ..... (skipped)
      20/08/19 09:44:57 WARN output.FileOutputCommitter: No Output found for attempt_20200819094307_0003_m_000002_11
      20/08/19 09:44:57 INFO mapred.SparkHadoopMapRedUtil: attempt_20200819094307_0003_m_000002_11: Committed
      

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            crazyredevil Ryan Luo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: