Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4383

TeraGen Application allows same output directory for multiple jobs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      When Teragen is run multiple times with the same output directory, normally it should validate and fail.

      But some cases it may continue and cause the exceptions which results failure in job later time.

      I think the reason behind it is

       org.apache.hadoop.examples.terasort.TeraOutputFormat.checkOutputSpecs(TeraOutputFormat.java) have issue, it permit the output already exists if it have only 1 kid and it's PARTITION_FILENAME
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            tongshiquan tongshiquan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: