Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-4927

Part files on the output filesystem are created irrespective of whether the corresponding task has anything to write there

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.21.0
    • None
    • None
    • Reviewed
    • All output part files are created regardless of whether the corresponding task has output.

    Description

      When OutputFormat.getRecordWriter is invoked, a part file is created on the output filesystem. But the created RecordWriter is not used until the OutputCollector.collect call is made by the task (user's code). This results in empty part files even if the OutputCollector.collect is never invoked by the corresponding tasks.

      Attachments

        1. hadoop-4927.patch
          15 kB
          Jothi Padmanabhan
        2. hadoop-4927-v1.patch
          17 kB
          Jothi Padmanabhan
        3. hadoop-4927-v2.patch
          18 kB
          Jothi Padmanabhan
        4. hadoop-4927-v3.patch
          24 kB
          Jothi Padmanabhan
        5. hadoop-4927-v4.patch
          33 kB
          Jothi Padmanabhan
        6. hadoop-4927-v5.patch
          34 kB
          Jothi Padmanabhan
        7. hadoop-4927-v6.patch
          34 kB
          Jothi Padmanabhan
        8. hadoop-4927-y20.patch
          34 kB
          Jothi Padmanabhan

        Activity

          People

            jothipn Jothi Padmanabhan
            ddas Devaraj Das
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: