Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-4057

Fix Unsorted broadcast shuffle umasks

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.2
    • Fix Version/s: 0.10.1, 0.9.3
    • Component/s: None
    • Labels:
      None

      Description

      
          if (numPartitions == 1 && !pipelinedShuffle) {
            //special case, where in only one partition is available.
            finalOutPath = outputFileHandler.getOutputFileForWrite();
            finalIndexPath = outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate);
            skipBuffers = true;
            writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass,
                codec, outputRecordsCounter, outputRecordBytesCounter);
          } else {
            skipBuffers = false;
            writer = null;
          }
      

      The broadcast events don't update the file umasks, because they have 1 partition.

      total 8.0K
      -rw------- 1 hive hadoop 15 Mar 27 20:30 file.out
      -rw-r----- 1 hive hadoop 32 Mar 27 20:30 file.out.index
      

      ending up with readable index files and unreadable .out files.

        Attachments

        1. TEZ-4057.1.patch
          2 kB
          Eric Wohlstadter

          Activity

            People

            • Assignee:
              ewohlstadter Eric Wohlstadter
              Reporter:
              gopalv Gopal Vijayaraghavan
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: