Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2046

A CombineFileInputSplit cannot be less than a dfs block

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: None
    • Labels:
      None

      Description

      I ran into this while testing some hive features.

      Whether we use hiveinputformat or combinehiveinputformat, a split cannot be less than a dfs block size.
      This is a problem if we want to increase the block size for older data to reduce memory consumption for the
      name node.

      It would be useful if the input split was independent of the dfs block size.

        Attachments

        1. patch-2046-ydist.txt
          6 kB
          Amareshwari Sriramadasu
        2. combineFileInputFormatMaxSize2.txt
          7 kB
          dhruba borthakur
        3. combineFileInputFormatMaxSize.txt
          7 kB
          dhruba borthakur

          Activity

            People

            • Assignee:
              dhruba dhruba borthakur
              Reporter:
              namit Namit Jain
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: