Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12657

Add a option to skip newline on empty files with getMerge -nl

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.6.0, 2.7.1
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added -skip-empty-file option to hadoop fs -getmerge command. With the option, delimiter (LF) is not printed for empty files even if -nl option is used.

      Description

      Hello everyone,

      I recently was in the need of using the new line option -nl with getMerge because the files I needed to merge simply didn't had one. I was merging all the files from one directory and unfortunately this directory also included empty files, which effectively led to multiple newlines append after some files. I needed to remove them manually afterwards.

      In this situation it is maybe good to have another argument that allows skipping empty files.
      Thing one could try to implement this feature:

      The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't
      return the number of bytes copied which would be convenient as one could
      skip append the new line when 0 bytes where copied or one would check the file size before.

      I posted this Idea on the mailing list http://mail-archives.apache.org/mod_mbox/hadoop-user/201507.mbox/%3C55B25140.3060005%40trivago.com%3E but I didn't really get many responses, so I thought I my try this way.

        Attachments

        1. HDFS-8836-07.patch
          8 kB
          Kanaka Kumar Avvaru
        2. HDFS-8836-06.patch
          8 kB
          Kanaka Kumar Avvaru
        3. HDFS-8836-05.patch
          7 kB
          Kanaka Kumar Avvaru
        4. HDFS-8836-04.patch
          7 kB
          Kanaka Kumar Avvaru
        5. HDFS-8836-03.patch
          7 kB
          Kanaka Kumar Avvaru
        6. HDFS-8836-02.patch
          7 kB
          Kanaka Kumar Avvaru
        7. HDFS-8836-01.patch
          6 kB
          Kanaka Kumar Avvaru

          Activity

            People

            • Assignee:
              kanaka Kanaka Kumar Avvaru
              Reporter:
              jfilipiak Jan Filipiak
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: