Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2867

saveAsHadoopFile() in PairRDDFunction.scala should allow use other OutputCommiter class

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Not A Problem
    • 1.0.0, 1.1.0
    • None
    • Spark Core
    • None

    Description

      The saveAsHadoopFile() in PairRDDFunction.scala hard-coded the OutputCommitter class as FileOutputCommitter because of the following code in the source:

      hadoopConf.setOutputCommitter(classOf[FileOutputCommitter])

      However, OutputCommitter is a changeable option in regular Hadoop MapReduce program. Users can specify "mapred.output.committer.class" to change the committer class used by other Hadoop programs.

      The saveAsHadoopFile() function should remove this hard-coded assignment and provide a way to specify the OutputCommitte used here.

      Attachments

        Activity

          People

            Unassigned Unassigned
            joesu Joseph Su
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: