Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18432

Fix HDFS block size in programming guide

    Details

    • Type: Documentation
    • Status: Resolved
    • Priority: Trivial
    • Resolution: Fixed
    • Affects Version/s: 2.0.1
    • Fix Version/s: 2.0.3, 2.1.0
    • Component/s: Documentation
    • Labels:
      None

      Description

      http://spark.apache.org/docs/latest/programming-guide.html
      "By default, Spark creates one partition for each block of the file (blocks being 64MB by default in HDFS)"

      Currently default block size in HDFS is 128MB.
      The default value has been already increased in Hadoop 2.2.0 (the oldest supported version of Spark). https://issues.apache.org/jira/browse/HDFS-4053

      Since it looks confusing explanation, I'd like to fix the value from 64MB to 128MB.

        Attachments

          Activity

            People

            • Assignee:
              moomindani Noritaka Sekiyama
              Reporter:
              moomindani Noritaka Sekiyama
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: