Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10135

Percent of pruned partitions is shown wrong

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Trivial
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.5.0
    • Component/s: SQL
    • Labels:
    • Target Version/s:

      Description

      When reading partitioned Parquet in SparkSQL, an info message about the number of pruned partitions is displayed.

      Actual:
      "Selected 15 partitions out of 181, pruned -1106.6666666666667% partitions."

      Expected:
      "Selected 15 partitions out of 181, pruned 91.71270718232044% partitions."

      Fix: (i'm newbie here so please help make patch, thanks!)
      in DataSourceStrategy.scala in method apply()

      insted of:
      val percentPruned = (1 - total.toDouble / selected.toDouble) * 100
      should be:
      val percentPruned = (1 - selected.toDouble / total.toDouble) * 100

        Attachments

          Activity

            People

            • Assignee:
              rxin Reynold Xin
              Reporter:
              romi-totango Romi Kuntsman
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified