Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6177

Add note in LDA example to remind possible coalesce

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • 1.2.1
    • 1.4.0
    • Examples, MLlib
    • None

    Description

      Add comment to introduce coalesce to LDA example to avoid the possible massive partitions from sc.textFile.

      sc.textFile will create RDD with one partition for each file, and the possible massive partitions downgrades LDA performance.

      Attachments

        Activity

          People

            yuhaoyan yuhao yang
            yuhaoyan yuhao yang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified