Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20443

The blockSize of MLLIB ALS should be setting by the User

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 2.3.0
    • None
    • ML, MLlib

    Description

      The blockSize of MLLIB ALS is very important for ALS performance.
      In our test, when the blockSize is 128, the performance is about 4X comparing with the blockSize is 4096 (default value).
      The following are our test results:
      BlockSize(recommendationForAll time)
      128(124s), 256(160s), 512(184s), 1024(244s), 2048(332s), 4096(488s), 8192(OOM)

      The Test Environment:
      3 workers: each work 10 core, each work 30G memory, each work 1 executor.
      The Data: User 480,000, and Item 17,000

      Attachments

        1. blockSize.jpg
          34 kB
          Peng Meng

        Activity

          People

            Unassigned Unassigned
            peng.meng@intel.com Peng Meng
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: