Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3541

Improve ALS internal storage

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.3.0
    • MLlib
    • None

    Description

      The internal storage of ALS uses many small objects, which increases the GC pressure and makes ALS difficult to scale to very large scale, e.g., 50 billion ratings. In such cases, the full GC may take more than 10 minutes to finish. That is longer than the default heartbeat timeout and hence executors will be removed under default settings.

      We can use primitive arrays to reduce the number of objects significantly. This requires big change to the ALS implementation.

      Attachments

        Issue Links

          Activity

            People

              mengxr Xiangrui Meng
              mengxr Xiangrui Meng
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 96h
                  96h
                  Remaining:
                  Remaining Estimate - 96h
                  96h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified