Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2916

[MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • None
    • MLlib, Spark Core
    • None

    Description

      While running any of the regression algorithms with gradient descent, the treeAggregate blows up after several iterations.

      Observed on EC2 cluster with 16 nodes, matrix dimensions of 1,000,000 x 5,000

      In order to replicate the problem, use aggregate multiple times, maybe over 50-60 times.

      Testing lead to the possible workaround:
      setting
      `spark.cleaner.referenceTracking false`

      seems to help. So the problem is most probably related to the cleanup.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              brkyvz Burak Yavuz
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: