Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2916

[MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • None
    • MLlib, Spark Core
    • None

    Description

      While running any of the regression algorithms with gradient descent, the treeAggregate blows up after several iterations.

      Observed on EC2 cluster with 16 nodes, matrix dimensions of 1,000,000 x 5,000

      In order to replicate the problem, use aggregate multiple times, maybe over 50-60 times.

      Testing lead to the possible workaround:
      setting
      `spark.cleaner.referenceTracking false`

      seems to help. So the problem is most probably related to the cleanup.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            brkyvz Burak Yavuz
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment