Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1006

MLlib ALS gets stack overflow with too many iterations

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • MLlib
    • None

    Description

      The tipping point seems to be around 50. We should fix this by checkpointing the RDDs every 10-20 iterations to break the lineage chain, but checkpointing currently requires HDFS installed, which not all users will have.

      We might also be able to fix DAGScheduler to not be recursive.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              matei Matei Alexandru Zaharia
              Votes:
              3 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: