[SPARK-12760] inaccurate description for difference between local vs cluster mode in closure handling - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.6.1, 2.0.0
Component/s: Documentation
Labels:
None

Description

In the spark documentation there's an example for illustrating how `local` and `cluster` mode can differ http://spark.apache.org/docs/latest/programming-guide.html#example

" In local mode with a single JVM, the above code will sum the values within the RDD and store it in counter. This is because both the RDD and the variable counter are in the same memory space on the driver node."

However the above doesn't seem to be true. Even in `local` mode it seems like the counter value should still be 0, because the variable will be summed up in the executor memory space, but the final value in the driver memory space is still 0. I tested this snippet and verified that in `local` mode the value is indeed still 0.

Is the doc wrong or perhaps I'm missing something the doc is trying to say?

Attachments

Issue Links

links to

[Github] Pull Request #10866 (srowen)

[Github] Pull Request #10867 (mortada)

Activity

People

Assignee:: Mortada Mehyar

Reporter:: Mortada Mehyar

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 11/Jan/16 22:49

Updated:: 23/Jan/16 11:45

Resolved:: 23/Jan/16 11:45