[SPARK-12595] fold should pass arguments to op in the correct order - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Trivial
Resolution: Duplicate
Affects Version/s: 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0, 1.5.0, 1.6.0
Fix Version/s: None
Component/s: PySpark
Labels:
None

External issue URL:
http://stackoverflow.com/q/34529953/1560062

Description

At this moment fold method reverses an order of arguments and places accumulator on the RHS.

        def func(iterator):
            acc = zeroValue
            for obj in iterator:
                acc = op(obj, acc)
            yield acc

It is confusing (see linked SO question), clearly conflicts with documentation:

The function op(t1, t2) is allowed to modify t1 and return it as its result value to avoid object allocation; however, it should not modify t2

and may become a bug if implementation changes.

Attachments

Issue Links

duplicates

SPARK-7683 Confusing behavior of fold function of RDD in pyspark

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Maciej Szymkiewicz

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 01/Jan/16 10:45

Updated:: 01/Jan/16 15:44

Resolved:: 01/Jan/16 15:44