[SPARK-34015] SparkR partition timing summary reports input time correctly - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.3.2, 3.0.1
Fix Version/s: 3.1.1, 3.2.0
Component/s: SparkR
Labels:
None
Environment:

Observed on CentOS-7 running spark 2.3.1 and on my mac running master

Flags:

Patch

Description

When sparkR is run at log level INFO, a summary of how the worker spent its time processing the partition is printed. There is a logic error where it is over-reporting the time inputting rows.

In detail: the variable inputElap in a wider context is used to mark the beginning of reading rows, but in the part changed here it was used as a local variable for measuring compute time. Thus, the error is not observable if there is only one group per partition, which is what you get in unit tests.

For our application, here's what a log entry looks like before these changes were applied:

20/10/09 04:08:58 WARN RRunner: Times: boot = 0.013 s, init = 0.005 s, broadcast = 0.000 s, read-input = 529.471 s, compute = 492.037 s, write-output = 0.020 s, total = 1021.546 s

this indicates that we're spending more time reading rows than operating on the rows.

After these changes, it looks like this:

20/12/15 06:43:29 WARN RRunner: Times: boot = 0.013 s, init = 0.010 s, broadcast = 0.000 s, read-input = 120.275 s, compute = 1680.161 s, write-output = 0.045 s, total = 1812.553 s

Attachments

Issue Links

links to

[Github] Pull Request #31021 (WamBamBoozle)

Activity

People

Assignee:: Tom Howland

Reporter:: Tom Howland

Shepherd:: hyukjin.kwon

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 05/Jan/21 18:05

Updated:: 12/Dec/22 18:10

Resolved:: 06/Jan/21 02:41