[SPARK-31217] Unnecessary persist on cumulativeCounts in BinaryClassificationMetrics - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: 2.4.4, 2.4.5
Fix Version/s: None
Component/s: ML, MLlib
Labels:
- bulk-closed

Description

In mllib.evaluation.BinaryClassificationMetrics, cumulativeCounts is cached in a lazy initialization. But when I run LogisticRegressionSummaryExample as well as ModelSelectionViaCrossValidationExample, I find that cached cumulativeCounts only used by one action during execution.
So I think it should not be cached in initilization, we can set an extra persist() API in this class, just as that the unpersist() API in BinaryClassificationMetrics releases cached cumulativeCounts.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: IcySanwitch

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 22/Mar/20 15:32

Updated:: 25/May/21 01:52

Resolved:: 25/May/21 01:40