[SPARK-14567] Add instrumentation logs to MLlib training algorithms - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Umbrella
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.2.0
Component/s: ML, MLlib
Labels:
None

Description

In order to debug performance issues when training mllib algorithms,
it is useful to log some metrics about the training dataset, the training parameters, etc.

This ticket is an umbrella to add some simple logging messages to the most common MLlib estimators. There should be no performance impact on the current implementation, and the output is simply printed in the logs.

Here are some values that are of interest when debugging training tasks:

number of features
number of instances
number of partitions
number of classes
input RDD/DF cache level
hyper-parameters

Attachments

Sub-Tasks

There are no Sub-Tasks for this issue.

Activity

People

Assignee:: Timothy Hunter

Reporter:: Timothy Hunter

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 12/Apr/16 18:34

Updated:: 17/Jan/17 23:40

Resolved:: 17/Jan/17 23:40