[SPARK-7432] Flaky test in PySpark CrossValidator doc test - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.4.0
Fix Version/s: 1.4.0
Component/s: ML, PySpark
Labels:
None

Target Version/s:

1.4.0

Description

There was a test failure in the doc test in Python CrossValidator:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32058/consoleFull

Here's the full doc test:

    >>> from pyspark.ml.classification import LogisticRegression
    >>> from pyspark.ml.evaluation import BinaryClassificationEvaluator
    >>> from pyspark.mllib.linalg import Vectors
    >>> dataset = sqlContext.createDataFrame(
    ...     [(Vectors.dense([0.0, 1.0]), 0.0),
    ...      (Vectors.dense([1.0, 2.0]), 1.0),
    ...      (Vectors.dense([0.55, 3.0]), 0.0),
    ...      (Vectors.dense([0.45, 4.0]), 1.0),
    ...      (Vectors.dense([0.51, 5.0]), 1.0)] * 10,
    ...     ["features", "label"])
    >>> lr = LogisticRegression()
    >>> grid = ParamGridBuilder().addGrid(lr.maxIter, [0, 1, 5]).build()
    >>> evaluator = BinaryClassificationEvaluator()
    >>> cv = CrossValidator(estimator=lr, estimatorParamMaps=grid, evaluator=evaluator)
    >>> cvModel = cv.fit(dataset)
    >>> expected = lr.fit(dataset, {lr.maxIter: 5}).transform(dataset)
    >>> cvModel.transform(dataset).collect() == expected.collect()
    True

Here's the failure message:

Running test: pyspark/ml/tuning.py ... **********************************************************************
File "pyspark/ml/tuning.py", line 108, in __main__.CrossValidator
Failed example:
    cvModel.transform(dataset).collect() == expected.collect()
Expected:
    True
Got:
    False
**********************************************************************
   1 of  11 in __main__.CrossValidator
***Test Failed*** 1 failures.
Had test failures; see logs.
[error] Got a return code of 255 on line 240 of the run-tests script.

Attachments

Issue Links

links to

[Github] Pull Request #5962 (mengxr)

[Github] Pull Request #6572 (mengxr)

Activity

People

Assignee:: Xiangrui Meng

Reporter:: Joseph K. Bradley

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 07/May/15 03:27

Updated:: 02/Jun/15 15:51

Resolved:: 02/Jun/15 15:51