Description
The implementation of nDCG evaluation in MLLib with relevance score (added in 3.4.0, see https://issues.apache.org/jira/browse/SPARK-39446 and pull request) implements the following warning when the input data isn't correct: "# of ground truth set and # of relevance value set should be equal, check input data"
The logic for raising warnings is faulty at the moment: it raises a warning when the following conditions are both true:
- rel is empty
- lab.size and rel.size are not equal.
With the current logic, RankingMetrics will:
- raise incorrect warning when a user is using it in the "binary" mode (i.e. no relevance values in the input)
- not raise warning (that could be necessary) when the user is using it in the "non-binary" model (i.e. with relevance values in the input)
The logic should be to raise a warning should be:
- rel is not empty
- lab.size and rel.size are not equal.