Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2748

Loss of precision for small arguments to Math.exp, Math.log

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.0.1
    • Fix Version/s: 1.1.0
    • Component/s: GraphX, MLlib
    • Labels:
      None
    • Target Version/s:

      Description

      In a few places in MLlib, an expression of the form log(1.0 + p) is evaluated. When p is so small that 1.0 + p == 1.0, the result is 0.0. However the correct answer is very near p. This is why Math.log1p exists.

      Similarly for one instance of exp(m) - 1 in GraphX; there's a special Math.expm1 method.

      While the errors occur only for very small arguments, given their use in machine learning algorithms, this is entirely possible.

      Also, while we're here, naftaliharris discovered a case in Python where 1 - 1 / (1 + exp(margin)) is less accurate than exp(margin) / (1 + exp(margin)). I don't think there's a JIRA on that one, so maybe this can serve as an umbrella for all of these related issues.

        Attachments

          Activity

            People

            • Assignee:
              srowen Sean R. Owen
              Reporter:
              srowen Sean R. Owen
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: