Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21476

RandomForest classification model not using broadcast in transform

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 2.2.0
    • None
    • ML

    Description

      I notice significant task deserialization latency while running prediction with pipelines using RandomForestClassificationModel. While digging into the source, found that the transform method in RandomForestClassificationModel binds to its parent ProbabilisticClassificationModel and the only concrete definition that RandomForestClassificationModel provides and which is actually used in transform is that of predictRaw. Broadcasting is not being used in predictRaw.

      Attachments

        Activity

          People

            Unassigned Unassigned
            sagraw Saurabh Agrawal
            Votes:
            9 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: