Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31169

Random Forest in SparkML 2.3.3 vs 2.4.x

    XMLWordPrintableJSON

    Details

    • Type: Question
    • Status: Resolved
    • Priority: Major
    • Resolution: Invalid
    • Affects Version/s: 2.3.3, 2.4.0, 2.4.3
    • Fix Version/s: None
    • Component/s: ML

      Description

      Hi all,

      When I trained the model with the Random Forest algorithm, I got different results in different versions of spark, the same input, label ratio, hyperparameter for all training. Detailed training results in the attached file. Model training results with spark 2.3.3 are much better, so I want to ask if there have been any changes to the random forest (or other algorithms) in mllib?

      Many thanks.

        Attachments

        1. spark233.jpg
          47 kB
          Nguyen Nhanduc
        2. spark240.jpg
          48 kB
          Nguyen Nhanduc
        3. spark243.jpg
          48 kB
          Nguyen Nhanduc

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              meocon Nguyen Nhanduc
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: