Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31169

Random Forest in SparkML 2.3.3 vs 2.4.x

    XMLWordPrintableJSON

Details

    • Question
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 2.3.3, 2.4.0, 2.4.3
    • None
    • ML

    Description

      Hi all,

      When I trained the model with the Random Forest algorithm, I got different results in different versions of spark, the same input, label ratio, hyperparameter for all training. Detailed training results in the attached file. Model training results with spark 2.3.3 are much better, so I want to ask if there have been any changes to the random forest (or other algorithms) in mllib?

      Many thanks.

      Attachments

        1. spark233.jpg
          47 kB
          Nguyen Nhanduc
        2. spark240.jpg
          48 kB
          Nguyen Nhanduc
        3. spark243.jpg
          48 kB
          Nguyen Nhanduc

        Activity

          People

            Unassigned Unassigned
            meocon Nguyen Nhanduc
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: