Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25941

Random forest score decreased due to updating spark version

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 2.3.2
    • Fix Version/s: None
    • Component/s: Deploy, Input/Output, ML
    • Labels:

      Description

      Problem description

      I use different versions of spark to analyze random forest scores..

      • spark-core_2.10 and version 2.0.0
        • RandomForestsKaggle Score = 0.8978765219058574
      • spark-core_2.11 and version 2.4.0
        • RandomForestsKaggle Score = 0.8886987035251259

      Source :  https://github.com/smartscity/Kaggle_Titanic_spark

      Example github source and readme

       

      Introduce

      This case is Titanic Competitions on the Kaggle. https://www.kaggle.com/c/titanic

      Conclusion

      After upgrading the spark version(version 2.4.0), the random forest score dropped(0.01).

      Expectation

      Expect random forest score not to drop as the version upgrades.

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              lyl2008dsg jack li
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: