Details
Description
Hello,
I have an engine running on spark2 using a DecisionTreeClassifier model using the CrossValidator.
dt = DecisionTreeClassifier(maxBins=10000, seed=0) cv_dt_evaluator = BinaryClassificationEvaluator( metricName="", rawPredictionCol="probability") # Create param grid and cross validator for model selection dt_grid = ParamGridBuilder()\ .addGrid( dt.minInstancesPerNode, [100] )\ .addGrid( dt.maxDepth, [10] )\ .build() cv = CrossValidator( estimator=dt, estimatorParamMaps=dt_grid, evaluator=cv_dt_evaluator, parallelism=4 numFolds=4 )
I want to migrate from spark2 to spark3. I've run DecisionTreeClassifier on the same data with the same parameter values. But unfortunately my results are completely different, especially in terms of tree structure. I have trees with less depth and fewer splits on spark3. I've tried to read the documentation but I haven't found an answer to my question.
Can you help me find a solution to this problem?
Thanks in advance for your help