Details
-
Question
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.0.1
-
None
-
None
Description
SparkR throws "node stack overflow" error upon running code (sample below) on R-4.0.2 with Spark 3.0.1.
Same piece of code works on R-3.3.3 with Spark 2.2.1 (& SparkR 2.4.5)
source('sample.R') myclsr = myclosure_func() myclsr$get_some_date('2021-01-01') ## spark.lapply throws node stack overflow result = spark.lapply(c('2021-01-01', '2021-01-02'), function (rdate) { source('sample.R') another_closure = myclosure_func() return(another_closure$get_some_date(rdate)) })
Sample.R
## util function, which calls itself getPreviousBusinessDate <- function(asofdate) { asdt <- asofdate; asdt <- as.Date(asofdate)-1; wd <- format(as.Date(asdt),"%A") if(wd == "Saturday" | wd == "Sunday") { return (getPreviousBusinessDate(asdt)); } return (asdt); } ## closure which calls util function myclosure_func = function() { myclosure = list() get_some_date = function (random_date) { return (getPreviousBusinessDate(random_date)) } myclosure$get_some_date = get_some_date return(myclosure) }
This seems to have caused by sourcing sample.R twice. Once before invoking Spark session and another within Spark session.