I have the following schema in a dataset -
|– userId: string (nullable = true)|
|– data: map (nullable = true)|
|– key: string|
|– value: struct (valueContainsNull = true)|
|– startTime: long (nullable = true)|
|– endTime: long (nullable = true)|
|– offset: long (nullable = true)|
And I have the following classes (+ setter and getters which I omitted for simplicity) -
I collect the result the following way -
I do several calculations to get the result I want and the result is correct all through the way before I collect it.
This is the result for -
This is the result after collecting the reuslts for -
data startTime: 1498870800
data endTime: 1498854000
I tend to believe it is a spark issue. Would love any suggestions on how to bypass it.