I'm attaching a simplified reproducible example of the problem:
1. Loading a JSON file from HDFS as a Data Frame
2. Creating 3 data frames: PRCP, TMIN, TMAX
3. Joining the data frames together. Each of those has a column "value" with the same name, so renaming them after the join.
4. The output seems incorrect; the first column has the correct values, but the two other columns seem to have a copy of the values from the first column.
Here's the sample code:
The output is:
And the join output is:
- data.json file that is read from HDFS