Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.4.6, 3.0.0
Description
When you use floats are index of pandas, it produces a wrong results:
>>> import pandas as pd >>> spark.createDataFrame(pd.DataFrame({'a': [1,2,3]}, index=[2., 3., 4.])).show() +---+ | a| +---+ | 1| | 1| | 2| +---+
This is because direct slicing uses the value as index when the index contains floats:
>>> pd.DataFrame({'a': [1,2,3]}, index=[2., 3., 4.])[2:] a 2.0 1 3.0 2 4.0 3 >>> pd.DataFrame({'a': [1,2,3]}, index=[2., 3., 4.]).iloc[2:] a 4.0 3 >>> pd.DataFrame({'a': [1,2,3]}, index=[2, 3, 4])[2:] a 4 3