Description
See:
https://github.com/databricks/koalas/pull/1325#discussion_r647889901
https://github.com/databricks/koalas/pull/1325#discussion_r647890007
midx1 = ps.MultiIndex.from_tuples([('a', 'x', 1), ('b', 'z', 2), ('k', 'z', 3)]) midx1.difference(idx1)
pyspark.pandas.exceptions.PandasNotImplementedError: The method `pd.Index.__iter__()` is not implemented. If you want to collect your data as an NumPy array, use 'to_numpy()' instead.
In addition, calling MultiIndex.from_tuples will result in collecting all into driver side.
Attachments
Issue Links
- relates to
-
SPARK-35682 Pin mypy version in GitHub Actions CI
- Resolved
-
SPARK-35684 Bump up mypy version in GitHub Actions
- Resolved
- links to