Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44957

Make PySpark (pyspark-sql module) tests passing without any dependency

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 3.5.0, 4.0.0
    • PySpark, Tests
    • None

    Description

      ./python/run-tests --python-executables=python3 --modules=pyspark-sql
      Running PySpark tests. Output is in /.../spark/python/unit-tests.log
      Will test against the following Python executables: ['python3']
      Will test the following Python modules: ['pyspark-sql']
      python3 python_implementation is CPython
      python3 version is: Python 3.10.12
      Starting test(python3): pyspark.sql.tests.pandas.test_pandas_grouped_map_with_state (temp output: /.../spark/python/target/8e530108-4d5e-46e4-88fb-8f0dfb7b47e2/python3__pyspark.sql.tests.pandas.test_pandas_grouped_map_with_state__jggatex7.log)
      Starting test(python3): pyspark.sql.tests.pandas.test_pandas_grouped_map (temp output: /.../spark/python/target/3b6e9e5a-c479-408c-9365-8286330e8e7c/python3__pyspark.sql.tests.pandas.test_pandas_grouped_map__1lrovmur.log)
      Starting test(python3): pyspark.sql.tests.pandas.test_pandas_cogrouped_map (temp output: /.../spark/python/target/68c7cf56-ed7a-453e-8d6d-3a0eb519d997/python3__pyspark.sql.tests.pandas.test_pandas_cogrouped_map__sw2875dr.log)
      Starting test(python3): pyspark.sql.tests.pandas.test_pandas_map (temp output: /.../spark/python/target/90712186-a104-4491-ae0d-2b5ab973991b/python3__pyspark.sql.tests.pandas.test_pandas_map__ysp4911q.log)
      Traceback (most recent call last):
        File "/.../miniconda3/envs/vanilla-3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
          return _run_code(code, main_globals, None,
        File "/.../miniconda3/envs/vanilla-3.10/lib/python3.10/runpy.py", line 86, in _run_code
          exec(code, run_globals)
        File "/.../workspace/forked/spark/python/pyspark/sql/tests/pandas/test_pandas_map.py", line 27, in <module>
          from pyspark.testing.sqlutils import (
        File "/.../workspace/forked/spark/python/pyspark/testing/__init__.py", line 19, in <module>
          from pyspark.testing.pandasutils import assertPandasOnSparkEqual
        File "/.../workspace/forked/spark/python/pyspark/testing/pandasutils.py", line 22, in <module>
          import pandas as pd
      ModuleNotFoundError: No module named 'pandas'
      

      Attachments

        Issue Links

          Activity

            People

              gurwls223 Hyukjin Kwon
              gurwls223 Hyukjin Kwon
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: