Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36989

Migrate type hint data tests

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0
    • 3.3.0
    • PySpark, Tests
    • None

    Description

      Before the migration, pyspark-stubs contained a set of data tests, modeled after, and using internal test utilities, of mypy.

      These were omitted during the migration for a few reasons:

      • Simplicity.
      • Relative slowness.
      • Dependence on non public API.

       

      Data tests are useful for a number of reasons:

       

      • Improve test coverage for type hints.
      • Checking if type checkers infer expected types.
      • Checking if type checkers reject incorrect code.
      • Detecting unusual errors with code that otherwise type checks,

       

      Especially, the last two functions are not fulfilled by simple validation of existing codebase.

       

      Data tests are not required for all annotations and can be restricted to code that has high possibility of failure:

      • Complex overloaded signatures.
      • Complex generics.
      • Generic self annotations
      • Code containing type: ignore

      The biggest risk, is that output matchers have to be updated when signature changes and / or mypy output changes.

      Example of problem detected with data tests can be found in SPARK-36894 PR (https://github.com/apache/spark/pull/34146).

       

       

      Attachments

        Issue Links

          Activity

            People

              zero323 Maciej Szymkiewicz
              zero323 Maciej Szymkiewicz
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: