Description
Before the migration, pyspark-stubs contained a set of data tests, modeled after, and using internal test utilities, of mypy.
These were omitted during the migration for a few reasons:
- Simplicity.
- Relative slowness.
- Dependence on non public API.
Data tests are useful for a number of reasons:
- Improve test coverage for type hints.
- Checking if type checkers infer expected types.
- Checking if type checkers reject incorrect code.
- Detecting unusual errors with code that otherwise type checks,
Especially, the last two functions are not fulfilled by simple validation of existing codebase.
Data tests are not required for all annotations and can be restricted to code that has high possibility of failure:
- Complex overloaded signatures.
- Complex generics.
- Generic self annotations
- Code containing type: ignore
The biggest risk, is that output matchers have to be updated when signature changes and / or mypy output changes.
Example of problem detected with data tests can be found in SPARK-36894 PR (https://github.com/apache/spark/pull/34146).
Attachments
Issue Links
- causes
-
SPARK-48068 `mypy` should have `--python-executable` parameter
- Resolved
- links to