Details
-
Bug
-
Status: Open
-
P3
-
Resolution: Unresolved
-
2.32.0, 2.33.0, 2.34.0, 2.35.0
-
None
-
None
Description
After upgrading our Python project from 2.31.0 to 2.33.0, we started getting TypeCheckErrors such as
apache_beam.typehints.decorators.TypeCheckError: Type hint violation for 'all_data/combine_new_and_all': requires Tuple[Tuple[Any, Any], Dict[str, Iterable[_CombinedEntry]]] but got Tuple[Tuple[int, int], Dict[str, List[Union[]]]] for element
where the output value of a CoGroupByKey() is apparently incorrectly deduced to be a Dict[str, List[Union[]]].
I managed to build a small repro case:
import apache_beam as beam from typing import Dict, Iterable, Tuple { "foo": [(42, "foo")], "bar": [(42, "bar")], } | beam.CoGroupByKey().with_output_types(Tuple[int, Dict[str, Iterable[str]]])
which raises
apache_beam.typehints.decorators.TypeCheckError: Output type hint violation at CoGroupByKey: expected Tuple[int, Dict[str, Iterable[str]]], got Tuple[int, Dict[str, List[Union[]]]]
or alternatively, using a TestPipeline:
import apache_beam as beam from apache_beam.testing.test_pipeline import TestPipeline from apache_beam.testing.util import assert_that, equal_to from typing import Dict, Iterable, Tuple with TestPipeline() as p: actual = { "foo": p | "create_foo" >> beam.Create([(42, "foo")]), "bar": p | "create_bar" >> beam.Create([(42, "bar")]), } | beam.CoGroupByKey().with_output_types(Tuple[int, Dict[str, Iterable[str]]]) assert_that(actual, equal_to([(42, {"foo": ["foo"], "bar": ["bar"]})]))
Oh, and one more thing, about that Tuple[Any, Any] from the original error message I posted. We can reproduce that like this:
import apache_beam as beam from typing import Dict, Iterable, NewType, Tuple key = NewType("key", int) { "foo": [(key(1337), "foo")], "bar": [(key(1337), "bar")], } | beam.CoGroupByKey().with_output_types(Tuple[key, Dict[str, Iterable[str]]])
apache_beam.typehints.decorators.TypeCheckError: Output type hint violation at CoGroupByKey: expected Tuple[Any, Dict[str, Iterable[str]]], got Tuple[int, Dict[str, List[Union[]]]]
It looks like NewType is treated as Any? That surprised me.
I could also reproduce the issue in 2.32.0.