Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9380

Python: ReadFromDatastore Embedded Entities

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • 2.17.0, 2.18.0, 2.19.0
    • None
    • io-py-gcp
    • None

    Description

      Issue 8405 discussed the possibility to support embedded entities when using the conversion methods to/from of the client entity type.

      This feature was added in [PR 9805|https://github.com/apache/beam/pull/9805], which was shipped in 2.17.

      However, there seems to be an issue when Datastore embedded entities do not have a key.
      Keys in embedded entities are optional.

      Because of this, when using the apache_beam.io.gcp.datastore.v1new.datastoreio.ReadFromDatastore transform, the pipeline fails with the following stacktrace:

        File "apache_beam/runners/common.py", line 780, in apache_beam.runners.common.DoFnRunner.process
        File "apache_beam/runners/common.py", line 440, in apache_beam.runners.common.SimpleInvoker.invoke_process
        File "apache_beam/runners/common.py", line 895, in apache_beam.runners.common._OutputProcessor.process_outputs
        File "/Users/quentin/Work/sensome/dev/senback/dataflow/senback-reporting-db-import-flow/venv/lib/python3.7/site-packages/apache_beam/io/gcp/datastore/v1new/datastoreio.py", line 264, in process
          yield types.Entity.from_client_entity(client_entity)
        File "/Users/quentin/Work/sensome/dev/senback/dataflow/senback-reporting-db-import-flow/venv/lib/python3.7/site-packages/apache_beam/io/gcp/datastore/v1new/types.py", line 225, in from_client_entity
          value = Entity.from_client_entity(value)
        File "/Users/quentin/Work/sensome/dev/senback/dataflow/senback-reporting-db-import-flow/venv/lib/python3.7/site-packages/apache_beam/io/gcp/datastore/v1new/types.py", line 219, in from_client_entity
          Key.from_client_key(client_entity.key),
        File "/Users/quentin/Work/sensome/dev/senback/dataflow/senback-reporting-db-import-flow/venv/lib/python3.7/site-packages/apache_beam/io/gcp/datastore/v1new/types.py", line 156, in from_client_key
          return Key(client_key.flat_path, project=client_key.project,
      AttributeError: 'NoneType' object has no attribute 'flat_path' [while running 'Read from Datastore/Read']
      

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Cavalié Quentin

            Dates

              Created:
              Updated:

              Slack

                Issue deployment