[BEAM-2439] Datastore writer can fail to progress if Datastore is slow - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: P3
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.1.0
Component/s: io-java-gcp
Labels:
- datastore

Description

When writing to Datastore, Beam groups writes into large batches (usually 500 entities per write, the maximum permitted by the API). If these writes are slow to commit on the serving side, the request may time out before all of the entities are written.

When this happens, it loses any progress that has been made on those entities (the connector uses non-transactional writes, so some entities might have been written, but partial results are not returned to the connector so it has to assume that all entities need rewriting). It will retry the write with the same set of entities, which may time out in the same way repeatedly. This can be influenced by factors on the Datastore serving side, some of which are transient (hotspots) but some of which are not.

We (Datastore) are developing a fix for this.

Attachments

Issue Links

links to

GitHub Pull Request #3390

GitHub Pull Request #3585

Activity

People

Assignee:: Colin Phipps

Reporter:: Colin Phipps

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 13/Jun/17 09:54

Updated:: 16/May/20 13:19

Resolved:: 22/Jun/17 20:38