Details
-
Improvement
-
Status: Open
-
P3
-
Resolution: Unresolved
-
2.14.0
-
None
-
None
Description
Similar to BEAM-5807, Tensorflow's defacto storage format is TFRecord files with Example proto payload and its own schema.proto. We already have TFRecordIO support. Need to implement:
- Conversion between Beam and TF schema
- Conversion between Beam Row and TF Example proto
- TFRecordTableProvider
https://github.com/tensorflow/metadata/blob/master/tensorflow_metadata/proto/v0/schema.proto
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/example/example.proto
Also it seems the metadata protos are not published as Java artifacts:
https://github.com/tensorflow/metadata/issues/5
My WIP branch: https://github.com/spotify/beam/tree/neville/tf