Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7921

Add Schema support for Tensorflow

Details

    • Improvement
    • Status: Open
    • P3
    • Resolution: Unresolved
    • 2.14.0
    • None
    • io-java-tfrecord
    • None

    Description

      Similar to BEAM-5807, Tensorflow's defacto storage format is TFRecord files with Example proto payload and its own schema.proto. We already have TFRecordIO support. Need to implement:

      • Conversion between Beam and TF schema
      • Conversion between Beam Row and TF Example proto
      • TFRecordTableProvider

      https://github.com/tensorflow/metadata/blob/master/tensorflow_metadata/proto/v0/schema.proto

      https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/example/example.proto

       

      Also it seems the metadata protos are not published as Java artifacts:

      https://github.com/tensorflow/metadata/issues/5

      My WIP branch: https://github.com/spotify/beam/tree/neville/tf

      Attachments

        Activity

          People

            Unassigned Unassigned
            sinisa_lyh Neville Li
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: