Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9502

SchemaCoder is not update compatible

Details

    Description

      See relevant dev@ discussion. Runners should consider schemas compatible if they have the same fields in the same order. They should also get the ability to re-order fields in equivalent schemas (same fields, possibly out of order) using the encoding_position field (BEAM-10277).

      Original Description:

      SchemaCoder assigns random UUID, causes Dataflow's compatibility check to fail

      After fe4b7794, Schema.equals comparing only the UUIDs for faster comparison.
      After 0b3b18c6 SchemaCoder forcing random UUID when schema.uuid is null.

      thus, when trying to update (--update) a Dataflow job with row schemas in user-code, the compatibility check will fail because SchemaCoder produce another random UUID.

       

      The user can set the UUID after creating the Schema, but not with Schema.Builder
      and I'm afraid most users, that are not aware to the internal implementation, won't do that.

       

      In my branch, I added .withUUID and .withRandomUUID to Schema.Builder

      But I think a better solution will be to calculate the UUID based on the schema itself.

      any thoughts?

      reuvenlax

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yaronneuman Yaron Neuman
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h
                  2h