Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-4076

Schema followups

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: P3
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: beam-model, sdk-java-core
    • Labels:

      Description

      This umbrella bug contains subtasks with followups for Beam schemas, which were moved from SQL to the core Java SDK and made to be type-name-based rather than coder based.

        Attachments

          Issue Links

          1.
          Refactor builder field nullability Sub-task Resolved Kenneth Knowles

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 40m
          2.
          Review Schema API surface Sub-task Resolved Kenneth Knowles  
          3.
          Define & document the domain of Schema types prominently Sub-task Open Unassigned  
          4.
          Consider Schema.join to automatically produce a correct joined schema Sub-task Open Unassigned  
          5.
          Review of schema metadata vs schema types Sub-task Open Unassigned  
          6.
          remove RowSqlTypeBuilder Sub-task Resolved Kenneth Knowles

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1.5h
          7.
          FieldType should be a proper algebraic type Sub-task Open Unassigned  
          8.
          Find remaining uses of rowType and RowType, etc, and make them Schema as appropriate Sub-task Resolved Kenneth Knowles

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 50m
          9.
          Document the SDK contract for a PCollection having a schema Sub-task Open Unassigned  
          10.
          SQL operators and primitive values should use a richer type system than SqlTypeName Sub-task Open Unassigned  
          11.
          Valildate that OutputReceiver<Row> is only allowed if the output PCollection has a schema Sub-task Open Unassigned  
          12.
          SchemaRegistry should support a ServiceLoader interface Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          13.
          Create a lazy row on top of a generic Getter interface Sub-task Resolved Reuven Lax  
          14.
          Provide automatic schema registration for POJOs Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 12h 50m
          15.
          Provide automatic schema registration for AVROs Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 13.5h
          16.
          Provide automatic schema registration for Protos Sub-task Resolved Reuven Lax  
          17.
          Provide automatic schema registration for BigQuery TableRows Sub-task Open Unassigned  
          18.
          Analyze FieldAccessDescriptors and drop fields that are never accessed Sub-task Open Unassigned  
          19.
          Support unknown fields in Rows Sub-task Open Unassigned  
          20.
          Schemas across pipeline modifications Sub-task Open Unassigned  
          21.
          Investigate other encoding mechanism for SchemaCoder Sub-task Open Unassigned  
          22.
          Create a library of useful transforms that use schemas Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 30.5h
          23.
          Improve performance of SchemaCoder Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 10m
          24.
          Enable schemas for all runners Sub-task Open Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1.5h
          25.
          Move Nexmark and SQL to use the new Schema framework Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2.5h
          26.
          Schemas do not work on Dataflow runner of FnApi Runner Sub-task Resolved Reuven Lax  
          27.
          Allow users to annotate POJOs and JavaBeans for richer functionality Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 7h 50m
          28.
          Create automatic schema registration for AutoValue classes. Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 50m
          29.
          Generated row object for POJOs, Avros, and JavaBeans should work if the wrapped class is package private Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 10m
          30.
          Allow textual selection syntax for schema fields Sub-task Resolved Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2.5h
          31.
          Nested collection types cause NullPointerException when converting to a POJO Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 40m
          32.
          ParDo should allow any type with a compatible registered schema in the @Element parameter Sub-task Resolved Reuven Lax  
          33.
          Support schemas in BigQueryIO.Write Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 50m
          34.
          BigQueryIO.Read should automatically produce schemas Sub-task Resolved Charith Ellawala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2.5h
          35.
          The JdbcIO source should produce schemas Sub-task Resolved Charith Ellawala

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h 10m
          36.
          The JdbcIO sink should accept schemas Sub-task Resolved Shehzaad Nakhoda

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 8h 10m
          37.
          isSubType isSuperType methods do not belong in Schema.FieldType Sub-task Open Unassigned  
          38.
          Create LogicalType for Schema fields Sub-task Resolved Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 50m
          39.
          Create proto representation for schemas Sub-task Open Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2.5h
          40.
          Support lazy iterables in schemas Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2.5h
          41.
          Select transform has non-intuitive semantics Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 12h 10m
          42.
          Remove KV from Schema transforms Sub-task Resolved Brian Hulette

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 10m
          43.
          Convert should support more boxing and unboxing Sub-task Open Unassigned  
          44.
          Add transforms for modifying schemas Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h 20m
          45.
          Remove FieldType metadata Sub-task Open Unassigned  
          46.
          Beam transforms reorder fields Sub-task Open Unassigned  
          47.
          Select need the ability to rename fields Sub-task Resolved Unassigned  
          48.
          Create a better Schema builder Sub-task Open Unassigned  
          49.
          PubSubIO.writeAvros should infer beam schemas Sub-task Open Unassigned  
          50.
          KafkaIO should support inferring schemas when reading Avro Sub-task Open Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 10m
          51.
          Allow selecting slices of arrays and maps Sub-task Open Unassigned  
          52.
          Add support for generics in schema inference Sub-task Open Unassigned  
          53.
          Select should support nested schema selectors Sub-task Resolved Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h
          54.
          Some JDBC types do not have an equivalent Beam schema type Sub-task Open Unassigned  
          55.
          Duplication of code between JDBC Read classes Sub-task Open Unassigned  
          56.
          Support Avro dates in Schemas Sub-task Resolved Gleb Kanterov

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          57.
          Support Avro enums in Schemas Sub-task Resolved Gleb Kanterov  
          58.
          Protobuf Beam Schema support Sub-task Resolved Alex Van Boxel

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 26h 40m
          59.
          The Row object needs better builders Sub-task Resolved Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h 10m
          60.
          ByteBuddy Schema code does not properly handle null values Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 50m
          61.
          support schemas in state API Sub-task Open Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 50m
          62.
          Schema Select does not properly handle nested nullable fields Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2.5h
          63.
          Coder inference should be disabled for Row types Sub-task Resolved Reuven Lax

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          64.
          Add examples using Schema-based APIs Sub-task Open Unassigned  
          65.
          Support Dataflow update when schemas are used Sub-task Resolved Reuven Lax  

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                kenn Kenneth Knowles
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 195.5h
                  195.5h