Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-4076

Schema followups

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      This umbrella bug contains subtasks with followups for Beam schemas, which were moved from SQL to the core Java SDK and made to be type-name-based rather than coder based.

        Attachments

        1.
        Refactor builder field nullability Sub-task Resolved Kenneth Knowles

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2h 40m
        2.
        Review Schema API surface Sub-task Open Unassigned  
        3.
        Define & document the domain of Schema types prominently Sub-task Open Unassigned  
        4.
        Consider Schema.join to automatically produce a correct joined schema Sub-task Open Unassigned  
        5.
        Review of schema metadata vs schema types Sub-task Open Unassigned  
        6.
        remove RowSqlTypeBuilder Sub-task Resolved Kenneth Knowles

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1.5h
        7.
        FieldType should be a proper algebraic type Sub-task Open Unassigned  
        8.
        Find remaining uses of rowType and RowType, etc, and make them Schema as appropriate Sub-task Resolved Kenneth Knowles

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 50m
        9.
        Document the SDK contract for a PCollection having a schema Sub-task Open Unassigned  
        10.
        SQL operators and primitive values should use a richer type system than SqlTypeName Sub-task Open Unassigned  
        11.
        Valildate that OutputReceiver<Row> is only allowed if the output PCollection has a schema Sub-task Open Reuven Lax  
        12.
        SchemaRegistry should support a ServiceLoader interface Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h
        13.
        Create a lazy row on top of a generic Getter interface Sub-task Resolved Reuven Lax  
        14.
        Provide automatic schema registration for POJOs Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 12h 50m
        15.
        Provide automatic schema registration for AVROs Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 13.5h
        16.
        Provide automatic schema registration for Protos Sub-task Open Shehzaad Nakhoda  
        17.
        Provide automatic schema registration for BigQuery TableRows Sub-task Open Reuven Lax  
        18.
        Analyze FieldAccessDescriptors and drop fields that are never accessed Sub-task Open Reuven Lax  
        19.
        Support unknown fields in Rows Sub-task Open Reuven Lax  
        20.
        Schemas across pipeline modifications Sub-task Open Reuven Lax  
        21.
        Investigate other encoding mechanism for SchemaCoder Sub-task Open Reuven Lax  
        22.
        Create a library of useful transforms that use schemas Sub-task Open Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 30.5h
        23.
        Improve performance of SchemaCoder Sub-task Open Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 3h 10m
        24.
        Enable schemas for all runners Sub-task Open Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1.5h
        25.
        Move Nexmark and SQL to use the new Schema framework Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2.5h
        26.
        Schemas do not work on Dataflow runner of FnApi Runner Sub-task Closed Reuven Lax  
        27.
        Allow users to annotate POJOs and JavaBeans for richer functionality Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 7h 50m
        28.
        Create automatic schema registration for AutoValue classes. Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 40m
        29.
        Generated row object for POJOs, Avros, and JavaBeans should work if the wrapped class is package private Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h 10m
        30.
        Allow textual selection syntax for schema fields Sub-task Open Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2.5h
        31.
        Nested collection types cause NullPointerException when converting to a POJO Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h 40m
        32.
        ParDo should allow any type with a compatible registered schema in the @Element parameter Sub-task Resolved Reuven Lax  
        33.
        Support schemas in BigQueryIO.Write Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2h 50m
        34.
        BigQueryIO.Read should automatically produce schemas Sub-task Resolved Charith Ellawala

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2.5h
        35.
        The JdbcIO source should produce schemas Sub-task Resolved Charith Ellawala

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 4h 10m
        36.
        The JdbcIO sink should accept schemas Sub-task Resolved Shehzaad Nakhoda

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 8h 10m
        37.
        isSubType isSuperType methods do not belong in Schema.FieldType Sub-task Open Reuven Lax  
        38.
        Create LogicalType for Schema fields Sub-task Open Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 3h 50m
        39.
        Create proto representation for schemas Sub-task Open Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2.5h
        40.
        Support lazy iterables in schemas Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2.5h
        41.
        Select transform has non-intuitive semantics Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 12h 10m
        42.
        Remove KV from Schema transforms Sub-task Resolved Brian Hulette

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2h 10m
        43.
        Convert should support more boxing and unboxing Sub-task Open Unassigned  
        44.
        Add transforms for modifying schemas Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 4h 20m
        45.
        Remove FieldType metadata Sub-task Open Reuven Lax  
        46.
        Beam transforms reorder fields Sub-task Open Unassigned  
        47.
        Select need the ability to rename fields Sub-task Resolved Unassigned  
        48.
        Create a better Schema builder Sub-task Open Unassigned  
        49.
        PubSubIO.writeAvros should infer beam schemas Sub-task Open Unassigned  
        50.
        KafkaIO should support inferring schemas when reading Avro Sub-task Open Ismaël Mejía

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2h 20m
        51.
        Allow selecting slices of arrays and maps Sub-task Open Unassigned  
        52.
        Add support for generics in schema inference Sub-task Open Unassigned  
        53.
        Select should support nested schema selectors Sub-task Open Unassigned

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2h
        54.
        Some JDBC types do not have an equivalent Beam schema type Sub-task Open Unassigned  
        55.
        Duplication of code between JDBC Read classes Sub-task Open Unassigned  
        56.
        Support Avro dates in Schemas Sub-task Closed Gleb Kanterov

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h
        57.
        Support Avro enums in Schemas Sub-task Closed Gleb Kanterov  
        58.
        Protobuf Beam Schema support Sub-task Closed Alex Van Boxel

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 26h 40m
        59.
        The Row object needs better builders Sub-task Open Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 5h 10m
        60.
        ByteBuddy Schema code does not properly handle null values Sub-task Resolved Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 50m
        61.
        support schemas in state API Sub-task Open Reuven Lax

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 10m
        62.
        Schema Select does not properly handle nested nullable fields Sub-task Triage Needed Unassigned

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2.5h
        63.
        Move Beam SQL to use the schema join transforms Sub-task Triage Needed Unassigned  
        64.
        Coder inference should be disabled for Row types Sub-task Triage Needed Unassigned

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 40m

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              kenn Kenneth Knowles
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 193h 10m
                193h 10m