Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9144

Beam's own Avro TimeConversion class in beam-sdk-java-core

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.18.0
    • Fix Version/s: 2.19.0
    • Component/s: sdk-java-core
    • Labels:
      None

      Description

      From Aaron's comment in https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476 .

      My org must use Avro 1.9.x (due to some Avro schema resolution issues resolved in 1.9.x) so downgrading Avro is not possible for us.
      Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are not available in 1.9.x.

      The Java class is org.apache.avro.data.TimeConversions.TimestampConversion in Avro 1.8.
      It's renamed to org.apache.avro.data.JodaTimeConversions in Avro 1.9.

      Beam Java SDK cannot upgrade Avro to 1.9

      Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.

      Illustration of the dependency

      Short-term Solution

      As illustrated above, as long as Beam Java SDK uses only the intersection of Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide flexibility in runtime Avro versions (as it did until Beam 2.16).

      Difference of the TimeConversion Classes

      Avro 1.9's TimestampConversion overrides getRecommendedSchema method. Details below:

      Avro 1.8's TimeConversions.TimestampConversion:

        public static class TimestampConversion extends Conversion<DateTime> {
          @Override
          public Class<DateTime> getConvertedType() {
            return DateTime.class;
          }
      
          @Override
          public String getLogicalTypeName() {
            return "timestamp-millis";
          }
      
          @Override
          public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType type) {
            return new DateTime(millisFromEpoch, DateTimeZone.UTC);
          }
      
          @Override
          public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
            return timestamp.getMillis();
          }
        }
      

      Avro 1.9's JodaTimeConversions.TimestampConversion:

        public static class TimestampConversion extends Conversion<DateTime> {
          @Override
          public Class<DateTime> getConvertedType() {
            return DateTime.class;
          }
      
          @Override
          public String getLogicalTypeName() {
            return "timestamp-millis";
          }
      
          @Override
          public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType type) {
            return new DateTime(millisFromEpoch, DateTimeZone.UTC);
          }
      
          @Override
          public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
            return timestamp.getMillis();
          }
      
          @Override
          public Schema getRecommendedSchema() {
            return LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
          }
        }
      

        Attachments

        1. avro-beam-dependency-graph.png
          68 kB
          Tomo Suzuki
        2. dataflow_step_job_id_OBFUSC-0.json
          382 kB
          Aaron Dixon
        3. dataflow-not-finish.png
          19 kB
          Tomo Suzuki
        4. NoClassDefFoundError in word-count-beam.png
          382 kB
          Tomo Suzuki
        5. dataflowWorkerJar_succeeded.png
          158 kB
          Tomo Suzuki

          Issue Links

            Activity

              People

              • Assignee:
                suztomo Tomo Suzuki
                Reporter:
                suztomo Tomo Suzuki
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 50m
                  2h 50m