Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9144

Beam's own Avro TimeConversion class in beam-sdk-java-core

Details

    • Bug
    • Status: Triage Needed
    • P2
    • Resolution: Fixed
    • 2.18.0
    • 2.19.0
    • sdk-java-core

    Description

      From Aaron's comment in https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476 .

      My org must use Avro 1.9.x (due to some Avro schema resolution issues resolved in 1.9.x) so downgrading Avro is not possible for us.
      Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are not available in 1.9.x.

      The Java class is org.apache.avro.data.TimeConversions.TimestampConversion in Avro 1.8.
      It's renamed to org.apache.avro.data.JodaTimeConversions in Avro 1.9.

      Beam Java SDK cannot upgrade Avro to 1.9

      Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.

      Illustration of the dependency

      Short-term Solution

      As illustrated above, as long as Beam Java SDK uses only the intersection of Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide flexibility in runtime Avro versions (as it did until Beam 2.16).

      Difference of the TimeConversion Classes

      Avro 1.9's TimestampConversion overrides getRecommendedSchema method. Details below:

      Avro 1.8's TimeConversions.TimestampConversion:

        public static class TimestampConversion extends Conversion<DateTime> {
          @Override
          public Class<DateTime> getConvertedType() {
            return DateTime.class;
          }
      
          @Override
          public String getLogicalTypeName() {
            return "timestamp-millis";
          }
      
          @Override
          public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType type) {
            return new DateTime(millisFromEpoch, DateTimeZone.UTC);
          }
      
          @Override
          public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
            return timestamp.getMillis();
          }
        }
      

      Avro 1.9's JodaTimeConversions.TimestampConversion:

        public static class TimestampConversion extends Conversion<DateTime> {
          @Override
          public Class<DateTime> getConvertedType() {
            return DateTime.class;
          }
      
          @Override
          public String getLogicalTypeName() {
            return "timestamp-millis";
          }
      
          @Override
          public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType type) {
            return new DateTime(millisFromEpoch, DateTimeZone.UTC);
          }
      
          @Override
          public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
            return timestamp.getMillis();
          }
      
          @Override
          public Schema getRecommendedSchema() {
            return LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
          }
        }
      

      Attachments

        1. dataflowWorkerJar_succeeded.png
          158 kB
          Tomo Suzuki
        2. NoClassDefFoundError in word-count-beam.png
          382 kB
          Tomo Suzuki
        3. dataflow-not-finish.png
          19 kB
          Tomo Suzuki
        4. dataflow_step_job_id_OBFUSC-0.json
          382 kB
          Aaron Dixon
        5. avro-beam-dependency-graph.png
          68 kB
          Tomo Suzuki

        Issue Links

          Activity

            People

              suztomo Tomo Suzuki
              suztomo Tomo Suzuki
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 50m
                  2h 50m

                  Slack

                    Issue deployment