Uploaded image for project: 'Crunch (Retired)'
  1. Crunch (Retired)
  2. CRUNCH-495

Fix case class/SpecificRecord interactions in Scrunch

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.11.0
    • 0.12.0
    • Scrunch
    • None

    Description

      So this is a fun one: I wrote a way to serialize case classes in Scala as Avro generic records as part of the work for 0.11. However, if AvroMode.SPECIFIC is enabled on a MR job (e.g., if you were doing a join between one PTable that contained specific record instances and a different PTable that contained instances of a case class), the SpecificData object in Avro will get confused when it sees the Avro schema I generate for the case class, b/c the name of the Avro schema is identical to the name of the case class on the JVM, so Avro will think that the record is an actual instance of a SpecificRecord.

      The solution I came up with is to slightly modify the name of the generated Avro generic schema that corresponds to the case class so that it doesn't match the name of the case class exactly so that Avro doesn't get confused.

      Attachments

        1. CRUNCH-495.patch
          5 kB
          Josh Wills

        Activity

          People

            jwills Josh Wills
            jwills Josh Wills
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: