Uploaded image for project: 'Crunch (Retired)'
  1. Crunch (Retired)
  2. CRUNCH-370

Update Parquet dependency in Crunch pom

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0, 0.8.2
    • Fix Version/s: 0.10.0, 0.8.3
    • Component/s: IO
    • Labels:
      None

      Description

      Currently crunch is supporting avro to parquet conversion using AvroParquetFileTarget, AvroParquetFileSource classes. When I used these classes to convert avro to parquet files, I got the following exception in some cases: "org.apache.crunch.CrunchRuntimeException: parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead"

      After further debugging I found out that this issue is related to AvroWriteSupport class in parquet, which was fixed as part of milestone 1.2.3 https://github.com/Parquet/parquet-mr/issues/162. Latest parquet version is 1.3.2.

      But crunch is still using parquet 1.2.0 https://github.com/apache/crunch/blob/master/pom.xml#L77
      As part of this improvement, parquet dependency version in crunch will be updated if not to latest then at least to 1.2.3

        Attachments

        1. CRUNCH-370.patch
          0.8 kB
          Micah Whitacre

          Activity

            People

            • Assignee:
              mkwhitacre Micah Whitacre
              Reporter:
              anand.kothapalli Anandsagar Kothapalli
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: