Uploaded image for project: 'Chukwa'
  1. Chukwa
  2. CHUKWA-452

ChukwaArchive byte[] needs to be wrapped in DataByteArray

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.4.0
    • Fix Version/s: 0.4.0
    • Component/s: None
    • Labels:
      None

      Description

      I've been trying to read the data inserted into /chukwa/finalArchives using the ChukwaArchive Loader in pig.
      On trying to STORE only the data field using BinaryStore I get:

      FAILED

      java.lang.ClassCastException: [B cannot be cast to org.apache.pig.data.DataByteArray
      at org.apache.pig.builtin.BinaryStorage.putNext(BinaryStorage.java:128)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:200)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:174)
      at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.collect(MapTask.java:642)
      at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:70)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:255)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:244)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
      at org.apache.hadoop.mapred.Child.main(Child.java:170)

      Looking at the ChukwaArchive class I saw that the data:byte[] is set to the Tuple without wrapping to DataByteArray.
      The patch I'm submitting applies this change by wrapping the data:byte[] with the DataByteArray class.

      1. CHUKWA-452.patch
        1.0 kB
        Gerrit Jansen van Vuuren

        Activity

        Hide
        eyang Eric Yang added a comment -

        This actually changes Chukwa Archive format to be pig specific. ChukwaArchive should be extensible to become archiver for pig data format. It would be better that to leave Chukwa Archive as it is and create a new archiver class to extend on top of this one. What do you guys think?

        Show
        eyang Eric Yang added a comment - This actually changes Chukwa Archive format to be pig specific. ChukwaArchive should be extensible to become archiver for pig data format. It would be better that to leave Chukwa Archive as it is and create a new archiver class to extend on top of this one. What do you guys think?
        Hide
        asrabkin Ari Rabkin added a comment -

        Eric, I don't quite follow. This patch only modifies code in contrib/chukwa-pig/. The ChukwaArchive in question is the Pig data format of that name.

        I vote to commit this.

        Show
        asrabkin Ari Rabkin added a comment - Eric, I don't quite follow. This patch only modifies code in contrib/chukwa-pig/. The ChukwaArchive in question is the Pig data format of that name. I vote to commit this.
        Hide
        jboulon Jerome Boulon added a comment -

        +1 the code is specifically designed for Pig, It's a wrapper that I wrote to be "Pig friendly".

        Show
        jboulon Jerome Boulon added a comment - +1 the code is specifically designed for Pig, It's a wrapper that I wrote to be "Pig friendly".
        Hide
        gerritjvv Gerrit Jansen van Vuuren added a comment -

        Yes this is just for pig support, it enables pig to read the final chukwa archived files.

        Show
        gerritjvv Gerrit Jansen van Vuuren added a comment - Yes this is just for pig support, it enables pig to read the final chukwa archived files.
        Hide
        eyang Eric Yang added a comment -

        My fault for not seeing this correctly.

        +1 Looks good.

        Show
        eyang Eric Yang added a comment - My fault for not seeing this correctly. +1 Looks good.
        Hide
        asrabkin Ari Rabkin added a comment -

        I just committed this. Thanks, Gerrit!

        Show
        asrabkin Ari Rabkin added a comment - I just committed this. Thanks, Gerrit!
        Hide
        gerritjvv Gerrit Jansen van Vuuren added a comment -

        Show
        gerritjvv Gerrit Jansen van Vuuren added a comment -
        Hide
        hudson Hudson added a comment -
        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #330 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/330/ )

          People

          • Assignee:
            Unassigned
            Reporter:
            gerritjvv Gerrit Jansen van Vuuren
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development