Uploaded image for project: 'Chukwa'
  1. Chukwa
  2. CHUKWA-495

Implement Pig 0.7.0 compatible Loader and Storer classes

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Chukwa now requires Pig 0.7.x. ChukwaStorage has been replaced by ChukwaLoader and ChukwaStorer classes.

      Description

      Pig 0.7.0 introduces a revamped Load/Store model that is not backward compatible with previous Pig releases. We need to create new classes to handle loading/storing Chukwa data from Pig. Since the new load/store model uses abstract super classes instead of interfaces, I propose we deprecate org.apache.hadoop.chukwa.ChukwaStorage and create the following classes:

      org.apache.hadoop.chukwa.pig.ChukwaLoader
      org.apache.hadoop.chukwa.pig.ChukwaStorer
      

      Note the addition of the pig sub-package. Thoughts about this approach?

      1. pig-0.7.0-test.jar
        885 kB
        Bill Graham
      2. pig-0.7.0.jar.gz
        9.38 MB
        Bill Graham
      3. chukwa-pig.jar
        24 kB
        Bill Graham
      4. CHUKWA-495.1.patch
        57 kB
        Bill Graham

        Activity

        Hide
        asrabkin Ari Rabkin added a comment -

        I just committed this. Thanks, Bill!

        Show
        asrabkin Ari Rabkin added a comment - I just committed this. Thanks, Bill!
        Hide
        asrabkin Ari Rabkin added a comment -

        No, it looked fine. I was holding off on committing pending more operational experience; I assumed the people who needed it would be trying it. I'll commit it tomorrow barring objection.

        Show
        asrabkin Ari Rabkin added a comment - No, it looked fine. I was holding off on committing pending more operational experience; I assumed the people who needed it would be trying it. I'll commit it tomorrow barring objection.
        Hide
        billgraham Bill Graham added a comment -

        Ping. Any comments on this patch?

        Show
        billgraham Bill Graham added a comment - Ping. Any comments on this patch?
        Hide
        billgraham Bill Graham added a comment -

        Uploading gzipped pig core jar, since it's > 10M.

        Show
        billgraham Bill Graham added a comment - Uploading gzipped pig core jar, since it's > 10M.
        Hide
        billgraham Bill Graham added a comment -

        Attached is CHUKWA-495.1.patch, which includes the new classes mentioned above, along with a refactored ChukwaArchive class. Please review these changes. Note the change in line 82 of TestLocalChukwaStorage. Basically, when setting the output path is set to chukwa-pig.evt, Pig writes out to chukwa-pig.evt/part-m-00000. I'm not sure how to keep that from happening so we need to make sure that change is ok.

        I've also attached pig-0.7.0 jars built from the pig distro at this mirror (pig.jar and pig-test.jar should be removed):
        http://mirror.atlanticmetro.net/apache/hadoop/pig/pig-0.7.0/

        I wasn't able to find a pig distro in the maven repositories listed in the maven configs. If we find one, we can handle that change in a seperate JIRA.

        Finally, I don't know what the purpose of having contrib/chukwa-pig/chukwa-pig.jar in SVN, since it's generated at build time but I haven't removed it. I've uploaded my version too, in the event we need to commit that to what's current.

        Show
        billgraham Bill Graham added a comment - Attached is CHUKWA-495 .1.patch, which includes the new classes mentioned above, along with a refactored ChukwaArchive class. Please review these changes. Note the change in line 82 of TestLocalChukwaStorage . Basically, when setting the output path is set to chukwa-pig.evt , Pig writes out to chukwa-pig.evt/part-m-00000 . I'm not sure how to keep that from happening so we need to make sure that change is ok. I've also attached pig-0.7.0 jars built from the pig distro at this mirror (pig.jar and pig-test.jar should be removed): http://mirror.atlanticmetro.net/apache/hadoop/pig/pig-0.7.0/ I wasn't able to find a pig distro in the maven repositories listed in the maven configs. If we find one, we can handle that change in a seperate JIRA. Finally, I don't know what the purpose of having contrib/chukwa-pig/chukwa-pig.jar in SVN, since it's generated at build time but I haven't removed it. I've uploaded my version too, in the event we need to commit that to what's current.
        Hide
        billgraham Bill Graham added a comment -

        I've tested the loader on my cluster and it's working well. I don't have any use cases where the storer is used in my cluster though, so I can't easily test that. The pig unit tests pass though. I still need to update the ChukwaArchiver class as well and then I can submit a patch, hopefully in the next day or two.

        Show
        billgraham Bill Graham added a comment - I've tested the loader on my cluster and it's working well. I don't have any use cases where the storer is used in my cluster though, so I can't easily test that. The pig unit tests pass though. I still need to update the ChukwaArchiver class as well and then I can submit a patch, hopefully in the next day or two.
        Hide
        asrabkin Ari Rabkin added a comment -

        Have you tested the new patch on your cluster? Is it committable yet?

        Show
        asrabkin Ari Rabkin added a comment - Have you tested the new patch on your cluster? Is it committable yet?
        Hide
        eyang Eric Yang added a comment -

        It's best to keep current with dependency library. Yes, please remove the stale code as part of your patch. Thanks

        Show
        eyang Eric Yang added a comment - It's best to keep current with dependency library. Yes, please remove the stale code as part of your patch. Thanks
        Hide
        billgraham Bill Graham added a comment -

        Changing title. I've written both the loader and the storer classes.

        It will be tricky to deprecate ChukwaStorage, because it will require a different Pig version < 0.7 to compile, but the other classes will require 0.7. Should we remove this class instead?

        Show
        billgraham Bill Graham added a comment - Changing title. I've written both the loader and the storer classes. It will be tricky to deprecate ChukwaStorage, because it will require a different Pig version < 0.7 to compile, but the other classes will require 0.7. Should we remove this class instead?
        Hide
        eyang Eric Yang added a comment -

        +1 on this refactor.

        Show
        eyang Eric Yang added a comment - +1 on this refactor.

          People

          • Assignee:
            billgraham Bill Graham
            Reporter:
            billgraham Bill Graham
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development