Pig
  1. Pig
  2. PIG-2803

Include Wonderdog (ElasticSearch Integration) in contrib/

    Details

    • Release Note:
      Contrib is where code goes to die.

      Description

      I propose to add Wonderdog to Pig contrib/

      Wonderdog is an Apache 2.0 licensed project that adds Hadoop and Pig integration for ElasticSearch. This lets you index any Pig relation with a single UDF call, which is very powerful. Both writing searchable indexes and loading based on search queries is supported.

      More information on Wonderdog is available at https://github.com/infochimps-labs/wonderdog and a great introduction to ElasticSearch is available at http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html

      Wonderdog broke in Pig 0.10.0, and was patched to work here: https://github.com/infochimps-labs/wonderdog/pull/9 Even still, there is the issue of Pig creating schema files when storing and loading JSON that must be manually removed to make Wonderdog go.

      Moving forward, I would like the Pig project to maintain Wonderdog in contrib/ and verify that it works with each version increment. Wonderdog is an incredibly useful library that is license compatible with Pig itself. Along with ElasticSearch, it adds the ability for any user to index his Pig relations and to load subsets of data by pushing search queries down to ElasticSearch.

      I use Wonderdog in production and in my book, so I volunteer to do the maintenance on contrib/wonderdog.

        Activity

        Hide
        Philip (flip) Kromer added a comment -

        Infochimps (current maintainers of wonderdog) would be very enthusiastic to transfer this to contrib/. As far as I know the version compatibility issues have been addressed.

        Show
        Philip (flip) Kromer added a comment - Infochimps (current maintainers of wonderdog) would be very enthusiastic to transfer this to contrib/. As far as I know the version compatibility issues have been addressed.
        Hide
        Russell Jurney added a comment -
        Show
        Russell Jurney added a comment - For an example of how to use Pig 0.11 with Elasticsearch/Wonderdog, see: https://github.com/rjurney/Agile_Data_Code/tree/master/ch03 https://github.com/rjurney/Agile_Data_Code/tree/master/ch07
        Hide
        Russell Jurney added a comment -

        Wonderdog works with Pig 0.10, 0.10.1 and 0.11, and with the latest ElasticSearch, 0.20.2 last I checked.

        It just isn't a part of Pig - but a separate package, and should stay that way imo.

        Show
        Russell Jurney added a comment - Wonderdog works with Pig 0.10, 0.10.1 and 0.11, and with the latest ElasticSearch, 0.20.2 last I checked. It just isn't a part of Pig - but a separate package, and should stay that way imo.
        Hide
        Alex McLintock added a comment -

        Russ, where would be a good place to discuss this? Is it time to use a different tool for transferring data from Hadoop to elastic search? Is it something that is fixable with some decent java dev time?

        Show
        Alex McLintock added a comment - Russ, where would be a good place to discuss this? Is it time to use a different tool for transferring data from Hadoop to elastic search? Is it something that is fixable with some decent java dev time?
        Hide
        Alex McLintock added a comment -

        Does this ticket mean that Wonderdog is not currently compatible with the 0.10.0 release?

        Show
        Alex McLintock added a comment - Does this ticket mean that Wonderdog is not currently compatible with the 0.10.0 release?
        Hide
        Alan Gates added a comment -

        I think this should be discussed as a proposal on the dev list. I'll start the discussion.

        Show
        Alan Gates added a comment - I think this should be discussed as a proposal on the dev list. I'll start the discussion.
        Show
        Russell Jurney added a comment - See https://github.com/infochimps-labs/wonderdog/issues/10

          People

          • Assignee:
            Russell Jurney
            Reporter:
            Russell Jurney
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development