Pig
  1. Pig
  2. PIG-2562

Apache Pig does not work on Amazon's Elastic MapReduce

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.9.1, 0.9.2, 0.10.0, 0.11
    • Fix Version/s: 0.10.0
    • Component/s: None
    • Labels:
    • Environment:

      Amazon Elastic MapReduce

      Description

      See https://forums.aws.amazon.com/thread.jspa?messageID=323063

      According to this thread, only Amazon's proprietary hadoop-core.jar enables S3 to work on with Pig. Apache Pig does not work.

      Example:

      Apache Pig branch-0.9 as of today:

      hadoop@ip-10-195-159-114:~$ pig/bin/pig
      grunt> cd s3://elasticmapreduce/samples/pig-apache/input/
      2012-02-29 05:45:22,282 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. This file system object (hdfs://10.195.159.114:9000) does not support access to the request path 's3://elasticmapreduce/samples/pig-apache/input' You possibly called FileSystem.get(conf) when you should have called FileSystem.get(uri, conf) to obtain a file system supporting your path.
      Details at logfile: /home/hadoop/pig_1330494091268.log
      grunt> quit

      EMR's Pig as of today:
      hadoop@ip-10-195-159-114:~$ pig
      2012-02-29 05:45:35,626 [main] INFO org.apache.pig.Main - Logging error messages to: /home/hadoop/pig_1330494335621.log
      2012-02-29 05:45:35,841 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.195.159.114:9000
      2012-02-29 05:45:36,200 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.195.159.114:9001
      grunt> cd s3://elasticmapreduce/samples/pig-apache/input/

        Activity

        Russell Jurney created issue -
        Hide
        Daniel Dai added a comment -

        how about load/store?

        Show
        Daniel Dai added a comment - how about load/store?
        Hide
        Russell Jurney added a comment -

        Nothing with s3 works.

        Show
        Russell Jurney added a comment - Nothing with s3 works.
        Hide
        Dmitriy V. Ryaboy added a comment -

        According to the hortonworks blog post, 10 should work with EMR. Can you verify that the 0.10 release still has this problem, and post instructions to reproduce?

        Show
        Dmitriy V. Ryaboy added a comment - According to the hortonworks blog post, 10 should work with EMR. Can you verify that the 0.10 release still has this problem, and post instructions to reproduce?
        Hide
        Daniel Dai added a comment -

        This should be solved in 0.10.0. If you saw any more issue, please open a new Jira with more specific information.

        Show
        Daniel Dai added a comment - This should be solved in 0.10.0. If you saw any more issue, please open a new Jira with more specific information.
        Daniel Dai made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 0.10.0 [ 12316246 ]
        Resolution Fixed [ 1 ]
        Hide
        Russell Jurney added a comment -

        I think this should be verified before we close it.

        Show
        Russell Jurney added a comment - I think this should be verified before we close it.
        Hide
        Daniel Dai added a comment -

        Mostly Pig works with s3. The case here is home directory cannot be on s3, which should be a minor feature. We shall open a new Jira with more specific title to track it.

        Show
        Daniel Dai added a comment - Mostly Pig works with s3. The case here is home directory cannot be on s3, which should be a minor feature. We shall open a new Jira with more specific title to track it.
        Hide
        Russell Jurney added a comment -

        Pig 0.9 did not work with S3 at all. This needs testing by somebody in 0.10 to be closed.

        Show
        Russell Jurney added a comment - Pig 0.9 did not work with S3 at all. This needs testing by somebody in 0.10 to be closed.
        Russell Jurney made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        Daniel Dai added a comment -

        Yes, we committed several s3 patches to 0.10. I tested s3 works for script file, jars, macros, parameter files, script udf files in 0.10.0.

        Show
        Daniel Dai added a comment - Yes, we committed several s3 patches to 0.10. I tested s3 works for script file, jars, macros, parameter files, script udf files in 0.10.0.
        Hide
        dan young added a comment -

        Is this still a known issue?

        Show
        dan young added a comment - Is this still a known issue?
        Russell Jurney made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Russell Jurney
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development