Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2872

StoreFuncInterface.setStoreLocation get's a copy of a Configuration object

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.11
    • Fix Version/s: None
    • Component/s: impl
    • Labels:
      None
    • Environment:

      Pig trunk, Hadoop 0.20.205 with Kerberos, ElasticSearch trunk, Wonderdog trunk

      Description

      When an implementation of StoreFuncInterface.setStoreLocation is called from JobControlCompiler.getJob, it is passed a copy of the Configuration that will be used for the Job that will be submitted:

      JobControlCompiler.java
      sFunc.setStoreLocation(st.getSFile().getFileName(), new org.apache.hadoop.mapreduce.Job(nwJob.getConfiguration()));
      

      When a new org.apache.hadoop.mapreduce.Job is created it creates a copy of the Configuration object, as far as I know. Thus anything added to the Configuration object in the implementation of setStoreLocation will not be included in the Configuration of nwJob in JobControlCompiler.getJob.

      I notice this goes wrong in Wonderdog, which needs to include the Elasticsearch configuration file in the DistributedCache. It is added to mapred.cache.files through setStoreLocation, but this setting doesn't make it back into the Job returned by JobControlCompiler.getJob, and is therefore never localized.

      This might be intentional semantics within Pig, but I'm not familiar enough with StoreFuncs to know whether it is.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                evertlammerts Evert Lammerts
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated: