Pig
  1. Pig
  2. PIG-2912

Pig should clone JobConf while creating JobContextImpl and TaskAttemptContextImpl in Hadoop23

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.3, 0.10.1
    • Fix Version/s: 0.9.3, 0.11, 0.10.1
    • Component/s: None
    • Labels:
      None

      Description

      There is change in the semantics of
      JobContext::JobContext(Configuration, JobID). While in .20, the Config was
      cloned, in .23 the Config is adopted (if it's a JobConf). This causes the same
      Configuration instance to be written-to for different tables in the same job.

      It would affect multi store commands in pig on Hadoop 23/2.0. The
      cloning in HadoopShims was part of PIG-2578 but was reverted to other issues.

      1. PIG-2912-branch09.patch
        7 kB
        Rohini Palaniswamy
      2. PIG-2912-branch10.patch
        7 kB
        Rohini Palaniswamy
      3. PIG-2912-trunk.patch
        7 kB
        Rohini Palaniswamy

        Activity

        Hide
        Rohini Palaniswamy added a comment -

        The patch creates a clone if JobConf is passed.

        Testcase added in TestMultiQueryLocal ensures that if new settings are added in the backend by a store, they are not passed to other stores in a multi store script.

        Show
        Rohini Palaniswamy added a comment - The patch creates a clone if JobConf is passed. Testcase added in TestMultiQueryLocal ensures that if new settings are added in the backend by a store, they are not passed to other stores in a multi store script.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Could you create a test in which the storage or loader function uses UDFContext to ship around information? I don't think the current test probes the issue that caused us to roll back PIG-2578

        Show
        Dmitriy V. Ryaboy added a comment - Could you create a test in which the storage or loader function uses UDFContext to ship around information? I don't think the current test probes the issue that caused us to roll back PIG-2578
        Hide
        Rohini Palaniswamy added a comment -

        Dmitriy,
        This patch is mainly to address a behaviour change in hadoop between 20 and 23 in the way instantiation of new JobContext is done and to deal with it in HadoopShims so that JobContext objects in the backend do not get overwritten in case of multiple stores. PIG-2578 main problem was with JobControlCompiler and it changed frontend behaviour. I will create a separate jira to add test case for PIG-2578.

        Hadoop 20:

        JobContext(JobConf conf, org.apache.hadoop.mapreduce.JobID jobId, 
                     Progressable progress) {
            super(conf, jobId); //Gets cloned
            this.job = conf;
            this.progress = progress;
          }
        

        Hadoop 23:

         public JobContextImpl(Configuration conf, JobID jobId) {
            if (conf instanceof JobConf) {
              this.conf = (JobConf)conf; //Gets assigned
            } else {
              this.conf = new JobConf(conf);
            }
        
        Show
        Rohini Palaniswamy added a comment - Dmitriy, This patch is mainly to address a behaviour change in hadoop between 20 and 23 in the way instantiation of new JobContext is done and to deal with it in HadoopShims so that JobContext objects in the backend do not get overwritten in case of multiple stores. PIG-2578 main problem was with JobControlCompiler and it changed frontend behaviour. I will create a separate jira to add test case for PIG-2578 . Hadoop 20: JobContext(JobConf conf, org.apache.hadoop.mapreduce.JobID jobId, Progressable progress) { super (conf, jobId); //Gets cloned this .job = conf; this .progress = progress; } Hadoop 23: public JobContextImpl(Configuration conf, JobID jobId) { if (conf instanceof JobConf) { this .conf = (JobConf)conf; //Gets assigned } else { this .conf = new JobConf(conf); }
        Hide
        Daniel Dai added a comment -

        +1. We shall keep the same behavior between 20 and 23. Put the abstraction in shims is the right approach. Will commit soon.

        Show
        Daniel Dai added a comment - +1. We shall keep the same behavior between 20 and 23. Put the abstraction in shims is the right approach. Will commit soon.
        Hide
        Daniel Dai added a comment -

        Patch committed to 0.9/0.10/trunk. Thanks Rohini!

        Show
        Daniel Dai added a comment - Patch committed to 0.9/0.10/trunk. Thanks Rohini!

          People

          • Assignee:
            Rohini Palaniswamy
            Reporter:
            Rohini Palaniswamy
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development