Pig
  1. Pig
  2. PIG-236

properties specified on the command line are ignored

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.0.0
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None

      Description

      Looks like code in src/org/apache/pig/impl/util/PropertiesUtil.java does not take system properties into account.

      1. pig-236.patch
        4 kB
        Pradeep Kamath

        Activity

        Olga Natkovich created issue -
        Olga Natkovich made changes -
        Field Original Value New Value
        Assignee Pradeep Kamath [ pkamath ]
        Pradeep Kamath made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        Hide
        Pradeep Kamath added a comment - - edited

        Currently the code is not looking at the "System properties" which include the command line "-D" overrides and hence the issue reported in this bug. Here is the plan to address this:

        First the pig.properties (and currently .pigrc - from comments in the code, there seems to be a plan to not support .pigrc in the future) file(s) will be read into the "properties" object. Then the "System properties" (from System.getProperties()) will be added into the properties object overriding any existing key/values.

        A repercussion of this is that any default java or system properties (like java.io.tmpdir) that a user wants to override should be supplied in the commandline and NOT in pig.properties or .pigrc

        Show
        Pradeep Kamath added a comment - - edited Currently the code is not looking at the "System properties" which include the command line "-D" overrides and hence the issue reported in this bug. Here is the plan to address this: First the pig.properties (and currently .pigrc - from comments in the code, there seems to be a plan to not support .pigrc in the future) file(s) will be read into the "properties" object. Then the "System properties" (from System.getProperties()) will be added into the properties object overriding any existing key/values. A repercussion of this is that any default java or system properties (like java.io.tmpdir) that a user wants to override should be supplied in the commandline and NOT in pig.properties or .pigrc
        Hide
        Pi Song added a comment -

        +1
        This will increase usability

        Show
        Pi Song added a comment - +1 This will increase usability
        Hide
        Pradeep Kamath added a comment -

        In the code, there were several changes made:
        1) In Main the earlier code was setting JOBTRACKER_LOCATION prematurely to "local" in the properties - this has been removed since this should be determined from hadoop-site.xml and then overriden by pig.properties - see the 3) below

        2) Changed PropertiesUtil.java to also include properties from System including command line overrides

        3) In HEexecutionEngine.java, for the non HOD case, the earlier code was only reading "properties" object (built from pig.properties/.pigrc and System.getProperties() which includes command line overrides). Now it first reads hadoop-site.xml and then adds on (over-writing existing keys) properties from the "properties" object. Towards this also added a utility function in ConfigurationUtil.java

        Show
        Pradeep Kamath added a comment - In the code, there were several changes made: 1) In Main the earlier code was setting JOBTRACKER_LOCATION prematurely to "local" in the properties - this has been removed since this should be determined from hadoop-site.xml and then overriden by pig.properties - see the 3) below 2) Changed PropertiesUtil.java to also include properties from System including command line overrides 3) In HEexecutionEngine.java, for the non HOD case, the earlier code was only reading "properties" object (built from pig.properties/.pigrc and System.getProperties() which includes command line overrides). Now it first reads hadoop-site.xml and then adds on (over-writing existing keys) properties from the "properties" object. Towards this also added a utility function in ConfigurationUtil.java
        Pradeep Kamath made changes -
        Affects Version/s 0.0.0 [ 12312847 ]
        Status In Progress [ 3 ] Patch Available [ 10002 ]
        Fix Version/s 0.1.0 [ 12312848 ]
        Hide
        Pradeep Kamath added a comment -

        Patch file - details in comments section

        Show
        Pradeep Kamath added a comment - Patch file - details in comments section
        Pradeep Kamath made changes -
        Attachment pig-236.patch [ 12381810 ]
        Hide
        Pi Song added a comment -

        Overall patch looks good except that I don't understand why you look up hadoop-site.xml in this location.

        configuration.addResource("/hadoop-site.xml");
        
        Show
        Pi Song added a comment - Overall patch looks good except that I don't understand why you look up hadoop-site.xml in this location. configuration.addResource("/hadoop-site.xml");
        Hide
        Pradeep Kamath added a comment -

        Hadoop looks for the named file in configuration.AddResource() in the classpath. So by specifying "/hadoop-site.xml" as the argument, we are letting the configuration.AddResource() pick it up from the first location in the classpath where such a file is present.

        Show
        Pradeep Kamath added a comment - Hadoop looks for the named file in configuration.AddResource() in the classpath. So by specifying "/hadoop-site.xml" as the argument, we are letting the configuration.AddResource() pick it up from the first location in the classpath where such a file is present.
        Hide
        Pradeep Kamath added a comment -

        Using "hadoop-site.xml" in configuration.addResource(); in HExecutionEngine.init(Properties) instead of "/hadoop-site.xml"

        Show
        Pradeep Kamath added a comment - Using "hadoop-site.xml" in configuration.addResource(); in HExecutionEngine.init(Properties) instead of "/hadoop-site.xml"
        Pradeep Kamath made changes -
        Attachment pig-236.patch [ 12381831 ]
        Pradeep Kamath made changes -
        Attachment pig-236.patch [ 12381810 ]
        Hide
        Pradeep Kamath added a comment -

        "/hadoop-site.xml" was confusing - so I changed it to "hadoop-site.xml" as indicated in previous comment and uploaded a new patch with this change.

        Show
        Pradeep Kamath added a comment - "/hadoop-site.xml" was confusing - so I changed it to "hadoop-site.xml" as indicated in previous comment and uploaded a new patch with this change.
        Hide
        Pi Song added a comment -

        +1

        We have two semantics here.
        1) Some configuration files have to be under

        {pig.home}

        /conf
        2) Some configuration files have to be under classpath.

        Possibly we should also fix the shell script to add conf in the classpath as well so that we can say "put everything under conf"

        Show
        Pi Song added a comment - +1 We have two semantics here. 1) Some configuration files have to be under {pig.home} /conf 2) Some configuration files have to be under classpath. Possibly we should also fix the shell script to add conf in the classpath as well so that we can say "put everything under conf"
        Hide
        Pradeep Kamath added a comment -

        The Configuration constructor by default handles hadoop-default.xml and hadoop-site.xml.

        
          public Configuration() {
            if (LOG.isDebugEnabled()) {
              LOG.debug(StringUtils.stringifyException(new IOException("config()")));
            }
            resources.add("hadoop-default.xml");
            resources.add("hadoop-site.xml");
          }
        
        

        Attaching a new patch to make use of this. The logic is the same as before, only with fewer changes to use the above fact.

        Show
        Pradeep Kamath added a comment - The Configuration constructor by default handles hadoop-default.xml and hadoop-site.xml. public Configuration() { if (LOG.isDebugEnabled()) { LOG.debug(StringUtils.stringifyException( new IOException( "config()" ))); } resources.add( "hadoop- default .xml" ); resources.add( "hadoop-site.xml" ); } Attaching a new patch to make use of this. The logic is the same as before, only with fewer changes to use the above fact.
        Pradeep Kamath made changes -
        Attachment pig-236.patch [ 12381843 ]
        Pradeep Kamath made changes -
        Attachment pig-236.patch [ 12381831 ]
        Hide
        Amir Youssefi added a comment -

        I thinks it's time we have a Wiki page for different configuration files and how they override each other:

        http://wiki.apache.org/pig/

        or

        http://wiki.apache.org/pig/FAQ

        Show
        Amir Youssefi added a comment - I thinks it's time we have a Wiki page for different configuration files and how they override each other: http://wiki.apache.org/pig/ or http://wiki.apache.org/pig/FAQ
        Hide
        Olga Natkovich added a comment -

        I don't think we need to pick up config from hadoop-default. That suppose to be used by the backend only.

        Show
        Olga Natkovich added a comment - I don't think we need to pick up config from hadoop-default. That suppose to be used by the backend only.
        Hide
        Pi Song added a comment -

        Like Pradeep mentioned, trying to read both "hadoop-default.xml" and "hadoop-site.xml" from classpath is a default behavior of Hadoop Configuration so we just have to live with it.

        Show
        Pi Song added a comment - Like Pradeep mentioned, trying to read both "hadoop-default.xml" and "hadoop-site.xml" from classpath is a default behavior of Hadoop Configuration so we just have to live with it.
        Hide
        Alan Gates added a comment -

        These changes should be tested against a static cluster, against a cluster that uses HOD, and against a cluster that has been allocated with HOD. Pradeep, have you tested all three of those scenarios?

        Show
        Alan Gates added a comment - These changes should be tested against a static cluster, against a cluster that uses HOD, and against a cluster that has been allocated with HOD. Pradeep, have you tested all three of those scenarios?
        Hide
        Olga Natkovich added a comment -

        yes, pradeep and I tested all 3 of these cases together.

        Show
        Olga Natkovich added a comment - yes, pradeep and I tested all 3 of these cases together.
        Hide
        Alan Gates added a comment -

        Testing found that the fix did not work in the case where HOD was being used to allocate clusters. Pradeep will take a look at fixing this.

        Show
        Alan Gates added a comment - Testing found that the fix did not work in the case where HOD was being used to allocate clusters. Pradeep will take a look at fixing this.
        Alan Gates made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Pradeep Kamath added a comment -

        The issue with the earlier patch was that the following code was moved up only into the non-HOD case in the "if-else"

        configuration = ConfigurationUtil.toConfiguration(properties);
        

        Now this call is made again after the if-else in the snippet below - the comment
        explains the code in the snippet:

        
                ds = new HDataStorage(properties);
                        
                // The above HDataStorage constructor sets DEFAULT_REPLICATION_FACTOR_KEY in properties.
                // So we need to reconstruct the configuration object for the non HOD case
                // In the HOD case, this is the first time the configuration object will be created
                configuration = ConfigurationUtil.toConfiguration(properties);	
        
        
        Show
        Pradeep Kamath added a comment - The issue with the earlier patch was that the following code was moved up only into the non-HOD case in the "if-else" configuration = ConfigurationUtil.toConfiguration(properties); Now this call is made again after the if-else in the snippet below - the comment explains the code in the snippet: ds = new HDataStorage(properties); // The above HDataStorage constructor sets DEFAULT_REPLICATION_FACTOR_KEY in properties. // So we need to reconstruct the configuration object for the non HOD case // In the HOD case , this is the first time the configuration object will be created configuration = ConfigurationUtil.toConfiguration(properties);
        Pradeep Kamath made changes -
        Attachment pig-236.patch [ 12381913 ]
        Pradeep Kamath made changes -
        Attachment pig-236.patch [ 12381843 ]
        Pradeep Kamath made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Alan Gates added a comment -

        Fix checked in at revision 655940. Tested this patch against static cluster, cluster using hod, and cluster allocated via hod.

        Show
        Alan Gates added a comment - Fix checked in at revision 655940. Tested this patch against static cluster, cluster using hod, and cluster allocated via hod.
        Alan Gates made changes -
        Resolution Fixed [ 1 ]
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Alan Gates made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open In Progress In Progress
        6m 21s 1 Pradeep Kamath 09/May/08 01:21
        In Progress In Progress Patch Available Patch Available
        1d 27m 1 Pradeep Kamath 10/May/08 01:49
        Patch Available Patch Available Open Open
        2d 16h 21m 1 Alan Gates 12/May/08 18:11
        Open Open Patch Available Patch Available
        10h 54m 1 Pradeep Kamath 13/May/08 05:05
        Patch Available Patch Available Resolved Resolved
        12h 30m 1 Alan Gates 13/May/08 17:36
        Resolved Resolved Closed Closed
        680d 5h 24m 1 Alan Gates 24/Mar/10 22:01

          People

          • Assignee:
            Pradeep Kamath
            Reporter:
            Olga Natkovich
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development