Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2425

Aggregate Warning does not work as expected on Embedding Pig in Java 0.9.1

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.1
    • Fix Version/s: 0.10.0
    • Component/s: None
    • Labels:
    • Patch Info:
      Patch Available

      Description

      Property "aggregate.warning" is not being set by default when running PigServer, embedding Pig in Java.

      I was initially creating a PigServer object this way:

        PigServer pigServer = new PigServer(ExecType.MAPREDUCE);
      

      But this generated detailed logs in the log directory. To code around this on the client-side you could do

      Properties properties = PropertiesUtil.loadDefaultProperties();
      properties.setProperty("aggregate.warning", "true");
      PigServer pigServer = new PigServer(ExecType.MAPREDUCE, properties);
      

      The behavior between Pig scripting and Embedded Pig should be similar. Looking at
      the main constructor of PigServer, it looks like "aggregateWarning" is set
      to false if its not loaded in to Properties object.

      public PigServer(PigContext context, boolean connect) throws ExecException {
               this.pigContext = context;
               currDAG = new Graph(false);
      
               aggregateWarning =
      "true".equalsIgnoreCase(pigContext.getProperties().getProperty("aggregate.warning"));
               isMultiQuery =
      "true".equalsIgnoreCase(pigContext.getProperties().getProperty("opt.multiquery","true"));
      
               jobName = pigContext.getProperties().getProperty(
                       PigContext.JOB_NAME,
                       PigContext.JOB_NAME_PREFIX + ":DefaultJobName");
      
               if (connect) {
                   pigContext.connect();
               }
      
               addJarsFromProperties();
           }
      

      I suggest adding "aggregate.warning" to Properties object of PigContext so its picked up across all users of this property (MapReduceLauncher)

      public PigServer(PigContext context, boolean connect) throws ExecException {
              this.pigContext = context;
              currDAG = new Graph(false);
      
              aggregateWarning = "true".equalsIgnoreCase(pigContext.getProperties().getProperty("aggregate.warning", "true"));
              if(aggregateWarning) {
              	pigContext.getProperties().setProperty("aggregate.warning", "true");
              }
              	
              isMultiQuery = "true".equalsIgnoreCase(pigContext.getProperties().getProperty("opt.multiquery","true"));
              
              jobName = pigContext.getProperties().getProperty(
                      PigContext.JOB_NAME,
                      PigContext.JOB_NAME_PREFIX + ":DefaultJobName");
      
              if (connect) {
                  pigContext.connect();
              }
      
              addJarsFromProperties();
          }
      
      1. aggregateWarning.patch
        0.8 kB
        Prashant Kommireddi
      2. aggregateWarning2425.txt
        5 kB
        Prashant Kommireddi

        Activity

        Hide
        thejas Thejas M Nair added a comment -

        To ensure that the default properties are same when commandline or PigServer is used, it would be better to have same function set it from Main.run and PigServer constructor.

        The following section in Main can be moved to a function in PropertiesUtil and get called from PropertiesUtil.loadDefaultProperties. We also need a test case. Can you add one unit test on the lines of testPigProperties in TestPigServer.java ?

          if (properties.getProperty("aggregate.warning") == null) {
                    //by default warning aggregation is on
                    properties.setProperty("aggregate.warning", ""+true);
                }
        
                if (properties.getProperty("opt.multiquery") == null) {
                    //by default multiquery optimization is on
                    properties.setProperty("opt.multiquery", ""+true);
                }
        
                if (properties.getProperty("stop.on.failure") == null) {
                    //by default we keep going on error on the backend
                    properties.setProperty("stop.on.failure", ""+false);
                }
        
        
        Show
        thejas Thejas M Nair added a comment - To ensure that the default properties are same when commandline or PigServer is used, it would be better to have same function set it from Main.run and PigServer constructor. The following section in Main can be moved to a function in PropertiesUtil and get called from PropertiesUtil.loadDefaultProperties. We also need a test case. Can you add one unit test on the lines of testPigProperties in TestPigServer.java ? if (properties.getProperty( "aggregate.warning" ) == null ) { //by default warning aggregation is on properties.setProperty( "aggregate.warning" , ""+ true ); } if (properties.getProperty( "opt.multiquery" ) == null ) { //by default multiquery optimization is on properties.setProperty( "opt.multiquery" , ""+ true ); } if (properties.getProperty( "stop.on.failure" ) == null ) { //by default we keep going on error on the backend properties.setProperty( "stop.on.failure" , ""+ false ); }
        Hide
        prkommireddi Prashant Kommireddi added a comment -

        That makes sense. Thejas, about naming does "setDefaultsIfUnset(Properties properties)" sound ok? This would be a private method within PropertiesUtil and called by loadDefaultProperties(Properties properties).

         /**
             * Sets properties to their default values if not set by Client
             * @param properties
             */
            private static void setDefaultsIfUnset(Properties properties) {
            	if (properties.getProperty("aggregate.warning") == null) {
                    //by default warning aggregation is on
                    properties.setProperty("aggregate.warning", ""+true);
                }
        
                if (properties.getProperty("opt.multiquery") == null) {
                    //by default multiquery optimization is on
                    properties.setProperty("opt.multiquery", ""+true);
                }
        
                if (properties.getProperty("stop.on.failure") == null) {
                    //by default we keep going on error on the backend
                    properties.setProperty("stop.on.failure", ""+false);
                }
            }
        
        Show
        prkommireddi Prashant Kommireddi added a comment - That makes sense. Thejas, about naming does "setDefaultsIfUnset(Properties properties)" sound ok? This would be a private method within PropertiesUtil and called by loadDefaultProperties(Properties properties). /** * Sets properties to their default values if not set by Client * @param properties */ private static void setDefaultsIfUnset(Properties properties) { if (properties.getProperty( "aggregate.warning" ) == null ) { //by default warning aggregation is on properties.setProperty( "aggregate.warning" , ""+ true ); } if (properties.getProperty( "opt.multiquery" ) == null ) { //by default multiquery optimization is on properties.setProperty( "opt.multiquery" , ""+ true ); } if (properties.getProperty( "stop.on.failure" ) == null ) { //by default we keep going on error on the backend properties.setProperty( "stop.on.failure" , ""+ false ); } }
        Hide
        prkommireddi Prashant Kommireddi added a comment -

        I would be adding a test case to TestPigServer

        @Test
        	public void testDefaultPigProperties() throws Throwable {
        		File propertyFile = new File("pig.properties");
        
        		Properties properties = PropertiesUtil.loadDefaultProperties();
        		Assert
        		.assertTrue(properties.getProperty(
        				"pig.exec.reducers.max").equals("999"));
        		Assert.assertTrue(properties.getProperty("aggregate.warning").equals("true"));
        		Assert.assertTrue(properties.getProperty("opt.multiquery").equals("true"));
        		Assert.assertTrue(properties.getProperty("stop.on.failure").equals("false"));
        		
        		PrintWriter out = new PrintWriter(new FileWriter(propertyFile));
        		out.println("aggregate.warning=false");
        		out.println("opt.multiquery=false");
        		out.println("stop.on.failure=true");
        		
        		out.close();
        
        		properties = PropertiesUtil.loadDefaultProperties();
        		Assert.assertTrue(properties.getProperty("aggregate.warning")
        				.equals("false"));
        		Assert.assertTrue(properties.getProperty("opt.multiquery")
        				.equals("false"));
        		Assert.assertTrue(properties.getProperty("stop.on.failure")
        				.equals("true"));
        
        		propertyFile.delete();
        	}
        
        Show
        prkommireddi Prashant Kommireddi added a comment - I would be adding a test case to TestPigServer @Test public void testDefaultPigProperties() throws Throwable { File propertyFile = new File( "pig.properties" ); Properties properties = PropertiesUtil.loadDefaultProperties(); Assert .assertTrue(properties.getProperty( "pig.exec.reducers.max" ).equals( "999" )); Assert.assertTrue(properties.getProperty( "aggregate.warning" ).equals( " true " )); Assert.assertTrue(properties.getProperty( "opt.multiquery" ).equals( " true " )); Assert.assertTrue(properties.getProperty( "stop.on.failure" ).equals( " false " )); PrintWriter out = new PrintWriter( new FileWriter(propertyFile)); out.println( "aggregate.warning= false " ); out.println( "opt.multiquery= false " ); out.println( "stop.on.failure= true " ); out.close(); properties = PropertiesUtil.loadDefaultProperties(); Assert.assertTrue(properties.getProperty( "aggregate.warning" ) .equals( " false " )); Assert.assertTrue(properties.getProperty( "opt.multiquery" ) .equals( " false " )); Assert.assertTrue(properties.getProperty( "stop.on.failure" ) .equals( " true " )); propertyFile.delete(); }
        Hide
        prkommireddi Prashant Kommireddi added a comment -

        Slightly modified the test case to add testing with PigServer.

        @Test
        	public void testDefaultPigProperties() throws Throwable {
            	//Test with PigServer
            	PigServer pigServer = new PigServer(ExecType.MAPREDUCE);
            	Properties properties = pigServer.getPigContext().getProperties();
            	
            	Assert
        		.assertTrue(properties.getProperty(
        				"pig.exec.reducers.max").equals("999"));
        		Assert.assertTrue(properties.getProperty("aggregate.warning").equals("true"));
        		Assert.assertTrue(properties.getProperty("opt.multiquery").equals("true"));
        		Assert.assertTrue(properties.getProperty("stop.on.failure").equals("false"));
            	
        		//Test with properties file
        		File propertyFile = new File("pig.properties");
        
        		properties = PropertiesUtil.loadDefaultProperties();
        		
        		Assert
        		.assertTrue(properties.getProperty(
        				"pig.exec.reducers.max").equals("999"));
        		Assert.assertTrue(properties.getProperty("aggregate.warning").equals("true"));
        		Assert.assertTrue(properties.getProperty("opt.multiquery").equals("true"));
        		Assert.assertTrue(properties.getProperty("stop.on.failure").equals("false"));
        		
        		PrintWriter out = new PrintWriter(new FileWriter(propertyFile));
        		out.println("aggregate.warning=false");
        		out.println("opt.multiquery=false");
        		out.println("stop.on.failure=true");
        		
        		out.close();
        
        		properties = PropertiesUtil.loadDefaultProperties();
        		Assert.assertTrue(properties.getProperty("aggregate.warning")
        				.equals("false"));
        		Assert.assertTrue(properties.getProperty("opt.multiquery")
        				.equals("false"));
        		Assert.assertTrue(properties.getProperty("stop.on.failure")
        				.equals("true"));
        
        		propertyFile.delete();
        	}
        
        Show
        prkommireddi Prashant Kommireddi added a comment - Slightly modified the test case to add testing with PigServer. @Test public void testDefaultPigProperties() throws Throwable { //Test with PigServer PigServer pigServer = new PigServer(ExecType.MAPREDUCE); Properties properties = pigServer.getPigContext().getProperties(); Assert .assertTrue(properties.getProperty( "pig.exec.reducers.max" ).equals( "999" )); Assert.assertTrue(properties.getProperty( "aggregate.warning" ).equals( " true " )); Assert.assertTrue(properties.getProperty( "opt.multiquery" ).equals( " true " )); Assert.assertTrue(properties.getProperty( "stop.on.failure" ).equals( " false " )); //Test with properties file File propertyFile = new File( "pig.properties" ); properties = PropertiesUtil.loadDefaultProperties(); Assert .assertTrue(properties.getProperty( "pig.exec.reducers.max" ).equals( "999" )); Assert.assertTrue(properties.getProperty( "aggregate.warning" ).equals( " true " )); Assert.assertTrue(properties.getProperty( "opt.multiquery" ).equals( " true " )); Assert.assertTrue(properties.getProperty( "stop.on.failure" ).equals( " false " )); PrintWriter out = new PrintWriter( new FileWriter(propertyFile)); out.println( "aggregate.warning= false " ); out.println( "opt.multiquery= false " ); out.println( "stop.on.failure= true " ); out.close(); properties = PropertiesUtil.loadDefaultProperties(); Assert.assertTrue(properties.getProperty( "aggregate.warning" ) .equals( " false " )); Assert.assertTrue(properties.getProperty( "opt.multiquery" ) .equals( " false " )); Assert.assertTrue(properties.getProperty( "stop.on.failure" ) .equals( " true " )); propertyFile.delete(); }
        Hide
        thejas Thejas M Nair added a comment -

        +1. Patch committed to 0.10 branch and trunk.
        Prashant, Thanks for the contribution!

        Show
        thejas Thejas M Nair added a comment - +1. Patch committed to 0.10 branch and trunk. Prashant, Thanks for the contribution!

          People

          • Assignee:
            prkommireddi Prashant Kommireddi
            Reporter:
            prkommireddi Prashant Kommireddi
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development