Pig
  1. Pig
  2. PIG-2650

Convenience mock Loader and Storer to simplify unit testing of Pig scripts

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11, 0.10.1
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      A test would look as follows:

      PigServer pigServer = new PigServer(ExecType.LOCAL);
      TupleFactory tf = TupleFactory.getInstance();
      Data data = Storage.resetData(pigServer.getPigContext());
      data.set("foo", Arrays.asList(
          tf.newTuple("a"),
          tf.newTuple("b"),
          tf.newTuple("c")
          ));
      
      pigServer.registerQuery("A = LOAD 'foo' USING mock.Storage();");
      // some complex script to test
      pigServer.registerQuery("STORE A INTO 'bar' USING mock.Storage();");
      
      Iterator<Tuple> out = data.get("bar").iterator();
      assertEquals("a", out.next().get(0));
      assertEquals("b", out.next().get(0));
      assertEquals("c", out.next().get(0));
      
      1. PIG-2650.patch
        12 kB
        Julien Le Dem
      2. PIG-2650-a.patch
        12 kB
        Julien Le Dem
      3. PIG-2650-b.patch
        17 kB
        Julien Le Dem
      4. PIG-2650-c.patch
        19 kB
        Julien Le Dem

        Activity

        Julien Le Dem created issue -
        Hide
        Julien Le Dem added a comment -

        attaching PIG-2650.patch

        Show
        Julien Le Dem added a comment - attaching PIG-2650 .patch
        Julien Le Dem made changes -
        Field Original Value New Value
        Attachment PIG-2650.patch [ 12522507 ]
        Julien Le Dem made changes -
        Patch Info Patch Available [ 10042 ]
        Hide
        Gianmarco De Francisci Morales added a comment -

        Hi, nice idea!

        What do you think of adding also something to clean up the static hashmaps that you use as "filesystem"?
        I think it could be good practice to start with a clean state the tests if we don't want interactions among them (e.g. something actually failed but the result looks good because it comes from another execution).

        Show
        Gianmarco De Francisci Morales added a comment - Hi, nice idea! What do you think of adding also something to clean up the static hashmaps that you use as "filesystem"? I think it could be good practice to start with a clean state the tests if we don't want interactions among them (e.g. something actually failed but the result looks good because it comes from another execution).
        Hide
        Julien Le Dem added a comment -

        PIG-2650-a.patch replaces the Loader and Storer by a single mock.Storage implementation with a mechanism to avoid side effects between tests.

        Show
        Julien Le Dem added a comment - PIG-2650 -a.patch replaces the Loader and Storer by a single mock.Storage implementation with a mechanism to avoid side effects between tests.
        Julien Le Dem made changes -
        Attachment PIG-2650-a.patch [ 12522593 ]
        Julien Le Dem made changes -
        Description A test would look as follows:
        {code}
        TupleFactory tf = TupleFactory.getInstance();
        Loader.setData("foo", Arrays.asList(
            tf.newTuple("a"),
            tf.newTuple("b"),
            tf.newTuple("c")
            ));

        PigServer pigServer = new PigServer(ExecType.LOCAL);
        pigServer.registerQuery("A = LOAD 'foo' USING mock.Loader();");
        // some complex script to test
        pigServer.registerQuery("STORE A INTO 'bar' USING mock.Storer();");

        List<Tuple> data = Storer.getData("bar");
        assertEquals("a", data.get(0).get(0));
        assertEquals("b", data.get(1).get(0));
        assertEquals("c", data.get(2).get(0));
        {code}
        A test would look as follows:
        {code}
        PigServer pigServer = new PigServer(ExecType.LOCAL);
        TupleFactory tf = TupleFactory.getInstance();
        Data data = Storage.resetData(pigServer.getPigContext());
        data.set("foo", Arrays.asList(
            tf.newTuple("a"),
            tf.newTuple("b"),
            tf.newTuple("c")
            ));

        pigServer.registerQuery("A = LOAD 'foo' USING mock.Storage();");
        // some complex script to test
        pigServer.registerQuery("STORE A INTO 'bar' USING mock.Storage();");

        Iterator<Tuple> out = data.get("bar").iterator();
        assertEquals("a", out.next().get(0));
        assertEquals("b", out.next().get(0));
        assertEquals("c", out.next().get(0));
        {code}
        Hide
        David Capwell added a comment -

        Could Data also have an overloaded method
        public void set(String location, Tuple... data) ? Would make the tests a little cleaner.

        Show
        David Capwell added a comment - Could Data also have an overloaded method public void set(String location, Tuple... data) ? Would make the tests a little cleaner.
        Hide
        Julien Le Dem added a comment -

        PIG-2650-b.patch adds some helper methods and support for schema set and get

        Show
        Julien Le Dem added a comment - PIG-2650 -b.patch adds some helper methods and support for schema set and get
        Julien Le Dem made changes -
        Attachment PIG-2650-b.patch [ 12522671 ]
        Hide
        Gianmarco De Francisci Morales added a comment -

        Hi Julien,

        Looks great!
        Can you fix testMockSchema()?
        It fails both on the schema comparison and on the data comparison.

        Maybe use Utils.getSchemaFromString() instead of comparing Strings?
        The schema in the query is (a:chararray, b:chararray) so I would do:

            Assert.assertEquals(Utils.getSchemaFromString("a:chararray,b:chararray"), data.getSchema("bar"));
            
            Iterator<Tuple> out = data.get("bar").iterator();
            
            Assert.assertEquals(tuple("a", "a"), out.next());
            Assert.assertEquals(tuple("b", "b"), out.next());
            Assert.assertEquals(tuple("c", "c"), out.next());
        

        Apart from this small issue, +1

        Show
        Gianmarco De Francisci Morales added a comment - Hi Julien, Looks great! Can you fix testMockSchema()? It fails both on the schema comparison and on the data comparison. Maybe use Utils.getSchemaFromString() instead of comparing Strings? The schema in the query is (a:chararray, b:chararray) so I would do: Assert.assertEquals(Utils.getSchemaFromString( "a:chararray,b:chararray" ), data.getSchema( "bar" )); Iterator<Tuple> out = data.get( "bar" ).iterator(); Assert.assertEquals(tuple( "a" , "a" ), out.next()); Assert.assertEquals(tuple( "b" , "b" ), out.next()); Assert.assertEquals(tuple( "c" , "c" ), out.next()); Apart from this small issue, +1
        Hide
        Julien Le Dem added a comment -

        PIG-2650-c.patch fixes the test.

        Show
        Julien Le Dem added a comment - PIG-2650 -c.patch fixes the test.
        Julien Le Dem made changes -
        Attachment PIG-2650-c.patch [ 12523520 ]
        Hide
        Gianmarco De Francisci Morales added a comment -

        Tests pass, looks good. +1
        Thanks Julien!

        Show
        Gianmarco De Francisci Morales added a comment - Tests pass, looks good. +1 Thanks Julien!
        Julien Le Dem made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 0.11 [ 12318878 ]
        Resolution Fixed [ 1 ]
        Julien Le Dem made changes -
        Fix Version/s 0.10.1 [ 12320547 ]
        Daniel Dai made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Julien Le Dem
            Reporter:
            Julien Le Dem
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development