Pig
  1. Pig
  2. PIG-2509

Util.getSchemaFromString fails with java.lang.NullPointerException when a tuple in a bag has no name (as when used in MongoStorage UDF)

    Details

    • Hadoop Flags:
      Reviewed

      Description

      The MongoStorage UDF ( https://github.com/mongodb/mongo-hadoop ) fails on its call to Util.getSchemaFromString when a tuple is not named. The call succeeds when the tuple is explicitly named. As tuple naming is optional, this is a bug.

      See http://www.mail-archive.com/user%40pig.apache.org/msg04199.html and http://www.mail-archive.com/dev%40pig.apache.org/msg08016.html

      This makes me sad

      1. PIG-2509-0.patch
        2 kB
        Jonathan Coveney

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        1d 21h 9m 1 Jonathan Coveney 08/Feb/12 19:05
        Resolved Resolved Reopened Reopened
        2d 11h 56m 1 Jonathan Coveney 28/Feb/12 17:52
        Reopened Reopened Patch Available Patch Available
        9s 1 Jonathan Coveney 28/Feb/12 17:52
        Patch Available Patch Available Resolved Resolved
        19d 16h 39m 2 Daniel Dai 01/Mar/12 23:42
        Resolved Resolved Closed Closed
        55d 20h 51m 1 Daniel Dai 26/Apr/12 21:33
        Hide
        Russell Jurney added a comment -

        MongoStorage is still not working when tuples aren't named, but as there is no longer a call to Util.getSchemaFromString at all in mongo-hadoop, this problem appears to be unrelated.

        Show
        Russell Jurney added a comment - MongoStorage is still not working when tuples aren't named, but as there is no longer a call to Util.getSchemaFromString at all in mongo-hadoop, this problem appears to be unrelated.
        Daniel Dai made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Daniel Dai made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Fix Version/s 0.9.3 [ 12319456 ]
        Resolution Fixed [ 1 ]
        Hide
        Daniel Dai added a comment -

        Unit test pass. test-patch:
        [exec] -1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 3 new or modified tests.
        [exec]
        [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages.
        [exec]
        [exec] -1 javac. The applied patch generated 904 javac compiler warnings (more than the trunk's current 900 warnings).
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] -1 release audit. The applied patch generated 533 release audit warnings (more than the trunk's current 530 warnings).

        javac warning is caused by Antlr. javadoc and release audit warning is not related.

        I don't feel a compelling reason to check it into 0.9 branch. Commit into 0.10/trunk first.

        Thanks Jonathan!

        Show
        Daniel Dai added a comment - Unit test pass. test-patch: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] [exec] -1 javac. The applied patch generated 904 javac compiler warnings (more than the trunk's current 900 warnings). [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] -1 release audit. The applied patch generated 533 release audit warnings (more than the trunk's current 530 warnings). javac warning is caused by Antlr. javadoc and release audit warning is not related. I don't feel a compelling reason to check it into 0.9 branch. Commit into 0.10/trunk first. Thanks Jonathan!
        Jonathan Coveney made changes -
        Status Reopened [ 4 ] Patch Available [ 10002 ]
        Jonathan Coveney made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        Jonathan Coveney added a comment -

        This was closed incorrectly, and needs to be reviewed and committed

        Show
        Jonathan Coveney added a comment - This was closed incorrectly, and needs to be reviewed and committed
        Russell Jurney made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Russell Jurney added a comment -

        Tested. Working well in MongoStorage.

        Show
        Russell Jurney added a comment - Tested. Working well in MongoStorage.
        Hide
        Russell Jurney added a comment -

        I have applied this patch, and have been using it since. It works. Tests pass. Can we resolve and include this in 0.10?

        Show
        Russell Jurney added a comment - I have applied this patch, and have been using it since. It works. Tests pass. Can we resolve and include this in 0.10?
        Jonathan Coveney made changes -
        Attachment PIG-2509-0.patch [ 12513845 ]
        Jonathan Coveney made changes -
        Field Original Value New Value
        Status Open [ 1 ] Patch Available [ 10002 ]
        Assignee Jonathan Coveney [ jcoveney ]
        Fix Version/s 0.10 [ 12316246 ]
        Fix Version/s 0.9.3 [ 12319456 ]
        Fix Version/s 0.11 [ 12318878 ]
        Hide
        Jonathan Coveney added a comment -

        The fix basically just allows a tuple where it is explicitly called "null:" as in the case of "b:

        {null:(a:int)}

        ". In such a case, it will just throw out the tuple name, which will make it equivalent to "b:

        {(a:int)}

        ". Russell ran it and it worked. Russell, I did change it slightly, so it'd be lovely if you can try it again just to be sure.

        I successfully applied to 0.9 and ran "ant test-commit", and applied it to 0.10 and 0.11 and ran "ant -Dtestcase=TestSchema test."

        Show
        Jonathan Coveney added a comment - The fix basically just allows a tuple where it is explicitly called "null:" as in the case of "b: {null:(a:int)} ". In such a case, it will just throw out the tuple name, which will make it equivalent to "b: {(a:int)} ". Russell ran it and it worked. Russell, I did change it slightly, so it'd be lovely if you can try it again just to be sure. I successfully applied to 0.9 and ran "ant test-commit", and applied it to 0.10 and 0.11 and ran "ant -Dtestcase=TestSchema test."
        Hide
        Russell Jurney added a comment -

        The offending code that is called is here: class org.apache.pig.newplan.logical.Util

        ...

        public static LogicalSchema.LogicalFieldSchema translateFieldSchema(Schema.FieldSchema fs) {
        LogicalSchema newSchema = null;
        if (fs.schema!=null)

        { newSchema = translateSchema(fs.schema); }

        LogicalSchema.LogicalFieldSchema newFs = new LogicalSchema.LogicalFieldSchema(fs.alias, newSchema, fs.type);
        return newFs;
        }

        /**

        • This function translates the new LogicalSchema into old Schema format required
        • by PhysicalOperators
        • @param schema LogicalSchema to be converted to Schema
        • @return Schema that is converted from LogicalSchema
        • @throws FrontendException
          */
          public static Schema translateSchema(LogicalSchema schema) {
          if (schema == null) { return null; }

        Schema s2 = new Schema();
        List<LogicalSchema.LogicalFieldSchema> ll = schema.getFields();
        for (LogicalSchema.LogicalFieldSchema f: ll) {
        Schema.FieldSchema f2 = null;
        try

        { f2 = new Schema.FieldSchema(f.alias, translateSchema(f.schema), f.type); f2.canonicalName = ((Long)f.uid).toString(); s2.add(f2); }

        catch (FrontendException e) {
        }
        }

        return s2;
        }

        Show
        Russell Jurney added a comment - The offending code that is called is here: class org.apache.pig.newplan.logical.Util ... public static LogicalSchema.LogicalFieldSchema translateFieldSchema(Schema.FieldSchema fs) { LogicalSchema newSchema = null; if (fs.schema!=null) { newSchema = translateSchema(fs.schema); } LogicalSchema.LogicalFieldSchema newFs = new LogicalSchema.LogicalFieldSchema(fs.alias, newSchema, fs.type); return newFs; } /** This function translates the new LogicalSchema into old Schema format required by PhysicalOperators @param schema LogicalSchema to be converted to Schema @return Schema that is converted from LogicalSchema @throws FrontendException */ public static Schema translateSchema(LogicalSchema schema) { if (schema == null) { return null; } Schema s2 = new Schema(); List<LogicalSchema.LogicalFieldSchema> ll = schema.getFields(); for (LogicalSchema.LogicalFieldSchema f: ll) { Schema.FieldSchema f2 = null; try { f2 = new Schema.FieldSchema(f.alias, translateSchema(f.schema), f.type); f2.canonicalName = ((Long)f.uid).toString(); s2.add(f2); } catch (FrontendException e) { } } return s2; }
        Hide
        Russell Jurney added a comment -
        Show
        Russell Jurney added a comment - See also https://jira.mongodb.org/browse/HADOOP-19
        Russell Jurney created issue -

          People

          • Assignee:
            Jonathan Coveney
            Reporter:
            Russell Jurney
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development