Pig
  1. Pig
  2. PIG-798

Schema errors when using PigStorage and none when using BinStorage in FOREACH??

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0, 0.8.0
    • Fix Version/s: None
    • Component/s: impl
    • Labels:
      None

      Description

      In the following script I have a tab separated text file, which I load using PigStorage() and store using BinStorage()

      A = load '/user/viraj/visits.txt' using PigStorage() as (name:chararray, url:chararray, time:chararray);
      
      B = group A by name;
      
      store B into '/user/viraj/binstoragecreateop' using BinStorage();
      
      dump B;
      

      I later load file 'binstoragecreateop' in the following way.

      
      A = load '/user/viraj/binstoragecreateop' using BinStorage();
      
      B = foreach A generate $0 as name:chararray;
      
      dump B;
      

      Result
      =======================================================================
      (Amy)
      (Fred)
      =======================================================================
      The above code work properly and returns the right results. If I use PigStorage() to achieve the same, I get the following error.

      A = load '/user/viraj/visits.txt' using PigStorage();
      
      B = foreach A generate $0 as name:chararray;
      
      dump B;
      
      

      =======================================================================

      2009-05-02 03:58:50,662 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: bytearray. Other Field Schema: name: chararray
      Details at logfile: /home/viraj/pig-svn/trunk/pig_1241236728311.log
      

      =======================================================================
      So why should the semantics of BinStorage() be different from PigStorage() where is ok not to specify a schema??? Should it not be consistent across both.

      1. schemaerr.pig
        0.5 kB
        Viraj Bhat
      2. visits.txt
        0.2 kB
        Viraj Bhat
      3. binstoragecreateop
        0.3 kB
        Viraj Bhat

        Activity

        Hide
        Daniel Dai added a comment -

        Yes, we are actually want to fix it, see PIG-2315.

        Show
        Daniel Dai added a comment - Yes, we are actually want to fix it, see PIG-2315 .
        Hide
        Subir S added a comment -

        Shouldn't the documentation http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html#schemaforeach also reflect correct usage? Based on this comment above

        Show
        Subir S added a comment - Shouldn't the documentation http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html#schemaforeach also reflect correct usage? Based on this comment above
        Hide
        Prashant Kommireddi added a comment -

        Thanks Daniel. Unfortunately my UDF cannot return a numeric type as the return value could non-numeric as well. The value depends on the arguments I pass to the function.

        For Error 2: how is it that if my UDF just returned a single chararray instead of a tuple of chararrays the SUM function would work? For eg,

        A = LOAD 'input.dat'; 
        --assuming MYUDF(*,'id') always returns numeric value
        B = FOREACH A GENERATE (int)MYUDF(*, 'id') as id;
        C = GROUP B BY $0;
        D = FOREACH C GENERATE group, SUM(id);
        
        Show
        Prashant Kommireddi added a comment - Thanks Daniel. Unfortunately my UDF cannot return a numeric type as the return value could non-numeric as well. The value depends on the arguments I pass to the function. For Error 2: how is it that if my UDF just returned a single chararray instead of a tuple of chararrays the SUM function would work? For eg, A = LOAD 'input.dat'; --assuming MYUDF(*,'id') always returns numeric value B = FOREACH A GENERATE ( int )MYUDF(*, 'id') as id; C = GROUP B BY $0; D = FOREACH C GENERATE group, SUM(id);
        Hide
        Daniel Dai added a comment -

        Hi, Prashant,
        Error 1: If your UDF generate a DataByteArray, then late convert to another datatype, Pig don't know how to convert bytes into the real type. This is a hole in Pig we are working on. For now, please avoid generating DBA in UDF.

        Error 2: SUM overrides all numeric version, so if your UDF returns a numeric type instead of chararray, the right SUM UDF will be picked up and the error will go away.

        Show
        Daniel Dai added a comment - Hi, Prashant, Error 1: If your UDF generate a DataByteArray, then late convert to another datatype, Pig don't know how to convert bytes into the real type. This is a hole in Pig we are working on. For now, please avoid generating DBA in UDF. Error 2: SUM overrides all numeric version, so if your UDF returns a numeric type instead of chararray, the right SUM UDF will be picked up and the error will go away.
        Hide
        Prashant Kommireddi added a comment -

        I developed UDF to return a Tuple of DataByteArray. The error corresponding to this return type is

        Backend error message
        ---------------------
        org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a bytearray from the UDF. Cannot determine how to convert the bytearray to string.
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:657)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:322)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
        	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        	at org.apache.hadoop.mapred.Child.main(Child.java:170)
        
        Pig Stack Trace
        ---------------
        ERROR 1066: Unable to open iterator for alias result
        
        org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias result
        	at org.apache.pig.PigServer.openIterator(PigServer.java:901)
        	at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:655)
        	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
        	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
        	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
        	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
        	at org.apache.pig.Main.run(Main.java:553)
        	at org.apache.pig.Main.main(Main.java:108)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        	at java.lang.reflect.Method.invoke(Method.java:597)
        	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
        Caused by: java.io.IOException: Couldn't retrieve job.
        	at org.apache.pig.PigServer.store(PigServer.java:965)
        	at org.apache.pig.PigServer.openIterator(PigServer.java:876)
        	... 12 more
        

        If I change the return type to be a Tuple of chararrays, I receive an error at a later stage when using SUM

        Backend error message
        ---------------------
        org.apache.pig.backend.executionengine.ExecException: ERROR 2106: Error while computing sum in Initial
        	at org.apache.pig.builtin.SUM$Initial.exec(SUM.java:113)
        	at org.apache.pig.builtin.SUM$Initial.exec(SUM.java:85)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:253)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
        	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        	at org.apache.hadoop.mapred.Child.main(Child.java:170)
        Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.DataByteArray
        	at org.apache.pig.builtin.SUM$Initial.exec(SUM.java:99)
        	... 15 more
        
        Pig Stack Trace
        ---------------
        ERROR 1066: Unable to open iterator for alias result
        
        org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias result
        	at org.apache.pig.PigServer.openIterator(PigServer.java:901)
        	at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:655)
        	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
        	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
        	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
        	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
        	at org.apache.pig.Main.run(Main.java:553)
        	at org.apache.pig.Main.main(Main.java:108)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        	at java.lang.reflect.Method.invoke(Method.java:597)
        	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
        Caused by: java.io.IOException: Couldn't retrieve job.
        	at org.apache.pig.PigServer.store(PigServer.java:965)
        	at org.apache.pig.PigServer.openIterator(PigServer.java:876)
        	... 12 more
        
        Show
        Prashant Kommireddi added a comment - I developed UDF to return a Tuple of DataByteArray. The error corresponding to this return type is Backend error message --------------------- org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a bytearray from the UDF. Cannot determine how to convert the bytearray to string. at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:657) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:322) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Pig Stack Trace --------------- ERROR 1066: Unable to open iterator for alias result org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias result at org.apache.pig.PigServer.openIterator(PigServer.java:901) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:655) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:553) at org.apache.pig.Main.main(Main.java:108) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.io.IOException: Couldn't retrieve job. at org.apache.pig.PigServer.store(PigServer.java:965) at org.apache.pig.PigServer.openIterator(PigServer.java:876) ... 12 more If I change the return type to be a Tuple of chararrays, I receive an error at a later stage when using SUM Backend error message --------------------- org.apache.pig.backend.executionengine.ExecException: ERROR 2106: Error while computing sum in Initial at org.apache.pig.builtin.SUM$Initial.exec(SUM.java:113) at org.apache.pig.builtin.SUM$Initial.exec(SUM.java:85) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:253) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.lang.ClassCastException: java.lang. String cannot be cast to org.apache.pig.data.DataByteArray at org.apache.pig.builtin.SUM$Initial.exec(SUM.java:99) ... 15 more Pig Stack Trace --------------- ERROR 1066: Unable to open iterator for alias result org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias result at org.apache.pig.PigServer.openIterator(PigServer.java:901) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:655) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:553) at org.apache.pig.Main.main(Main.java:108) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.io.IOException: Couldn't retrieve job. at org.apache.pig.PigServer.store(PigServer.java:965) at org.apache.pig.PigServer.openIterator(PigServer.java:876) ... 12 more
        Hide
        Ashutosh Chauhan added a comment -

        As follows:

        A = load 'input' using PigStorage();
        B = FOREACH A GENERATE FLATTEN(TOTUPLE(*), TOTUPLE('abc','def,'ghi'));
        C = Foreach B generate (int)$0 , (chararray)$1, (double)$2 as (a,b,c);
        
        Show
        Ashutosh Chauhan added a comment - As follows: A = load 'input' using PigStorage(); B = FOREACH A GENERATE FLATTEN(TOTUPLE(*), TOTUPLE('abc','def,'ghi')); C = Foreach B generate ( int )$0 , (chararray)$1, ( double )$2 as (a,b,c);
        Hide
        Prashant Kommireddi added a comment -

        Ashutosh, how can I do something like this

        REGISTER '/home/foo/bar/piggy.jar'
        DEFINE UD com.foo.bar.CustomUDF();
        
        A = load 'input' using PigStorage();
        B = FOREACH A GENERATE FLATTEN(TOTUPLE(*), TOTUPLE('abc','def,'ghi')) as (a:int, b:chararray, c:double);
        

        Here, CustomUDF is a function that takes in (tuple, tuple) and returns a tuple.
        Unfortunately if there is no way to make the above work, I would have to change the function to return a single byte or chararray and make multiple function calls. Probably not the best thing to do performance-wise.

        Show
        Prashant Kommireddi added a comment - Ashutosh, how can I do something like this REGISTER '/home/foo/bar/piggy.jar' DEFINE UD com.foo.bar.CustomUDF(); A = load 'input' using PigStorage(); B = FOREACH A GENERATE FLATTEN(TOTUPLE(*), TOTUPLE('abc','def,'ghi')) as (a: int , b:chararray, c: double ); Here, CustomUDF is a function that takes in (tuple, tuple) and returns a tuple. Unfortunately if there is no way to make the above work, I would have to change the function to return a single byte or chararray and make multiple function calls. Probably not the best thing to do performance-wise.
        Hide
        Viraj Bhat added a comment -

        Ashutosh thanks for clarifying, we will wait till that bug is fixed in BinStorage

        Viraj

        Show
        Viraj Bhat added a comment - Ashutosh thanks for clarifying, we will wait till that bug is fixed in BinStorage Viraj
        Hide
        Ashutosh Chauhan added a comment -

        1.

         b = foreach a generate (chararray) $0 as name; 
        

        2.

        B = foreach A generate $0 as name:chararray;
        

        @Viraj,

        Discussed with Alan and Daniel. Language semantics for achieving this functionality with whatever loader is 1. The fact that 2 works for BinStorage is unfortunate and is bug. It is something which is currently there for backward compatibility and will eventually be removed.

        Show
        Ashutosh Chauhan added a comment - 1. b = foreach a generate (chararray) $0 as name; 2. B = foreach A generate $0 as name:chararray; @Viraj, Discussed with Alan and Daniel. Language semantics for achieving this functionality with whatever loader is 1. The fact that 2 works for BinStorage is unfortunate and is bug. It is something which is currently there for backward compatibility and will eventually be removed.
        Hide
        Viraj Bhat added a comment -

        Hi Ashutosh,
        Yes that is possible, I know that we can do that in PigStorage() but why can we not do this in PigStorage? What do I need to cast as (chararray) ?

        A = load 'somedata' using PigStorage();
        B = foreach A generate $0 as name:chararray;
        dump B;
        

        But this is possible in BinStorage(), why is this not consistent?

        Is it that BinStorage() has schemas embedded while PigStorage() does not?

        Should this not be fixed to make it consistent across storage formats?

        Viraj

        Show
        Viraj Bhat added a comment - Hi Ashutosh, Yes that is possible, I know that we can do that in PigStorage() but why can we not do this in PigStorage? What do I need to cast as (chararray) ? A = load 'somedata' using PigStorage(); B = foreach A generate $0 as name:chararray; dump B; But this is possible in BinStorage(), why is this not consistent? Is it that BinStorage() has schemas embedded while PigStorage() does not? Should this not be fixed to make it consistent across storage formats? Viraj
        Hide
        Ashutosh Chauhan added a comment -

        You can specify schema in FOREACH GENERATE with PigStorage loader as follows:

        grunt> a = load 'data' using PigStorage();
        grunt> b = foreach a generate (chararray) $0 as name; 
        grunt> describe b;
        b: {name: chararray}
        grunt> dump b;
        

        I get the expected result.

        Show
        Ashutosh Chauhan added a comment - You can specify schema in FOREACH GENERATE with PigStorage loader as follows: grunt> a = load 'data' using PigStorage(); grunt> b = foreach a generate (chararray) $0 as name; grunt> describe b; b: {name: chararray} grunt> dump b; I get the expected result.
        Hide
        Viraj Bhat added a comment -

        Hi Ashutosh,
        The problem here is not about using the data interchangeably between BinStorage() and PigStorage(), it is about the consistency issues in schema. Sorry if the description was unclear.

        I can see that it is possible to write statements such as this using BinStorage()

        A = load 'somedata' using BinStorage();
        B = foreach A generate $0 as name:chararray;
        dump B;
        

        and not write it using PigStorage().

        Should we not support the following statement, as a user I am interested in projecting the first column and casting it to a chararray. I am not interested in knowing what the schemas are of other columns!!

        Fails when I do the following:

        A = load 'somedata' using PigStorage();
        B = foreach A generate $0 as name:chararray;
        dump B;
        

        Can you tell me why the schema specification in FOREACH GENERATE works with BinStorage and not in PigStorage?

        Viraj

        Show
        Viraj Bhat added a comment - Hi Ashutosh, The problem here is not about using the data interchangeably between BinStorage() and PigStorage(), it is about the consistency issues in schema. Sorry if the description was unclear. I can see that it is possible to write statements such as this using BinStorage() A = load 'somedata' using BinStorage(); B = foreach A generate $0 as name:chararray; dump B; and not write it using PigStorage(). Should we not support the following statement, as a user I am interested in projecting the first column and casting it to a chararray. I am not interested in knowing what the schemas are of other columns!! Fails when I do the following: A = load 'somedata' using PigStorage(); B = foreach A generate $0 as name:chararray; dump B; Can you tell me why the schema specification in FOREACH GENERATE works with BinStorage and not in PigStorage? Viraj
        Hide
        Ashutosh Chauhan added a comment -

        Viraj,

        I am confused with this description. It seems to me that you are first storing some data using BinStorage and then loading it using PigStorage. If that is so, obviously it will not work. PigStorage and BinStorage aren't interoperable in this way. Specifically, data stored using BinStorage, can only be loaded using BinStorage.

        Show
        Ashutosh Chauhan added a comment - Viraj, I am confused with this description. It seems to me that you are first storing some data using BinStorage and then loading it using PigStorage. If that is so, obviously it will not work. PigStorage and BinStorage aren't interoperable in this way. Specifically, data stored using BinStorage, can only be loaded using BinStorage.
        Hide
        Viraj Bhat added a comment -

        Pig script, Input File and Bin Storage() input file

        Show
        Viraj Bhat added a comment - Pig script, Input File and Bin Storage() input file

          People

          • Assignee:
            Alan Gates
            Reporter:
            Viraj Bhat
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development