Pig
  1. Pig
  2. PIG-3488

REGEX_EXTRACT does not work on dereferenced map from Hbase column family

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.10.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      I am not sure if this has been fixed in a later version.

      USERS = LOAD 'hbase://my_table' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('s:*', '-limit=1') AS (my_map:map[chararray]);
      TEST = FOREACH USERS GENERATE (chararray)my_map#'my_key';
      TEST2 = FOREACH TEST GENERATE REGEX_EXTRACT($0, 'expires": (.*)}', 1) ;
      illustrate TEST2

      java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.String
      at org.apache.pig.builtin.REGEX_EXTRACT.exec(REGEX_EXTRACT.java:85)
      at org.apache.pig.builtin.REGEX_EXTRACT.exec(REGEX_EXTRACT.java:47)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:305)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:322)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
      at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:194)
      at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:257)
      at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:238)
      at org.apache.pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:103)
      at org.apache.pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:98)
      at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:166)
      at org.apache.pig.PigServer.getExamples(PigServer.java:1206)
      at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:725)
      at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:591)
      at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:306)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
      at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
      at org.apache.pig.Main.run(Main.java:490)
      at org.apache.pig.Main.main(Main.java:111)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
      2013-09-26 15:50:12,762 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered IOException. Exception : org.apache.pig.data.DataByteArray cannot be cast to java.lang.String

      If I try this with other string functions, it works fine

      TEST3 = FOREACH TEST GENERATE CONCAT($0,$0);
      illustrate TEST3

      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

      USERS my_map:map(:chararray)
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        {my_key={"data": {"added": 1323561598000, "lastseen": 1323561598000, "expires": 1324771198000}

      }}
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      --------------------------------------------------------------------------------------------------------

      TEST :chararray

      --------------------------------------------------------------------------------------------------------

        {"data":{"added": 1323561598000, "lastseen": 1323561598000, "expires": 1324771198000}}

      --------------------------------------------------------------------------------------------------------
      --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

      TEST3 :chararray

      --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

        {"data":{"added": 1323561598000, "lastseen": 1323561598000, "expires": 1324771198000}}{"data":{"added": 1323561598000, "lastseen": 1323561598000, "expires": 1324771198000}}

      --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

        Activity

        There are no comments yet on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Sean Slattery
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development