Description
I have a xml file which I want to parse using the piggybank XPath udf.
The xml is:
<Aa name="test1">
<Bb Cc="1"/>
<Bb Cc="1"/>
<Bb Cc="1"/>
<Bb Cc="1"/>
<Dd>test2</Dd>
</Aa>
The xpath contains a sum aggregate to sum all Cc values.
The complete pig script:
REGISTER piggybank.jar
DEFINE XPath org.apache.pig.piggybank.evaluation.xml.XPath();
DEFINE XPathAll org.apache.pig.piggybank.evaluation.xml.XPathAll();
XMLFile = LOAD '/demo/test.xml' using org.apache.pig.piggybank.storage.XMLLoader('Aa') as (xmlContents:chararray);
MyOutput = FOREACH XMLFile GENERATE XPathAll(xmlContents,'Aa/@name',true,false).$0 AS Aa:chararray,XPath(xmlContents,'sum(Aa/Bb/@Cc)') AS Cc:Double, XPath(xmlContents,'Aa/Dd') AS Dd:chararray;
STORE MyOutput INTO 'Output/MyOutput' USING PigStorage('|');
MyOutput:
test1||test2
So i'm missing the aggregate 4 in column 2.
Attachments
Attachments
Issue Links
- links to