Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.12.1
-
None
-
None
-
MacOs 10.7.5 Java 1.7 hadoop 2.2.0
Description
When specifying the -tagFile option in foreach iteration the first tuple gets always overwritten with the file-name. Even when the file name is not selected. In the following example instead of the date and dividend, the result contains the file name and the dividend.
Example:
divs = load 'data.csv' using PigStorage(';','-tagFile') as (file:chararray, exchange:chararray, symbol:chararray, date:chararray, dividends:float);
subtable = foreach divs generate date as d, dividends as divs;
store subtable into 'sub_dividend';
Test Input data.csv:
NYSE;CPO;2009-12-30;0.14
NYSE;CPO;2009-01-06;0.14
NYSE;CCS;2009-10-28;0.414
NYSE;CCS;2009-01-28;0.414
NYSE;CIF;2009-12-09;0.029
PigUnit Test for it:
@Test
public void testPigScript() throws IOException, ParseException {
String[] script =
;
PigTest test = new PigTest(script);
String[] output =
{ "(2009-12-30,0.14)\n" + "(2009-01-06,0.14)\n" + "(2009-10-28,0.414)\n" + "(2009-01-28,0.414)\n" + "(2009-12-09,0.029)" };
test.assertOutput("subtable",output);
}