Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
0.1.0
-
None
-
None
-
Linux
$pig --version
Apache Pig version 0.1.0-dev (r750430)
compiled Mar 07 2009, 09:20:13
Description
grunt> rf_src = LOAD 'rf_test.txt' USING PigStorage(',') AS (lhs:chararray, rhs:chararray, r:float, p:float, c:float);
grunt> rf_grouped = GROUP rf_src BY rhs;
grunt> lhs_grouped = FOREACH rf_grouped GENERATE group as rhs, rf_src.(lhs, r) as lhs, MAX(rf_src.p) as p, MAX(rf_src.c) AS c;
grunt> describe lhs_grouped;
lhs_grouped: {rhs: chararray,lhs:
,p: float,c: float}
I think it should be:
lhs_grouped: {rhs: chararray,lhs:
,p: float,c: float}
Because of this, we are not able to perform UNION on 2 sets because union on incompatible schemas is causing a complete loss of schema information, making further processing impossible.
This is what we want to UNION with:
grunt> asrc = LOAD 'atest.txt' USING PigStorage(',') AS (rhs:chararray, a:int);
grunt> aa = FOREACH asrc GENERATE rhs, (bag
) null as lhs, -10F as p, -10F as c;
grunt> describe aa;
aa: {rhs: chararray,lhs:
,p: float,c: float}
If there is something wrong with what I am trying to do, please let me know.
Attachments
Issue Links
- depends upon
-
PIG-694 Schema merge should take into account bags with tuples and bags with schemas
- Closed