Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
We are constantly hitting the java heap space memory issue if the combiner is enabled on our jobs.
Configs:
pig.cachedbag.memusage=20
io.sort.mb=300
pig.exec.nocombiner=false
mapred.child.java.opts=-Xmx750m
Sample job:
A = LOAD '$INPUT' USING com.contextweb.pig.CWHeaderLoader('$WORK_DIR/schema/rpt.xml'); AA = foreach A GENERATE checkPointStart, PublisherId, TagId, ContextCategoryId,Impressions, Clicks, Actions; DESCRIBE AA; B = GROUP AA BY (checkPointStart, PublisherId, TagId, ContextCategoryId); result = FOREACH B GENERATE group, SUM(AA.Impressions) as Impressions, SUM(AA.Clicks) as Clicks, SUM(AA.Actions) as Actions; DESCRIBE result; STORE result INTO '$OUTPUT' USING com.contextweb.pig.CWHeaderStore();
Mapper Error Log:
2011-01-12 18:43:22,084 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:549)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
at org.apache.hadoop.mapred.Child.main(Child.java:211)
Attachments
Issue Links
- duplicates
-
PIG-1815 pig task retains used instances of PhysicalPlan
- Closed