Details
-
Improvement
-
Status: Resolved
-
Trivial
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
In order to determine when to start spilling bags, Pig uses MemoryNotification for both MEMORY_THRESHOLD_EXCEEDED and MEMORY_COLLECTION_THRESHOLD_EXCEEDED.
https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryNotificationInfo.html
Since spilling a large bag is expensive, Pig explicitly call System.gc() when the expected size is huge. I think we can skip this step when notification is based on MEMORY_COLLECTION_THRESHOLD_EXCEEDED since this means jvm has called the gc already.