Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Done
-
None
Description
Change List<Integer> and List<Double> to int[] and float[].
Using the primitive types gave me around 2X performance improvements for processing a Yahoo! Music dataset. The major benefit comes from creating a significantly fewer number of objects. Using floats instead of doubles also seem to improve overall performance.
Note that Beam does not provide efficient coders for in[] and float[] by default, so we need to add custom coders for those types to avoid using the inefficient Java serializer.
Attachments
Issue Links
- links to