Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
We can speed up our per row/buffer time by not serializing the model weights as part of state except for the last row. For all rows except the last row, all we need from the model state is just the agg_image_count, so we can just set the state to agg_image_count for all rows except the last row. For the last row, the behavior remains the same as before
{{}}
{{With places10 20 msts gpdb6 (with 1 epoch for all msts), per iteration time went down from around 1700-1800 secs to 1600-1700 secs. }}
This will be more pronounced when training with a bigger dataset like places365/imagenet.