Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
We cache java objects currently; these have high overhead, average stripe metadata takes 200-500Kb on real files, and with bloom filters blowing up more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. That is undesirable.
We should either create better objects for ORC (might be good in general) or store serialized metadata and deserialize when needed.